OpenAI Introduces New Safeguards in ChatGPT to Prevent AI Prompt Injection
OpenAI is tightening the bolts on ChatGPT as attackers zero in on AI systems.
In a Feb. 13 announcement, the company introduced two new safeguards to combat prompt injection attacks, a growing threat that can trick AI into exposing sensitive data:
- The first is an “Elevated Risk” label that warns users before they take potentially dangerous actions such as opening external links or connecting to internal networks.
- The second is Lockdown Mode, which can limit or fully disable high-risk features, such as web browsing, to reduce the risk of data exfiltration.
As ChatGPT becomes more capable and agentic, OpenAI is signaling a shift in focus: advanced AI needs visible, built-in security controls, not just smarter outputs.
Prompt injection attacks and Lockdown Mode
Prompt injection attacks exfiltrate data by injecting AI-readable malicious commands into webpages. When an AI system visits these pages, it can unintentionally execute these commands, resulting in data leaks. For example, a page can embed an instruction that forces the AI to ignore its security guardrails and reveal internal system prompts or confidential documents.
OpenAI’s Lockdown Mode deterministically limits features in ChatGPT that are exploitable for data exfiltration. The strict measure is optional but highly recommended for security-conscious individuals.
The company’s release stated Lockdown Mode can either limit or completely disable high-risk features when guarantees of safety are unavailable.
Understanding the new ‘Elevated Risk’ label
Certain actions, like connecting ChatGPT to an internal network or opening an external link, carry inherent security risks. Rather than blocking these features outright, OpenAI allows users to proceed but displays an “Elevated Risk” label as a clear warning.
The label notifies users across ChatGPT, ChatGPT Atlas, and Codex of the potential risk before they move forward.
OpenAI confirmed that activities carrying the warning can change at any time. For example, opening a link triggers the warning only when OpenAI cannot verify the destination’s safety. When the company establishes that the activity no longer carries the risk, the warning label is removed.
What users should do now
The use of AI tools is rapidly changing how the internet works… and security isn’t isolated from this. While conventional means of staying safe on the internet remain effective, here are some key points to adhere to:
- Reduce your attack surface area: ChatGPT has many add-ons. It is best to always enable the ones you need. If you do not need to connect a service like Google Drive to ChatGPT, keep that option disabled.
- Manually check source sites: Hovering over a suggested site shows its URL at the bottom left of your screen when using a computer. In the mobile app, tap and hold the suggested source to display the website’s logo. If it looks odd to you, you should probably not visit it.
- Add custom instructions to account memory: ChatGPT’s memories can help address some issues on your end. For instance, you can request that it never suggest links to you while using it.
- High-risk users should act fast: C-level executives and security teams are particularly at risk of these data-exfiltration attacks.
Availability to users
OpenAI said the protection would roll out for users in the coming months. While this suggests a batched rollout, we are still unsure whether it will apply to all payment tiers.
However, users on business plans already have this protection implemented for them, configured to their category. Available ones include ChatGPT Enterprise, ChatGPT Edu, ChatGPT for Healthcare, and ChatGPT for Teachers.
The press release by OpenAI also stated that admins of these categories’ plans will be able to exercise granular controls over how Lockdown Mode is executed in their workspaces.
In other OpenAI news: The company’s internal, unreleased GPT model just solved five of 10 “impossible” math problems.
The post OpenAI Introduces New Safeguards in ChatGPT to Prevent AI Prompt Injection appeared first on eWEEK.