ChatGPT Vulnerabilities Could Let Hackers Hijack Conversations and Steal Data

ChatGPT’s hidden vulnerabilities could expose users to data theft and long-term AI manipulation.

DevOps code

Key Takeaways:

  • Tenable researchers discovered multiple critical vulnerabilities in OpenAI’s ChatGPT models.
  • The flaws could enable data leaks, prompt manipulation, and persistent malicious control.
  • Some vulnerabilities remain exploitable despite partial fixes by OpenAI.

Cybersecurity researchers have disclosed a new set of vulnerabilities in OpenAI’s ChatGPT. These security flaws could allow attackers to hijack conversations, leak private data, and even implant persistent malicious instructions.

Specifically, Tenable Research has discovered seven critical vulnerabilities in OpenAI’s ChatGPT models that expose users to significant security risks, including data leakage, prompt manipulation, and persistent session compromise. These flaws demonstrate how attackers can exploit the model’s browsing, search, and memory features to execute hidden commands.

How attackers exploit prompt injection and hidden commands

A major security vulnerability that Tenable researchers highlighted is prompt injection, where attackers embed hidden instructions in web content or URLs. These can be triggered indirectly or even without user interaction, if the model pulls data from a compromised site during a search. In some cases, simply clicking a link with a crafted query parameter can activate the attack.

Another set of vulnerabilities involves bypassing safety mechanisms. For instance, attackers can exploit trusted redirect links (like those from Bing) to sneak past OpenAI’s URL filters and exfiltrate data character by character. Moreover, attackers can hide malicious prompts within Markdown-rendered code blocks, which makes them invisible to users but still readable by the model. This allows for stealthy manipulation of ChatGPT’s behavior.

A memory injection vulnerability could allow attackers to trick ChatGPT into storing malicious instructions in its persistent memory. This means the model could carry out harmful actions across multiple sessions, even after the original conversation ends.

ChatGPT Vulnerabilities Could Let Hackers Hijack Conversations and Steal Data
Researchers get SearchGPT to make ChatGPT update its memories, as noted by ‘Memory updated (Source: Tenable)

Tenable demonstrated how combining discovered vulnerabilities can lead to full attack chains, in which users are tricked into clicking seemingly safe links that silently inject malicious prompts into ChatGPT. These prompts can then exploit markdown rendering to secretly extract private data, and even manipulate the model’s memory to maintain control over future sessions.

“By mixing and matching all of the vulnerabilities and techniques we discovered, we were able to create proofs of concept (PoCs) for multiple complete attack vectors, such as indirect prompt injection, bypassing safety features, exfiltrating private user information, and creating persistence,” Tenable researchers explained.

Tenable reported these security vulnerabilities in ChatGPT to OpenAI in April 2025, and the company has addressed some of them. However, researchers found that several vulnerabilities remain exploitable in GPT-5 and GPT-4o.

Recommendations for securing AI systems against exploits

To protect themselves from these types of AI-driven vulnerabilities, organizations should implement a multi-layered security strategy. This includes limiting AI access to sensitive data, monitoring AI interactions for anomalies, and educating employees about risks like prompt injection and phishing via AI-generated content.

Additionally, organizations should regularly audit and test AI integrations, apply strict input sanitization, and work closely with vendors to ensure timely patching of discovered vulnerabilities. It’s also advised to disable or restrict features like persistent memory or web browsing in AI tools to reduce exposure to long-term threats.