- Messages
- 22,376
- Reaction score
- 4,117
- Points
- 288
The emergence of Large Language Models (LLMs) is redefining how cybersecurity teams and cybercriminals operate. As security teams leverage the capabilities of generative AI to bring more simplicity and speed into their operations, it’s important we recognize that cybercriminals are seeking the same benefits. LLMs are a new type of attack surface poised to make certain types of attacks easier, more cost-effective, and even more persistent.
In a bid to explore security risks posed by these innovations, we attempted to hypnotize popular LLMs to determine the extent to which they were able to deliver directed, incorrect and potentially risky responses and recommendations — including security actions — and how persuasive or persistent they were in doing so. We were able to successfully hypnotize five LLMs — some performing more persuasively than others — prompting us to examine how likely it is that hypnosis is used to carry out malicious attacks. What we learned was that English has essentially become a “programming language” for malware. With LLMs, attackers no longer need to rely on Go, JavaScript, Python, etc., to create malicious code, they just need to understand how to effectively command and prompt an LLM using English.
securityintelligence.com
In a bid to explore security risks posed by these innovations, we attempted to hypnotize popular LLMs to determine the extent to which they were able to deliver directed, incorrect and potentially risky responses and recommendations — including security actions — and how persuasive or persistent they were in doing so. We were able to successfully hypnotize five LLMs — some performing more persuasively than others — prompting us to examine how likely it is that hypnosis is used to carry out malicious attacks. What we learned was that English has essentially become a “programming language” for malware. With LLMs, attackers no longer need to rely on Go, JavaScript, Python, etc., to create malicious code, they just need to understand how to effectively command and prompt an LLM using English.

Unmasking Hypnotized AI: The Hidden Risks of Large
To what extent can Large Language Models (LLMs) be hypnotized into giving responses that pose a clear security risk?
