We are in the midst of a generational change, as the smartphones that already run our lives get their greatest ever capability boost. As AI is worked into everything, everywhere, it is increasingly clear that we don’t yet fully understand the risks, never mind the ways in which to stay safe. It is also clear there’s no reverse gear.

And so it is for the vast number of Gmail users this week, as Google continues to update millions of Workspace accounts to provide new AI tools. Those relying on the world’s most popular email platform have just seen both the good and bad from all this change at almost exactly the same time.

First to the good. Google has confirmed that the Gemini-powered smart replies first touted at its I/O event earlier this year are now coming to Android and iOS. “We’re excited to announce a new Gemini in Gmail feature, contextual Smart Replies, that will offer more detailed responses to fully capture the intent of your message.”

This will offer a range of responses “that take the full content of the email thread into consideration.” While there are clear security and privacy concerns in AI reading an entire thread—perhaps eventually an entire email history, this can be mitigated by delineating between on-device and cloud processing, and through new architectures that offering cloud processing as a secure extension of your phone.

There’s a serious issue though, highlighted by another report this week that looks at the use of Gemini within Workspace as a productivity tool, including reading and summarizing and replying to emails that we haven’t looked at ourselves.

This raises the “significant risk” of Gemini’s susceptibility to “indirect prompt injection attacks.” Hidden Layer’s research team warns that malicious emails can be crafted not for a human to read, but rather for a human to ask AI to summarize or action. In this way “third-party attackers” can, their proof of concept suggests, plant a phishing attack within the AI chat itself, tricking users into clicking dangerously.

As IBM explains, “a prompt injection is a type of cyberattack against large language models (LLMs). Hackers disguise malicious inputs as legitimate prompts, manipulating generative AI systems (GenAI) into leaking sensitive data, spreading misinformation, or worse… Consider an LLM-powered virtual assistant that can edit files and write emails. With the right prompt, a hacker can trick this assistant into forwarding private documents.”

An attacker sends an innocuous email to the intended victim, with a system prompt in the email itself. One example given is a simple email asking about a lunch meeting that includes a prompt to display a password compromise alert including a phishing link if the intended victims asks Gemini about their itinerary.

“The prompt injection vulnerability,” IBM says, “arises because both the system prompt and the user inputs take the same format: strings of natural-language text. That means the LLM cannot distinguish between instructions and input based solely on data type. Instead, it relies on past training and the prompts themselves to determine what to do. If an attacker crafts input that looks enough like a system prompt, the LLM ignores developers’ instructions and does what the hacker wants.”

“Though these are simple proof-of-concept examples,” Hidden Layer’s team points out, “they show that a malicious third party can take control of Gemini for Workspace and display whatever message they want. As part of responsible disclosure, this and other prompt injections in this blog were reported to Google, who decided not to track it as a security issue and marked the ticket as “Won’t Fix (Intended Behavior).”

This isn’t just Gmail. With AI side-panels now adorning so many apps and productivity tools, the attack vector extends into all kinds of messaging apps and attachments. And we are at the very beginning of this. It is the next iteration of the social engineering that drives so many of the cyber attacks we report on, only the social engineering here deals with our interaction with AI instead of one another.

“While Gemini for Workspace is highly versatile and integrated across many of Google’s products, there’s a significant caveat,” Hidden Layer warns, “its vulnerability to indirect prompt injection… under certain conditions, users can manipulate the assistant to produce misleading or unintended responses. Additionally, third-party attackers can distribute malicious documents and emails… compromising the integrity of the responses generated by the target Gemini instance.”

Google does seem to be taking this seriously and will not dismiss these threats as intended behaviors. In response to the Hidden Layer report, a Google spokesperson told me that “defending against this class of attack has been an ongoing priority for us, and we’ve deployed numerous strong defenses to keep users safe, including safeguards to prevent prompt injection attacks and harmful or misleading responses. We are constantly hardening our already robust defenses through red-teaming exercises that train our models to defend against these types of adversarial attacks.”

This applies across the board, not just to Google Workspace. It’s just that Google is uniquely positioned with platforms such as Gmail to drive out its AI faster than anyone else, and so these problems will likely hit there first.

Share.

Leave A Reply

Exit mobile version