
Meta AI security researcher Summer Yue’s X post initially reads like satire. She instructed the OpenClaw AI agent to check her full email inbox and suggest what to delete or keep.
The agent continued to run wild. I started deleting all my emails in a “speed run,” ignoring my phone’s commands to stop.
“I had to run to my Mac mini like I was defusing a bomb,” she said, posting an image of the ignored stop message as a receipt.
The Mac Mini, an inexpensive Apple computer that lies flat on your desk and fits in the palm of your hand, has become the preferred device these days for running OpenClaw. (The Mini is selling “like hotcakes,” one “confused” Apple employee apparently told famed AI researcher Andrej Karpathy when he bought a Mini to run an OpenClaw alternative called NanoClaw.)
It is an open source AI agent that gained notoriety through OpenClaw, as well as Moltbook, a social network dedicated to AI. The OpenClaw agents were at the center of a now largely debunked episode of Moltbook in which the AI appears to be conspiring against humans.
However, according to its GitHub page, OpenClaw’s mission is not focused on social networks. It aims to become a personal AI assistant that runs on your own device.
The Silicon Valley crowd was so enamored with OpenClaw that “claw” and “claws” became the buzzwords of choice for agents running on private hardware. Other agents include ZeroClaw, IronClaw, and PicoClaw. Y Combinator’s podcast team even appeared in lobster costumes in the most recent episode.
Tech Crunch Event
Boston, Massachusetts
|
June 9, 2026
But Yue’s post serves as a warning. As others on X have pointed out, if AI security researchers can face this problem, what hope can mere mortals have?
“Are you intentionally testing the guardrail, or did you make a rookie mistake?” A software developer asked her about X.
“It was a rookie mistake.” she answered. She’s been testing the agent using what she calls a small “toy” inbox, and it’s been running well on less important emails as well. Since it won her trust, she thought she would reveal the real thing.
Yue believes that the large amount of data in his physical inbox “created compression.” Compression occurs when the context window, a running record of everything the AI heard and did in a session, becomes so large that the agent begins to summarize, compress, and manage the conversation.
At that point, the AI may skip instructions that humans consider very important.
In this case, you may have skipped the last message telling you not to act and went back to the instructions in your “toys” inbox.
As several others on X have pointed out, prompts cannot be trusted to act as security guardrails. The model may misinterpret or ignore this.
Various people provided various suggestions, ranging from the exact syntax that Yue should have used to stop the agent, to various ways to better comply with the guardrails, such as writing instructions in a dedicated file or using other open source tools.
In the interest of full transparency, TechCrunch was unable to independently verify what happened to Yue’s inbox. (She responded to many of our questions and comments about X, but did not respond to our request for comment.)
But that doesn’t really matter.
The point of the story is that agents targeting knowledge workers currently in the development phase are risky. Those who say they are using it successfully are weaving together a number of methods to protect themselves.
It may be ready for widespread use someday, perhaps soon (2027? 2028?). We know that many of us out there want help with our emails, ordering groceries, or scheduling dental appointments. But that day hasn’t come yet.