Collector: Hızlı ve Tarafsız Haber

PCWorld
5 saat, 3 dakika

An AI agent nuked 200 emails. This guardrail stops the next disaster

You may know the story by now: A Meta exec asked the viral OpenClaw AI tool to triage her inbox and suggest messages to delete, then watched in horror as the agent went rogue and nuked more than 200 emails, her frantic “STOP OPENCLAW” prompt lost amid the bot’s massive undertaking. The twist? The exec was Meta’s lead AI safety officer, Summer Yue. Yue’s email apocalypse has highlighted a way we can prevent similar agentic AI horror stories. Yes, Yue unwittingly made herself a guinea pig for OpenClaw and its runaway automations–and indeed, pretty much anyone using OpenClaw right now is a guinea pig . But Yue’s email apocalypse also highlighted a way we can prevent similar agentic AI horror stories, and it’s a method that most coders–and even plenty of vibers–are already familiar with. It goes by different names; I’ve heard it called “agent git flow” and “agentic feature branching,” for example. But mostly, it’s about applying the methodology of “git”–the command-line utility that’s essential for tracking changes in code–to AI agents. The best part of this solution? It lets us have our cake (the cake being the ultra-cool things AI agents can do) and eat it, too. Chicken, fish, and OpenClaws First, a thought experiment. Pretend you’re at a restaurant, and there are two items on the menu: chicken or fish. The chicken sure sounds good, but the fish–salmon! Tough choice. Imagine, instead of risking a costly mistake by choosing the chicken over the fish (what if the chicken is spoiled!), you could create a “branch” of your immediate future–a temporary copy of your timeline that lets you test a choice before permanently making it. So, you go ahead and create (or “check out”) a new branch of your “main” lifeline–we’ll call it the “chicken branch”–and you then order and taste the chicken. Eww! It’s gross. No problem; we discard the chicken branch, go back to the “main” branch, and check out a new, second branch–the “fish” branch. Now we taste the salmon–delicious! We like this fish branch, so now we merge it with our “main” life branch, and commence with a meal that’s guaranteed to be yummy. In the code-tracking world of git, we call this functionality (which I’ve described only crudely) feature branching, and it’s an ingenious, battle-tested way to test big changes and new features in our code before committing them to our main project. A feature branch in git is really just a copy of the “main” branch. We check it out like a book from the library, make all the changes we want, test it, find bugs, make more changes, and so on. All the while, the “main” branch of our project is safe and untouched. Only after we’ve subjected our feature branch to a battery of tests–some automated, some performed by the human user–and determined that it’s in tip-top shape do we even think of merging our “feature” branch with the main branch. And if we don’t like how the feature branch is going, we can discard it–no harm, no foul. My point? This code-branching methodology can work with AI agents, too. (And no, I’m not the first person to consider with this idea .) How this could have gone better Let’s go back to Summer Yue and try our “branching” scenario on for size. This time, Yue sits down with OpenClaw and prompts it with, “Go through my inbox and suggest deletions.” (Her other prompt in the real-world story–”wait for approval”–was likely dropped from OpenClaw’s context window due to the sheer number of email messages it was wading through.) More–and potentially scarier–versions of Summer Yue’s terrible horrible, no good, very bad email day will happen again if we don’t give this idea a fair shake. Now, instead of OpenClaw diving into the live inbox, it creates a branch–call it the “triage” branch–that allows it to simulate the results of sifting, organizing, and culling her inbox, all in a sandboxed environment and all without touching her actual email messages. OpenClaw does its thing, maybe gets carried away, and starts deleting messages willy-nilly. If that happened, Yue could simply look at the triage branch, decide she’s not happy with the results, and then either discard the branch or keep working with it, testing different iterations of the OpenClaw prompt or adding markdown-formatted “scaffolding” documents that govern OpenClaw’s actions from the word go. In the meantime, her real inbox is safe and sound. Now, will such “feature branching” work for every AI agent scenario? Probably not. It’s easy to put branched computer code into a sandbox and safety test any number of actions and outcomes. But just as you can’t actually sandbox the chicken-versus-fish choice, there are plenty of real-world agentic AI actions and roles (like, say, HR-focused AI agents) that can’t easily be simulated. That said, more–and potentially scarier–versions of Summer Yue’s terrible horrible, no good, very bad email day will happen again if we don’t give this “agentic feature branching” idea a fair shake.

Go to News Site