{"id":24626,"date":"2026-02-24T06:02:44","date_gmt":"2026-02-24T06:02:44","guid":{"rendered":"https:\/\/microvibenews.com\/?p=24626"},"modified":"2026-02-24T06:02:44","modified_gmt":"2026-02-24T06:02:44","slug":"ai-agents-promise-to-work-while-you-sleep-the-reality-is-far-messier","status":"publish","type":"post","link":"https:\/\/microvibenews.com\/?p=24626","title":{"rendered":"AI agents promise to work while you sleep. The reality is far messier"},"content":{"rendered":"<p><img src=\"https:\/\/fortune.com\/img-assets\/wp-content\/uploads\/2026\/02\/GettyImages-2214158029-e1771874524975.jpg?w=2048\" \/><\/p>\n<p>Summer Yue may work on safety and alignment on Meta\u2019s superintelligence team, but even she admits she isn\u2019t immune to overconfidence when it comes to autonomous AI agents.\u00a0<\/p>\n<div>\n<p>In a post on X Monday, Yue described how her OpenClaw autonomous AI agents\u2014built to run locally on a Mac mini computer\u2014deleted her entire inbox, ignoring instructions to pause and ask for confirmation first.<\/p>\n<p>\u201cI had to RUN to my Mac Mini like I was defusing a bomb,\u201d she said. It was, she added, a \u201crookie mistake.\u201d The workflow had been working in a test inbox she used to safely trial the agent for weeks, she explained, but in the real inbox the agent lost her original instruction.\u00a0<\/p>\n<p>Yue\u2019s experience stands in stark contrast to viral posts such as The Lobster Revolution: Why 24\/7 AI Agents Just Changed Everything, in which Peter Diamandis claims always-on AI is far more frictionless.\u00a0<\/p>\n<p>\u201cLet me tell you what it feels like to use this,\u201d Diamandis wrote. \u201cYou wake up in the morning and your agent\u2014mine is named Skippy, cheerfully sarcastic and absurdly capable\u2014has done eight hours of work while you slept. It read a thousand pages of markdown. It organized your files. It drafted three project plans. It booked your travel. It researched that question you had at 11 PM and forgot about.<\/p>\n<p>\u201cWhen my Mac mini went offline for six hours, I felt withdrawal,\u201d he added. \u201cLike my best friend disappeared.\u201d<\/p>\n<p>Together, these dueling accounts of the power of AI agents capture the tension at the heart of today\u2019s push toward \u201calways-on\u201d AI. As tools like OpenClaw and Claude Code make it technically possible for agents to run for long periods, excitement is growing around the idea of AI that works while you sleep. But in practice, early users say that autonomy remains fragile, unpredictable, and labor-intensive to manage. Rather than replacing human work, today\u2019s agents often require constant monitoring, guardrails, and intervention, especially when the stakes rise beyond low-risk experiments.<\/p>\n<h2 class=\"wp-block-heading\"><strong>AI agents work best when tasks are simple and low-stakes<\/strong><\/h2>\n<p>Shyamal Anadkat, who previously worked as an applied AI engineer at OpenAI, said most of today\u2019s successful agents still require frequent human check-ins or are limited to tightly bounded, well-defined tasks\u2014though he emphasized that this will change as measurement and evaluation techniques improve.<\/p>\n<p>\u201cA system that\u2019s 95% accurate on individual steps becomes chaotic over a 20-step autonomous workflow,\u201d Anadkat said. \u201cLong-horizon planning is still weak.\u201d As a result, he explained, agents may perform well on short task chains but tend to fall apart when asked to manage complex, multiday projects. Memory is another major limitation: \u201cIn many agents, memory is either nonexistent or fragile. You need systems that can maintain a coherent model of your work context, priorities, and constraints.\u201d<\/p>\n<p>That doesn\u2019t mean the promise of AI agents is all smoke and mirrors, according to Yoav Shoham, a former principal scientist at Google, a professor emeritus at Stanford, and cofounder of AI21 Labs. But it does mean there is the danger of people getting ahead of themselves. Today\u2019s AI agents, he explained, work best when the task is low-risk, loosely defined, and cheap to get wrong.<\/p>\n<p>\u201cDevelopers like toys, and you have this toy that can do wonderful things,\u201d he told <em>Fortune<\/em>. \u201cAs long as what they\u2019re doing is fairly simple and fairly low-stakes with high tolerance for error, that\u2019s fine.\u201d For example, if you wanted your agent to read 10,000 websites and do something interesting with the results to give you tidbits of information overnight that could be useful.<\/p>\n<p>But for mission-critical enterprise workflows, the bar is much higher. Companies need systems that are verifiable, repeatable, and cost-effective\u2014requirements that quickly erode the set-it-and-forget-it promise of fully autonomous, always-on agents. In highly structured domains like coding or math, deeper automation is already possible. But for most real-world business processes, Shoham says, the work required to make agents reliable often outweighs the benefit.<\/p>\n<p>Bret Greenstein, chief AI officer at consulting firm West Monroe, pointed out that tools like OpenClaw feel like a tipping point similar to what happened with generative AI when ChatGPT launched in 2022\u2014for the first time, it has made the idea of AI agents accessible. Still, it\u2019s not a 24\/7\u00a0 \u201cmagic solution.\u201d\u00a0<\/p>\n<p>\u201cIt can work for a long time, cranking away on things, but it\u2019s like a toddler that needs to be overseen,\u201d he said. Some tasks are reasonable to do while you are sleeping, like scanning LinkedIn messages or tracking news. \u201cI\u2019m not sure I would have it answering customer feedback while I\u2019m sleeping,\u201d he said.\u00a0<\/p>\n<h2 class=\"wp-block-heading\"><strong>Ability to delegate to an AI agent feels powerful<\/strong><\/h2>\n<p>Still, there is little doubt that the ability to delegate real-world tasks to an AI agent is deeply compelling for users, Greenstein emphasized. He pointed to his own experience handing an AI agent the mundane task of getting his clothes picked up to be dry-cleaned\u2014and watching it quietly complete the job end to end.<\/p>\n<p>The agent independently contacted the cleaner, worked out pickup logistics through email exchanges, coordinated timing, monitored a doorbell camera to confirm the pickup, and notified Greenstein once the task was complete. The episode illustrated how agents can operate across multiple systems and adapt when things don\u2019t go as planned. But it also underscored why such tools still require strict guardrails and oversight\u2014especially before they are deployed in enterprise settings.<\/p>\n<p>\u201cOpenClaw is set up so it shouldn\u2019t feel safe for most people,\u201d Greenstein said. \u201cIt doesn\u2019t feel mature enough to be a trusted part of our lives yet.\u201d For AI to be welcomed into everyday life or business operations, he added, it has to earn trust over time\u2014much the way trust is established socially.<\/p>\n<p>Even so, demand is already evident. Greenstein pointed to meetups and early industry gatherings dedicated to OpenClaw, a rapid emergence he described as unusual for such a young tool. \u201cIt shows the hunger people have for AI that\u2019s actually useful,\u201d he said\u2014systems that move beyond answering questions and start taking action.<\/p>\n<p>Aaron Levie, CEO of cloud-based content management and collaboration company Box, called what is happening now with AI agents \u201clittle glimmers\u201d of what might happen in the future.\u00a0<\/p>\n<p>\u201cSome glimmers end up not manifesting, some glimmers just become the standard,\u201d he explained, pointing to two years ago when AI company Cognition introduced an early agent called Devin that would integrate with Slack for task delegation, bug fixes, data analysis, and code review. At the time, it was still seen as futuristic, but today, \u201cno one is confused that this is a standard practice,\u201d he said. \u201cYou can just Slack Claude Code to go work on stuff\u2014what seemed like a totally crazy idea is now basically the standard of any modern engineering team.\u201d\u00a0<\/p>\n<p>But while AI agents are becoming very good at automating specific, discrete tasks, they remain poor at handling the broader, context-heavy work that makes up most jobs, Levie emphasized. AI agents may fully automate a handful of tasks, but struggle with the rest\u2014including navigating relationships and participating in meetings.\u00a0<\/p>\n<p>\u201cWhen you hear an AI lab say we\u2019re going to automate all knowledge work in 24 months, that\u2019s usually a very narrow definition of jobs,\u201d he said. \u201cThe definition of what an agent can do is not the same definition of what the job is that gets hired in the economy.\u201d\u00a0<\/p>\n<h2 class=\"wp-block-heading\"><strong>The trust factor matters for when things can go wrong<\/strong><\/h2>\n<p>Avinash Vootkuri, a staff data scientist at a top Fortune 500 retailer, said that most enterprise AI agents \u201cabsolutely require a babysitter\u201d and, for now, can work only in enterprise settings with tightly bounded autonomy and extensive guardrails. \u201cThe stakes are massive,\u201d he explained.\u00a0<\/p>\n<p>For example, he described building an agentic system for enterprise cybersecurity where AI agents don\u2019t simply trigger alerts and wait for human review but actively investigate them. Instead of flooding analysts with thousands of warnings, the agents gather evidence in real time\u2014querying threat-intelligence databases, analyzing behavioral patterns, and filtering out false positives\u2014before deciding whether a situation warrants escalation.\u00a0<\/p>\n<p>The system relies on tightly bounded autonomy and extensive guardrails, reducing human workload without removing oversight.<\/p>\n<p>In cybersecurity, he explained, if the agent gets it wrong, the consequences are immediate and severe. \u201cThe AI either blocks legitimate customers (causing massive revenue loss) or it lets a sophisticated threat actor into the network,\u201d he said. \u201cIt absolutely matters if things go wrong.\u201d\u00a0<\/p>\n<p>According to Breeanna Whitehead, who runs an AI operations consultancy where she builds AI-powered systems for executives and founders, the industry is in a \u201ctrust calibration phase.\u201d\u00a0<\/p>\n<p>AI agents can do more than most people let them, but less than the hype suggests.\u00a0<\/p>\n<p>\u201cThe real skill isn\u2019t building the agent\u2014it\u2019s designing the handoff,\u201d she explained. \u201cMost people either over-trust agents and end up cleaning up messes, or they micromanage every output and wonder why AI feels like more work instead of less.\u201d The idea, she said, is to design clear handoff points, where something might be fully delegated, another thing might get a quick review, while another task stays just for humans to do.\u00a0<\/p>\n<p>For now, she said, agents are \u201cgenuinely excellent\u201d what she called the middle layer of knowledge work\u2014\u201cthe stuff that used to eat two to three hours of a smart person\u2019s day, like synthesizing meeting notes into action items, drafting follow-up emails in someone\u2019s voice, pulling together research briefs, organizing competing priorities into a clear plan.\u201d\u00a0<\/p>\n<p>But anything that requires reading a room, navigating ambiguity, or making judgment calls that depend on relationships are not ready for AI agent prime time. \u201cI had a client who wanted to fully automate their investor communications,\u201d she said. \u201cThe AI could draft beautifully, but it couldn\u2019t sense when a funder was losing interest and needed a different approach. The agent drafted the email, but the human had to decide whether to send it.\u201d\u00a0<\/p>\n<h2 class=\"wp-block-heading\"><strong>For now, sleep may be elusive when working with AI agents<\/strong><\/h2>\n<p>For now, working with AI agents may have less to do with sleeping while they work than with staying half-awake while they do. Tools like OpenClaw can run for hours at a time, but for many early users, that autonomy comes with a new kind of vigilance\u2014checking logs, reviewing outputs, and stepping in before things go wrong.<\/p>\n<p>That dynamic was captured in a recent viral post titled Token Anxiety, in which investor Nikunj Kothari described a friend leaving a party early\u2014not because he was tired, but because he wanted to get back to his agents. \u201cNobody questions it anymore,\u201d Kothari wrote. \u201cHalf the room is thinking the same thing. The other half are probably checking the progress of their agents. At a party.\u201d<\/p>\n<p>The dream of AI that works while you sleep may be real. But for now, it\u2019s still keeping a lot of people awake.<\/p>\n<\/div>\n<p>#agents #promise #work #sleep #reality #messier<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Summer Yue may work on safety &hellip; <\/p>\n","protected":false},"author":1,"featured_media":24627,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[2],"tags":[2083,704,883,1827,3775,703,1900,3563,5687,1606],"_links":{"self":[{"href":"https:\/\/microvibenews.com\/index.php?rest_route=\/wp\/v2\/posts\/24626"}],"collection":[{"href":"https:\/\/microvibenews.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/microvibenews.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/microvibenews.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/microvibenews.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=24626"}],"version-history":[{"count":0,"href":"https:\/\/microvibenews.com\/index.php?rest_route=\/wp\/v2\/posts\/24626\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/microvibenews.com\/index.php?rest_route=\/wp\/v2\/media\/24627"}],"wp:attachment":[{"href":"https:\/\/microvibenews.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=24626"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/microvibenews.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=24626"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/microvibenews.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=24626"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}