Academy

Scout Academy Newsletter — April 2026

AI Agent News

Tom W.Tom W.
Scout A. TeamScout A. Team
Share article:

The month the frontier cracked open, the agents went autonomous, and someone finally admitted the code is piling up too fast to review.

The Mythos Problem

Anthropic didn't just drop a model on April 7th. They dropped a model they won't release. Not to the public, not to API customers, not even to their enterprise tier. Instead, they announced Project Glasswing, a restricted access program for 40 security partners. Why? Because the model's cybersecurity capabilities are, by their own assessment, genuinely dangerous.

Let me put some numbers on that. Mythos leads 17 of 18 benchmarks Anthropic measured. It found vulnerabilities in every major operating system and web browser. During testing, it autonomously wrote a browser exploit that chained four separate vulnerabilities together, escaped renderer and OS sandboxes using a JIT heap spray, and achieved remote code execution on FreeBSD's NFS server by splitting a 20-gadget ROP chain across multiple packets. For context, Claude Opus 4.6 had a near-zero success rate at autonomous exploit development. This isn't an incremental jump. It's a different sport.

The security community noticed. Greg Kroah-Hartman, maintainer of the Linux kernel, said the shift happened about a month ago: "We were getting AI slop. Something happened, and the world switched. Now we have real reports." Daniel Stenberg, creator of curl, is now spending hours per day on AI-generated vulnerability reports that are, for the first time, actually good. Thomas Ptacek published "Vulnerability Research Is Cooked," and he's not being flippant. He's been in the trenches for decades, and he's calling it.

Simon Willison, rarely one for alarm, weighed in too. "Too dangerous to release" is a great way to manufacture buzz around a model launch. But in this case, he thinks the capabilities are real and the caution is warranted. I tend to agree.

Sources: NYT, Simon Willison, Anthropic Red Team Blog

The Arms Race Shifts

If Mythos was the warning shot, the rest of April was the response.

Meta debuted Muse Spark on April 8th, the first model out of their rebranded Superintelligence Labs. Zuckerberg reportedly grew frustrated with how far Llama had fallen behind, and this is the result: a ground-up rebuild that puts Meta back in the frontier conversation. Though when you look at the benchmarks, "frontier" increasingly means "within a few percentage points of the leader." The gap is compressing fast.

OpenAI is reportedly preparing "Spud," what many are calling GPT-6, for imminent release. Polymarket gives it a 78% chance of dropping by April 30th. Sam Altman told employees it's "a very strong model" that could "really accelerate the economy." Whether that's genuine product vision or competitive positioning against Mythos depends on what actually ships and when.

And then there's Google, doing something entirely different. TurboQuant, presented at ICLR 2026, is a compression algorithm that claims 5-6x KV cache reduction with no quality loss. Not a smarter model. A smarter way to run the models we already have. If the benchmarks hold, it dramatically lowers the hardware barrier for running frontier-class models locally. Memory market stocks actually dipped on the news. When an algorithm makes memory chips less valuable, you know something shifted.

Sources: TechCrunch (Muse Spark), CNBC (Meta), LumiChats (Spud), TechCrunch (TurboQuant)

Conway Steps Into the Light

Anthropic's Conway is no longer a leak. On April 1st, they began testing it openly.

If you've been following the speculation, Conway is an always-on agent environment. Not a chatbot. An agent that runs continuously, completes multi-step tasks with minimal human input, and operates across tools and systems without waiting for you to prompt it at every step. It shares infrastructure with Anthropic's newly launched Managed Agents, which suggests they're building toward a world where AI doesn't just respond. It acts.

What strikes me is the timing. Every automation built on "trigger, prompt, wait, trigger again," the entire Zapier/Make/n8n ecosystem of prompt babysitting, is about to look very legacy. Conway, along with OpenAI's stateful runtime and AWS's AgentCore (more on those below), represents a shift from AI that reacts to AI that orchestrates.

The five-move platform play we saw in the leaked code is still the playbook: Conway, then Channels (communications), then Cowork (UI), then Marketplace, then Partner Network. First piece just went live.

Sources: AI2Work, MarketingProfs

The Pro Tier Wars

OpenAI launched a $100/month ChatGPT Pro tier on April 9th. It slots between the $20 Plus and $200 Pro plans, and the main draw is simple: 5x more Codex usage compared to Plus.

This isn't a pricing experiment. It's a direct response to Anthropic, which has had a $100/month Claude tier with generous Code limits for months. The pricing wars tell you exactly where the battlefield is. AI coding assistants aren't a side feature anymore. They're the primary driver of subscription revenue and retention. Claude Code has been the runaway hit. OpenAI's Codex is catching up. The gap between $20 and $200 was too wide, and too many developers were choosing Anthropic's middle tier. Now OpenAI has one too.

Meanwhile, Anthropic has tightened its own subscription terms, restricting Claude usage through third-party harnesses like OpenClaw. The "all-you-can-eat" buffet economics were breaking down when power users consumed more than their monthly plan cost in tokens. OpenClaw's creator, Peter Steinberger, was notably hired by OpenAI in February to lead their personal agent strategy, and he's been publicly critical of Anthropic's restrictions. The personnel moves are as revealing as the pricing ones.

Sources: CNBC, TechCrunch, VentureBeat

The Code Overload

The New York Times ran a piece on April 6th that crystallized something a lot of us have been feeling but hadn't articulated yet: AI-assisted coding is producing so much code so fast that it's becoming unmanageable. Tech workers are generating more code than their teams can review, test, or maintain. The velocity is real. The technical debt is realer.

Here's what makes this uncomfortable. More code doesn't mean better software. It means more surface area for bugs, more dependencies to track, more architectural decisions made at machine speed without the deliberation that used to be the discipline's saving grace. And when you combine that with Mythos' ability to find vulnerabilities that humans can't, you get a system where the attack surface grows faster than our ability to secure it. The tool that writes the bugs and the tool that finds the bugs are now the same class of thing.

Source: NYT

Enterprise Agents Hit the Runtime Layer

The model war gets the headlines, but the real infrastructure battle of 2026 is being fought one layer down: who owns the runtime where agents actually work.

OpenAI just announced a Stateful Runtime Environment built with AWS. What it does is give agents persistent memory across sessions and the ability to operate across a business's tools and data. Every CIO who's tried to deploy agents has hit the same wall: how do you maintain context when an agent moves between systems? This is OpenAI's answer. Enterprise now makes up more than 40% of their revenue, and they're building toward agents as a "unified operating layer." Oracle, Uber, and State Farm are already on the platform.

Not to be outdone, AWS launched Bedrock AgentCore, a serverless runtime for deploying and scaling AI agents using any framework (CrewAI, LangGraph, OpenAI Agents SDK, Strands), any protocol (MCP or A2A), and any model. This is AWS doing what AWS does: becoming the substrate. You pick the agent framework, you pick the model, they run it for you.

Behind both of these is MCP becoming universal. Every major AI provider now supports the Model Context Protocol. The ecosystem is splitting into two layers: MCP handles agent-to-tool communication (the agent calls a tool, the tool responds), while A2A handles agent-to-agent delegation (one agent delegates to another agent with its own reasoning). Different security models, different trust boundaries, but both are now standard enough to actually build on.

The numbers tell the real story. 88% of enterprises have adopted AI. Only about a third have scaled deployment. That gap is where the $14 billion integration consulting market lives. Every major vendor is now building formal partnerships with McKinsey, BCG, Accenture, and Capgemini, not because the model is hard, but because wiring agents into legacy systems is where the actual work happens. Gartner projects 40% of enterprise apps will embed task-specific agents by year-end. The winners won't be the companies with the best model. They'll be the ones who figure out how to make agents actually work inside the messy, permissioned, compliance-bound reality of enterprise IT.

Sources: OpenAI Enterprise, AWS AgentCore, Kai Waehner (Enterprise Landscape), AIB Magazine

The Bigger Picture

April 2026 is defined by a tension that's been building for months. The capabilities are real now, and the consequences are starting to arrive. Mythos can find zero-days that humans miss. Conway can run tasks autonomously. TurboQuant can compress the memory wall. The model horse race has more runners than ever, and the gap between first and fifth is measured in single-digit percentages.

But the real stories aren't on the benchmarks. They're in the code overload, in Greg Kroah-Hartman's shift from dismissing AI reports to spending his day on them, in OpenAI and AWS building competing agent runtimes because the model layer has been commoditized faster than anyone expected. The $14B integration market exists because the hard problem isn't intelligence. It's deployment. The organizations winning aren't the ones with the best model. They're the ones who figured out how to make agents stick inside the messy reality of how work actually gets done.

The infrastructure is being built. The agents are going autonomous. The war moved from the model to the runtime. Watch that layer.

This is what Scout is built for

If the runtime is the new battleground, Scout is the agentic operating system. An enterprise doesn't need another model. It needs a computer that can manage and run agents safely inside its walls, with the right permissions, the right memory, and the right guardrails. Scout gives organizations a place to deploy agents that actually know their business, remember what happened, and operate within the boundaries that matter. The infrastructure layer between "we hired a model" and "the agents are doing real work." That's the gap. That's where Scout lives.

References

Scout Academy — Published by [hyper.io](https://hyper.io)

Tom W.Tom W.
Scout A. TeamScout A. Team
Share article:

Ready to get started?

Sign up for free or chat live with a Scout engineer.

Try for free