2026-03-08

Dario’s Magnum Opus

I hear the soft clacking of the mechanical keys as I type, shortly after the update finishes. I type /model and select Opus 4.5, Anthropic’s then-newest AI coding model. It’s mid-December 2025 and I’m getting ready for the Christmas season, wrapping up an assignment at work.

I’d used Claude Code to augment the software engineering process for a few months, but I had learned it’s strengths and weaknesses and come to see it as a very limited, but helpful code generator, that I had to micromanage in order for it to be useful. Something was different on that evening though. The model got smarter.

The model was able to generate entire applications from my prompts. I had a TheONE Smart Piano and it had a terrible iPad app for teaching piano. Every interesting song was an add-on purchase. Out of curiosity, I went to see if Opus 4.5 could generate me the exact app I had in mind. I just wrote it down, and in minutes, it had a working demo that integrated with the MIDI controller and was able to interact with my piano.

The model kept performing in a way that genuinely shook me. It was way better than I had any reason to believe was possible. Around that time, 𝕏.com (the website formerly known as twitter) was absolutely on fire about this. The METR chart dropped six days before Christmas, and that put into numbers what a lot of us had qualitatively experienced. The same chart, when shown on a log scale, is a straight line, which is another way of saying it is exponential.

Everyone else on the timeline had seen and felt this unreasonably effective AI model and was sharing their shock and euphoria about it. It was electric. It was a wild time. I still remember those days.

50 subjective years later, in the month of March, in the year of Our Lord Two Thousand and Twenty Six, so much has changed. And yet I get the impression that the wider world, outside of tech, hasn’t really felt the g-forces that us software engineers who use these models underwent last December.

Since then, Anthropic has released GPT-5.3-Codex, which, according to OpenAI’s release notes, was the first model that was “instrumental in creating itself”. Here’s the quote from their February 5th release notes:

GPT-5.3-Codex is our first model that was instrumental in creating itself. The Codex team used early versions to debug its own training, manage its own deployment, and diagnose test results and evaluations.

That same day, February 5, 2026, Anthropic released Opus 4.6, which, even though it was only a few months later, quintupled the “context window” which is the short term memory of the AI model. The context window is one of those fundamental constraints that really limits what you can do with it. That constraint has since been lifted and we haven’t really seen what the limits of this model are yet.

As I write this in the present, March 8, 2026, GPT 5.4 is out, and it’s already solved a superhuman challenge that no previous model has been able to solve: it reverse-engineered a GPT-2 model and wrote a small 5 kilobyte program that implements inference (which is what makes it work like a chatbot), in less than 15 minutes. The newer models already have a superhuman understanding of the older ones, and they are already being used to create newer, better models.

I could go on, but it’s time to shift gears.

This unprecedented explosion in AI model capability has already had an impact on geopolitics. In February, WSJ reported that Anthropic’s AI model was involved in Operation Absolute Resolve, where the United States 1st Special Forces Operational Detachment-Delta captured Venezuela’s president to bring him to justice in New York.

On February 26, 2026, Dario Amodei, CEO and Co-Founder of Anthropic, issued a statement where he laid out two red lines he did not want the U.S. Department of War to cross when using its models. Those are mass domestic surveillance and fully autonomous weapons. The following day, the Secretary of War designated Anthropic’s models as a Supply-Chain Risk. As of March 4, 2026, the supply chain risk (SPR) designation is still there, but Anthropic is still in talks with the Department.

The fact that the software update we got back in December has already had a measurable impact on geopolitics is something the world hasn’t fully digested yet.

What did I mean when I said “it’s time to shift gears”? What I meant was that I should allocate some amount of time to communicating to people outside of tech that these capabilities are a part of the world, they are already reshaping capital flows and geographical power balances.

I can’t give you a clear vision of what 2030 will be like, but I do have some advice for anyone who is concerned about what this means for their jobs. White-collar work is going to be disrupted before blue collar work, if the models can reverse engineer earlier versions of themselves and recursively self-improve, then they will solve humanoid robot control as soon as the hardware is built and deployed at scale.

My advice: If you are the adventurous type, install Claude Code or OpenAI’s Codex and ask it to build an app that solves some problem you have. Becoming personally familiar with the capabilities is one of the best ways to limit the downsides of disruption. The best case is that you actively thrive by making good use of them for your own work.

If you would rather have guidance, book some time on my calendar, and I can walk you through it, or give a demo, because sometimes you need to see it to believe it.

Or hear it, like the sound of the keys clacking right before you hit enter and your mind is blown.