January 26, 2026

The Lingering Weakness

A year ago I made it a goal to document and share my experience using AI. I kept hearing how the world was on the cusp of a massive change, and I wanted to experiment with all the tech and build some opinions based on real experiments and use cases. It's hard to believe that AI-assisted coding tools like Claude Code didn't exist a year ago and Cursor was still primarily awesome AI-driven tab completion. A testament to how quickly this is moving and shaping the lives of software teams. It's also true that the storm hasn't hit the mass market yet. The big game changing app has yet to appear and upend the lives of many. For those who build products and code, the storm has definitely reached shore.

On February 2nd, 2025, Andrej Karpathy coined the term "vibe coding," and in a way, AI coding was born. Around this time I wrote my first entry about how empowering AI coding was, and how I was able to build again for the first time in years after letting my coding skills atrophy. I also shared how disappointing it was when AI just forgot what it was doing—my first encounter with the context window. Claude Sonnet 3.5 was the coding model of choice with no real viable alternatives in sight. It was exciting but generally calm. No imminent concern of AI doing all the coding work, it was just helpful.

The Lull, Then the Leap

By late summer and early fall of 2025, news of an AI bubble was everywhere. AI wasn't delivering for big business and seemed on the verge of collapse. It was a lull after a wild start to the year with the release of Deepseek shaking things up. People and the media questioned whether we'd hit a plateau—maybe this was as good as it was going to get. OpenAI released GPT-5 in August [2] and it failed to change the world as Sam Altman had promised [8]. People expected a step change, but GPT-5 didn't feel like much of a leap. The bubble talk grew louder.

This changed in November when Google released Gemini 3 [3]. People saw a real step change this time. Claude Sonnet finally had competition from a coding perspective, and the Nano Banana Pro image generation lit up socials with images so scary-accurate you couldn't tell they were AI. Gemini 3 woke everyone up. Google had moved into the top spot and freaked their competition out. OpenAI declared their own "code red," a nice echo to Google's code red back in 2022 when ChatGPT first hit the scene. For the first time in about six months, the bubble talk started to fade.

Then Claude Opus 4.5 arrived in late November [4]—though it took a few weeks for people to really notice how much of a beast it was at coding. This release was a true inflection point. I was in the middle of building a personal finance app to replace some old spreadsheets using Claude Sonnet 4.5. I was moving through it but it was still the frustrating battle of context management and a model that was wild and difficult to steer. When I switched over to Opus 4.5, the change was immediately noticeable. Everything just started working. My methods for context management were actually effective, and my progress building the app accelerated significantly. I thought to myself, others working with Opus must have been having the same experience, but the response was muted… for the moment.

Around this same time, the no-code AI tool Lovable announced it had doubled from $100M to $200M ARR, after going from $1M to $100M in just eight months. Lovable lets anyone prompt their way to a working website, and it's a glimpse of how much value AI can unlock when it's truly accessible to everyone. Lovable puts building power in everyone's hands, and I think this is how AI becomes the economic juggernaut we've been promised: by expanding who can ship to anyone with an idea. When you visit their website a simple prompt says "Let's build something" with a blinking cursor in the prompt box. Endless possibilities are just a few keywords away.

Pandemonium

Through December more people started using Opus, and Anthropic incentivized everyone to code by doubling its usage limits over the holidays so people could feel the vibes during their time off. With a little more time to burn, the community dug into Claude Code with Opus 4.5 and the excitement blew up. So many people were building amazing things and feeling the impact of a coding model that was an astonishingly large step change, dare I say a game changer. In a lot of cases Opus 4.5 just works, the unofficial litmus test for something great. Back in March 2025, Dario Amodei had predicted that by mid-to-late 2025, 90% of code would be AI-written [1]. Most people thought he was over-promising. However, this time he seemed to hit it on the head, off by a month or two.

By early 2026, Claude Code was the belle of the ball, with Cursor in the conversation finding a path between a new agent interface and a visual editor to perhaps compete with Lovable. OpenAI wasn't out of the game by any means. They had released GPT-5.2 and it was a strong coder when paired with Codex, OpenAI's answer to Claude Code [5]. Google also joined the party with Antigravity, a coding tool from the founders of Windsurf [6][7]. Similar to a year ago when DeepSeek hit, we've entered another period of immense acceleration in this wild AI ride. Even some of the most revered programmers were onboard, like Linus Torvalds.

At the end of 2025, Boris Cherny, Claude Code's creator, said that 100% of his code was now written by Opus. A single engineer doesn't mean the world has stopped writing code, but it minimally shows a trend in that direction. In that light, Dario's prediction was on point.

January Hysteria

Here we are at the end of January 2026 and the buzz is flying high, the bar pushed further and further up. The industry seems to have accepted that AI coding isn't going to fade away—it's here to stay.

Ralph loops came first. Back in November, a developer named Geoffrey Huntley created a simple loop that called AI over and over to solve his problem until certain acceptance criteria was hit. Named after the Simpsons character Ralph Wiggum, "Ralph loops" took the internet by storm in early January. My feeds were almost all Ralph this, Ralph that, "if you're not using Ralph you're losing" type posts. Ralph was a great demonstration of agents validating and self-improving it's own work—albeit not really useful for any real production app. Can you really run "dangerously accept permissions" on code outside of startups and hobby sites? I wouldn't. I think Ralph was a way to see how agents could run for long periods, doing large chunks of work and actually producing something of value. I'd wager that Ralph loops were enabled by Opus 4.5. If you ran these on older models, I wonder if the output would look like those nightmarish images you get when you feed a picture through ChatGPT 100 times.

Cowork is the next thing that swung into focus stealing the light from Ralph. Built by the same team as Claude Code, it aims to bring that agent orchestration to non-technical users. The claim is that it was created from nothing in 10 days, with Opus 4.5 doing all of the coding. The truly astonishing fact to me is the zero-to-product-in-10-days part. Our assumptions about development timelines have been completely overturned. This reduction in effort and time to go to market with an idea is powerful stuff. Imagine all those ideas that never saw the light of day? This is what I find most exciting, the potential for a small team to realize a vision and see if they can make it fly or not, which is also powerful. If you tested an idea and it doesn't work, at least you can say you tried versus wondering what might have been.

Clawdbot/Molt exploded this week and brought the frenzy to near foaming-at-the-mouth levels. It's an open source tool that runs on a local machine (Mac Minis are selling out everywhere) and connects Claude to all your services—Gmail, Obsidian, Calendar—while building a running memory of who you are. If you thought "dangerously accept permissions" was peak YOLO, Clawdbot is next level, Red Bull and Mountain Dew-drinking YOLO. You're giving an untested early AI system access to all your personal data across numerous services and letting it "run while you sleep." Exciting, but also… what? I guess I should thank early adopters for crashing through all this and stress-testing it for the rest of us. They make it better for everyone to try later. Godspeed my YOLO friends, may your data not be pillaged (or deleted or both).

Staying Grounded

I'm genuinely excited to try Clawdbot, but I'm not ready to unlock my life to it. As much as I'm a fan of Opus 4.5 and Claude Code, I still experience it making mistakes, silly errors, and I feel the "dumb zone" creeping in as I prompt away. I worry about what happens when the dumb zone hits Clawdbot while it's working on critical files or sending an email I probably wish it hadn't. It's an awesome idea, but it all hinges on one thing: context management.

A year later and countless hours pulling my hair out, the context window has to be this technology's Achilles heel. In my very first post I wrote about it, and a year later I'm still writing about it. It's the weakness that makes this technology out of reach for most people. Highly technical users understand how to manage context well; most others don't have the time or will to figure it out. It doesn't "just work" like great tech that takes over the world needs to.

Looking ahead, in January 2027, will I be writing about how terrible the context window limitations are on Claude Opus #.x? I hope not. I hope Cowork has taken over as our primary way of using a computer. I hope we have desktop devices that run agents like Clawdbot, but securely and privately. Even better would be a locally running version of Claude that hacks away while I sleep on said desktop box. I can dream. If my phone connects to this box and I have a true second brain that I trust with my data—one that isn't being pillaged by big tech—then 2027 is a good place to be. I hope we're not wearing dystopian pin wearables or glasses. I don't need another device to carry around harvesting data to display the perfectly placed ad.

These are my captain's logs for January 2026. This is my view of a world rapidly changing in so many ways. I look forward to reading this in a year and seeing how far off I was. In my last post I called 2026 the year of the AI harness, and I stand by it. Context management needs to be abstracted away into the shadows. If we can build a layer around this limitation and make the tech into something workable and easy to use, something that "just works" for as many people as possible, then I'm feeling pretty optimistic about where this goes for "idea holders" to become builders.

---

References

(1) Anthropic CEO: AI will write 90% of code

(2) Introducing GPT-5

(3) Gemini 3

(4) Claude Opus 4.5

(5) Introducing GPT-5.2 and Codex

(6) Windsurf's CEO goes to Google

(7) Varun Mohan on X

(8) GPT-5: Overdue, overhyped and underwhelming