2026 - 01 - 06 Models improving models

I can no longer understand those online who, with high confidence, scoff at the idea of recursive self improvement of AI models.

Perhaps they are conflating any “self improvement” with the “FOOM” scenario of a super-fast intelligence takeoff on the order of days or even hours. FOOM does not seem very likely at the moment. But we know that

  1. Ten years ago (2016), Machine Learning was all image-recognition-everything. People had only just figured out how to GPU-ize their training code. Natural language models were toys, things like bigram Markov models.
  2. Five years ago, LLMs had become popular among researchers, but were still little chatbots. Tinkerers and researchers would try to make new versions and delight when a coherent sentences popped out.
  3. Two years ago, we had versions that could “generate plausible business ideas” (plausible BS) and “make me a sonnet about X”.
  4. One year ago, with “reasoning”, it could give feedback better on technical writing that was more useful than many professors.
  5. Now, people are starting to have Claude Code Mania, where they realize that Claude can do basically any task they want to do, as long as it is all virtual computer work.

What happened during this progression? There were some physical changes — enormous data centers had to be built — but most of the difficult work, the work that only a few, highly trained and specialized humans could perform, was all in the digital world: understanding complex math, writing software based on to test ideas algorithms, communicating your new ideas and results to others.

These are all things that frontier models can do. They note in the Opus 4.6 system card that many of the research tasks to improve from Opus 4.5 to 4.6 were done by Claude. I supposed you could think everything coming out of an AI lab is a lie meant to hype up marketing, but the much simpler explanation, the one that matches my own experience, is that the self-improvement claims are no longer future hype, but happening right now.