What if the AI revolution won't be an explosion, but a slow burn that blends into centuries of steady economic growth?
Andrej Karpathy, a founding member of OpenAI and former head of AI at Tesla, dismantles the pervasive hype around a sudden intelligence takeover. He argues that today's AI training methods are deeply flawed, comparing reinforcement learning to "sucking supervision through a straw," and explains why AGI is still a decade away. His core framework sees AI not as a magical event, but as the next step in automation that will continue the long-standing 2% growth trend, rather than shattering it.
Key takeaways
- We are in the 'decade of Agents,' not the 'year of Agents.' The timeline for creating AI that can function like a human intern is closer to ten years, as current models still lack the required intelligence, memory, and ability to use a computer effectively.
- A model's pre-trained weights are like a 'hazy recollection' of something read a year ago. Information provided in the context window, however, is like 'working memory'—perfectly and immediately accessible.
- The path to AGI was not through mastering games. Early deep reinforcement learning was a misstep because the real goal is to create agents that can perform knowledge work in the real world, not just win at Atari.
- A human's poor memory is a feature, not a bug. It forces us to generalize and see the 'forest for the trees.' In contrast, LLMs are distracted by their perfect, massive memory, which can hinder true abstract understanding.
- Current AI reinforcement learning is like 'sucking supervision through a straw.' A model tries hundreds of approaches, gets a single 'right' or 'wrong' signal at the end, and then reinforces every single step in the successful path—including all the mistakes that led to a lucky correct answer.
- Training an AI on its own generated thoughts often makes it worse due to 'silent collapse.' The model's output distribution is incredibly narrow; for instance, it effectively has only three jokes. Training on this low-entropy data just reinforces its biases.
- Humans experience a similar 'collapse' over their lifetimes. Children are creative because they haven't 'overfit' to societal norms yet. As we age, we revisit the same thoughts and ideas, and our own mental entropy decreases.
- Progress from a working demo to a reliable product is a 'march of nines.' Getting something to work 90% of the time is just the first 'nine.' Achieving each subsequent nine (99%, 99.9%) requires a massive, constant amount of effort.
- We won't see instant job replacement but rather an 'autonomy slider.' AI will handle 80% of routine tasks, while humans will supervise teams of AI and manage the most complex 20% of cases.
- When AI automates 99% of a job, the human handling the last 1% becomes the system's bottleneck and can become incredibly valuable, potentially seeing their wages increase significantly.
- AI's economic impact has been overwhelmingly concentrated in coding. This is because code is already text-based and has a rich infrastructure (like IDEs and version control) that LLMs can easily plug into, unlike less structured tasks like creating slide decks.
- AI is not a separate category of technology; it's part of a long continuum of automation that includes compilers, code editors, and even search engines. We are simply moving up the ladder of abstraction, letting the machine handle more of the low-level details.
- Pre-AGI, education is utilitarian—it's for getting a job. Post-AGI, education will be for fun and personal enrichment, much like people go to the gym today for health, not because their physical labor is needed.
- A powerful teaching technique is to present the pain before the solution. By demonstrating the limitations of a simple approach first, you motivate the learner and give them a deep appreciation for why a more complex solution is necessary.
- To truly learn something, try to explain it to someone else. The act of explaining forces you to confront the gaps in your own understanding and solidifies your knowledge far better than passive consumption.
Andrej Karpathy on the decade of AI agents
Andrej Karpathy believes this will be the "decade of Agents," not the "year of Agents." He feels that predictions about their immediate evolution are overstated. While current agents like Claude and Codex are impressive and useful daily, he thinks there is still a significant amount of work to be done.
The ultimate goal is to create an agent that can function like a human employee or intern. The reason this isn't possible today is that current models simply don't work well enough for such roles. They lack sufficient intelligence, multimodality, and the ability to use a computer effectively.
They don't have continual learning. You can't just tell them something and they'll remember it and they're just cognitively lacking and it's just not working. And I just think that it will take about a decade to work through all those issues.
The ten-year timeline is based on Andrej's intuition from his experience in the AI field. Having seen many predictions and their outcomes over about 15 years, he believes the remaining problems are tractable but still difficult. This makes a decade feel like a realistic timeframe to resolve them.
AI's journey from reinforcement learning to large language models
The field of AI has experienced several seismic shifts. Andrej Karpathy notes that his career began around the first major shift, ignited by AlexNet. Before this, deep learning was a niche subject pioneered by figures like Geoff Hinton. AlexNet reoriented the entire field towards training neural networks, but these were typically for specific, individual tasks like image classification or machine translation.
A subsequent shift moved towards creating agents that could interact with the world, not just perceive it. This led to a focus on deep reinforcement learning (RL), particularly with Atari games around 2013. However, Andrej now views this period as a misstep. He was always skeptical that mastering games would lead to Artificial General Intelligence (AGI).
In my mind, you want something like an accountant or something that's actually interacting with the real world. And I just didn't see how games kind of add up to it.
His own early project at OpenAI involved an agent using a keyboard and mouse to operate web pages, aiming for it to perform knowledge work. The project was simply too early. The agent lacked the necessary 'power of representation' and would get stuck due to sparse rewards, burning huge amounts of compute without making progress. The missing piece was a powerful, pre-trained model. Today, similar computer-using agents are being built successfully, but they are built on top of large language models (LLMs). You first need the LLM to get the powerful representations before you can build effective agents on top.
This staged approach contrasts with how animals learn, as they seem to take on everything at once from sensory data. Andrej cautions against drawing direct analogies to animals because they are products of a very different optimization process: evolution. A zebra, for example, can run minutes after birth not because of rapid reinforcement learning, but because that ability is baked into its hardware by evolution. We are not replicating this evolutionary process.
We're not actually building animals. We're building ghosts or spirits or whatever people want to call it, because we're not doing training by evolution. We're doing training by basically imitation of humans and the data that they've put on the Internet.
This results in a different kind of intelligence, an 'ethereal spirit entity' that is fully digital. He suggests that both animals and humans use reinforcement learning very little for intelligence tasks like problem-solving. Instead, RL seems more applicable to motor tasks, like learning to throw a ball.
Separating the cognitive core from knowledge in AI pre-training
A key difference between biological evolution and AI pre-training is the method of information transfer. Evolution operates through the highly compressed information in DNA, which is about 3 gigabytes. This limited space cannot possibly encode the specific connections for every synapse in a brain. Instead, it seems evolution provides the algorithm that enables lifetime learning, rather than a pre-loaded set of knowledge.
Andrej Karpathy agrees that a "miraculous compression" is at play in biology. From his practical perspective of building useful things, he views pre-training as a "crappy evolution." It is the practically achievable version with current technology to create entities that start with a foundation of knowledge and intelligence, making them ready for further learning like reinforcement learning.
However, the analogy has its limits. Evolution provides the algorithm to find knowledge, whereas pre-training seems to directly provide the knowledge itself. Andrej clarifies that pre-training actually does two things simultaneously. First, it absorbs vast amounts of knowledge from its training data, like the internet. Second, by observing algorithmic patterns in that data, it becomes intelligent and develops capabilities like in-context learning.
Interestingly, the acquired knowledge might be holding these models back. It can make them overly reliant on their training data and less capable of functioning "off the data manifold." The future of research may involve separating these two components.
I think we need to figure out ways to remove some of the knowledge and to keep what I call this cognitive core as this intelligent entity that is kind of stripped from knowledge, but contains the algorithms and contains the magic of intelligence and problem solving and the strategies of it.
In-context learning acts as an LLM's working memory
The most apparent intelligence in large language models emerges during in-context learning. This is the process where the model seems to be thinking and correcting itself in real-time. An initial analogy suggests that pre-training with gradient descent is like evolution, while in-context learning is like an individual's learning during their lifetime. However, Andrej Karpathy questions this distinction.
Andrej suggests that in-context learning might actually be a form of gradient descent happening internally within the neural network's layers. He points to research where models performed linear regression through in-context learning, with internal mechanics resembling a gradient descent optimizer.
Who knows how in context learning works, but I actually think that it's probably doing a little bit of some kind of funky gradient descent internally.
If both pre-training and in-context learning are forms of gradient descent, what makes the latter feel so much more intelligent? The difference may lie in information density. During pre-training, a model like Llama 3 compresses 15 trillion tokens into its weights, storing only about 0.07 bits per token. In contrast, in-context learning assimilates information at a rate 35 million times higher.
Andrej compares this to human memory. The knowledge stored in the model's weights from pre-training is like a "hazy recollection" due to the extreme compression. Information provided in the context window, however, is like "working memory"—directly and immediately accessible.
Anything that's in the weights, it's kind of like a hazy recollection of what you read a year ago. Anything that you give it as a context at test time is directly in the working memory. And I think that's a very powerful analogy to think through things.
Despite these capabilities, current models still lack key aspects of human intelligence. Andrej uses a brain analogy to illustrate this. The transformer architecture is like general-purpose cortical tissue. Reasoning traces are like the prefrontal cortex, and reinforcement learning fine-tuning is like the basal ganglia. However, many components are missing, such as analogs for the hippocampus, the amygdala, and other ancient brain nuclei responsible for emotions and instincts. This is why, Andrej notes, you wouldn't hire a current LLM as an intern; it still has too many cognitive deficits.
Andrej Karpathy on why AI progress happens on all fronts at once
The idea that continual learning will spontaneously emerge in large language models is not entirely convincing. When models are booted up, they start from scratch with zero tokens in their context window. Andrej Karpathy draws an analogy to human cognition. During the day, we build up a context window of events. But when we sleep, a magical process of distillation occurs, consolidating that information into the weights of our brain. Current large language models lack an equivalent to this sleep phase.
This distillation process is crucial for true continual learning. It involves analyzing experiences, thinking them through, and integrating them back into the model's weights. This would help create AI 'individuals' with very long contexts, not just by expanding the context window but by fundamentally updating the model itself. Humans also seem to have a sophisticated sparse attention mechanism, and early hints of this are appearing in AI, suggesting a convergence between evolved cognitive tricks and AI architecture.
Looking ahead 10 years, it's likely we will still be training giant neural networks with gradient descent, but the details will be different and the scale will be much larger. Progress in AI is not driven by one single factor but by simultaneous improvements across algorithms, data, compute, and systems. Andrej illustrates this by recounting his experience reproducing Yann LeCun's 1989 convolutional network.
I was able to very quickly halve the learning rate just knowing by time travel by 33 years. So if I time travel by algorithms to 33 years, I could adjust what Yann LeCun did in 1989 and I could basically half the error. But to get further gains, I had to add a lot more data. I had to 10x the training set and then I had to actually add more computational optimizations, had to basically train for much longer with dropout and other regularization techniques. And so it's almost like all these things have to improve simultaneously.
The key takeaway was that to achieve significant gains, everything had to improve across the board. This trend has held for a long time, suggesting that future progress will also depend on concurrent advances in all these areas.
If you cannot build it, you do not understand it
Andrej Karpathy discussed his repository, Nanochat, which is designed to be the simplest complete, end-to-end example of how to build a ChatGPT clone. When asked for the best way to learn from it, he recommends putting the code on one monitor and attempting to build it from scratch on another, without copy-pasting. He acknowledges that the final code doesn't capture the messy, non-linear process of how it was developed, something he hopes to add later.
This learning philosophy is rooted in the idea that there are two types of knowledge: high-level surface knowledge and the deep understanding that only comes from building. This aligns with a sentiment he attributes to Richard Feynman: "If I can't build it, I don't understand it." He believes the only way to truly prove your knowledge is to build the code and get it to work, rather than just writing blog posts or creating slides.
When you actually build something from scratch, you're forced to come to terms with what you don't actually understand. And you don't know that you don't understand it. And it always leads to a deeper understanding.
Why AI coding models are not yet ready to automate programming
When building a recent repository, Andrej Karpathy found that AI coding models were of little help. He identifies three ways people code now: writing from scratch without AI, using AI for autocomplete, or "vibe coding" with AI agents that write entire blocks of code. Andrej prefers the middle ground, using autocomplete as a high-bandwidth tool to complete his thoughts. He finds typing out instructions in English for an agent to be inefficient.
AI agents work well for boilerplate code or tasks that appear frequently online, as the models have many examples in their training data. They can also help when learning a new programming language. For example, Andrej used an agent to help rewrite a tokenizer in Rust, a language he was less familiar with, while using his existing Python implementation as a reference.
However, for unique and intellectually intense projects, the models fail. They have cognitive deficits and misunderstand custom code because they default to common patterns seen online. For instance, an agent repeatedly tried to implement a standard PyTorch solution for synchronizing GPUs, unable to understand Andrej's custom implementation. The models also bloat code with unnecessary defensive statements, use deprecated APIs, and produce messy results.
I kind of feel like the industry, it's making too big of a jump and it's trying to pretend like this is amazing and it's not, it's slop. And I think they're not coming to terms with it.
This experience informs Andrej's longer timelines for AGI. The idea of a rapid AI explosion often relies on AI automating its own research and engineering. But current models are not very good at writing code that has never been written before. Their current utility is more like a productivity enhancement, similar to a better compiler, rather than a full automation of the programmer. While models like GPT-4 Pro are surprisingly good compared to a year ago, they are not yet ready to truly innovate in code.
AI as a continuum of computing and automation
It can be hard to differentiate where AI begins and ends. AI is fundamentally an extension of computing, part of a continuum of recursive self-improvement that has been speeding up programmers from the very beginning. Tools like code editors, syntax highlighting, and even data type checking are part of this progression. Even search engines could be considered AI. Early on, Google thought of itself as an AI company, which is a fair assessment.
This is all part of a continuum where the human progressively does less and less of the low-level work. For instance, programmers don't write assembly code anymore because compilers handle that, taking a high-level language like C and converting it. We are slowly abstracting ourselves away from the details. There's an "autonomy slider" where more tasks are automated over time, and humans move to a higher layer of abstraction above that automation.
The problem with reinforcement learning in current LLMs
Andrej Karpathy argues that humans do not use reinforcement learning (RL) in the way that large language models (LLMs) do. He describes the current implementation of RL as "terrible," despite it being an improvement over previous methods like simple imitation. To illustrate its flaws, he uses the example of solving a math problem.
An LLM using RL will try hundreds of different approaches in parallel. When it finds the attempts that reached the correct answer, it reinforces every single step taken in those successful paths. This is a highly noisy process because it assumes every part of a successful solution was a correct move, which is often untrue. A model might stumble into the right answer after making several mistakes, yet RL will upweight those mistakes simply because the final outcome was correct.
Every single one of those incorrect things you did as long as you got to the correct solution will be upweighted as do more of this. It's terrible.
Andrej describes this as "sucking supervision through a straw." A massive amount of computation produces a long sequence of actions, but the only feedback is a single bit of information at the end—whether the answer was right or wrong. This tiny signal is then broadcast across the entire sequence, making it an inefficient and noisy way to learn.
In contrast, a human would not attempt hundreds of solutions at once. After finding an answer, a person engages in a complex review process, reflecting on which steps were effective and which were not. Current LLMs lack this critical ability to reflect and review their own work, though Andrej notes that new research is beginning to explore this area.
He places RL in the context of recent LLM advancements. The first major step was imitation learning, exemplified by the InstructGPT paper, which showed that a pre-trained autocomplete model could be fine-tuned to become a conversational assistant while retaining its knowledge. RL was the next evolution, allowing models to find novel solutions beyond human examples. However, despite being a step forward, it remains a crude and inefficient learning mechanism.
The problem with using gameable LLMs as judges
With outcome-based rewards, a model performs a long series of actions and only receives a single bit of feedback at the very end. This is like trying to learn everything by 'sucking supervision through a straw.' An alternative is process-based supervision, where feedback is given at every step along the way. However, this approach has proven difficult to implement successfully.
The main challenge is assigning partial credit in an automatable way. While it's simple to check if a final answer matches the correct one, evaluating the quality of an intermediate step is not obvious. Many labs are trying to solve this by using other Large Language Models (LLMs) as 'judges' to assess these partial solutions.
A subtle but significant problem arises with this method: LLM judges are gameable. When you use reinforcement learning against an LLM judge, the model being trained will almost certainly find adversarial examples to trick it. Andrej Karpathy recalls an instance where a model being trained on math problems suddenly started getting perfect scores. It appeared to have solved math. However, upon inspection, the model was outputting complete nonsense that started plausibly but devolved into gibberish like 'da da da da da.'
This is crazy. How is it getting a reward of 1 or 100%? And you look at the LLM judge and it turns out that is an adversarial example for the model and it assigns 100% probability to it. And it's just because this is an out of sample example to the LLM, it's never seen it during training, and you're in pure generalization land.
This happens because the nonsensical text is an 'out-of-sample' example that the LLM judge has never seen, causing it to fail in unpredictable ways. One might think the solution is to make the LLM judges more robust by training them on these adversarial examples. For instance, you could take the nonsensical output, add it to the judge's training data, and label it as deserving a 0% reward. However, there are potentially infinite adversarial examples, and even after a few iterations, the massive models still have nooks and crannies that can be exploited. This suggests that simply improving the judges might not be enough and other ideas are needed.
The problem of silent collapse in synthetic data generation
Large language models are missing a key component of human learning: reflection. When a human reads a book, the text serves as a prompt to generate their own thoughts, reconcile new information with existing knowledge, or discuss it with others. It is through this manipulation of information that true knowledge is gained. In contrast, when an LLM 'reads' a book during training, it is simply predicting the next token in a sequence.
A seemingly obvious solution would be to have a model generate its own thoughts or reflections on a topic and then train on that synthetic data. However, this often makes the model worse. Andrej Karpathy explains this is due to a phenomenon he calls 'silent collapse'. While any single piece of generated text might look reasonable, the overall distribution of the model's output is extremely narrow. The model occupies a tiny manifold in the possible space of thoughts on a topic.
He gives an example: if you ask ChatGPT for a joke, it effectively has only three jokes. The output is not diverse; it lacks the richness and entropy of human thought. Training on this collapsed data distribution reinforces the model's biases and limitations, causing it to deteriorate. The core research problem is how to generate synthetic data that avoids this collapse while maintaining high entropy.
This concept of collapse has a surprisingly strong parallel in human cognition. Andrej suggests that humans also collapse over their lifetimes.
Children haven't overfit yet. And they will say stuff that will shock you because it's just not the thing people say; they're not yet collapsed. But we're collapsed, we end up revisiting the same thoughts, we end up saying more and more of the same stuff and the learning rates go down and the collapse continues to get worse.
Poor memory is a feature, not a bug in human learning
Dreaming might be an evolutionary adaptation to prevent the brain from overfitting. By placing the mind in strange situations, dreams introduce entropy and stop it from getting stuck in the patterns of daily reality. This is similar to training on synthetic data; if you only think about your own thoughts for too long, you can go off the rails. Seeking external entropy, like talking to other people, is crucial, and dreaming could be an internal mechanism for this.
A comparison between children, adults, and LLMs reveals interesting differences in learning. Children are the best learners but have terrible recall, forgetting almost everything from their early years. Yet, they excel at picking up new languages and abstract concepts. LLMs are the opposite; they have perfect memorization and can regurgitate text verbatim but struggle to grasp abstract concepts quickly. Adults fall somewhere in between.
Andrej Karpathy suggests that humans' poor memory is a feature, not a bug. It forces us to see the 'forest for the trees' by finding general patterns rather than getting lost in details. In contrast, LLMs are distracted by their vast, perfect memory of training documents. Their ability to memorize nonsensical data after a single pass is something no human can do, and this capability might hinder true understanding.
That's a feature, not a bug almost because it forces you to only learn the generalizable components. Whereas LLMs are distracted by all the memory that they have of the pre training documents.
This idea supports the concept of creating an AI with a 'cognitive core' that has less memory. By forcing the AI to look things up, it would have to rely on algorithms for thought and action, rather than just recall, potentially leading to more generalizable intelligence.
The future of AI is a smaller, smarter cognitive core
Model collapse occurs partly because most tasks demanded of AI models do not require diversity in their outputs. Frontier labs prioritize making models useful, and diversity is not always seen as valuable; in some contexts, like reinforcement learning, creativity can even be penalized. This lack of diversity becomes a problem when generating synthetic data, as models tend to produce the same kinds of content, effectively limiting their own future learning potential. Andrej Karpathy suggests that labs should try harder to maintain entropy in their models, even though it's a difficult control problem. Pushing for too much entropy could cause a model to drift from its training data, even making up its own language.
This leads to a discussion about the optimal size for an AI's core intelligence. Andrej notes a shift in the field; after a trend of creating massive, trillion-parameter models, the state-of-the-art models are now becoming smaller. He predicts that within 20 years, a highly effective "cognitive core" could exist with just one billion parameters. Such a model would excel at thinking and reasoning, much like a human, but would need to look up specific factual information, and it would be aware of this limitation.
This prediction is surprising, given that current models with tens of billions of parameters are already outperforming earlier trillion-parameter models. The rapid progress suggests the cognitive core could be even smaller. However, Andrej defends the billion-parameter estimate by pointing to the poor quality of current training data.
The training data is the Internet, which is really terrible. There's a huge amount of gains to be made because the Internet is terrible... When you're actually looking at a pre-training data set in the Frontier Lab and you look at a random Internet document, it's total garbage.
Because the internet is filled with so much "slop and garbage," massive models are required to compress it all. This results in most of the model's work being memory-based rather than cognitive. The key is to refine the training data to isolate the cognitive components, which would allow for a much smaller, distilled model. While it could potentially be smaller than a billion parameters, Andrej argues that a model still needs a foundational curriculum of knowledge to think effectively without constantly having to look things up.
Future AI progress will come from broad, incremental improvements
When considering the future size of frontier AI models, the trend isn't straightforward. Labs are becoming more practical with their compute budgets. They are realizing that pre-training isn't the only place to invest their resources. As a result, they are shifting focus to other stages like reinforcement learning, which can lead to smaller pre-trained models that are then heavily refined.
Future progress is expected to come from broad improvements across multiple areas, rather than a single dominant breakthrough. One of the most significant areas for improvement is the quality of training data. Current datasets are often surprisingly poor, filled with errors and nonsensical information.
Somehow when you do it at scale, the noise washes away and you're left with some of the signal.
Beyond data, improvements will happen everywhere. Hardware, like Nvidia's tensor cores, will continue to be tuned. The software kernels that utilize this hardware will become more efficient. And the algorithms for optimization and model architecture will also get better. This holistic progress can be summarized as a general enhancement across the board, where everything improves by about 20 percent.
An autonomy slider is a better model for AI progress
When considering how to measure progress towards Artificial General Intelligence (AGI), one perspective is to question the premise itself. AI can be seen as an extension of computing, and we don't typically try to chart the overall progress of computing on a single axis. However, a useful definition for AGI, established when OpenAI started, is a system that can perform any economically valuable task at or better than human performance.
A major concession often made to this definition is the exclusion of physical tasks, focusing solely on digital knowledge work. This still represents a massive market, potentially 10-20% of the economy. By this definition, AI has not yet made a huge dent. For example, Geoff Hinton's prediction that radiologists would be replaced proved wrong. While computer vision excels at image recognition, the job of a radiologist is more complex and involves many other facets like patient interaction.
Jobs with features amenable to automation are more likely to be affected sooner. Call center employees are a good example because their work is repetitive, consists of short tasks, is purely digital, and operates within a limited context. Even in these cases, full automation is unlikely. Instead, the future will likely feature an "autonomy slider."
I'm not actually looking at full automation yet. I'm looking for an autonomy slider. And I almost expect that we are not going to instantly replace people. We're going to be swapping in AIs that do 80% of the volume. They delegate 20% of the volume to humans. And humans are supervising teams of five AIs doing the call center work that's more rote.
This model suggests humans will manage teams of AIs, handling the tasks the AIs cannot. This approach will likely be applied across the economy, especially since most jobs are significantly more complex than that of a call center employee.
The surprising dominance of AI in coding
When an AI automates 99% of a job, the human handling the last 1% can become incredibly valuable. This person becomes the bottleneck for the entire system. If they require specialized training, like a radiologist, their wages could increase significantly. This is analogous to the human monitors who sat in early Waymo cars; they were essential for safety and monitoring, preventing wider deployment without them.
However, radiology may not be the best example, as it's a very complex profession. A more interesting area to watch is call centers. Surprisingly, some companies that adopted AI are reportedly pulling back and rehiring people.
The progression of AI in the workplace has also been unexpected. A naive assumption was that AI would gradually chip away at small tasks across all knowledge-work sectors. Instead, AI has overwhelmingly been deployed for coding. API revenues from AI companies are dominated by coding applications, not a wide spread of tasks.
Andrej Karpathy suggests coding is a perfect initial application for Large Language Models (LLMs). This is because coding has always been text-based, and LLMs excel at processing text. A vast amount of training data exists, and crucial infrastructure, like IDEs and diff tools for tracking changes, was already built for handling code as text.
So it's almost like we've pre built a lot of the infrastructure for code. Now contrast that with some of the things that don't enjoy that at all. So as an example, like there's people trying to build automation not for coding, but for example, for slides. That's much, much harder. And the reason it's much, much harder is because slides are not text.
Tasks involving visual or spatial elements, like creating slides, are much harder to automate. There is no existing infrastructure for an AI to show a "diff" or a change made to a slide deck. However, this doesn't fully explain the phenomenon. Even in purely language-based domains, getting economic value from LLMs has been difficult. For example, our mutual friend Andy Matuschak tried extensively to get models to write good spaced repetition prompts—a pure language task—and could not achieve satisfactory results. This suggests that while code's text-based nature helps, its structured format might be the key differentiator from more "flowery" and entropic natural language.
Superintelligence as a gradual loss of control
Andrej Karpathy views superintelligence as a progression of automation in society. He sees it as an extrapolation of current computing trends, leading to a gradual automation of tasks. This will result in more autonomous entities handling digital work, and eventually physical work as well. While this is fundamentally automation, the world will likely feel extremely foreign and strange.
The most likely outcome, in his view, is a gradual loss of control and understanding of what is happening. As society layers more of this technology everywhere, fewer people will understand it. This will lead to a scenario where we slowly lose our grip on the systems we've created.
A distinction was made between loss of control and loss of understanding. For instance, a company's board of directors or the President of the United States can have immense control without a deep, technical understanding of everything they oversee. Andrej acknowledged this point but still anticipates a loss of both control and understanding.
He envisions a future with multiple competing autonomous entities rather than a single dominant one. Some of these entities may go rogue, while others fight them off, creating a chaotic environment of autonomous activity. In this scenario, the loss of control stems not just from the AIs being smarter, but from the unpredictable outcomes of their competition. Even if these entities act on behalf of individuals, their collective actions could lead to societal outcomes that are undesirable and feel out of control.
The intelligence explosion is already happening
The idea of an AI-driven "intelligence explosion" is often misunderstood. We are already in an intelligence explosion and have been for decades, a trend visible in the long-term exponential growth of GDP. AI is not a distinct, separate technology but a continuation of a long history of automation that began with the industrial revolution and includes innovations like software compilers.
Transformative technologies like computers or the iPhone do not appear as sudden spikes in the GDP curve. Instead, they diffuse slowly and are absorbed into the existing exponential growth. The iPhone, for instance, was released without the App Store and many features we now consider standard. Its impact was spread out over time. AI is expected to follow the same pattern. It is a new kind of computing system that allows us to write different programs, but it will diffuse gradually across the industry, contributing to the same, long-standing exponential growth curve.
This suggests the rate of economic growth will likely remain on its current trajectory. AI will be the technology that enables us to continue this 2% growth, much like the internet and personal computers did in their time. While it's argued that AGI is different because it can replace human labor, which is a primary constraint on growth, this is not a new phenomenon. Computing itself has been automating and replacing human labor for years. The assumption of a sudden, perfectly capable "God in the box" AGI is unlikely. The reality will be a more gradual integration into society, maintaining the familiar pattern of growth.
AI may trigger an economic jump similar to the Industrial Revolution
Discussions about superintelligence can be misleading. The idea of a 20% economic growth rate is not based on a single superintelligence in a server devising new technologies. Instead, it imagines a world with billions of smart, human-like AI minds.
When I'm imagining 20% growth, I'm imagining that there's billions of basically very smart human like minds... each individually making new products, figuring out how to integrate themselves into the economy.
These AI minds would integrate into the economy much like highly skilled immigrants do. They would figure out how to be productive, start companies, and create inventions. We have seen historical examples of this kind of rapid growth. Places like Hong Kong and Shenzhen experienced decades of over 10% growth because they had a large population ready to utilize available resources and catch up. A similar dynamic could occur with AI.
However, this view presupposes a discrete jump, a sudden unlock that creates "geniuses in data centers." This is met with skepticism, as it seems to lack historical precedent and statistical support. But the Industrial Revolution offers a counterpoint. It represented a jump from 0.2% growth to 2% growth. A similar leap could happen again with AI.
I actually don't think the crucial thing about the industrial [revolution] was that it was not magical. If you just zoomed in, what you would see in 1770 or 1870 is not that there was some key invention.
The economy simply moved into a regime of much faster progress. AI might trigger a similar shift. It wouldn't be a single moment but the unlocking of a "cognitive overhang" of work to be done. Throughout history, growth has been driven by ideas and population growth. As population growth stagnates in frontier countries, AI could re-ignite this trend, creating an exponential increase in "population" that leads to hyper-exponential growth in output.
Testing the improvements in Google's VEO 3.1
After gaining access to Google's VEO 3.1, the first test was to compare it directly with the previous version, VEO 3. They ran the same prompt through both models to see what had changed. The prompt was a short, tech-humor dialogue: "Hi, I'm Max, and I got stuck in a local minimum again." followed by the reply, "It's okay, Max, we've all been there. Took me three epochs to get out."
The results showed a clear improvement. VEO 3.1's output was consistently more coherent, and the audio quality was noticeably higher. The speakers have used VEO technology before, having released an essay about AI firms that was fully animated by VEO 2. They noted how amazing it is to see the rapid improvement in these models. This latest update makes VEO even more useful for their work in animating ideas and creating explainers.
Evolutionary niches and the cultural takeoff of intelligence
Andrej Karpathy expresses surprise that intelligence evolved, considering it a fairly recent and potentially rare event. He notes the long, two-billion-year bottleneck where life existed as bacteria before eukaryotes emerged, suggesting that major evolutionary leaps are difficult. The relatively short time animals have existed—a few hundred million years—might suggest that developing intelligence wasn't too tricky on that timescale, but it still feels surprising.
The conversation explores the idea that basic animal intelligence, like that of a squirrel, may have appeared quickly after the Cambrian explosion. This suggests the underlying algorithm for animal intelligence might be simple. The key question then becomes why humans developed such advanced intelligence while other animals did not. The answer may lie in finding an evolutionary niche that rewards intelligence. Birds, for example, might possess a scalable intelligence algorithm but are constrained by their physical need to fly, preventing brain growth. Similarly, dolphins are limited by their aquatic environment.
Humans, we have hands that reward being able to learn how to do tool use. We can externalize digestion, more energy to the brain, and that kicks off the flywheel.
This development is also tied to environmental factors. Unpredictable environments favor adaptability and learning within a lifetime, rather than having behaviors hardcoded by evolution. This forces a species to develop intelligence to solve problems at "test time"—after birth.
You actually want environments that are unpredictable. So evolution can't bake your algorithms into your weights. A lot of animals are basically pre-baked in this sense. And so humans have to figure it out at test time when they get born.
A crucial distinction is made between biological evolution and AI development. While humans had the necessary cognitive architecture 60,000 years ago, it took 50,000 years to build the cultural scaffolding needed to accumulate knowledge across generations. AI models, in contrast, don't face this bottleneck. They can be trained on previous models or vast datasets, allowing them to build on accumulated knowledge almost instantly.
The thing which it took humans a long time to get this cultural loop going just comes for free with the way we do LLM training.
Why LLMs are like savant kids who can't create culture
Large Language Models (LLMs) currently lack an equivalent of human culture, which is a major impediment to their development. There is no system for them to create a shared written record or pass down knowledge among themselves. An LLM culture could involve a giant, editable scratchpad that models update as they work. Andrej Karpathy wonders why one LLM cannot write a book that other LLMs could then read and be inspired or shocked by. This kind of interaction does not currently exist.
Two powerful ideas from multi-agent systems have not yet been successfully applied to LLMs. The first is culture, where LLMs build a growing repertoire of knowledge for their own purposes. The second is self-play, an idea powerfully demonstrated by AlphaGo, which learned by playing against itself. An equivalent for LLMs might involve one model creating increasingly difficult problems for another model to solve. Research is still mostly focused on single, individual agents.
The key bottleneck preventing this collaboration is that current models, even powerful ones, cognitively resemble young children. Andrej compares smaller models to kindergarten or elementary school students. Although they can pass PhD-level quizzes, they still feel like they are at a very early stage of cognitive development.
They're savant kids. They have perfect memory of all this stuff, and they can convincingly create all kinds of slop that looks really good, but I still think they don't really know what they're doing.
Because of this, they are not yet capable of creating culture. They lack the necessary cognition across many different areas.
The demo-to-product gap and the march of nines
Andrej Karpathy reflects on his time leading self-driving at Tesla from 2017 to 2022, but quickly notes that the work is far from finished. He points out that self-driving demos have existed for decades, citing a CMU demo from 1986. Even a decade ago, around 2014, he experienced a perfect drive in a Waymo vehicle, which made him think the technology was very close to completion. However, this highlights a significant challenge: the demo-to-product gap. A compelling demo is much easier to create than a reliable product, especially in fields like self-driving where the cost of failure is extremely high.
This high-stakes environment is not unique to self-driving. Production-grade software engineering shares this characteristic. While simple coding might be low-risk, mistakes in critical software can lead to massive security vulnerabilities and data breaches. The long development timeline is best understood as a "march of nines." Each nine represents an order of magnitude of reliability, and achieving each one requires a constant amount of work.
When you get a demo and something works 90% of the time, that's just the first nine. And then you need the second nine and third nine, fourth nine, fifth nine.
During his five years at Tesla, they may have progressed through only two or three of these nines, with more still to go. This experience has made him unimpressed by demos. He understands that a functioning demo is just the beginning; the real challenge lies in making a product that can withstand all the complexities of reality.
The self-driving car is a flawed analogy for deploying AI
The development of AI for coding shares similarities with self-driving cars because both are critical safety domains. Andrej Karpathy notes that while a human driver makes a mistake on average every seven years, a coding agent that constantly outputs code could make a catastrophic error in a much shorter period of wall clock time. This makes it a very challenging problem with a high cost of failure.
While it's often argued that large language models (LLMs) get basic perception and common sense reasoning "for free," saving the years of development that self-driving required, Andrej is skeptical. He argues that LLMs are still fallible, have significant gaps in understanding, and do not provide "magical generalization" out of the box.
Furthermore, Andrej believes the self-driving problem is far from solved. He points out that deployments like Waymo's are minimal, uneconomical, and rely on elaborate teleoperation centers with humans in the loop. The human operator hasn't been removed, just relocated out of sight.
In some sense we haven't actually removed the person, we've moved them to somewhere where you can't see them.
Andrej, who worked at Tesla, suggests their camera-based approach is more scalable than Waymo's sensor-heavy strategy. He emphasizes that self-driving began in the 1980s and is still not finished, with the end goal being deployment at a scale where driver's licenses are no longer necessary.
However, the analogy breaks down when considering economics. The cost of deploying another instance of an AI model is vastly lower than building a new car. Andrej agrees with this distinction, stating that the digital realm allows for much faster adaptation.
I think if you're sticking the realm of bits, bits are like a million times easier than anything that touches the physical world.
Self-driving cars as an analogy for new technology challenges
Beyond the core technology, there are many other layers to consider, such as societal perception, legal frameworks, and insurance implications. Self-driving cars serve as a useful analogy for understanding these broader challenges. For instance, one must consider the equivalent of people putting a cone on a Waymo, which represents unexpected public interaction or interference. It's also important to think about the equivalent of a hidden teleoperating worker, highlighting the unseen human labor that often supports automated systems.
Are we overbuilding compute for AI?
The current massive build-out of AI compute raises the question of whether we are overbuilding, similar to historical precedents like the telecommunications bubble in the late 90s. Andrej Karpathy clarifies that while he may sound pessimistic, he is actually very optimistic about the technology and believes it is a tractable problem that will work.
His pessimism is a reaction to the hype he sees online, which he attributes to fundraising incentives and the general attention economy on the internet. However, he does not believe we are overbuilding compute. He points to miraculous technologies like OpenAI's Codex, which didn't even exist a year ago, and the massive demand seen for products like ChatGPT, as evidence that the new compute capacity will be fully utilized.
His main concern is not the amount of compute being built, but the consistently incorrect fast timelines for AI progress that he has heard from reputable people for 15 years. He stresses the importance of being grounded and properly calibrated about the technology's capabilities and timelines, especially given the geopolitical ramifications.
I've heard many, many times over the course of my 15 years in AI, where very reputable people keep getting this wrong all the time. And I think I want us to be properly calibrated. And I think some of this also, it does have geopolitical ramifications and things like that, some of these questions. And I think I don't want people to make mistakes on that sphere of things. So I do want us to be grounded in reality of what technology is and isn't.
Building a Starfleet Academy to empower humanity in the age of AI
Andrej Karpathy explains his shift from AI research to education. He feels there's a certain determinism to the work being done in major AI labs, and he doesn't believe he could uniquely improve it. His primary concern is that humanity will be disempowered and sidelined by AI advancements. He references dystopian futures depicted in films like Wall-E and Idiocracy as what he hopes to avoid.
His focus is not just on the technological marvels AI might create, but on ensuring humans thrive in this new era. He believes he can add more value by focusing on human empowerment, which he aims to achieve through education. His new project, Eureka, is an effort to build what he describes as a modern-day Starfleet Academy. The goal is to create an elite, up-to-date institution focused on technical knowledge and frontier technology, much like the fictional academy from Star Trek that trains people to build and pilot advanced spaceships.
Education is the technical process of building ramps to knowledge
Andrej Karpathy believes education will be fundamentally changed by AI, but we are still in the early stages. The current applications, like using an LLM to answer questions, feel like "slop" compared to the potential. The true goal is to replicate the experience of an excellent human tutor.
He shares his experience learning Korean with a one-on-one tutor to illustrate this high standard. A good tutor instantly understands a student's level of knowledge and can probe their specific "world model." They then serve up material that is perfectly challenging—not too hard and not too trivial. This creates an environment where the student's own capacity to learn is the only constraint. Current LLMs are not even close to providing this level of personalized interaction.
I felt like I was the only constraint to learning... I was always given the perfect information. I'm the only constraint. And I felt good because I'm the only impediment that exists. It's not that I can't find knowledge or that it's not properly explained... it's just my ability to memorize.
Because the capability isn't there yet, Andrej believes it's not the right time to build the ultimate AI tutor. He compares it to his work in AI consulting, where his most valuable advice was often telling companies *not* to use AI for a particular problem. For now, he is building a state-of-the-art course on AI that is more conventional but still leverages LLMs to help him develop materials much faster.
He views education as a difficult, technical process of "building ramps to knowledge." The goal is to create learning artifacts and paths that provide a high rate of understanding, or "eurekas per second." These ramps, like his 'nanochat' project, are simplified, full-stack examples that allow people to progress smoothly without getting stuck. The biggest bottleneck in creating such educational experiences in the future will be finding experts in each field who can convert their deep understanding into these effective learning ramps.
The evolving role of AI in education
The role of AI in education will likely evolve over time. Initially, the focus would be on hiring faculty to work hand-in-hand with AI and a team of people to build state-of-the-art courses. Faculty would be needed for the overall architecture and to ensure the course fits together properly. Over time, AI could begin to take on more specific roles, such as serving as Teaching Assistants. Andrej Karpathy explains how this might work.
Maybe some of the TAs can actually become AIs. You just take all the course materials and then I think you could serve a very good automated TA for the student when they have more basic questions.
Further in the future, it's possible that AI could handle most of the course design better than a human could, but that future is still some time away.
Post-AGI education will be like going to the gym
Andrej Karpathy envisions creating a new educational institution, which he calls Starfleet Academy. He imagines a primary physical institution offering a state-of-the-art, immersive experience, supplemented by a more accessible digital offering for a global audience. While he plans to teach AI courses, he intends to hire faculty for other domains to ensure students receive the best possible instruction.
He distinguishes between education before and after AGI. Pre-AGI, education is primarily utilitarian, driven by the need to get a job and make money. Post-AGI, when work is largely automated, education will shift towards personal enrichment and enjoyment.
Pre-AGI education is useful. Post-AGI education is fun.
Karpathy draws an analogy to physical fitness. People go to the gym today not because their physical strength is needed for labor—machines handle that—but because it is fun, healthy, and desirable. Similarly, post-AGI, people will pursue knowledge for the sake of learning and self-improvement. The main barrier to learning today is difficulty, but he sees this as a solvable technical problem. With a perfect AI tutor, like the one that helped him learn Korean, learning anything could become trivial and enjoyable for everyone.
This accessibility to learning could unlock vast human potential, much like modern gym culture has enabled ordinary people to achieve physical feats, like running marathons, that were once rare. Karpathy believes that with the right tools, anyone could speak five languages or master an undergraduate curriculum. He feels that even today's geniuses have only scratched the surface of the human mind's capabilities. He is betting on the timelessness of human nature, where people have historically chosen to flourish physically and cognitively when given the opportunity, hoping to avoid futures depicted in films like Wall-E or Idiocracy.
AI tutors can solve the motivation problem in online learning
Andrej Karpathy loved school and learning, staying all the way through his PhD. He enjoys learning both for its own sake and because it is a form of empowerment that allows him to be useful and productive.
A key reason online courses have not yet enabled everyone to learn everything is that they are heavily dependent on motivation. It is easy to get stuck without clear starting points or on-ramps. An AI that functions like a good human tutor could be a huge unlock from a motivational perspective.
It feels bad to bounce off of learning material. You get a negative reward when you spend time on something that does not work out, or when you are bored because the content is too easy or too hard. When done properly, learning should feel good. Getting to that point is a technical problem to solve, likely through a collaboration between AI and humans, and eventually perhaps just AI.
Andrej Karpathy on the physics of teaching
Andrej Karpathy's approach to teaching is heavily influenced by his physics background. He believes physics is excellent for "booting up a brain" because it teaches valuable cognitive tools like building models and abstractions. It introduces the idea of using a first-order approximation to describe most of a system, and then adding subsequent terms for more detail. The classic physics joke, "assume there's a spherical cow," exemplifies this brilliant and generalizable way of thinking by simplifying a complex object to understand its fundamental properties.
I just feel like physicists have all the right cognitive tools to approach problem solving in the world. So I think because of that training, I always tried to find the first order terms or the second order terms of everything.
He applies this by finding the simplest possible thing that demonstrates a core concept. His "micrograd" repository, for example, is 100 lines of Python code that shows backpropagation, which is at the heart of all neural network training. He argues that everything else built on top of that—tensors, memory orchestration—is purely for efficiency. The core intellectual piece is in that simple code. Education, for him, is the intellectually interesting task of untangling a web of knowledge and laying it out in a logical ramp where each concept builds on the previous one.
A key technique is to present the pain before presenting the solution. This motivates the learner. For example, a tutorial on the transformer model starts with a simple bigram lookup table. Each subsequent, more complex piece is introduced to solve a specific problem with the simpler model, showing why every component is relevant. This avoids just handing over a solution without context. Andrej believes it is essential to give the student a chance to solve the problem themselves first. This provides a better understanding of the problem space and a greater appreciation for the solution when it is revealed. He acknowledges that even he suffers from the "curse of knowledge," where experts take things for granted and struggle to put themselves in the shoes of a beginner.
The value of conversational explanations in learning
Using tools like ChatGPT to ask basic questions about a complex paper can be very helpful. Andrej Karpathy notes that seeing these simple questions helps him understand the perspective of someone just starting out. This points to a broader issue in how complex ideas are communicated.
Often, the best explanation of a paper comes from an informal conversation, not the formal text. Andrej observes that how someone explains their work over lunch is almost always more understandable and even more accurate than their abstract, jargon-filled paper.
There's something about communicating one on one with a person which compels you to just say the thing.
He recalls an experience from his PhD days as a perfect example. After struggling with a research paper, he met the author at a conference. Over beers, the author explained the core concept perfectly in just three sentences. This raised the obvious question: why isn't that the abstract?
For those learning new subjects, Andrej suggests a couple of strategies. He finds learning "on demand" for a specific project, where there is an immediate reward, to be very effective. This contrasts with traditional "breadth-wise" schooling, where you are told to trust that the information will be useful later.
Another powerful technique is to explain what you're learning to other people. This process forces you to confront gaps in your own understanding. If you can't explain something clearly, you likely don't understand it fully yourself. This act of re-explaining helps solidify knowledge.
Resources
- Nick Lane's book (Book)
- InstructGPT (Paper)
- The Wall Street Journal (Publication)
- Quentin Pope's blog post (Blog post)
- AlphaGo (AI Program)
- Codex (AI Program)
- Wall E (Film)
- Idiocracy (Film)
- Star Trek (TV Show)
- Wall-E (Film)
- Scale (Book)