On May 19, 2026, Andrej Karpathy—OpenAI co-founder, former Tesla AI director, and the person who taught millions how LLMs work through nanoGPT—announced he's joining Anthropic to build a team using Claude to accelerate pre-training research. The move immediately drew Kevin Durant comparisons from AI Twitter, with one widely-shared tweet calling it "KD joining the Warriors for people who know linear algebra." Karpathy will work under Nick Joseph on Anthropic's pre-training team, where the goal is to use AI to speed up the core training processes that power frontier models—essentially making AI development self-improving. The announcement came as Anthropic rides a $40 billion valuation and positions itself as the research lab betting hardest on AI safety and capability scaling simultaneously.
This article covers who Karpathy is, why the move matters, what pre-training acceleration means, and what it signals about the AI talent wars.
TL;DR
| Question | Answer |
|---|---|
| Who | Andrej Karpathy—OpenAI co-founder, Tesla AI lead (2017-2022), legendary educator (Stanford, YouTube, nanoGPT). |
| What | Joined Anthropic's pre-training team to use Claude to accelerate pre-training research. |
| When | May 19, 2026 announcement. Started May 19 (Yasser met him on "his second day"). |
| Why | Wants to get back to frontier LLM R&D. Believes "next few years at the frontier will be especially formative." |
| Pre-training | The core large-scale training process that gives models foundational capabilities. Using AI to optimize AI training. |
| Team | Works under Nick Joseph (Anthropic pre-training lead). Building a new team focused on Claude-accelerated research. |
| Reaction | AI Twitter: "KD joining the Warriors" (superstar joining elite team). Warm welcome from Anthropic team. |
| Context | Anthropic at $40B valuation, pushing self-improving AI and Claude as research accelerator. |
Primary sources: Karpathy on X · TechCrunch coverage · The Decoder analysis
Who is Andrej Karpathy?
Andrej Karpathy is one of the most respected figures in AI—equally known for technical depth and teaching ability. His career spans founding-team roles, leadership at the world's most valuable companies, and educating millions.
OpenAI co-founder (2015-2017)
Karpathy was part of OpenAI's founding team in 2015, focusing on deep learning and computer vision. He worked alongside Ilya Sutskever, Greg Brockman, and Sam Altman during OpenAI's early research phase.
Tesla AI director (2017-2022)
Led Tesla's Autopilot and Full Self-Driving (FSD) AI programs. Under his leadership, Tesla:
- Deployed vision-only self-driving (no LiDAR)
- Built massive real-world datasets from fleet data
- Trained models on custom hardware (Dojo supercomputer)
He left Tesla in July 2022 after 5 years, citing burnout and wanting to return to technical work.
Educator and nanoGPT creator
Karpathy is legendary for making AI accessible:
- Stanford CS231n: His computer vision course became the gold standard
- YouTube: Deep dives like "The spelled-out intro to neural networks" have millions of views
- nanoGPT: A minimal, educational implementation of GPT that taught developers how LLMs actually work—37k+ GitHub stars
Brief OpenAI return (2023-2024)
Rejoined OpenAI in February 2023, working on ChatGPT and GPT-4 improvements. Left again in February 2024 to start Eureka Labs, an education-focused AI startup.
Eureka Labs (2024-2026)
Founded Eureka Labs to build AI tutors and education tools. The startup aimed to use LLMs to personalize learning. After ~1 year, he made the move to Anthropic, signaling frontier research pulled harder than the startup path.
What is pre-training and why accelerate it?
Pre-training is the phase where models learn from massive datasets (trillions of tokens) over months of training on thousands of GPUs. It's the foundation that gives models like Claude, GPT-4, and Gemini their core capabilities.
Why it's expensive:
- Compute: Months on 10,000+ H100 GPUs (~$500M per run for frontier models)
- Data: Curating high-quality training data at trillion-token scale
- Iteration: Each experiment takes weeks; wrong hyperparameters waste millions
Pre-training bottlenecks:
- Data quality: Filtering web scrapes for signal vs noise
- Architecture search: Finding optimal model designs
- Hyperparameter tuning: Learning rates, batch sizes, warmup schedules
- Training instability: Loss spikes, gradient explosions, plateau diagnosis
Why accelerate? Whoever can iterate faster and cheaper wins the AI race. If Anthropic can use Claude to:
- Filter training data automatically (flag low-quality samples)
- Suggest architecture improvements (test variations in simulation)
- Tune hyperparameters (search optimal configs)
- Debug training runs (diagnose loss curves, suggest fixes)
...they can train better models in less time while competitors burn compute on failed experiments.
The self-improving loop: Claude helps train Claude 4 → Claude 4 helps train Claude 5 → compounding acceleration.
What Karpathy will work on
From TechCrunch:
"An Anthropic spokesperson told TechCrunch that Karpathy will start a team focused on using Claude to accelerate pre-training research. Pre-training is responsible for the large-scale training runs that give Claude its core knowledge and capabilities."
Specific areas (inferred from role):
- Data curation: Using Claude to score, filter, and curate training datasets
- Architecture search: Claude suggests model modifications, runs ablations
- Hyperparameter optimization: AI-driven search over training configs
- Training diagnostics: Claude analyzes loss curves, flags instabilities
- Synthetic data generation: Claude creates high-quality training examples
Example workflow:
- Claude analyzes training run logs → identifies a plateau at step 50K
- Suggests: "Increase learning rate by 1.2×, add warmth for 5K steps"
- Runs simulation → confirms improvement → applies to main run
- Result: Faster convergence, lower cost
Why Karpathy? His Tesla experience is directly relevant:
- Built systems to curate billions of driving clips from fleet data
- Optimized training pipelines for real-world, noisy datasets
- Shipped production models with lives depending on correctness
Those skills map perfectly to pre-training at scale.
Why Anthropic over OpenAI?
Karpathy's statement:
"I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D."
Key factors:
1. Research focus over product
OpenAI is increasingly product-driven (ChatGPT, GPT-5, enterprise deals). Anthropic is still research-first, with less pressure to ship features weekly.
2. Using AI to improve AI
Anthropic's self-improving AI thesis—using Claude to accelerate Claude's development—is the meta-problem Karpathy finds most interesting.
3. Pre-training team autonomy
Working under Nick Joseph (Anthropic pre-training lead), Karpathy gets to build a new team with fresh mandates, not inherit legacy systems.
4. Timing
OpenAI is scaling GPT-5. Anthropic is rethinking training itself. For someone betting on "formative years," the research leverage is at Anthropic.
AI Twitter reactions
The KD comparison (most viral tweet):
@netcapgirl: "this is KD joining the warriors for people who know linear algebra"
Translation: Kevin Durant joining the already-dominant Golden State Warriors = superstar joining elite team. Anthropic already has:
- Dario Amodei (ex-OpenAI VP Research)
- Chris Olah (legendary interpretability researcher)
- Sam McCandlish, Tom Brown, Jared Kaplan (scaling laws pioneers)
- $40B valuation, $7B annual revenue run-rate
Adding Karpathy = stacked roster.
Warm welcome:
@ClaudeDevs (official Anthropic account): "Welcome to the team, Andrej!"
Second-day encounter:
@yasser_elsaid_: "Met @karpathy on his second day of work at anthropic. He did confirm they didn't ask him leetcode questions in the interview. Also he is super super nice irl."
What this signals about AI talent wars
1. Anthropic is winning top-tier talent After hiring Dario + Daniela Amodei (OpenAI founders), Chris Olah (interpretability legend), and now Karpathy, Anthropic has assembled a research all-star team.
2. Pre-training acceleration is the new frontier The fact that Anthropic is dedicating a top-tier hire to "using AI to accelerate AI training" signals this is where the next major gains come from—not just bigger models, but smarter training.
3. OpenAI faces brain drain risk Karpathy's departure (his second) follows Ilya Sutskever leaving to start Safe Superintelligence Inc. OpenAI's product focus may be pushing researchers toward labs still prioritizing pure R&D.
4. Education remains his passion Karpathy noted: "I remain deeply passionate about education and plan to resume my work on it in time." This suggests he'll continue teaching even while at Anthropic—possibly through blog posts, lectures, or open-sourcing research techniques.
What happens next
Short term (2026):
- Karpathy builds team under Nick Joseph
- First experiments: using Claude 3.7 to curate training data for Claude 4
- Early wins: faster data filtering, better architecture search
Medium term (2027-2028):
- Self-improving loop kicks in: Claude 4 helps train Claude 5
- Public papers on AI-accelerated pre-training techniques
- Anthropic's training costs drop 30-50% vs competitors
Long term (2029+):
- Fully automated AI research labs
- Models that design their own training curricula
- Human researchers shift to oversight + goal-setting
Karpathy's education return: Likely publishes "The spelled-out intro to pre-training" (nanoGPT-style tutorial) once the techniques mature.
Related on ExplainX
- Anthropic Files Confidential S-1 with SEC: AI Safety Leader Eyes IPO
- Anthropic Claude Opus 4.7: Complete Guide to the Latest Models
- Anthropic Claude 3.7: capabilities and benchmarks
- What are agent skills? Complete guide
- Pre-training vs fine-tuning: what developers need to know
- Scaling laws: why bigger models aren't always better
- AI safety: Anthropic's constitutional AI approach
Sources
- Andrej Karpathy on X: x.com/karpathy
- TechCrunch — Karpathy joins Anthropic: techcrunch.com
- The Decoder — Analysis: the-decoder.com
- Axios — Karpathy move: axios.com
- CNBC — Anthropic hires Karpathy: cnbc.com
Career moves, team structures, and research priorities evolve rapidly. Treat this as May 22, 2026 context. Verify current roles and projects at anthropic.com and karpathy.ai before citing.