Verifiability Frontier

The verifiability frontier is the boundary between work that current LLM training methods can automate quickly because outputs can be reliably checked, and work that remains jagged because rewards are hard to specify.

Key points

Karpathy’s formulation: traditional computers automate what can be specified in code; current LLMs automate what can be verified ^[src-055].
Frontier labs use reinforcement-learning environments with verification rewards, so models peak in domains such as math, code, and nearby tasks where correctness can be checked at scale ^[src-055].
Capability is not only about verifiability; it also depends on what labs care enough to include in the data and RL mix. Karpathy gives chess improving from GPT-3.5 to GPT-4 as an example of data distribution changing a capability ^[src-055].
Founders can look for valuable verifiable domains that labs have not prioritized, then build their own RL environments or fine-tuning loops around those tasks ^[src-055].
If an application sits outside the model’s trained circuits, teams should expect more struggle and may need fine-tuning or domain-specific verification rather than assuming the base model will handle it ^[src-055].

Related entities

Andrej Karpathy

Related concepts

Source references

^[src-055] Sequoia Capital — “Andrej Karpathy: From Vibe Coding to Agentic Engineering” (2026-04-29)

Verifiability Frontier

Verifiability Frontier

Key points

Related entities

Related concepts

Source references

Explore Robin's AI portfolio

Recent posts

Archive

Tags

Senior AI product leadership

Robin Cartier

Company

Services