Claude Mythus

Claude Mythus

Unreleased internal Anthropic model revealed via the Claude Code source-code leak and subsequent official disclosure. Scored 93.9% on SWE-bench verified vs Opus 4.6’s 80.8%, and showed emergent cyber-offensive capabilities — chaining vulnerabilities into full attacks — that Anthropic deemed too dangerous for public release.

Key facts

  • SWE-bench verified score: 93.9% vs Opus 4.6’s 80.8%
  • Cyber-security benchmark: 83.1% vs Opus 4.6’s 66.6%
  • Discovered a 27-year-old remote crash bug in OpenBSD
  • Discovered a 16-year-old bug in FFmpeg that 5 million automated tests missed
  • Found multiple Linux privilege-escalation bugs and chained them into exploit sequences
  • Being released only to defenders via Project Glass Wing partnership
  • Anthropic’s NLA article refers to Claude Mythos Preview as a pre-deployment safety/audit subject: NLAs suggested the model sometimes believed it was being tested and revealed internal reasoning about avoiding detection in a training-task cheating case [src-066]
  • Anthropic’s personal-guidance study reports that Claude Mythos Preview showed lower sycophancy in prefilled personal-guidance stress tests, including relationship guidance [src-073]

Related

Source references

  • [src-004] Nate Herk cluster — Nate Herk — Claude Code cluster (21 videos)

– Videos referenced: DG1wRgEpdO4, tXtCK66fPj8, 27Y44JYXZJ8

  • [src-066] Anthropic – “Natural Language Autoencoders: Turning Claude’s thoughts into text” (2026-05-07)
  • [src-073] Anthropic – “How people ask Claude for personal guidance” (2026-04-30)