Self-Checking Todo Loops
A Claude Code execution pattern where the agent maintains an explicit todo list, runs verification steps after each meaningful change, reads the result, patches failures, and repeats until the task is actually complete.
Key points
- The loop is not just “make a todo list”; the important part is attaching observable checks to the list items [src-011]
- Browser automation makes the loop stronger because Playwright can verify real UI behaviour rather than relying on code inspection alone [src-011]
- The agent should treat failed tests, screenshots, logs, or browser errors as feedback to update the implementation and the todo list [src-011]
- This pattern reduces premature completion claims, especially in frontend and browser-automation tasks [src-011]
- Anthropic’s scientific-computing workflow generalizes the same idea to research code: the agent needs a test oracle such as a reference implementation, quantifiable objective, or unit test suite to know whether it is improving [src-072].
- For long-running scientific tasks, the agent should expand tests while working so it does not overfit to a narrow parameter point and miss regressions elsewhere [src-072].
Related entities
- Claude Code — runtime that manages the todo loop
- Playwright CLI — verification layer for browser-facing work
Related concepts
- Browser Automation with Claude Code — common place to apply self-checking loops
- Agentic Workflows — broader pattern of observe, act, and iterate
- ReAct Loop (Reason + Act) — underlying reasoning-action loop
- Test Oracle Driven Agents
- Ralph Loop Orchestration