Ollama
Open-source tool for running LLMs locally on macOS, Windows, or Linux and also hosting some models in Ollama Cloud. Used with Claude Code to swap out the underlying model — launched via ollama launch claude — enabling free or near-free Claude Code usage by pointing the agent harness at local Qwen, Gemma, or MiniMax models. Requires a $5 API credit on Anthropic to activate Claude Code but the local models bill nothing.
Key facts
- Local CLI: ollama pull
downloads a model, ollama run chats with it - Cloud option via Ollama Cloud (MiniMax, etc.) when local hardware can’t run big models
- ollama launch claude integration pulls Claude Code pointing at a local or cloud Ollama model
- Default Ollama context windows can be smaller than advertised — create a custom model with more context for Claude Code
- Paid tier required for concurrent cloud models or higher usage limits
- Claude Code still requires $5 of Anthropic credit to initialise, but local models never consume it
- Nate’s May 2026 stack video places Ollama in the experimental bucket: he does not run on local models day to day, but uses Ollama to download, test, or access open-source models and keep up with the ecosystem [src-053]
Related concepts
Source references
- [src-004] Nate Herk cluster — Nate Herk — Claude Code cluster (21 videos)
– Videos referenced: O2k_qwZA8HU, sboNwYmH3AY
- [src-053] Nate Herk — “Overwhelmed By AI? Just Copy My Tech Stack” (2026-05-08)