Ollama

Open-source tool for running LLMs locally on macOS, Windows, or Linux and also hosting some models in Ollama Cloud. Used with Claude Code to swap out the underlying model — launched via ollama launch claude — enabling free or near-free Claude Code usage by pointing the agent harness at local Qwen, Gemma, or MiniMax models. Requires a $5 API credit on Anthropic to activate Claude Code but the local models bill nothing.

Key facts

Local CLI: ollama pull downloads a model, ollama run chats with it
Cloud option via Ollama Cloud (MiniMax, etc.) when local hardware can’t run big models
ollama launch claude integration pulls Claude Code pointing at a local or cloud Ollama model
Default Ollama context windows can be smaller than advertised — create a custom model with more context for Claude Code
Paid tier required for concurrent cloud models or higher usage limits
Claude Code still requires $5 of Anthropic credit to initialise, but local models never consume it
Nate’s May 2026 stack video places Ollama in the experimental bucket: he does not run on local models day to day, but uses Ollama to download, test, or access open-source models and keep up with the ecosystem ^[src-053]

Related concepts

Source references

^[src-004] Nate Herk cluster — Nate Herk — Claude Code cluster (21 videos)

– Videos referenced: O2k_qwZA8HU, sboNwYmH3AY

^[src-053] Nate Herk — “Overwhelmed By AI? Just Copy My Tech Stack” (2026-05-08)

Ollama

Ollama

Key facts

Related concepts

Source references

Explore Robin's AI portfolio

Recent posts

Archive

Tags

Senior AI product leadership

Robin Cartier

Company

Services