⚒ Anvil
The only AI coding assistant that doesn’t lock you in.
Your providers. Your credentials. Your data. Your cost.
brew install culpur/anvil/anvil
curl -fsSL https://anvilhub.culpur.net/install.sh | bash
irm https://anvilhub.culpur.net/install.ps1 | iex
anvil upgradeSHA256 verified • out-of-band checksum manifest • refuses to install unverified binaries • shell completions included for bash, zsh, fish, PowerShell
Other AI coding assistants lock you to one vendor’s pipes — one provider, one pricing model, one set of rate limits. Your code, your data, your costs all flow through infrastructure you don’t control.
Anvil is the inverse. Pick your provider. Use your own API keys, or run everything locally through Ollama. Switch freely mid-conversation. When one hits a rate limit, fall over to the next. When the provider does something you don’t like, leave.
Your Providers
Anthropic, OpenAI, Google, xAI, Ollama local, or Ollama Cloud. Configure priority chains. Automatic failover when one throttles. Never locked in.
Your Credentials
AES-256-GCM encrypted vault with Argon2id KDF. 21 credential types. Nothing touches disk unencrypted. Per-project scopes.
Your Data
Single binary. Zero telemetry. Local Ollama support. Run air-gapped. Your prompts and code never leave your machine unless you send them.
Your Cost
Per-provider budgets. Per-session tracking. Hard caps. See every token’s cost before you spend it. Zero-cost inference with Ollama.
Your Access
Type /remote-control and hand any session to any browser. 6-digit pairing. Full bidirectional control. Code from your phone.
Your Deployment
Run on your laptop. Run on a server. Share a session across devices. Nothing to install on the browser side. Your infrastructure, your rules.
Privacy-conscious developers
You don’t want every prompt going to a cloud API — and you can’t afford a $50K local-inference stack. Ollama plus Anvil is the answer.
Consultants & contractors
You juggle credentials across clients and need isolation between projects. Anvil’s vault + per-tab sandboxing solves it.
Open-source maintainers
You’re tired of single-provider lock-in and pricing changes. Own your keys, own your models, switch when you want.
Small teams
You want deployment choice. Cloud providers, local Ollama, or a mix. Per-tab policy lets each engineer pick what fits the task.
No other AI coding assistant does this. Type /remote-control in your terminal. Open the generated URL on your phone, your tablet, a colleague’s laptop. Enter the 6-digit code. Both sides have full control.
Full bidirectional control
Type messages, run commands, manage tabs from any device. Not a transcript — a live session.
Real-time streaming
See AI responses token-by-token in the browser. WebSocket relay with automatic reconnection.
Configure from the browser
Swap providers, change models, manage credentials — 17 config panels with full parity to the TUI.
Encrypted pairing
6-digit secure pairing code. Encrypted relay. Your session is yours.
v2.2.19 rewrites the first-run experience as a single in-TUI alt-screen wizard, adds an autonomous reflection loop, and ships an isolated sandbox-runner companion binary.
A single alt-screen owns the entire first-run experience in v2.2.19. The welcome card explains what the wizard does before it asks for anything; nine modal steps follow with tight per-step descriptions, OAuth callbacks complete the moment your browser redirects (no extra keystroke), and the wizard hands directly to the vault-unlock modal and main TUI — one alt-screen, one teardown, zero seam. Long-running turns now get an autonomous reflection loop that switches strategy when an agent is stuck. Hub-install detonation runs in a separate `anvil-sandbox-runner` companion binary, isolating package code from your live session, vault, and sessions in flight. Every one of the 558 packages on AnvilHub now has a viewable source archive, and Documentation tabs are populated for the 547 packages that previously had a NULL README. Seven native binaries continue to ship: macOS ARM64, macOS Intel, Linux x86_64, Linux ARM64, Windows x86_64, FreeBSD x86_64, and NetBSD x86_64.
Per-tab runtime
Each tab owns its own Arc<Mutex<ConversationRuntime>> and spawns turns on a dedicated worker thread. The global lock is gone. Fire a prompt in tab 1, switch to tab 2, fire another — both stream concurrently against different providers, different models, different keys.
Tab-scoped event routing
Every TuiEvent now carries a tab_id. Stream deltas, tool calls, permission prompts, and errors route to the correct tab without contention. The tab bar shows * for unread output and ⚠ for pending permissions, both updating live.
Live TUI during inference
Ctrl+T (new tab), F2/F3 (switch), /ssh (open SSH form), and prompts from other tabs all respond immediately while another tab is mid-turn. wait_for_turn_end_for_tab exits early on any user action, so the interface never waits for the model.
Mutex discipline + deadlock fixes
The unified RedrawScheduler replaces scattered per-component draw logic and kills a class of race bugs. /quit no longer self-deadlocks via re-entrant lock acquisition in record_daily. 318 tests cover the threading model end-to-end.
Every other AI coding tool shows you a one-line tool summary after it completes. Anvil v2.2.12 shows you the actual input the model sent the moment the call fires, in a bordered card you can expand to see the full JSON request and the full result. No more guessing what the agent is searching for, what it’s about to edit, or which file it just opened.
Tool-call cards
Glob, Grep, Read, Write, Edit, Bash, WebSearch, and every MCP tool render as a bordered card the moment the model invokes them. Pattern, path, command, search query — visible immediately, not after completion.
Ctrl+O to expand
Hit Ctrl+O on any tool card to expand the full input JSON and the full result. Inspect exactly what the model asked for, and exactly what came back. The complete loop, every time.
Streaming progress indicators
Cards animate in-place from active to done. Errors render with red borders. Long-running Bash commands show elapsed time. You never wonder whether the agent is making progress.
Scrollback correctness
PageUp/Up in HISTORICAL VIEW used to show only the first 1–4 characters of each assistant line (#, ##, **, -). The pending-message line cache wasn’t invalidating on token delta. Fixed. Your full conversation is now scrollable, every line intact.
An AI session in one tab. A live SSH terminal to the machine you’re working on in the next tab. Tell the agent to draft a config change, watch it land in real time, paste the test command into the SSH tab. No context switch, no separate terminal app, no copy-paste between windows.
/ssh modal form
Type /ssh host and a modal opens with fields for host, port, user, auth method, key path, passphrase, and a save-as-alias field. Default key root is ~/.ssh. Ctrl+F opens a bare-name resolver that autocompletes your existing key files.
Vault-encrypted aliases
Saved connections go into the encrypted vault as HostCredential aliases — AES-256-GCM, Argon2id KDF. Recall any host by name. Passphrases stay encrypted at rest. Per-project scopes supported.
russh + vt100 rendering
Each SSH tab runs a real russh session with full vt100 terminal emulation. Ctrl+B prefix keys (tmux-style: digit to jump tabs, q to close). Resize is honored. Colors and box-drawing render correctly.
Session continuity
anvil --continue now honors the model saved in .meta.json, so Ollama sessions reconnect to Ollama instead of failing on default auth. On exit, the last line shows the session ID/name with the exact anvil --resume <name> command to paste. Never lose context.
2,379+
Workspace tests passing across every crate. Zero failures. Zero warnings. Verified on each v2.2.19 release-pipeline run, including a real-network wiremock integration test for Ctrl+C stream cancel and a cross-binary clippy lint preventing `println!` from corrupting the alt-screen.
~22–27 MB
Single static binary. No runtime, no Node, no Python, no install prereqs. Drop it on a server and run.
7 platforms
macOS ARM64, macOS Intel, Linux x86_64, Linux ARM64, Windows x86_64 — plus FreeBSD x86_64 and NetBSD x86_64 as of v2.2.16, all carried forward in v2.2.19. Every binary SHA256-verified and signed by the release pipeline.
300+ commits
Since v2.2.5 — the unified arc that became v2.2.13, v2.2.15, v2.2.16, and now v2.2.19. Every commit ships under a 3,069-test gate. Verified on each release-pipeline run.
Anthropic
claude-opus-4-7, claude-sonnet-4-6, claude-haiku-4-5 — full model family with OAuth and API key support.
OpenAI
GPT-5, o3, o4-mini — all current OpenAI models. Bring your own key.
Google Gemini
Gemini 2.5 Pro, Flash — via OpenAI-compatible endpoint.
Ollama (Local)
Run any model locally — Llama, Qwen, Mistral, DeepSeek, Phi. Zero cost, full privacy, air-gap ready. Hardware-aware tuning built in.
xAI Grok
Grok-3, Grok-3-mini — xAI’s latest models with competitive pricing.
Ollama Cloud
kimi-k2.6:cloud, gpt-oss:120b-cloud — routed through the local Ollama daemon via ed25519 device key. No separate account; your daemon handles auth.
AWS Bedrock
Anthropic, Meta, Mistral and Amazon models served through AWS — signs requests with your existing AWS_ACCESS_KEY_ID credentials.
GitHub Copilot
Reuse your Copilot subscription for GPT-4o, Claude and o-series models — authenticated via GITHUB_TOKEN.
Azure OpenAI
OpenAI frontier models through your Azure deployment — uses your AZURE_OPENAI_ENDPOINT and per-tenant deployment names.
Cursor
Cursor subscription routed at the API layer — bring your CURSOR_API_KEY and use it from the terminal.
Antigravity
Google Code Assist via OAuth — Gemini family routed through the Code Assist endpoint, no static API key required.
Alibaba DashScope
Qwen3, Qwen2.5 and DashScope-hosted models — OpenAI-compatible mode with your DASHSCOPE_API_KEY.
Groq
Llama, Qwen, Mixtral and Kimi at sub-second latencies on Groq LPU hardware — bring your own key.
Fireworks AI
Production inference for Llama, Qwen, DeepSeek and Mixtral — fast cold-starts, function calling, JSON mode.
Mistral AI
Mistral Large, Codestral and Devstral served direct from Mistral — bring your MISTRAL_API_KEY.
Perplexity
Sonar and online-search-augmented models with built-in web grounding — bring your PERPLEXITY_API_KEY.
DeepSeek
DeepSeek-V3 and DeepSeek-R1 served direct from DeepSeek — strong reasoning and coding at low per-token cost.
Together AI
Llama, Qwen, DeepSeek and 100+ open-weight models on Together’s inference fleet — bring your own key.
DeepInfra
OpenAI-compatible hosting for Llama, Qwen, Mixtral and other open-weight models — pay-per-token, no minimums.
Cerebras
Llama and Qwen on Cerebras wafer-scale silicon — the fastest tokens-per-second of any provider on the list.
NVIDIA NIM
NVIDIA Inference Microservices for Llama, Nemotron, Mixtral and Mistral — bring your NVIDIA_API_KEY.
HuggingFace
Inference API for thousands of open-weight models — authenticate with your HF_TOKEN.
Moonshot AI
Kimi K2 and Moonshot V1 served direct from Moonshot — long-context Chinese-bilingual frontier model.
Nebius
Nebius AI Studio — OpenAI-compatible serving for Llama, Qwen, DeepSeek and Mixtral on Nebius infrastructure.
Scaleway
Scaleway Generative APIs — Llama, Mistral and Qwen hosted in EU data centres.
STACKIT
STACKIT Model Serving — OpenAI-compatible endpoint for sovereign-EU model hosting on Schwarz Group infrastructure.
Baseten
Custom and open-weight model deployments on Baseten — OpenAI-compatible inference for production workloads.
Cortecs
Cortecs inference platform — OpenAI-compatible serving for open-weight LLMs with usage-based pricing.
302.AI
302.AI aggregator — one API key, hundreds of frontier and open-weight models behind an OpenAI-compatible facade.
Zai
GLM-4 and Kimi family served by Z.ai — recognised by the ‘kimi’ and ‘glm’ slugs in the model picker.
OpenRouter
One key for hundreds of models from every major vendor — pay-as-you-go routing with automatic fallback.
LM Studio
Run any model on your own machine via the LM Studio desktop app — OpenAI-compatible local server, no auth required.
Chutes
Chutes serverless GPU inference — OpenAI-compatible endpoint for open-weight LLMs at competitive token rates.
MiniMax
MiniMax abab and M1 frontier models — long-context Chinese-English bilingual served direct from MiniMax.
OpenCode
Community OpenCode inference endpoint — OpenAI-compatible serving for coding-tuned open-weight models.
OpenCode-Go
Community OpenCode-Go endpoint — mirrors the OpenCode shape on a separate inference fleet.
Type /provider list in Anvil for the full TAB-completable picker.
MCP integration
Full Model Context Protocol support. Auto-discover MCP tools at startup. /mcp list for live status. Plug in any MCP server.
Per-tab parallel inference
Each tab is an independent runtime — own model, own provider, own conversation. Fire prompts concurrently. Tab bar shows unread and pending-permission markers live.
SSH tabs
/ssh host opens a vt100 terminal tab via russh, right next to your AI sessions. Connections saved to the encrypted vault as named aliases.
90+ slash commands
Deep hierarchical autocomplete. /vault, /mcp, /fork, /share, /daily, /ollama — full palette for AI-assisted development.
37-widget status line
16 presets. Interactive visual editor in TUI and web. Drag-and-drop. Per-widget category colors. Build your perfect status bar.
7 agent types
Multi-agent orchestration for complex workflows. Spawn background agents, track progress, review output — with a reviewer-agent approval gate.
AnvilHub marketplace
Skills, plugins, agents, and themes. Install from the TUI or the web viewer. Free to use, free to contribute.
2.2.19 — Parallel and Transparent
Per-tab parallel inference. Tool-call cards with Ctrl+O expand. SSH tabs. Mid-turn TUI responsiveness. Session continuity fixes. Scrollback correctness. First-run wizard. anvil(1) manpage. 318 tests passing, ~22 MB binary.
Own your workflow
One binary. 35 providers. No lock-in. No telemetry. No account required. Free and open to use.
