OmniContext CLI is a terminal-native coding assistant that treats context as a first-class resource. Lean system prompts keep overhead low. Specialist delegation routes grunt work to cheaper models while keeping your main context clean. Zero telemetry means your code never leaves your machine. And it extends into VS Code, Office, the browser, and Figma.
$ npm install -g omni-context-cli && omx
╔═╗┌┬┐┌┐┌┬ ╔═╗┌─┐┌┐┌┌┬┐┌─┐─┐ ┬┌┬┐ ╔═╗╦ ╦ ║ ║│││││││ ║ │ ││││ │ ├┤ ┌┴┬┘ │ ║ ║ ║ ╚═╝┴ ┴┘└┘┴ ╚═╝└─┘┘└┘ ┴ └─┘┴ └─ ┴ ╚═╝╩═╝╩
Omni Context CLI. Tell Omx what you want to do.
Traditional assistants call basic tools one at a time, resending your entire context with every round. OmniContext CLI delegates multi-step operations to agentic sub-agents running on a cheaper model -- your expensive model stays focused on reasoning, not file I/O.
handleAuth"
glob("src/**/*.ts")
grep("handleAuth", ...)read("src/middleware/auth.ts")read("src/routes/login.ts")read("src/services/auth.ts", 40-90)pluck("handleAuth definition")Each tool runs as an autonomous sub-agent on a cheaper model. It handles file I/O, error recovery, and retries internally -- keeping intermediate output out of your main context and your token bill down. Tip: start with glance and slice when exploring a codebase -- they're faster than hunting file by file.
Survey project architecture. Understands directory layout, key files, and how the codebase is organized.
Run shell commands with automatic error detection and retry. Handles build failures and install issues.
Edit files with surgical precision. Finds the right location, makes the change, and validates the result.
Write entire files from scratch with auto-validation. Handles formatting and structure automatically.
Find files matching complex criteria. Searches by name, content, or structure across your project.
Extract specific code segments from any file. Pulls functions, classes, or blocks you need.
Trace symbol references across your codebase. Finds every usage of a function, variable, or type.
Answer targeted code questions. Reads only the relevant parts to give you focused answers.
Research any topic via web search. Finds documentation, examples, and solutions from across the internet.
Preview multiple files at once with brief summaries. Quickly understand what you are working with.
Switch how OmniContext CLI behaves with a single command. Each preset changes the tools available, the system prompt, and the response style.
Your main model reasons, a cheaper agent model executes. Agentic tools keep the cheap model out of decisions. Fewer rounds, cleaner context, lower cost.
Research-first mode. Launches multiple web searches before answering. Great for current events, docs, and fact-checking.
Visual-first responses. Prioritizes image generation when the model supports it. Ideal for design exploration and mockups.
Personal assistant for app integrations. Controls browser tabs, Office documents, and Figma designs through natural language.
Basic tools with manual orchestration. Direct read, write, edit, and bash access. Full control, no abstraction.
Most tools funnel everything through a single API format and hope for the best. OmniContext CLI has a dedicated request builder and stream handler for each protocol. Prompt caching, extended thinking, and provider-specific features work exactly as the vendor intended -- no lossy translation layer in between.
Every API call resends your full conversation history. Fewer rounds means fewer cache reads. Cleaner context means fewer tokens written. Specialist mode cuts both -- and offloads the grunt work to a cheaper model.
Traditional tools need 5 rounds to find a function definition. Specialist mode does it in 1. That is 4 fewer full-context resends -- saving cache read costs on every skipped round.
Basic tools dump ~10KB of intermediate output into your conversation. Agentic tools return only the final result. Context editing automatically trims old tool payloads and thinking blocks, keeping growth in check even over long sessions.
Sub-agents run on a low-cost model (e.g. GLM-5) while your main model (e.g. Claude Opus 4.6) handles only planning and decisions. The expensive model never does file I/O.
The default 5-minute prompt cache expires if you pause to think. Switch to 1-hour in preferences for debugging, refactoring, or research -- it eliminates repeated cache rebuilds across a session.
| Traditional | Specialist | Saved | |
|---|---|---|---|
| API rounds | 5 | 1 | -4 rounds |
| Cache read per round | ~20K tokens x 5 | ~20K tokens x 1 | -80K tokens |
| New context added | ~10KB | ~3KB | -70% |
| Cache write (new tokens) | ~2.5K tokens | ~1K tokens | -60% |
| Execution model | Opus 4.6 only | Opus 4.6 + GLM-5 | ~30% cheaper |
Based on a 20K-token conversation finding a function across a TypeScript project. Actual savings depend on project size and model pricing.
OmniContext CLI ships with built-in provider presets. Pick one, paste your API key, and every model from that service is ready to use.
# List available providers
$ omx --list-providers
# Add all Zenmux models in one go
$ omx --add-provider zenmux --api-key zmx-...
Added: Zenmux Anthropic (Claude Sonnet 4)
Added: Zenmux Anthropic (Claude Haiku)
Added: Zenmux Gemini (Gemini 2.5 Flash)
Added: Zenmux OpenAI (GPT-4o)
...
# Remove a provider just as easily
$ omx --remove-provider zenmux
OmniContext CLI remembers your coding style, project patterns, and past mistakes across sessions. Key points are scored over time -- helpful insights stick around, irrelevant ones decay.
"This project uses TypeScript strict mode with path aliases configured in tsconfig"
"API routes follow REST conventions in src/routes/ with Zod validation"
"Uses Webpack for bundling"
Decaying -- will be removed at -5
Terminal is home base, but OmniContext CLI reaches into every tool you use. One AI, consistent context, zero context switching.
Full IDE integration with file context, diagnostics, and diff views. OmniContext CLI sees what you see in the editor.
GUI for the CLI. Acts as the local hub connecting Office, browser, and Figma extensions.
Sidebar on any webpage. Summarize, extract data, run scripts, and automate browser tasks.
AI panel inside Word, Excel, and PowerPoint. Create budgets, format docs, and design slides.
Inspect layouts, create shapes, modify nodes, and export assets through the chat panel.
Works as an external agent via Agent Client Protocol. Full tool access inside Zed's agent panel.
Browser UI with LaTeX, Mermaid diagrams, file attachments, and drag-and-drop support.
Run omx --serve and connect from your phone. Code reviews from the couch.
Custom agents, skills, slash commands, and MCP servers. Everything is a markdown file or JSON config.
Write a markdown file with a prompt template and tool permissions. It becomes a new agentic tool instantly. Add OMX-AGENTS.md for global agent instructions.
Teach OmniContext CLI domain-specific knowledge and workflows. Skills inject instructions into the current conversation.
Create shortcuts for common prompts. Type /review and your custom prompt fires with Handlebars templating.
Connect external tools and data sources via Model Context Protocol. Stdio and HTTP transports supported.
Minimal, focused instructions and concise tool descriptions. Your tokens go toward actual work, not bloated framework overhead.
No usage tracking, no analytics, no data collection. Your code and conversations never leave your machine.
Automatically trims old tool call payloads and thinking blocks from your conversation history. Keeps token usage lean in long sessions.
Enable deeper reasoning for complex tasks. The model thinks step by step before responding, with configurable budget limits.
Already have a CLAUDE.md in your repo? OmniContext CLI reads it automatically, right alongside OMX.md. Zero-friction migration.
When context hits 80% capacity, the conversation is compacted, key memories are extracted, and a fresh session picks up where you left off.
Automatic cache control for Anthropic and Gemini. Custom TTL settings (5 min or 1 hour) keep frequently used context cached and costs down.
Drop an OMX.md in your repo root. Everyone on the team gets the same conventions and context. Also reads CLAUDE.md for easy migration.
One command. Zero config. Bring your own API key.
$ npm install -g omni-context-cli && omx