What is DeepSeek-TUI?

It is an MIT-licensed terminal coding agent (Rust binaries deepseek + deepseek-tui) built around DeepSeek V4 models—deepseek-v4-pro and deepseek-v4-flash—with streaming thinking blocks, tool use, git, web, sub-agents, MCP servers, session resume, and cost tracking. Repository: https://github.com/Hmbown/DeepSeek-TUI — explicitly not affiliated with DeepSeek Inc.

README documents npm install -g deepseek-tui (downloads prebuilt binaries), cargo install for both crates, Homebrew tap on macOS, Scoop on Windows, or GitHub Releases. First run prompts for a DeepSeek API key into ~/.deepseek/config.toml; deepseek doctor verifies setup.

With --model auto or /model auto, the app runs a small deepseek-v4-flash routing call (thinking off) to pick concrete model (Flash vs Pro) and thinking level per turn before the real request—never sending model:auto to the upstream API. Failures fall back to local heuristics per README.

Does it support MCP and skills?

Yes—MCP integration is documented in docs/MCP.md. Skills load from standard skill directories (.cursor/skills, .claude/skills, ~/.deepseek/skills, etc.) with SKILL.md manifests; /skill install can pull from GitHub without a backend service.

Can I use non-DeepSeek backends?

README lists NVIDIA NIM, Fireworks, self-hosted SGLang, and vLLM providers via flags and env vars such as VLLM_BASE_URL.

How does this relate to explainx.ai DeepSeek posts?

Our API and pricing articles describe deepseek-v4-* models and economics; DeepSeek-TUI is a concrete harness that consumes those APIs with approval gates, compaction, and IDE-adjacent workflows (including Zed ACP via deepseek serve --acp, early limitations noted upstream).

DeepSeek-TUI: terminal coding agent for DeepSeek V4 | explainx.ai Blog

explainx.ainewsletter3.5k

workshops ↗

DeepSeek-TUI: terminal coding agent for DeepSeek V4 | explainx.ai Blog | explainx.ai

DeepSeek-TUI (Hmbown) is a Rust terminal coding agent built around DeepSeek V4 (deepseek-v4-pro / deepseek-v4-flash): streaming thinking, tools (files, shell, git, web, MCP, sub-agents), session checkpoints, and cost telemetry including prefix-cache hints. The README disclaims affiliation with DeepSeek Inc.

It appeared on GitHub Trending in early May 2026; facts here come from the upstream README (v0.8.x era).

TL;DR

Topic	Takeaway
Binaries	`deepseek` dispatcher → `deepseek-tui` (ratatui UI)
Models	V4 default; auto routes Flash vs Pro + thinking per turn
Modes	Plan (read-only explore), Agent (approval gates), YOLO (auto-approve)
Extras	MCP, skills from GitHub, HTTP/SSE `deepseek serve`, RLM batch helper, LSP diagnostics hooks
Install	`npm i -g deepseek-tui` or `cargo install` / Homebrew / Releases
License	MIT

Why it matters next to "just use the API"

When DeepSeek released V4 Pro and Flash in early 2026, the raw API story was compelling: state-of-the-art coding and reasoning capabilities at a fraction of the cost of GPT-4 or Claude. But raw APIs leave critical questions unanswered:

How do you gate risky operations? An LLM with file-write and shell-exec tools can wreck a repository in seconds if it misunderstands context or hallucinates a destructive command. Production teams need approval workflows, not just API keys.

How do you survive context overflow? Even with million-token windows, long agent sessions accumulate transcripts that exceed limits. Naive implementations fail silently or lose critical state when compaction kicks in.

How do you track costs? DeepSeek's aggressive pricing (orders of magnitude cheaper than alternatives) only matters if you can attribute spend to specific tasks, understand cache-hit economics, and forecast bills before they arrive.

How do you maintain session continuity? Terminal windows close, SSH connections drop, laptops suspend. Stateless API wrappers start from scratch every time. Real workflows need durable task queues and resumable sessions.

DeepSeek-TUI addresses these operational gaps with harness patterns that experienced teams eventually build anyway:

Side-git snapshots for rollback without touching your repo .git—the agent can experiment freely while you maintain a clean undo path
Durable task queue across restarts—close your terminal, resume tomorrow, the context and pending tasks persist
Reasoning-effort cycling (Shift+Tab)—dynamically adjust how hard the model thinks based on task complexity and budget
1M-token awareness with compaction controls and cache-hit accounting—see what is getting dropped, control summarization, understand prompt-cache economics

That matches the scaffold story in our agent harness article—here aimed at DeepSeek APIs and compatible hosts like vLLM, SGLang, and NVIDIA NIM.

Installation: npm for convenience, cargo for source

The repository offers multiple install paths to meet users where they are:

npm (recommended for quick starts):

bash

npm install -g deepseek-tui

This downloads prebuilt binaries for macOS (x64/ARM), Linux (x64/ARM), and Windows (x64). The npm wrapper automatically selects the correct binary for your platform and places it in your PATH as deepseek and deepseek-tui.

Cargo (for Rust developers):

bash

cargo install deepseek deepseek-tui

Compiles from source, useful if you are on a platform without prebuilt binaries or want to modify the code. Requires Rust toolchain 1.70+.

Homebrew (macOS):

bash

brew tap Hmbown/deepseek-tui
brew install deepseek-tui

Scoop (Windows):

bash

scoop bucket add deepseek-tui https://github.com/Hmbown/scoop-bucket
scoop install deepseek-tui

GitHub Releases: Download platform-specific binaries directly from the Releases page if package managers are not an option.

After installation, run deepseek doctor to verify setup. First launch prompts for a DeepSeek API key, stored in ~/.deepseek/config.toml. The config file supports multiple profiles for different API endpoints, keys, and default models.

Architecture: dispatcher pattern and terminal UI

DeepSeek-TUI uses a two-binary architecture:

deepseek (dispatcher): The main CLI entry point. Handles command parsing, configuration management, API credential loading, and dispatches to subcommands or the TUI. Think of it as the control plane.

deepseek-tui (terminal UI): The ratatui-based interactive interface. Renders streaming responses with syntax highlighting, shows thinking blocks in real-time, manages approval dialogs, and handles keyboard shortcuts. This is the data plane where you spend your time.

The separation means you can use deepseek in headless scripts and CI pipelines (deepseek run --file task.md --mode yolo) while still having the rich TUI available for interactive work (deepseek-tui or just deepseek with no subcommand).

The TUI is built on ratatui, a modern terminal UI library for Rust that provides:

Declarative layouts with flexbox-style composition
Incremental rendering so streaming tokens appear instantly
Widget composition for panels, lists, syntax-highlighted code blocks
Event-driven architecture for responsive keyboard/mouse input

This is not an Electron app or web view in a terminal emulator—it is native terminal control sequences for maximum performance on remote servers and low-bandwidth connections.

Model routing: auto mode and thinking levels

Auto mode is DeepSeek-TUI's answer to "which model should I use for this task?"

When you set --model auto or /model auto in the TUI, the tool does not pass "auto" to the DeepSeek API (which does not support it). Instead:

The harness runs a small deepseek-v4-flash request (thinking disabled) with a routing prompt that analyzes the user task
The routing model decides whether the task needs Flash (fast, cheap, good for simple edits) or Pro (slow, expensive, better for complex reasoning)
The routing model also selects a thinking level (0-3 in DeepSeek's API schema)
The harness issues the real request with the concrete model and thinking setting
On failure, fallback heuristics kick in: use Pro for tasks mentioning "refactor" or "architecture", Flash for "fix typo" or "add comment"

This two-phase approach costs one extra Flash call per turn but can save significant money. Example: a simple "add logging to this function" task might cost $0.001 with Flash auto-routed, vs $0.015 with Pro always-on. Over hundreds of tasks, the routing overhead pays for itself many times over.

Thinking levels control how much internal reasoning the model exposes:

0: No thinking, just output (fastest, cheapest, works for straightforward tasks)
1: Brief thinking (model shows a few sentences of reasoning)
2: Moderate thinking (paragraph-scale internal monologue)
3: Deep thinking (multi-paragraph reasoning, useful for debugging complex logic)

You can cycle through levels with Shift+Tab during a session. Watch the token counters: thinking tokens count toward your bill and context window, so level 3 on a simple task wastes budget.

Operating modes: Plan, Agent, YOLO

Plan mode is read-only exploration:

The agent can read files, run git diff, search codebases, query documentation
No write operations allowed: no file edits, no git commit, no rm -rf
Useful for understanding a new codebase, debugging without risk of changes, or drafting a proposal

Use Plan mode when you want an AI assistant to explain what a project does, identify where a bug might be, or suggest an implementation approach—without touching anything.

Agent mode (default) adds approval gates:

The agent proposes edits, shell commands, git operations
You see a preview with syntax-highlighted diffs
You approve (y), reject (n), or edit the proposal before execution
Each action is logged to session history for audit and rollback

This is the sweet spot for most development work: the agent does the heavy lifting (write boilerplate, refactor functions, generate tests) while you maintain control over what actually runs.

YOLO mode removes the gates:

The agent executes all proposed actions automatically
Useful for batch tasks, CI/CD pipelines, trusted automation
Dangerous if the agent misunderstands requirements or hallucinates destructive commands

YOLO mode is "just use the API" with session management bolted on. Only use it when you trust the agent, the task is well-scoped, and rollback is easy (e.g. you are working in a disposable Docker container or a git branch you can delete).

MCP integration: connecting to external tools

Model Context Protocol support is documented in docs/MCP.md in the repository. DeepSeek-TUI acts as an MCP client, connecting to MCP servers that provide tools and resources:

Standard MCP servers work out of the box:

@modelcontextprotocol/server-filesystem — read/write files with access controls
@modelcontextprotocol/server-postgres — query databases
@modelcontextprotocol/server-github — search issues, read PRs, post comments
@modelcontextprotocol/server-brave-search — web search
@modelcontextprotocol/server-puppeteer — browser automation

Configuration example (from docs):

toml

# ~/.deepseek/config.toml
[[mcp_servers]]
name = "filesystem"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/projects"]

[[mcp_servers]]
name = "postgres"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-postgres"]
env = { DATABASE_URL = "postgresql://localhost/mydb" }

When the TUI starts, it launches each MCP server as a subprocess and communicates via stdio. The agent sees MCP tools in its tool list alongside built-in capabilities like file-edit and shell-exec. Natural language requests route to the appropriate tool: "search GitHub issues for label:bug" triggers the GitHub MCP server, "query the users table" hits Postgres.

This architecture means you can extend DeepSeek-TUI with domain-specific tools (internal APIs, proprietary data sources, compliance checks) without forking the core codebase—just write an MCP server and configure it.

Skills: portable instructions from GitHub

The repository documents a skills system inspired by Cursor and Claude Code:

What are skills? Markdown files (SKILL.md) that contain structured instructions for common tasks. Example structure:

markdown

# Skill: Add API Endpoint

## Goal
Create a new REST API endpoint in our FastAPI application

## Steps
1. Define request/response models in `app/schemas.py`
2. Implement handler in `app/routers/`
3. Add tests in `tests/test_api.py`
4. Update OpenAPI docs

## Constraints
- Follow existing error handling patterns
- Include request validation
- Add rate limiting decorators

Loading skills: /skill install can pull from GitHub without a backend service:

snippet

/skill install username/repo/path/to/skill.md

DeepSeek-TUI downloads the skill, caches it locally in ~/.deepseek/skills/, and makes it available as a /skill run <name> command. When you invoke a skill, the TUI injects the skill instructions into the system prompt so the agent follows your team's patterns automatically.

Standard skill directories:

.cursor/skills/ (Cursor compatibility)
.claude/skills/ (Claude Code compatibility)
~/.deepseek/skills/ (user global)
./.deepseek/skills/ (project local)

The cross-compatible path means skills you write for DeepSeek-TUI work in Cursor and vice versa (subject to tool availability differences). This is the "portable agent instructions" layer our agent skills guide discusses.

Economics (verify live)

The README embeds DeepSeek per-1M cache hit/miss tables and notes time-limited discounts through 31 May 2026 UTC for Pro—reconcile with official pricing.

As of the README snapshot, approximate DeepSeek V4 pricing (per million tokens):

deepseek-v4-flash:

Input: $0.14/1M tokens
Output: $0.28/1M tokens
Cache hit: $0.014/1M tokens (90% discount on cached input)

deepseek-v4-pro (with temporary discount):

Input: $0.55/1M tokens (normally $2.19)
Output: $2.19/1M tokens (normally $8.75)
Cache hit: $0.055/1M tokens

Why cache hits matter: DeepSeek-TUI sends your codebase context, tool definitions, and system prompts in every request. With prompt caching, the API recognizes unchanged content and bills at cache-hit rates. On long sessions, 95%+ of your input tokens can be cache hits.

Example cost breakdown for a 1000-turn agent session (typical for implementing a medium feature):

Without caching: 500M input tokens × $0.55 = $275
With caching: 25M cache-miss tokens × $0.55 + 475M cache-hit tokens × $0.055 = $13.75 + $26.13 = $39.88

The TUI shows cache-hit percentages in the status bar so you can see economics in real time. If cache hits drop unexpectedly, it usually means you are editing files the agent is reading, causing cache invalidation.

Session management and persistence

Session snapshots: Every N turns (configurable), DeepSeek-TUI writes session state to ~/.deepseek/sessions/<id>.json:

Full message history
Tool call results
Pending tasks
File modification log
Cost tracking

You can /save manually to checkpoint before risky operations. If the TUI crashes or you kill the terminal, /resume <id> picks up where you left off. This is critical for long-running tasks: implementing a feature might span hours across multiple SSH disconnects.

Side-git snapshots: When YOLO mode or approved Agent actions modify files, DeepSeek-TUI can maintain a parallel git history in .deepseek-snapshots/:

Each file write triggers a git commit with the tool call as the message
Your main .git stays clean
Roll back with deepseek restore <snapshot-id> without polluting real git history

This separation means you can experiment aggressively (let the agent try three refactoring approaches) and only promote successful changes to your actual repository.

Alternative backends: vLLM, SGLang, NVIDIA NIM

DeepSeek-TUI is not hardcoded to DeepSeek's API. The README documents compatibility with:

vLLM: Self-hosted inference with DeepSeek weights downloaded from HuggingFace:

bash

export VLLM_BASE_URL=http://localhost:8000/v1
deepseek --model deepseek-v4-flash

SGLang: Faster inference engine for long context:

bash

export SGLANG_BASE_URL=http://localhost:7501/v1

NVIDIA NIM: Enterprise-grade deployment on NVIDIA hardware:

bash

deepseek --provider nvidia --model deepseek-v4-pro

Fireworks AI: Managed hosting with auto-scaling:

bash

deepseek --provider fireworks --api-key <key>

This flexibility matters for teams with:

Data residency requirements (run models on-prem)
Latency constraints (colocate inference with your VPC)
Cost optimization (prepaid GPU hours vs pay-per-token APIs)
Custom fine-tunes (host your domain-adapted DeepSeek variant)

The TUI abstracts provider differences: model selection, thinking modes, and tool calling work the same regardless of backend (subject to the provider actually supporting the features).

Advanced features: RLM batch, LSP diagnostics, HTTP serve

RLM batch helper: Run many independent tasks in parallel:

bash

deepseek rlm --input tasks.jsonl --output results.jsonl --concurrency 10

Useful for dataset generation, bulk code migrations, or evaluations. Each line in tasks.jsonl is an independent conversation; results stream to output as they complete.

LSP diagnostics hooks: When editing code, DeepSeek-TUI can run language servers (typescript-language-server, rust-analyzer, pylsp) and inject compiler errors and warnings into the agent context:

snippet

File: app.ts
Error: Type 'string' is not assignable to type 'number'
  Line 42: const count: number = getUserInput();

The agent sees real type errors, not just your natural-language description. This dramatically improves fix accuracy for compiler-enforced constraints.

HTTP/SSE serve mode: deepseek serve exposes an HTTP API with Server-Sent Events for streaming:

bash

deepseek serve --port 8080 --acp

The --acp flag enables Zed ACP (Agent Communication Protocol) compatibility for integration with Zed editor. This mode is experimental per the README but opens the door to custom UIs, web dashboards, and editor plugins that consume DeepSeek-TUI as a backend service.

Comparison with alternatives

vs raw API calls: DeepSeek-TUI adds session management, approval workflows, cost tracking, and compaction—essential for production use. Raw API calls are fine for one-off scripts; agent sessions need harness logic.

vs Cursor: Cursor is IDE-native with premium UX and tight editor integration. DeepSeek-TUI is terminal-native, works over SSH, and supports custom MCP servers. Cursor costs $20/month; DeepSeek-TUI is free with bring-your-own-API-key.

vs Aider: Aider (Python) focuses on git-aware code edits with simple approval prompts. DeepSeek-TUI (Rust) adds streaming thinking, MCP, skills, and multi-provider routing. Aider is lighter; DeepSeek-TUI is more feature-complete.

vs Claude Code: Claude Code is Anthropic's official CLI with deep integration into their ecosystem. DeepSeek-TUI targets DeepSeek models and compatible backends. If you are already on Claude, use Claude Code; if you want DeepSeek economics, use DeepSeek-TUI.

Choose based on your workflow: IDE vs terminal, Anthropic vs DeepSeek economics, feature priorities.

Sources

Repository: github.com/Hmbown/DeepSeek-TUI
npm: npmjs.com/package/deepseek-tui
DeepSeek pricing: api-docs.deepseek.com/quick_start/pricing

Releases, provider matrices, and ACP support evolve. Treat this as May 6, 2026 README context.

DeepSeek-TUI: terminal coding agent for DeepSeek V4 (Rust, MCP, skills)

Related posts

Bojie Li's AI Agent Book: Open-Source Textbook, 10 Chapters, and Runnable Code

Asia AI Models Hit 60% of OpenRouter Tokens — Polymarket Data & What It Means (2026)

OpenShip Launch: Self-Hosted PaaS With Built-In Mail, MCP, and One-Click Services (2026)

TL;DR

Why it matters next to "just use the API"

Installation: npm for convenience, cargo for source

Architecture: dispatcher pattern and terminal UI

Model routing: auto mode and thinking levels

Operating modes: Plan, Agent, YOLO

MCP integration: connecting to external tools

Skills: portable instructions from GitHub

Economics (verify live)

Session management and persistence

Alternative backends: vLLM, SGLang, NVIDIA NIM

Advanced features: RLM batch, LSP diagnostics, HTTP serve

Comparison with alternatives

Sources

Related posts

Bojie Li's AI Agent Book: Open-Source Textbook, 10 Chapters, and Runnable Code

Asia AI Models Hit 60% of OpenRouter Tokens — Polymarket Data & What It Means (2026)

OpenShip Launch: Self-Hosted PaaS With Built-In Mail, MCP, and One-Click Services (2026)

TL;DR

Why it matters next to "just use the API"

Installation: npm for convenience, cargo for source

Architecture: dispatcher pattern and terminal UI

Model routing: auto mode and thinking levels

Operating modes: Plan, Agent, YOLO

MCP integration: connecting to external tools

Skills: portable instructions from GitHub

Economics (verify live)

Session management and persistence

Alternative backends: vLLM, SGLang, NVIDIA NIM

Advanced features: RLM batch, LSP diagnostics, HTTP serve

Comparison with alternatives

Related on explainx.ai

Sources