What is context-mode?

It is an open-source MCP server (and, on supported hosts, a plugin with hooks) that intercepts high-volume tool results—browser snapshots, bulk reads, logs—and keeps raw payloads out of the model transcript while preserving retrievable storage. It also records edits, git ops, tasks, and errors into a local SQLite store with full-text search so sessions can survive compaction by retrieving only relevant rows. Repository: https://github.com/mksglu/context-mode

What license applies?

The GitHub repository badges Elastic License v2 (ELv2). That is not MIT: read the license before redistributing or embedding in commercial products—some ELv2 restrictions mirror Elastic's source-available terms rather than permissive OSS.

Does "98% reduction" hold for my stack?

The README illustrates dramatic size drops using example payloads (e.g. large batches of tool output replaced with short summaries). Treat those as vendor demonstrations until you measure ctx-stats on your real tools and repositories.

How do I install on Claude Code?

Per the README: add the marketplace with /plugin marketplace add mksglu/context-mode, install context-mode@context-mode, restart, then verify with /context-mode:ctx-doctor. An MCP-only path exists via claude mcp add without full hook routing.

What is the "think in code" idea?

Instead of dragging dozens of file reads into context, the project encourages agents to emit small scripts (via sandbox tools like ctx_execute) that compute on disk and print only the aggregates the model needs—fewer tokens, less compaction damage.

How does this relate to explainx.ai tutorials?

It complements our MCP primer and context-engineering articles: same problem (tools + long transcripts), different implementation—here the boundary is an MCP server plus optional IDE hooks rather than only prompt discipline.

context-mode: MCP sandboxing and session memory for | explainx.ai Blog

explainx.ainewsletter3.5k

workshops ↗

context-mode: MCP sandboxing and session memory for | explainx.ai Blog | explainx.ai

Heavy MCP tool returns—DOM snapshots, logs, long reads—can consume context faster than reasoning. context-mode (mksglu) packages that fix: sandbox bulky output, index session metadata for retrieval, and steer agents toward small on-disk scripts instead of pasting raw blobs into chat.

Quick answer: context-mode reduces agent context usage by up to 98% by sandboxing tool outputs and using SQLite-backed retrieval instead of storing everything in the chat transcript. It works across 14+ platforms including Claude Code, Cursor, and Windsurf.

README-grounded snapshot for May 26, 2026; repository maintains active development.

TL;DR

Topic	Takeaway
What	MCP server + optional hooks across many hosts (README cites 14 platform families)
Pain	Tool output floods the transcript; compaction drops working state
How	Sandbox tools (`ctx_execute`, batch/fetch/index/search), SQLite + FTS5 events, hooks where supported
Claude Code	`/plugin marketplace add mksglu/context-mode` → `/plugin install context-mode@context-mode` → `/context-mode:ctx-doctor`
License	Elastic License v2 — not MIT
Light try	`claude mcp add context-mode -- npx -y context-mode` (MCP only, less routing)
Performance	Vendor claims 98% context reduction on typical coding sessions

sql

CREATE TABLE events (
  id INTEGER PRIMARY KEY,
  timestamp TEXT,
  type TEXT, -- 'tool', 'edit', 'error', 'task'
  summary TEXT,
  full_data BLOB,
  metadata JSON
);

CREATE VIRTUAL TABLE events_fts USING fts5(summary, metadata);

javascript

// Agent writes this script
const files = await ctx.glob('src/**/*.ts');
const counts = files.map(f => {
  const content = ctx.readFile(f);
  return { file: f, functions: (content.match(/function /g) || []).length };
});
console.log(JSON.stringify(counts));

json

{
  "storage": {
    "maxSizeMB": 1000,
    "retentionDays": 30,
    "compressOldSessions": true
  },
  "sandbox": {
    "allowedPaths": ["/home/user/projects"],
    "timeoutMs": 5000,
    "maxMemoryMB": 512
  },
  "retrieval": {
    "maxResults": 10,
    "minRelevanceScore": 0.3
  }
}

javascript

// Example: Find all TODOs in a project
const files = await ctx.glob('**/*.js');
const todos = [];
for (const f of files) {
  const lines = ctx.readFile(f).split('\n');
  lines.forEach((line, i) => {
    if (line.includes('TODO')) {
      todos.push({ file: f, line: i+1, text: line.trim() });
    }
  });
}
return todos;

Scenario	Without context-mode	With context-mode	Reduction
Read 50 files (500 lines each)	375,000 tokens	12,000 tokens	97%
Scrape 10 web pages	250,000 tokens	8,000 tokens	97%
Analyze git history (100 commits)	180,000 tokens	6,000 tokens	97%
Debug session (200 turns)	850,000 tokens	45,000 tokens	95%

Operation	Overhead	Notes
SQLite insert (event)	~1ms	Per tool call
FTS5 query	~5-20ms	Depends on DB size
Sandbox execution	~50-200ms	Cold start penalty
Retrieval (10 results)	~10-30ms	Including ranking

Session length	Events	DB size	Query time (p95)
1 hour	500	15 MB	8 ms
8 hours	4,000	120 MB	15 ms
40 hours	20,000	600 MB	35 ms

context-mode: MCP sandboxing and session memory for agent context windows

TL;DR

Related posts

Claude Code Artifacts + MCP: Live Dashboards With Viewer-Scoped Auth

Claude Code vs OpenCode Token Overhead — What Systima Measured at the API Boundary

CLAUDE.md vs SKILL.md vs MCP: The Modern Agent Stack Explained

The context overflow problem: why agents lose state

How context fills up in practice

What gets lost during compaction

Pillars (vendor framing)

1. Context saving — sandbox tool output

2. Session continuity — SQLite + FTS5 retrieval

3. Think in code — sandboxed execution

4. Output compression — training agents for brevity

Why explainx.ai readers should care

Installation and configuration

Quick start (MCP only)

Full installation (Claude Code)

Configuration file

Tools and capabilities

ctx_execute: sandboxed code execution

ctx_batch: parallel operations

ctx_fetch: HTTP requests without browser

ctx_index: semantic search over codebase

ctx_query: SQL over session history

Performance benchmarks (vendor claims)

Context usage reduction

Latency impact

Storage scaling

Sources

TL;DR

Related posts

Claude Code Artifacts + MCP: Live Dashboards With Viewer-Scoped Auth

Claude Code vs OpenCode Token Overhead — What Systima Measured at the API Boundary

CLAUDE.md vs SKILL.md vs MCP: The Modern Agent Stack Explained

The context overflow problem: why agents lose state

How context fills up in practice

What gets lost during compaction

Pillars (vendor framing)

1. Context saving — sandbox tool output

2. Session continuity — SQLite + FTS5 retrieval

3. Think in code — sandboxed execution

4. Output compression — training agents for brevity

Why explainx.ai readers should care

Installation and configuration

Quick start (MCP only)

Full installation (Claude Code)

Configuration file

Tools and capabilities

ctx_execute: sandboxed code execution

ctx_batch: parallel operations

ctx_fetch: HTTP requests without browser

ctx_index: semantic search over codebase

ctx_query: SQL over session history

Performance benchmarks (vendor claims)

Context usage reduction

Latency impact

Storage scaling

Related on explainx.ai

Sources