What is DeepSeek-V4 Preview?

According to DeepSeek’s API documentation news entry (April 24, 2026), V4 Preview is a new generation of open-weight models and APIs centered on a 1M-token default context, shipped as two API variants: deepseek-v4-pro and deepseek-v4-flash. Primary announcement: https://api-docs.deepseek.com/news/news260424

What is the difference between DeepSeek-V4-Pro and DeepSeek-V4-Flash?

DeepSeek positions V4-Pro as a larger mixture-of-experts-class model (they report 1.6T total parameters with 49B active) aimed at top-tier reasoning and agentic coding; V4-Flash is smaller (284B total / 13B active per their post) with faster responses and lower API cost, while keeping reasoning close to Pro on simpler agent tasks. Verify live pricing on DeepSeek’s Models & Pricing page.

How do I call DeepSeek V4 on the API?

DeepSeek states you keep the same API base_url and switch the model field to deepseek-v4-pro or deepseek-v4-flash. The service supports OpenAI-style Chat Completions and Anthropic-compatible APIs; both models offer 1M context and Thinking / Non-Thinking modes documented at https://api-docs.deepseek.com/guides/thinking_mode

Where are the open weights and tech report?

DeepSeek links a Hugging Face collection for V4 weights at https://huggingface.co/collections/deepseek-ai/deepseek-v4 and a PDF tech report from the DeepSeek-V4-Pro repository at https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main/DeepSeek_V4.pdf — treat those as upstream sources of truth for architecture details.

What happens to deepseek-chat and deepseek-reasoner?

DeepSeek warns that deepseek-chat and deepseek-reasoner will be fully retired and inaccessible after July 24, 2026, 15:59 UTC, and notes that traffic is currently routed to deepseek-v4-flash in non-thinking and thinking modes respectively. Plan migrations before that cutoff if you hard-code legacy model IDs.

Which AI coding agents does DeepSeek cite for V4 integration?

The announcement names Claude Code, OpenClaw, and OpenCode as agents where V4 is integrated, and states V4 already powers DeepSeek’s internal agentic coding stack. Confirm connector versions in each product’s release notes; this post only summarizes the vendor announcement.

DeepSeek V4 preview: V4-Pro, V4-Flash, 1M context API (2026) | explainx.ai Blog

DeepSeek published DeepSeek V4 Preview Release on April 24, 2026: open weights, 1M-token default context on official services, and two new API model IDs—deepseek-v4-pro and deepseek-v4-flash. This article is a field note for engineers: what changed, how to migrate, and where to read primary materials—not a substitute for DeepSeek API Docs.

Benchmark and “SOTA” claims below are as stated by DeepSeek in that post; treat them as marketing-facing positioning until you run your own evals on real workloads.

TL;DR

Topic	Takeaway
New models	`deepseek-v4-pro` (larger, flagship) and `deepseek-v4-flash` (smaller, economical).
Context	1M tokens is the default across official DeepSeek services per the announcement.
API shape	Same `base_url`; swap `model` string. OpenAI Chat Completions + Anthropic APIs supported.
Modes	Thinking and Non-Thinking—see Thinking Mode.
Weights	Hugging Face collection + tech report PDF.
Legacy IDs	`deepseek-chat` / `deepseek-reasoner` retire after 2026-07-24 15:59 UTC (currently mapped to V4-Flash).
Try in UI	chat.deepseek.com — Expert Mode / Instant Mode per the post.

V4-Pro vs V4-Flash (vendor-reported)

Dimension	DeepSeek-V4-Pro	DeepSeek-V4-Flash
Reported scale	1.6T total params, 49B active	284B total, 13B active
Positioning	Flagship reasoning + agentic coding	Fast, cost-effective, strong on simple agent work
Reasoning	DeepSeek claims open-model SOTA on agentic coding benchmarks and strong Math/STEM/Coding vs other open models	DeepSeek states reasoning near Pro, on par with Pro for simple agent tasks

GEO note: When you summarize leaderboard tables, link the PDF report or Hugging Face cards instead of copying every number—citation-friendly pages get cited more often in generative answers.

Architecture: long context and sparse attention

The post highlights token-wise compression plus DSA (DeepSeek Sparse Attention) as structural contributions, and frames them as improving long-context efficiency (compute and memory). For engineering detail, start with the tech report and model cards on Hugging Face rather than second-hand summaries.

If you are new to why context length matters for agents, our LLM context window guide walks through attention cost, KV cache, and product trade-offs—useful background when a vendor moves the default to 1M tokens.

Agent integrations and “agentic coding”

DeepSeek states V4 is integrated with Claude Code, OpenClaw, and OpenCode, and that it already powers in-house agentic coding at DeepSeek. For portable agent instructions (skills, MCP, and progressive disclosure), see what are agent skills? on ExplainX—skills are complementary to whichever base model you route through your host.

API migration checklist

Inventory hard-coded model strings (deepseek-chat, deepseek-reasoner, older aliases).
Map to deepseek-v4-pro or deepseek-v4-flash per latency and budget.
Confirm Thinking / Non-Thinking behavior against thinking mode docs.
Set calendar for 2026-07-24 15:59 UTC legacy retirement—DeepSeek is explicit that deepseek-chat and deepseek-reasoner will become inaccessible after that moment.
Re-run integration tests: tool calling, JSON modes, and streaming paths differ across providers even when the HTTP surface looks “compatible.”

Minimal pattern (illustrative only) — replace with your real client and base URL from DeepSeek’s Quick Start:

{
  "model": "deepseek-v4-flash",
  "messages": [{ "role": "user", "content": "Ping: confirm V4 routing." }]
}

Official sources (bookmark these)

Announcement: DeepSeek V4 Preview Release
Weights hub: deepseek-v4 collection on Hugging Face
Tech report: DeepSeek_V4.pdf
Thinking modes: Thinking Mode guide
Product UI: chat.deepseek.com

DeepSeek closes the post with a reminder to trust official channels for news—reasonable advice when frontier releases generate noisy third-party commentary.

DeepSeek V4 preview: V4-Pro, V4-Flash, 1M context API (2026)

TL;DR

V4-Pro vs V4-Flash (vendor-reported)

Architecture: long context and sparse attention

Agent integrations and “agentic coding”

API migration checklist

Official sources (bookmark these)

Related ExplainX reading

Related posts

GLM-5.1 on Hugging Face & how to run it (Z.ai API, Ollama, vLLM) — 2026 guide

GPT-5.5-Cyber rollout: OpenAI’s defender track vs Claude Mythos—what the record actually compares

ACE-Step UI: detailed guide to the open-source Suno alternative for local AI music