MinerU is an open-source document parsing engine from OpenDataLab (github.com/opendatalab/MinerU) that converts PDF, images, DOCX, PPTX, and XLSX into machine-readable Markdown and JSON for RAG, agents, and pre-training pipelines. Born during InternLM pre-training; now at v3.4.0 with ~69.7k GitHub stars.

What changed in MinerU 3.4?

Released June 18, 2026. Pipeline backend OCR upgraded to PP-OCRv6 (~11% accuracy gain on OmniDocBench v1.6). OCR processing speed improved ~100%. Model download logic adds automatic source selection by network environment and prioritizes local cache reuse before remote downloads.

How accurate is MinerU?

On OmniDocBench v1.6 end-to-end scores: pipeline backend 86.47; hybrid/VLM high effort 95.39; hybrid medium (new default) 95.26; VLM HTTP client 95.30. Medium effort trades 0.13 points vs high for 35–220% speed gains depending on platform.

Can MinerU run on CPU only?

Yes. Use the pipeline backend: mineru -p input.pdf -o output/ -b pipeline. Pipeline runs on pure CPU with min 16GB RAM (32GB recommended). Hybrid and VLM backends require GPU (min 8GB VRAM) or Apple Silicon for acceleration.

How do I install MinerU?

pip install uv && uv pip install -U "mineru[all]" — includes all core features on Windows, Linux, and macOS. Docker deployment supported on Linux and Windows WSL2. CLI: mineru -p input_path -o output_path.

What license does MinerU use?

Since v3.1.0 (April 2026), MinerU uses the MinerU Open Source License — based on Apache 2.0 with additional conditions — replacing the earlier AGPLv3. This lowers friction for commercial deployments while keeping the project open.

MinerU is an open-source document parsing engine from OpenDataLab (github.com/opendatalab/MinerU) that converts PDF, images, DOCX, PPTX, and XLSX into machine-readable Markdown and JSON for RAG, agents, and pre-training pipelines. Born during InternLM pre-training; now at v3.4.0 with ~69.7k GitHub stars.

What changed in MinerU 3.4?

Released June 18, 2026. Pipeline backend OCR upgraded to PP-OCRv6 (~11% accuracy gain on OmniDocBench v1.6). OCR processing speed improved ~100%. Model download logic adds automatic source selection by network environment and prioritizes local cache reuse before remote downloads.

How accurate is MinerU?

On OmniDocBench v1.6 end-to-end scores: pipeline backend 86.47; hybrid/VLM high effort 95.39; hybrid medium (new default) 95.26; VLM HTTP client 95.30. Medium effort trades 0.13 points vs high for 35–220% speed gains depending on platform.

Can MinerU run on CPU only?

Yes. Use the pipeline backend: mineru -p input.pdf -o output/ -b pipeline. Pipeline runs on pure CPU with min 16GB RAM (32GB recommended). Hybrid and VLM backends require GPU (min 8GB VRAM) or Apple Silicon for acceleration.

How do I install MinerU?

pip install uv && uv pip install -U "mineru[all]" — includes all core features on Windows, Linux, and macOS. Docker deployment supported on Linux and Windows WSL2. CLI: mineru -p input_path -o output_path.

What license does MinerU use?

Since v3.1.0 (April 2026), MinerU uses the MinerU Open Source License — based on Apache 2.0 with additional conditions — replacing the earlier AGPLv3. This lowers friction for commercial deployments while keeping the project open.

MinerU is an open-source document parsing engine from OpenDataLab (github.com/opendatalab/MinerU) that converts PDF, images, DOCX, PPTX, and XLSX into machine-readable Markdown and JSON for RAG, agents, and pre-training pipelines. Born during InternLM pre-training; now at v3.4.0 with ~69.7k GitHub stars.

What changed in MinerU 3.4?

Released June 18, 2026. Pipeline backend OCR upgraded to PP-OCRv6 (~11% accuracy gain on OmniDocBench v1.6). OCR processing speed improved ~100%. Model download logic adds automatic source selection by network environment and prioritizes local cache reuse before remote downloads.

How accurate is MinerU?

On OmniDocBench v1.6 end-to-end scores: pipeline backend 86.47; hybrid/VLM high effort 95.39; hybrid medium (new default) 95.26; VLM HTTP client 95.30. Medium effort trades 0.13 points vs high for 35–220% speed gains depending on platform.

Can MinerU run on CPU only?

Yes. Use the pipeline backend: mineru -p input.pdf -o output/ -b pipeline. Pipeline runs on pure CPU with min 16GB RAM (32GB recommended). Hybrid and VLM backends require GPU (min 8GB VRAM) or Apple Silicon for acceleration.

How do I install MinerU?

pip install uv && uv pip install -U "mineru[all]" — includes all core features on Windows, Linux, and macOS. Docker deployment supported on Linux and Windows WSL2. CLI: mineru -p input_path -o output_path.

What license does MinerU use?

Since v3.1.0 (April 2026), MinerU uses the MinerU Open Source License — based on Apache 2.0 with additional conditions — replacing the earlier AGPLv3. This lowers friction for commercial deployments while keeping the project open.

MinerU is an open-source document parsing engine from OpenDataLab (github.com/opendatalab/MinerU) that converts PDF, images, DOCX, PPTX, and XLSX into machine-readable Markdown and JSON for RAG, agents, and pre-training pipelines. Born during InternLM pre-training; now at v3.4.0 with ~69.7k GitHub stars.

What changed in MinerU 3.4?

Released June 18, 2026. Pipeline backend OCR upgraded to PP-OCRv6 (~11% accuracy gain on OmniDocBench v1.6). OCR processing speed improved ~100%. Model download logic adds automatic source selection by network environment and prioritizes local cache reuse before remote downloads.

How accurate is MinerU?

On OmniDocBench v1.6 end-to-end scores: pipeline backend 86.47; hybrid/VLM high effort 95.39; hybrid medium (new default) 95.26; VLM HTTP client 95.30. Medium effort trades 0.13 points vs high for 35–220% speed gains depending on platform.

Can MinerU run on CPU only?

Yes. Use the pipeline backend: mineru -p input.pdf -o output/ -b pipeline. Pipeline runs on pure CPU with min 16GB RAM (32GB recommended). Hybrid and VLM backends require GPU (min 8GB VRAM) or Apple Silicon for acceleration.

How do I install MinerU?

pip install uv && uv pip install -U "mineru[all]" — includes all core features on Windows, Linux, and macOS. Docker deployment supported on Linux and Windows WSL2. CLI: mineru -p input_path -o output_path.

What license does MinerU use?

Since v3.1.0 (April 2026), MinerU uses the MinerU Open Source License — based on Apache 2.0 with additional conditions — replacing the earlier AGPLv3. This lowers friction for commercial deployments while keeping the project open.

MinerU is an open-source document parsing engine from OpenDataLab (github.com/opendatalab/MinerU) that converts PDF, images, DOCX, PPTX, and XLSX into machine-readable Markdown and JSON for RAG, agents, and pre-training pipelines. Born during InternLM pre-training; now at v3.4.0 with ~69.7k GitHub stars.

What changed in MinerU 3.4?

Released June 18, 2026. Pipeline backend OCR upgraded to PP-OCRv6 (~11% accuracy gain on OmniDocBench v1.6). OCR processing speed improved ~100%. Model download logic adds automatic source selection by network environment and prioritizes local cache reuse before remote downloads.

How accurate is MinerU?

On OmniDocBench v1.6 end-to-end scores: pipeline backend 86.47; hybrid/VLM high effort 95.39; hybrid medium (new default) 95.26; VLM HTTP client 95.30. Medium effort trades 0.13 points vs high for 35–220% speed gains depending on platform.

Can MinerU run on CPU only?

Yes. Use the pipeline backend: mineru -p input.pdf -o output/ -b pipeline. Pipeline runs on pure CPU with min 16GB RAM (32GB recommended). Hybrid and VLM backends require GPU (min 8GB VRAM) or Apple Silicon for acceleration.

How do I install MinerU?

pip install uv && uv pip install -U "mineru[all]" — includes all core features on Windows, Linux, and macOS. Docker deployment supported on Linux and Windows WSL2. CLI: mineru -p input_path -o output_path.

What license does MinerU use?

Since v3.1.0 (April 2026), MinerU uses the MinerU Open Source License — based on Apache 2.0 with additional conditions — replacing the earlier AGPLv3. This lowers friction for commercial deployments while keeping the project open.

MinerU is an open-source document parsing engine from OpenDataLab (github.com/opendatalab/MinerU) that converts PDF, images, DOCX, PPTX, and XLSX into machine-readable Markdown and JSON for RAG, agents, and pre-training pipelines. Born during InternLM pre-training; now at v3.4.0 with ~69.7k GitHub stars.

What changed in MinerU 3.4?

Released June 18, 2026. Pipeline backend OCR upgraded to PP-OCRv6 (~11% accuracy gain on OmniDocBench v1.6). OCR processing speed improved ~100%. Model download logic adds automatic source selection by network environment and prioritizes local cache reuse before remote downloads.

How accurate is MinerU?

On OmniDocBench v1.6 end-to-end scores: pipeline backend 86.47; hybrid/VLM high effort 95.39; hybrid medium (new default) 95.26; VLM HTTP client 95.30. Medium effort trades 0.13 points vs high for 35–220% speed gains depending on platform.

Can MinerU run on CPU only?

Yes. Use the pipeline backend: mineru -p input.pdf -o output/ -b pipeline. Pipeline runs on pure CPU with min 16GB RAM (32GB recommended). Hybrid and VLM backends require GPU (min 8GB VRAM) or Apple Silicon for acceleration.

How do I install MinerU?

pip install uv && uv pip install -U "mineru[all]" — includes all core features on Windows, Linux, and macOS. Docker deployment supported on Linux and Windows WSL2. CLI: mineru -p input_path -o output_path.

What license does MinerU use?

Since v3.1.0 (April 2026), MinerU uses the MinerU Open Source License — based on Apache 2.0 with additional conditions — replacing the earlier AGPLv3. This lowers friction for commercial deployments while keeping the project open.

MinerU is an open-source document parsing engine from OpenDataLab (github.com/opendatalab/MinerU) that converts PDF, images, DOCX, PPTX, and XLSX into machine-readable Markdown and JSON for RAG, agents, and pre-training pipelines. Born during InternLM pre-training; now at v3.4.0 with ~69.7k GitHub stars.

What changed in MinerU 3.4?

Released June 18, 2026. Pipeline backend OCR upgraded to PP-OCRv6 (~11% accuracy gain on OmniDocBench v1.6). OCR processing speed improved ~100%. Model download logic adds automatic source selection by network environment and prioritizes local cache reuse before remote downloads.

How accurate is MinerU?

On OmniDocBench v1.6 end-to-end scores: pipeline backend 86.47; hybrid/VLM high effort 95.39; hybrid medium (new default) 95.26; VLM HTTP client 95.30. Medium effort trades 0.13 points vs high for 35–220% speed gains depending on platform.

Can MinerU run on CPU only?

Yes. Use the pipeline backend: mineru -p input.pdf -o output/ -b pipeline. Pipeline runs on pure CPU with min 16GB RAM (32GB recommended). Hybrid and VLM backends require GPU (min 8GB VRAM) or Apple Silicon for acceleration.

How do I install MinerU?

pip install uv && uv pip install -U "mineru[all]" — includes all core features on Windows, Linux, and macOS. Docker deployment supported on Linux and Windows WSL2. CLI: mineru -p input_path -o output_path.

What license does MinerU use?

Since v3.1.0 (April 2026), MinerU uses the MinerU Open Source License — based on Apache 2.0 with additional conditions — replacing the earlier AGPLv3. This lowers friction for commercial deployments while keeping the project open.

MinerU is an open-source document parsing engine from OpenDataLab (github.com/opendatalab/MinerU) that converts PDF, images, DOCX, PPTX, and XLSX into machine-readable Markdown and JSON for RAG, agents, and pre-training pipelines. Born during InternLM pre-training; now at v3.4.0 with ~69.7k GitHub stars.

What changed in MinerU 3.4?

Released June 18, 2026. Pipeline backend OCR upgraded to PP-OCRv6 (~11% accuracy gain on OmniDocBench v1.6). OCR processing speed improved ~100%. Model download logic adds automatic source selection by network environment and prioritizes local cache reuse before remote downloads.

How accurate is MinerU?

On OmniDocBench v1.6 end-to-end scores: pipeline backend 86.47; hybrid/VLM high effort 95.39; hybrid medium (new default) 95.26; VLM HTTP client 95.30. Medium effort trades 0.13 points vs high for 35–220% speed gains depending on platform.

Can MinerU run on CPU only?

Yes. Use the pipeline backend: mineru -p input.pdf -o output/ -b pipeline. Pipeline runs on pure CPU with min 16GB RAM (32GB recommended). Hybrid and VLM backends require GPU (min 8GB VRAM) or Apple Silicon for acceleration.

How do I install MinerU?

pip install uv && uv pip install -U "mineru[all]" — includes all core features on Windows, Linux, and macOS. Docker deployment supported on Linux and Windows WSL2. CLI: mineru -p input_path -o output_path.

What license does MinerU use?

Since v3.1.0 (April 2026), MinerU uses the MinerU Open Source License — based on Apache 2.0 with additional conditions — replacing the earlier AGPLv3. This lowers friction for commercial deployments while keeping the project open.

MinerU is an open-source document parsing engine from OpenDataLab (github.com/opendatalab/MinerU) that converts PDF, images, DOCX, PPTX, and XLSX into machine-readable Markdown and JSON for RAG, agents, and pre-training pipelines. Born during InternLM pre-training; now at v3.4.0 with ~69.7k GitHub stars.

What changed in MinerU 3.4?

Released June 18, 2026. Pipeline backend OCR upgraded to PP-OCRv6 (~11% accuracy gain on OmniDocBench v1.6). OCR processing speed improved ~100%. Model download logic adds automatic source selection by network environment and prioritizes local cache reuse before remote downloads.

How accurate is MinerU?

On OmniDocBench v1.6 end-to-end scores: pipeline backend 86.47; hybrid/VLM high effort 95.39; hybrid medium (new default) 95.26; VLM HTTP client 95.30. Medium effort trades 0.13 points vs high for 35–220% speed gains depending on platform.

MinerU 3.4 Guide — PDF/Office to Markdown for RAG & Agents | explainx.ai Blog

If your RAG pipeline still treats PDFs as "extract text with PyPDF and hope," you are leaving layout, tables, formulas, and multi-column structure on the floor. MinerU — OpenDataLab's document parsing engine with ~69.7k GitHub stars — exists to fix that: turn complex PDFs and Office documents into LLM-ready Markdown and JSON with headings, tables, formulas, and images preserved.

Version 3.4.0 landed June 18, 2026 with a focused upgrade: PP-OCRv6 for the pipeline backend (~11% OCR accuracy gain on OmniDocBench v1.6), roughly 100% faster OCR processing, and smarter model download / cache reuse. For agent builders, MinerU is increasingly the default ingestion layer before chunking, embedding, and retrieval.

TL;DR

Detail	MinerU 3.4
Repo	github.com/opendatalab/MinerU
Docs	opendatalab.github.io/MinerU
Latest release	mineru-3.4.0 (June 2026)
Stars / forks	~69.7k / ~5.9k
Inputs	PDF, images, DOCX, PPTX, XLSX
Outputs	Markdown, JSON, multimodal formats
License	MinerU Open Source License (Apache 2.0–based)
Install	`uv pip install -U "mineru[all]"`
CLI	`mineru -p <input> -o <output>`
CPU path	`-b pipeline`

Why MinerU Matters for RAG and Agents

Document ingestion is the silent failure mode in most RAG systems. Chunk a badly parsed PDF and you get:

Tables split across chunks with no header context
Formulas rendered as garbage Unicode
Multi-column layouts read in wrong order
Headers, footers, and page numbers polluting embeddings

MinerU addresses parsing before chunking. It removes headers/footers/page numbers, preserves document structure (headings, lists, paragraphs), converts formulas to LaTeX, tables to HTML, extracts images with captions, and detects scanned PDFs for automatic OCR.

The project originated during InternLM pre-training — built to solve symbol conversion in scientific literature. That pedigree shows in formula and table handling, where generic text extractors fail.

June 2026 sits in a crowded document-AI moment: Baidu Unlimited-OCR targets one-shot long-horizon parsing; Mistral OCR 4 offers managed API extraction with bounding boxes. MinerU's position: full-stack open ingestion with multiple backends, local deployment, and production routing (mineru-router) — not a single-model demo.

Version 3.4: What Changed (June 18, 2026)

PP-OCRv6 upgrade

The pipeline backend's OCR model moved to PP-OCRv6, improving OCR accuracy by about 11% on OmniDocBench v1.6. Japanese, Traditional Chinese, English, and Latin were removed as separate OCR language options — those scenarios now route through the ch OCR model, simplifying configuration.

~100% OCR speed improvement

MinerU optimized the OCR inference and processing pipeline, roughly doubling OCR throughput — significant for batch document jobs and OCR-heavy scans.

Model download and cache

Automatic model source selection on first install based on network environment (HuggingFace, ModelScope, etc.)
Local cache priority — checks downloaded model files before remote requests
Reduces repeated downloads across dev/staging/prod environments

See OpenDataLab's Model Source Documentation for configuration details.

Parsing Backends Compared

MinerU is not one model — it is an orchestration stack with backend selection:

Backend	Accuracy (OmniDocBench v1.6 E2E)	CPU	GPU	Best for
pipeline	86.47	✅	Optional	Homelab, CPU-only, batch OCR
hybrid medium (default)	95.26	❌	8GB+ VRAM	Daily production — speed/accuracy balance
hybrid high	95.39	❌	8GB+ VRAM	Max accuracy, image analysis
vlm / vlm-http-client	95.30	❌	2GB+ VRAM (client)	OpenAI-compatible remote servers

Hybrid medium (added in v3.3, now default) sacrifices only 0.13 accuracy points vs high while delivering 35–220% speed improvements by platform:

Platform	Text PDF speedup	OCR scenario speedup
Linux	~80%	~35%
Windows	~90%	~45%
macOS	~220%	~50%

Medium does not support image analysis inside documents — switch to effort=high when you need that.

VLM model: MinerU2.5-Pro

The primary VLM is MinerU2.5-Pro-2605-1.2B (v3.3+) with native multilingual OCR, image/chart parsing, truncated paragraph merging, and cross-page table merging. v3.1.0 added native PPTX and XLSX parsing alongside PDF, DOCX, and images.

Key Features

Multi-format input: PDF, PNG/JPG, DOCX, PPTX, XLSX
Layout-aware output: reading order for single/multi-column and complex layouts
Formula → LaTeX, table → HTML
OCR: 109 languages; auto-detect scanned/garbled PDFs
Outputs: NLP Markdown, multimodal Markdown, JSON by reading order, layout/span visualizations
Interfaces: CLI, FastAPI (mineru-api), Gradio WebUI, mineru-router for multi-GPU load balancing
Async tasks: POST /tasks for submit/status/result (v3.0+)
Long documents: sliding-window parsing + streaming disk writes — tens of thousands of pages without manual splitting
Thread-safe multi-threaded inference for high-concurrency production

Quick Start

Install

pip install --upgrade pip
pip install uv
uv pip install -U "mineru[all]"

mineru[all] is the recommended bundle for Windows, Linux, and macOS.

Parse a document (GPU path)

mineru -p document.pdf -o ./output

Parse on CPU only

mineru -p document.pdf -o ./output -b pipeline

Supports single files or directories. Outputs land in structured Markdown/JSON under the output path.

Docker

Docker deployment is documented for Linux and Windows WSL2 — macOS should use pip/uv install instead. See Docker deployment docs.

Production: mineru-router and Multi-GPU

mineru-router (v3.0+) provides unified entry deployment across multiple services and GPUs:

Interfaces fully compatible with mineru-api
Automatic task load balancing
Designed for high-concurrency, high-throughput parsing farms

Combined with thread-safe concurrent inference and streaming writes, MinerU 3.x targets enterprise document pipelines — not just one-off CLI conversions. That aligns with Liquid AI LFM2.5-230M's data-extraction positioning: parse at scale upstream, route structured chunks to small edge models downstream.

Hardware Requirements (Summary)

	pipeline	hybrid / vlm
OS	Linux 2019+, Windows, macOS 14+	Same
Python	3.10–3.13 (Windows: 3.10–3.12)	Same
RAM	Min 16GB, rec 32GB+	Min 16GB
VRAM	4GB optional	Min 8GB (hybrid), 2GB (http client)
Disk	Min 20GB SSD recommended	Min 2GB (+ models)

Pure CPU inference is pipeline-only. Apple Silicon supports GPU acceleration via MPS on supported backends.

MinerU vs Alternatives (June 2026)

Tool	Strength	Trade-off
MinerU 3.4	Full stack, multi-backend, Office formats, router	Heavy install, GPU for best accuracy
Unlimited-OCR	One-shot long PDFs, SGLang throughput	Vision-model path, different architecture
Mistral OCR 4	Managed API, bounding boxes, confidence	Not self-hosted weights
Generic PyPDF	Fast, trivial	No layout, tables, or formulas

For RAG specifically, parsed output quality directly affects chunking and retrieval strategy. MinerU's JSON-sorted-by-reading-order output is designed for downstream indexing — or wire parsed Markdown into a Langflow RAG pipeline for visual retriever tuning. At the extreme end of "hard documents," the Vesuvius Challenge applies a similar parse-then-verify loop to carbonized 2,000-year-old scrolls — with papyrologists, not chunkers, as the final gate.

License Evolution

v3.1.0 (April 2026) moved MinerU from AGPLv3 to the MinerU Open Source License — Apache 2.0–based with additional conditions. The change explicitly targets lower adoption friction for commercial deployments while keeping the codebase open.

v3.0 also removed dependencies on AGPLv3 models (doclayoutyolo, mfd_yolov8) and a CC-BY-NC-SA layoutreader — cleaning the license stack for enterprise use.

Online Demos (Try Before Deploy)

Demo	Notes
Official web app	Full features, login required
OpenDataLab	Same as official
ModelScope Gradio	Core parsing, no login
HuggingFace Gradio	Core parsing, no login

MinerU's own docs recommend trying online demos first — complex layouts, scans, and handwriting may still fall short of expectations.

Related ExplainX coverage

Post	Connection
Baidu Unlimited-OCR	Alternative long-horizon parsing approach
Mistral OCR 4	Managed document AI API comparison
RAG vs agentic RAG	What to do with parsed documents
Liquid AI LFM2.5-230M	Edge extraction after MinerU ingestion
arXiv AI-generated errors ban	Why grounded document pipelines matter
Vesuvius Challenge scroll read	Extreme document recovery — ML ink detection + human transcription

Summary

MinerU 3.4 reinforces its role as the default open-source document ingestion engine for LLM workflows: PP-OCRv6 accuracy, doubled OCR speed, smarter model caching, 95%+ hybrid parsing, full Office format support, and mineru-router for production scale.

69.7k stars reflect years of iteration from InternLM's pre-training needs to today's agent/RAG stacks. If your agents read PDFs, MinerU is the layer to install before you embed a single chunk.

Last updated: June 26, 2026. Version details from github.com/opendatalab/MinerU release mineru-3.4.0 and project README.

TL;DR

Detail	MinerU 3.4
Repo	github.com/opendatalab/MinerU
Docs	opendatalab.github.io/MinerU
Latest release	mineru-3.4.0 (June 2026)
Stars / forks	~69.7k / ~5.9k
Inputs	PDF, images, DOCX, PPTX, XLSX
Outputs	Markdown, JSON, multimodal formats
License	MinerU Open Source License (Apache 2.0–based)
Install	`uv pip install -U "mineru[all]"`
CLI	`mineru -p <input> -o <output>`
CPU path	`-b pipeline`

Why MinerU Matters for RAG and Agents

Document ingestion is the silent failure mode in most RAG systems. Chunk a badly parsed PDF and you get:

Tables split across chunks with no header context
Formulas rendered as garbage Unicode
Multi-column layouts read in wrong order
Headers, footers, and page numbers polluting embeddings

Version 3.4: What Changed (June 18, 2026)

PP-OCRv6 upgrade

~100% OCR speed improvement

MinerU optimized the OCR inference and processing pipeline, roughly doubling OCR throughput — significant for batch document jobs and OCR-heavy scans.

Model download and cache

Automatic model source selection on first install based on network environment (HuggingFace, ModelScope, etc.)
Local cache priority — checks downloaded model files before remote requests
Reduces repeated downloads across dev/staging/prod environments

See OpenDataLab's Model Source Documentation for configuration details.

Parsing Backends Compared

MinerU is not one model — it is an orchestration stack with backend selection:

Backend	Accuracy (OmniDocBench v1.6 E2E)	CPU	GPU	Best for
pipeline	86.47	✅	Optional	Homelab, CPU-only, batch OCR
hybrid medium (default)	95.26	❌	8GB+ VRAM	Daily production — speed/accuracy balance
hybrid high	95.39	❌	8GB+ VRAM	Max accuracy, image analysis
vlm / vlm-http-client	95.30	❌	2GB+ VRAM (client)	OpenAI-compatible remote servers

Hybrid medium (added in v3.3, now default) sacrifices only 0.13 accuracy points vs high while delivering 35–220% speed improvements by platform:

Platform	Text PDF speedup	OCR scenario speedup
Linux	~80%	~35%
Windows	~90%	~45%
macOS	~220%	~50%

Medium does not support image analysis inside documents — switch to effort=high when you need that.

VLM model: MinerU2.5-Pro

Key Features

Multi-format input: PDF, PNG/JPG, DOCX, PPTX, XLSX
Layout-aware output: reading order for single/multi-column and complex layouts
Formula → LaTeX, table → HTML
OCR: 109 languages; auto-detect scanned/garbled PDFs
Outputs: NLP Markdown, multimodal Markdown, JSON by reading order, layout/span visualizations
Interfaces: CLI, FastAPI (mineru-api), Gradio WebUI, mineru-router for multi-GPU load balancing
Async tasks: POST /tasks for submit/status/result (v3.0+)
Long documents: sliding-window parsing + streaming disk writes — tens of thousands of pages without manual splitting
Thread-safe multi-threaded inference for high-concurrency production

Quick Start

Install

pip install --upgrade pip
pip install uv
uv pip install -U "mineru[all]"

mineru[all] is the recommended bundle for Windows, Linux, and macOS.

Parse a document (GPU path)

mineru -p document.pdf -o ./output

Parse on CPU only

mineru -p document.pdf -o ./output -b pipeline

Supports single files or directories. Outputs land in structured Markdown/JSON under the output path.

Docker

Docker deployment is documented for Linux and Windows WSL2 — macOS should use pip/uv install instead. See Docker deployment docs.

Production: mineru-router and Multi-GPU

mineru-router (v3.0+) provides unified entry deployment across multiple services and GPUs:

Interfaces fully compatible with mineru-api
Automatic task load balancing
Designed for high-concurrency, high-throughput parsing farms

Hardware Requirements (Summary)

	pipeline	hybrid / vlm
OS	Linux 2019+, Windows, macOS 14+	Same
Python	3.10–3.13 (Windows: 3.10–3.12)	Same
RAM	Min 16GB, rec 32GB+	Min 16GB
VRAM	4GB optional	Min 8GB (hybrid), 2GB (http client)
Disk	Min 20GB SSD recommended	Min 2GB (+ models)

Pure CPU inference is pipeline-only. Apple Silicon supports GPU acceleration via MPS on supported backends.

MinerU vs Alternatives (June 2026)

Tool	Strength	Trade-off
MinerU 3.4	Full stack, multi-backend, Office formats, router	Heavy install, GPU for best accuracy
Unlimited-OCR	One-shot long PDFs, SGLang throughput	Vision-model path, different architecture
Mistral OCR 4	Managed API, bounding boxes, confidence	Not self-hosted weights
Generic PyPDF	Fast, trivial	No layout, tables, or formulas

License Evolution

v3.0 also removed dependencies on AGPLv3 models (doclayoutyolo, mfd_yolov8) and a CC-BY-NC-SA layoutreader — cleaning the license stack for enterprise use.

Online Demos (Try Before Deploy)

Demo	Notes
Official web app	Full features, login required
OpenDataLab	Same as official
ModelScope Gradio	Core parsing, no login
HuggingFace Gradio	Core parsing, no login

MinerU's own docs recommend trying online demos first — complex layouts, scans, and handwriting may still fall short of expectations.

Related ExplainX coverage

Post	Connection
Baidu Unlimited-OCR	Alternative long-horizon parsing approach
Mistral OCR 4	Managed document AI API comparison
RAG vs agentic RAG	What to do with parsed documents
Liquid AI LFM2.5-230M	Edge extraction after MinerU ingestion
arXiv AI-generated errors ban	Why grounded document pipelines matter
Vesuvius Challenge scroll read	Extreme document recovery — ML ink detection + human transcription

Summary

69.7k stars reflect years of iteration from InternLM's pre-training needs to today's agent/RAG stacks. If your agents read PDFs, MinerU is the layer to install before you embed a single chunk.

Last updated: June 26, 2026. Version details from github.com/opendatalab/MinerU release mineru-3.4.0 and project README.

TL;DR

Why MinerU Matters for RAG and Agents

Version 3.4: What Changed (June 18, 2026)

PP-OCRv6 upgrade

~100% OCR speed improvement

Model download and cache

Parsing Backends Compared

VLM model: MinerU2.5-Pro

Key Features

Quick Start

Install

Parse a document (GPU path)

Parse on CPU only

Docker

Production: mineru-router and Multi-GPU

Hardware Requirements (Summary)

MinerU vs Alternatives (June 2026)

License Evolution

Online Demos (Try Before Deploy)

Related ExplainX coverage

Summary

Related posts

Baidu's Unlimited-OCR: One-Shot Long-Horizon Document Parsing Is Here

Mistral OCR 4: Bounding Boxes, Document AI, and the New OCR API

PixelRAG: Berkeley's Visual RAG That Reads Web Pages as Screenshots (Not HTML)

TL;DR

Why MinerU Matters for RAG and Agents

Version 3.4: What Changed (June 18, 2026)

PP-OCRv6 upgrade

~100% OCR speed improvement

Model download and cache

Parsing Backends Compared

VLM model: MinerU2.5-Pro

Key Features

Quick Start

Install

Parse a document (GPU path)

Parse on CPU only

Docker

Production: mineru-router and Multi-GPU

Hardware Requirements (Summary)

MinerU vs Alternatives (June 2026)

License Evolution

Online Demos (Try Before Deploy)

Related ExplainX coverage

Summary

Related posts

Baidu's Unlimited-OCR: One-Shot Long-Horizon Document Parsing Is Here

Mistral OCR 4: Bounding Boxes, Document AI, and the New OCR API

PixelRAG: Berkeley's Visual RAG That Reads Web Pages as Screenshots (Not HTML)