What is Ideogram 4.0?

Ideogram 4.0 is Ideogram's first open-weight text-to-image foundation model, released June 3, 2026. It is a 9.3B-parameter flow-matching DiT trained from scratch — not a fine-tune of FLUX or SDXL. It ships with native 2K resolution, best-in-class in-image text rendering, bounding-box layout control, and JSON-first prompting. Weights are on Hugging Face; hosted access is at ideogram.ai and via the Ideogram API.

How do I run Ideogram 4.0 locally?

Clone github.com/ideogram-oss/ideogram4, run pip install ., accept the license gate on Hugging Face (ideogram-ai/ideogram-4-nf4 or ideogram-4-fp8), authenticate with hf auth login, then run python run_inference.py --prompt "your prompt" --output out.png --quantization nf4 --magic-prompt-key "$IDEOGRAM_API_KEY". The NF4 checkpoint fits on a single 24GB CUDA GPU with Diffusers support.

How does the Ideogram 4.0 API work?

Get an API key at developer.ideogram.ai, then POST to https://api.ideogram.ai/v1/ideogram-v4/generate with an Api-Key header and a JSON body containing text_prompt or json_prompt. Pricing is per image with no subscription: Turbo $0.03, Default $0.06, Quality $0.10. The default rate limit is 10 in-flight requests.

Why does Ideogram 4.0 use JSON prompts?

Ideogram 4.0 was trained exclusively on structured JSON captions, not plain prose. JSON gives explicit control over style, color palette, spatial layout (bbox coordinates), and in-image text. Plain-text prompts still work via magic-prompt — a free LLM expansion step that converts casual prompts into the JSON schema the model expects.

How does Ideogram 4.0 compare to FLUX and GPT Image?

On Design Arena, Ideogram 4.0 is the top open-weight model and trails only proprietary GPT and Gemini models overall. ContraLabs blind typography tests rated it first 47.9% of the time vs Gemini 3.1 Flash Image (30%), FLUX.2 max (15.5%), and Grok Imagine (15%). At 9.3B params it beats much larger open models on text rendering benchmarks.

Can I use Ideogram 4.0 commercially?

Open weights ship under Ideogram's commercial license (see ideogram.ai/licensing). The NF4/FP8 Hugging Face checkpoints use a non-commercial license for the open release; enterprise and commercial tiers are listed at ideogram.ai/licensing. The hosted API is billed separately from the Ideogram app subscription.

What is Ideogram 4.0?

Ideogram 4.0 is Ideogram's first open-weight text-to-image foundation model, released June 3, 2026. It is a 9.3B-parameter flow-matching DiT trained from scratch — not a fine-tune of FLUX or SDXL. It ships with native 2K resolution, best-in-class in-image text rendering, bounding-box layout control, and JSON-first prompting. Weights are on Hugging Face; hosted access is at ideogram.ai and via the Ideogram API.

How do I run Ideogram 4.0 locally?

Clone github.com/ideogram-oss/ideogram4, run pip install ., accept the license gate on Hugging Face (ideogram-ai/ideogram-4-nf4 or ideogram-4-fp8), authenticate with hf auth login, then run python run_inference.py --prompt "your prompt" --output out.png --quantization nf4 --magic-prompt-key "$IDEOGRAM_API_KEY". The NF4 checkpoint fits on a single 24GB CUDA GPU with Diffusers support.

How does the Ideogram 4.0 API work?

Get an API key at developer.ideogram.ai, then POST to https://api.ideogram.ai/v1/ideogram-v4/generate with an Api-Key header and a JSON body containing text_prompt or json_prompt. Pricing is per image with no subscription: Turbo $0.03, Default $0.06, Quality $0.10. The default rate limit is 10 in-flight requests.

Why does Ideogram 4.0 use JSON prompts?

Ideogram 4.0 was trained exclusively on structured JSON captions, not plain prose. JSON gives explicit control over style, color palette, spatial layout (bbox coordinates), and in-image text. Plain-text prompts still work via magic-prompt — a free LLM expansion step that converts casual prompts into the JSON schema the model expects.

How does Ideogram 4.0 compare to FLUX and GPT Image?

On Design Arena, Ideogram 4.0 is the top open-weight model and trails only proprietary GPT and Gemini models overall. ContraLabs blind typography tests rated it first 47.9% of the time vs Gemini 3.1 Flash Image (30%), FLUX.2 max (15.5%), and Grok Imagine (15%). At 9.3B params it beats much larger open models on text rendering benchmarks.

Can I use Ideogram 4.0 commercially?

Open weights ship under Ideogram's commercial license (see ideogram.ai/licensing). The NF4/FP8 Hugging Face checkpoints use a non-commercial license for the open release; enterprise and commercial tiers are listed at ideogram.ai/licensing. The hosted API is billed separately from the Ideogram app subscription.

What is Ideogram 4.0?

Ideogram 4.0 is Ideogram's first open-weight text-to-image foundation model, released June 3, 2026. It is a 9.3B-parameter flow-matching DiT trained from scratch — not a fine-tune of FLUX or SDXL. It ships with native 2K resolution, best-in-class in-image text rendering, bounding-box layout control, and JSON-first prompting. Weights are on Hugging Face; hosted access is at ideogram.ai and via the Ideogram API.

How do I run Ideogram 4.0 locally?

Clone github.com/ideogram-oss/ideogram4, run pip install ., accept the license gate on Hugging Face (ideogram-ai/ideogram-4-nf4 or ideogram-4-fp8), authenticate with hf auth login, then run python run_inference.py --prompt "your prompt" --output out.png --quantization nf4 --magic-prompt-key "$IDEOGRAM_API_KEY". The NF4 checkpoint fits on a single 24GB CUDA GPU with Diffusers support.

How does the Ideogram 4.0 API work?

Get an API key at developer.ideogram.ai, then POST to https://api.ideogram.ai/v1/ideogram-v4/generate with an Api-Key header and a JSON body containing text_prompt or json_prompt. Pricing is per image with no subscription: Turbo $0.03, Default $0.06, Quality $0.10. The default rate limit is 10 in-flight requests.

Why does Ideogram 4.0 use JSON prompts?

Ideogram 4.0 was trained exclusively on structured JSON captions, not plain prose. JSON gives explicit control over style, color palette, spatial layout (bbox coordinates), and in-image text. Plain-text prompts still work via magic-prompt — a free LLM expansion step that converts casual prompts into the JSON schema the model expects.

How does Ideogram 4.0 compare to FLUX and GPT Image?

On Design Arena, Ideogram 4.0 is the top open-weight model and trails only proprietary GPT and Gemini models overall. ContraLabs blind typography tests rated it first 47.9% of the time vs Gemini 3.1 Flash Image (30%), FLUX.2 max (15.5%), and Grok Imagine (15%). At 9.3B params it beats much larger open models on text rendering benchmarks.

Can I use Ideogram 4.0 commercially?

Open weights ship under Ideogram's commercial license (see ideogram.ai/licensing). The NF4/FP8 Hugging Face checkpoints use a non-commercial license for the open release; enterprise and commercial tiers are listed at ideogram.ai/licensing. The hosted API is billed separately from the Ideogram app subscription.

What is Ideogram 4.0?

Ideogram 4.0 is Ideogram's first open-weight text-to-image foundation model, released June 3, 2026. It is a 9.3B-parameter flow-matching DiT trained from scratch — not a fine-tune of FLUX or SDXL. It ships with native 2K resolution, best-in-class in-image text rendering, bounding-box layout control, and JSON-first prompting. Weights are on Hugging Face; hosted access is at ideogram.ai and via the Ideogram API.

How do I run Ideogram 4.0 locally?

Clone github.com/ideogram-oss/ideogram4, run pip install ., accept the license gate on Hugging Face (ideogram-ai/ideogram-4-nf4 or ideogram-4-fp8), authenticate with hf auth login, then run python run_inference.py --prompt "your prompt" --output out.png --quantization nf4 --magic-prompt-key "$IDEOGRAM_API_KEY". The NF4 checkpoint fits on a single 24GB CUDA GPU with Diffusers support.

How does the Ideogram 4.0 API work?

Get an API key at developer.ideogram.ai, then POST to https://api.ideogram.ai/v1/ideogram-v4/generate with an Api-Key header and a JSON body containing text_prompt or json_prompt. Pricing is per image with no subscription: Turbo $0.03, Default $0.06, Quality $0.10. The default rate limit is 10 in-flight requests.

Why does Ideogram 4.0 use JSON prompts?

Ideogram 4.0 was trained exclusively on structured JSON captions, not plain prose. JSON gives explicit control over style, color palette, spatial layout (bbox coordinates), and in-image text. Plain-text prompts still work via magic-prompt — a free LLM expansion step that converts casual prompts into the JSON schema the model expects.

How does Ideogram 4.0 compare to FLUX and GPT Image?

On Design Arena, Ideogram 4.0 is the top open-weight model and trails only proprietary GPT and Gemini models overall. ContraLabs blind typography tests rated it first 47.9% of the time vs Gemini 3.1 Flash Image (30%), FLUX.2 max (15.5%), and Grok Imagine (15%). At 9.3B params it beats much larger open models on text rendering benchmarks.

Can I use Ideogram 4.0 commercially?

Open weights ship under Ideogram's commercial license (see ideogram.ai/licensing). The NF4/FP8 Hugging Face checkpoints use a non-commercial license for the open release; enterprise and commercial tiers are listed at ideogram.ai/licensing. The hosted API is billed separately from the Ideogram app subscription.

What is Ideogram 4.0?

Ideogram 4.0 is Ideogram's first open-weight text-to-image foundation model, released June 3, 2026. It is a 9.3B-parameter flow-matching DiT trained from scratch — not a fine-tune of FLUX or SDXL. It ships with native 2K resolution, best-in-class in-image text rendering, bounding-box layout control, and JSON-first prompting. Weights are on Hugging Face; hosted access is at ideogram.ai and via the Ideogram API.

How do I run Ideogram 4.0 locally?

Clone github.com/ideogram-oss/ideogram4, run pip install ., accept the license gate on Hugging Face (ideogram-ai/ideogram-4-nf4 or ideogram-4-fp8), authenticate with hf auth login, then run python run_inference.py --prompt "your prompt" --output out.png --quantization nf4 --magic-prompt-key "$IDEOGRAM_API_KEY". The NF4 checkpoint fits on a single 24GB CUDA GPU with Diffusers support.

How does the Ideogram 4.0 API work?

Get an API key at developer.ideogram.ai, then POST to https://api.ideogram.ai/v1/ideogram-v4/generate with an Api-Key header and a JSON body containing text_prompt or json_prompt. Pricing is per image with no subscription: Turbo $0.03, Default $0.06, Quality $0.10. The default rate limit is 10 in-flight requests.

Why does Ideogram 4.0 use JSON prompts?

Ideogram 4.0 was trained exclusively on structured JSON captions, not plain prose. JSON gives explicit control over style, color palette, spatial layout (bbox coordinates), and in-image text. Plain-text prompts still work via magic-prompt — a free LLM expansion step that converts casual prompts into the JSON schema the model expects.

How does Ideogram 4.0 compare to FLUX and GPT Image?

On Design Arena, Ideogram 4.0 is the top open-weight model and trails only proprietary GPT and Gemini models overall. ContraLabs blind typography tests rated it first 47.9% of the time vs Gemini 3.1 Flash Image (30%), FLUX.2 max (15.5%), and Grok Imagine (15%). At 9.3B params it beats much larger open models on text rendering benchmarks.

Can I use Ideogram 4.0 commercially?

Open weights ship under Ideogram's commercial license (see ideogram.ai/licensing). The NF4/FP8 Hugging Face checkpoints use a non-commercial license for the open release; enterprise and commercial tiers are listed at ideogram.ai/licensing. The hosted API is billed separately from the Ideogram app subscription.

What is Ideogram 4.0?

Ideogram 4.0 is Ideogram's first open-weight text-to-image foundation model, released June 3, 2026. It is a 9.3B-parameter flow-matching DiT trained from scratch — not a fine-tune of FLUX or SDXL. It ships with native 2K resolution, best-in-class in-image text rendering, bounding-box layout control, and JSON-first prompting. Weights are on Hugging Face; hosted access is at ideogram.ai and via the Ideogram API.

How do I run Ideogram 4.0 locally?

Clone github.com/ideogram-oss/ideogram4, run pip install ., accept the license gate on Hugging Face (ideogram-ai/ideogram-4-nf4 or ideogram-4-fp8), authenticate with hf auth login, then run python run_inference.py --prompt "your prompt" --output out.png --quantization nf4 --magic-prompt-key "$IDEOGRAM_API_KEY". The NF4 checkpoint fits on a single 24GB CUDA GPU with Diffusers support.

How does the Ideogram 4.0 API work?

Get an API key at developer.ideogram.ai, then POST to https://api.ideogram.ai/v1/ideogram-v4/generate with an Api-Key header and a JSON body containing text_prompt or json_prompt. Pricing is per image with no subscription: Turbo $0.03, Default $0.06, Quality $0.10. The default rate limit is 10 in-flight requests.

Why does Ideogram 4.0 use JSON prompts?

Ideogram 4.0 was trained exclusively on structured JSON captions, not plain prose. JSON gives explicit control over style, color palette, spatial layout (bbox coordinates), and in-image text. Plain-text prompts still work via magic-prompt — a free LLM expansion step that converts casual prompts into the JSON schema the model expects.

How does Ideogram 4.0 compare to FLUX and GPT Image?

On Design Arena, Ideogram 4.0 is the top open-weight model and trails only proprietary GPT and Gemini models overall. ContraLabs blind typography tests rated it first 47.9% of the time vs Gemini 3.1 Flash Image (30%), FLUX.2 max (15.5%), and Grok Imagine (15%). At 9.3B params it beats much larger open models on text rendering benchmarks.

Can I use Ideogram 4.0 commercially?

Open weights ship under Ideogram's commercial license (see ideogram.ai/licensing). The NF4/FP8 Hugging Face checkpoints use a non-commercial license for the open release; enterprise and commercial tiers are listed at ideogram.ai/licensing. The hosted API is billed separately from the Ideogram app subscription.

What is Ideogram 4.0?

Ideogram 4.0 is Ideogram's first open-weight text-to-image foundation model, released June 3, 2026. It is a 9.3B-parameter flow-matching DiT trained from scratch — not a fine-tune of FLUX or SDXL. It ships with native 2K resolution, best-in-class in-image text rendering, bounding-box layout control, and JSON-first prompting. Weights are on Hugging Face; hosted access is at ideogram.ai and via the Ideogram API.

How do I run Ideogram 4.0 locally?

Clone github.com/ideogram-oss/ideogram4, run pip install ., accept the license gate on Hugging Face (ideogram-ai/ideogram-4-nf4 or ideogram-4-fp8), authenticate with hf auth login, then run python run_inference.py --prompt "your prompt" --output out.png --quantization nf4 --magic-prompt-key "$IDEOGRAM_API_KEY". The NF4 checkpoint fits on a single 24GB CUDA GPU with Diffusers support.

How does the Ideogram 4.0 API work?

Get an API key at developer.ideogram.ai, then POST to https://api.ideogram.ai/v1/ideogram-v4/generate with an Api-Key header and a JSON body containing text_prompt or json_prompt. Pricing is per image with no subscription: Turbo $0.03, Default $0.06, Quality $0.10. The default rate limit is 10 in-flight requests.

Why does Ideogram 4.0 use JSON prompts?

Ideogram 4.0 was trained exclusively on structured JSON captions, not plain prose. JSON gives explicit control over style, color palette, spatial layout (bbox coordinates), and in-image text. Plain-text prompts still work via magic-prompt — a free LLM expansion step that converts casual prompts into the JSON schema the model expects.

How does Ideogram 4.0 compare to FLUX and GPT Image?

On Design Arena, Ideogram 4.0 is the top open-weight model and trails only proprietary GPT and Gemini models overall. ContraLabs blind typography tests rated it first 47.9% of the time vs Gemini 3.1 Flash Image (30%), FLUX.2 max (15.5%), and Grok Imagine (15%). At 9.3B params it beats much larger open models on text rendering benchmarks.

Can I use Ideogram 4.0 commercially?

Open weights ship under Ideogram's commercial license (see ideogram.ai/licensing). The NF4/FP8 Hugging Face checkpoints use a non-commercial license for the open release; enterprise and commercial tiers are listed at ideogram.ai/licensing. The hosted API is billed separately from the Ideogram app subscription.

What is Ideogram 4.0?

Ideogram 4.0 is Ideogram's first open-weight text-to-image foundation model, released June 3, 2026. It is a 9.3B-parameter flow-matching DiT trained from scratch — not a fine-tune of FLUX or SDXL. It ships with native 2K resolution, best-in-class in-image text rendering, bounding-box layout control, and JSON-first prompting. Weights are on Hugging Face; hosted access is at ideogram.ai and via the Ideogram API.

How do I run Ideogram 4.0 locally?

Clone github.com/ideogram-oss/ideogram4, run pip install ., accept the license gate on Hugging Face (ideogram-ai/ideogram-4-nf4 or ideogram-4-fp8), authenticate with hf auth login, then run python run_inference.py --prompt "your prompt" --output out.png --quantization nf4 --magic-prompt-key "$IDEOGRAM_API_KEY". The NF4 checkpoint fits on a single 24GB CUDA GPU with Diffusers support.

How does the Ideogram 4.0 API work?

Get an API key at developer.ideogram.ai, then POST to https://api.ideogram.ai/v1/ideogram-v4/generate with an Api-Key header and a JSON body containing text_prompt or json_prompt. Pricing is per image with no subscription: Turbo $0.03, Default $0.06, Quality $0.10. The default rate limit is 10 in-flight requests.

Why does Ideogram 4.0 use JSON prompts?

Ideogram 4.0 was trained exclusively on structured JSON captions, not plain prose. JSON gives explicit control over style, color palette, spatial layout (bbox coordinates), and in-image text. Plain-text prompts still work via magic-prompt — a free LLM expansion step that converts casual prompts into the JSON schema the model expects.

How does Ideogram 4.0 compare to FLUX and GPT Image?

On Design Arena, Ideogram 4.0 is the top open-weight model and trails only proprietary GPT and Gemini models overall. ContraLabs blind typography tests rated it first 47.9% of the time vs Gemini 3.1 Flash Image (30%), FLUX.2 max (15.5%), and Grok Imagine (15%). At 9.3B params it beats much larger open models on text rendering benchmarks.

Can I use Ideogram 4.0 commercially?

Open weights ship under Ideogram's commercial license (see ideogram.ai/licensing). The NF4/FP8 Hugging Face checkpoints use a non-commercial license for the open release; enterprise and commercial tiers are listed at ideogram.ai/licensing. The hosted API is billed separately from the Ideogram app subscription.

What is Ideogram 4.0?

Ideogram 4.0 is Ideogram's first open-weight text-to-image foundation model, released June 3, 2026. It is a 9.3B-parameter flow-matching DiT trained from scratch — not a fine-tune of FLUX or SDXL. It ships with native 2K resolution, best-in-class in-image text rendering, bounding-box layout control, and JSON-first prompting. Weights are on Hugging Face; hosted access is at ideogram.ai and via the Ideogram API.

How do I run Ideogram 4.0 locally?

Clone github.com/ideogram-oss/ideogram4, run pip install ., accept the license gate on Hugging Face (ideogram-ai/ideogram-4-nf4 or ideogram-4-fp8), authenticate with hf auth login, then run python run_inference.py --prompt "your prompt" --output out.png --quantization nf4 --magic-prompt-key "$IDEOGRAM_API_KEY". The NF4 checkpoint fits on a single 24GB CUDA GPU with Diffusers support.

How does the Ideogram 4.0 API work?

Get an API key at developer.ideogram.ai, then POST to https://api.ideogram.ai/v1/ideogram-v4/generate with an Api-Key header and a JSON body containing text_prompt or json_prompt. Pricing is per image with no subscription: Turbo $0.03, Default $0.06, Quality $0.10. The default rate limit is 10 in-flight requests.

Ideogram 4.0: Open Image Model — How to Run & API Guide (2026) | explainx.ai Blog

On June 3, 2026, Ideogram released 4.0 — its first open-weight frontier text-to-image model. The weights are on GitHub and Hugging Face. The hosted API is live at developer.ideogram.ai.

The headline is not just "another open diffusion model." Ideogram 4.0 closes the quality gap between proprietary frontier image models and the open ecosystem on the axes that matter for production design work: typography in scene, deterministic layout, and 2K photoreal output. CEO Mohammad Norouzi put it directly: "The hardest problems at the forefront of design generation — headline-grade typography, deterministic layout, branded layered output — need a foundation engineered for them."

This guide covers what shipped, how the architecture differs from unified multimodal stacks, and how to run Ideogram 4.0 — via API, CLI, and self-hosted inference.

Quick reference

Detail	Value
Release date	June 3, 2026
Parameters	9.3B
Architecture	Flow-matching DiT, single-stream, Qwen3-VL-8B text encoder
Max resolution	2048×2048 (multiples of 16, aspect ratios up to 6:1)
Open weights	ideogram-oss/ideogram4
Checkpoints	ideogram-4-nf4 (24GB GPU) · ideogram-4-fp8
API endpoint	`POST https://api.ideogram.ai/v1/ideogram-v4/generate`
API pricing	Turbo $0.03 · Default $0.06 · Quality $0.10 per image
Prompt format	JSON-first (plain text via magic-prompt expansion)
GitHub stars	2,100+ (as of June 2026)

Jump to the path you need:

What Ideogram 4.0 actually ships
Benchmarks and where it ranks
How to run via the Ideogram API
How to run locally (CLI)
JSON prompting and magic-prompt
Bounding-box layout and color palettes
API endpoints beyond generate
When to use API vs local vs the app

What Ideogram 4.0 ships today

Three capabilities anchor the release, per Ideogram's press release:

1. Text rendering at production fidelity

Ideogram has led on in-scene typography since its 2023 launch. Version 4.0 extends that with multilingual support, denser type at smaller scales, and reliable rendering of headlines, packaging copy, and signage. In a ContraLabs blind evaluation judged by ten professional designers, Ideogram 4.0 was picked as best 47.9% of the time — ahead of Gemini 3.1 Flash Image Preview (30.0%), FLUX.2 [max] (15.5%), and Grok Imagine 1.0 (15.0%).

2. Bounding-box layout control

You specify where a logo, headline, callout, or subject belongs on the canvas using normalized [y_min, x_min, y_max, x_max] coordinates on a 0–1000 grid. Layout is directed by the brief, not sampled and corrected afterward.

3. Photoreal output at 2K

Native support for resolutions from 256 to 2048 (multiples of 16), with aspect ratios up to 6:1. For highest quality locally, the README recommends --height 2048 --width 2048 --sampler-preset V4_QUALITY_48.

Layer-based roadmap

Most professional design work is not a single pixel layer. Ideogram 4.0 is the start of a generation stack:

Capability	Status
Transparent background cutouts	Available via Background Remover API
Editable text + movable image layers	Follow-up 4.0 release
Branded assets (typography, palette, logo fidelity)	Scheduled

Live WorkshopAug 1–2, 2026 · 2 days

Claude for Work

Use Claude as a thought partner for writing, research & decisions — no coding required. 2 live sessions with Yash Thakker.

Architecture: a specialized foundation, not a unified multimodal model

Ideogram 4.0 is a foundation model trained entirely from scratch — not a fine-tune or distillation of any existing checkpoint. Key architectural choices from the GitHub README:

Component	Detail
Backbone	34-layer single-stream Diffusion Transformer (DiT) — text and image tokens in one unified sequence
Text encoder	Qwen3-VL-8B-Instruct — hidden states from 13 intermediate layers concatenated
Training objective	Flow matching
Guidance	Dual-branch classifier-free guidance (independent positive/negative refinement)
Training data format	Structured JSON captions exclusively

The bet is explicit: unified multimodal models (GPT Image, Gemini) are strong generalists, but headline-grade typography, deterministic layout, and brand fidelity require a foundation engineered for design specifically. At 9.3B parameters, Ideogram 4.0 delivers the best text rendering of any open-weight release Ideogram benchmarked — ahead of Qwen-Image (20B), FLUX.2 [dev] (32B), and HunyuanImage 3.0 (80B MoE).

For a general primer on how diffusion image models work under the hood, see our diffusion explainer.

Benchmarks: where Ideogram 4.0 ranks

Benchmark	Result
Design Arena (overall)	Top open-weight model; trails only proprietary GPT and Gemini
Design Arena (open-weight only)	#1 by commanding margin
ContraLabs typography (1st-place win rate)	47.9%
ContraLabs "would use in client work"	3.55 / 5
LMArena text-to-image	Top open-weight lab, top-5 overall
7Bench (layout control)	Better than all closed-source models tested
Internal human-preference (design + photography)	#2 overall — behind only GPT Image 2 medium

The pattern is consistent: Ideogram 4.0 is the best open-weight image model by far, and sits at the frontier of design-oriented generation.

How to run Ideogram 4.0 via the API

The fastest path for production pipelines. No GPU required.

Step 1: Get an API key

Sign up at developer.ideogram.ai
Add payment method in the API Dashboard (billing is separate from the Ideogram app subscription)
Copy your Api-Key

Step 2: Generate your first image

Python:

import requests

response = requests.post(
    "https://api.ideogram.ai/v1/ideogram-v4/generate",
    headers={"Api-Key": "<your-api-key>"},
    json={
        "text_prompt": "A poster for a summer design conference with bold sans-serif typography",
        "rendering_speed": "DEFAULT",
        "aspect_ratio": "ASPECT_16_9",
    },
)

image = response.json()["data"][0]
print(image["url"])

cURL:

curl -X POST https://api.ideogram.ai/v1/ideogram-v4/generate \
  -H "Api-Key: <your-api-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "text_prompt": "A poster for a summer design conference",
    "rendering_speed": "TURBO"
  }'

TypeScript:

const res = await fetch("https://api.ideogram.ai/v1/ideogram-v4/generate", {
  method: "POST",
  headers: {
    "Api-Key": "<your-api-key>",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    text_prompt: "A poster for a summer design conference",
    rendering_speed: "DEFAULT",
  }),
});

const { data } = await res.json();
console.log(data[0].url);

API pricing and speed tiers

Rendering speed	Price per image	Use case
TURBO	$0.03	Rapid prototyping, A/B testing
DEFAULT	$0.06	Daily production work
QUALITY	$0.10	Final delivery assets

No subscription required. Default rate limit: 10 in-flight requests. For higher throughput, contact [email protected].

Important: Image URLs are ephemeral — download and store results in your own system immediately after generation.

How to run Ideogram 4.0 locally (CLI)

Self-host when you need gradients, fine-tuning, or air-gapped inference.

Prerequisites

CUDA GPU with 24GB VRAM (NF4 checkpoint) or broader hardware (FP8)
Python 3.10+
Hugging Face account with accepted license gate

Step 1: Clone and install

git clone https://github.com/ideogram-oss/ideogram4.git
cd ideogram4
pip install .

For development, use editable mode: pip install -e .

Step 2: Accept the license gate and authenticate

Open ideogram-ai/ideogram-4-nf4 on Hugging Face
Click Agree and access repository
Authenticate:

hf auth login
# or: export HF_TOKEN="hf_..."

Without this step, downloads fail with 404 / GatedRepoError.

Step 3: Generate with plain-text prompt

Plain --prompt is expanded into structured JSON by magic-prompt — Ideogram's hosted LLM expansion, which is free and requires only an API key:

export IDEOGRAM_API_KEY="your_key_from_developer.ideogram.ai"

python run_inference.py \
  --prompt "a ginger cat wearing a tiny wizard hat reading a spellbook" \
  --output out.png \
  --quantization "nf4" \
  --magic-prompt-key "$IDEOGRAM_API_KEY"

Step 4: Max quality settings

For 2K output with the quality sampler preset:

python run_inference.py \
  --prompt "a campaign poster with clean sans-serif typography" \
  --output poster.png \
  --quantization "nf4" \
  --height 2048 \
  --width 2048 \
  --sampler-preset V4_QUALITY_48 \
  --magic-prompt-key "$IDEOGRAM_API_KEY"

Optional: safety screening with Hive

For production deployments, enable prompt and output moderation via Hive:

export HIVE_TEXT_MODERATION_KEY="..."
export HIVE_VISUAL_MODERATION_KEY="..."

python run_inference.py \
  --prompt "an isometric illustration of a tiny city floating in the clouds" \
  --output out.png \
  --quantization "nf4" \
  --magic-prompt-key "$IDEOGRAM_API_KEY" \
  --hive-text-key "$HIVE_TEXT_MODERATION_KEY" \
  --hive-visual-key "$HIVE_VISUAL_MODERATION_KEY"

Model checkpoints

Checkpoint	Quantization	Hardware	Diffusers
ideogram-4-nf4	NF4	CUDA (24GB)	Yes
ideogram-4-fp8	FP8	All	No

See docs/inference.md for sampler presets, parameter reference, and optimization tips.

JSON prompting: the format that matters

Ideogram 4.0 was trained exclusively on structured JSON captions. Plain text works — but JSON is the native language.

Why JSON-only training?

From the prompting guide:

We train exclusively on JSON so that training and inference share a single, common prompt format. The training captions themselves are deliberately extremely descriptive: each JSON exhaustively describes everything in the image.

Plain-text prompts create train/eval mismatch. JSON mirrors the training distribution and unlocks full model quality.

The caption schema (three top-level fields)

{
  "high_level_description": "A clean business card layout for a tech startup.",
  "style_description": {
    "aesthetics": "minimal, professional, geometric",
    "lighting": "even, diffuse studio lighting",
    "medium": "graphic_design",
    "art_style": "flat vector design, generous whitespace, sans-serif typography",
    "color_palette": ["#FFFFFF", "#F0F0F0", "#333333", "#0066FF", "#00CC88"]
  },
  "compositional_deconstruction": {
    "background": "A solid off-white card surface with subtle paper texture.",
    "elements": [
      {
        "type": "text",
        "text": "ACME TECH",
        "desc": "Bold dark grey sans-serif company name across the upper third."
      },
      {
        "type": "text",
        "text": "[email protected]",
        "desc": "Small blue sans-serif contact email near the bottom."
      }
    ]
  }
}

Field	Required	Purpose
`high_level_description`	Strongly recommended	One- or two-sentence summary
`style_description`	Optional	Aesthetics, lighting, medium, color palette
`compositional_deconstruction`	Required	Background + spatial elements

Element types: "obj" for objects/subjects, "text" for in-image text (include a text field with the literal string to render).

Magic-prompt: JSON without writing JSON

Don't want to hand-write captions? Magic-prompt expands plain text into full structured JSON before generation.

Three backends ship in the repo:

Config	Registry key	Backend
`Ideogram4MagicPromptV1`	`ideogram-4-v1`	Ideogram hosted API (free)
`ClaudeOpusMagicPromptV1`	`claude-opus-v1`	OpenRouter
`ClaudeSonnetMagicPromptV1`	`claude-sonnet-v1`	OpenRouter

The hosted ideogram-4-v1 backend is the default in run_inference.py and only needs IDEOGRAM_API_KEY. The magic-prompt system prompts are open source in src/ideogram4/magic_prompt_system_prompts/.

Via the API, two endpoints scaffold the JSON workflow:

Endpoint	Purpose
`POST /v1/ideogram-v4/magic-prompt`	Convert plain text → structured `json_prompt`
`POST /v1/ideogram-v4/describe`	Upload a reference image → structured JSON prompt (preserves bboxes optionally)

Practical workflow: Start with text_prompt for fast ideation. Migrate to json_prompt once layout precision, brand hex colors, or multi-line typography matter.

Bounding-box layout and color palettes

Spatial control with bbox

Each element can include a bounding box in normalized 0–1000 coordinates (origin top-left):

{
  "type": "text",
  "bbox": [100, 200, 300, 800],
  "text": "SUMMER SALE",
  "desc": "Large bold red headline across the upper center of the poster."
}

Format: [y_min, x_min, y_max, x_max]. This is native to the model — no ControlNet pipeline required.

Color palette conditioning

Steer dominant colors with hex codes in style_description.color_palette:

"color_palette": ["#1B1B2F", "#162447", "#1F4068", "#E43F5A", "#F5F5F5"]

Rules from the prompting guide:

Up to 16 colors in style_description.color_palette
Up to 5 colors per element
Uppercase hex only — #RRGGBB form (not #fff or lowercase)
Include both highlight and shadow colors for controlled lighting

On 7Bench (layout control), Ideogram 4.0 scored significantly better than all closed-source models tested — the bbox + palette system is the differentiator.

API endpoints beyond generate

The Ideogram API is not just text-to-image. Full capability list from ideogram.ai/api-learn:

Capability	Endpoint family	Notes
Generate	`/v1/ideogram-v4/generate`	Text or JSON prompt → image
Transparent backgrounds	v4 endpoints	Native alpha cutouts
Edit with prompt	v3 endpoints	Describe changes in plain language
Remix	v3 endpoints	Reimagine with `image_weight` control
Reframe	v3 endpoints	Extend to new aspect ratio
Remove background	v4 endpoints	Clean cutout in one call
Layerized text	v3 endpoints	Pull editable text layers
Custom models	Training + generate	Fine-tune on brand assets
Upscale	Upscale endpoint	Raise resolution for delivery
Magic-prompt	`/v1/ideogram-v4/magic-prompt`	Plain text → JSON caption
Describe	`/v1/ideogram-v4/describe`	Image → JSON caption

Ideogram 4.0 also supports MCP for agent workflows — useful if you're wiring image generation into coding agents or design automation pipelines. For agent harness concepts, see our Agent Harness guide.

When to use API vs local vs the app

Surface	Best for	Trade-off
Ideogram app	Hands-on creation, iteration, editing	Subscription credits; no programmatic access
API	Production pipelines, product integration, agents	Per-image cost; ephemeral URLs
Local (CLI)	Fine-tuning, research, air-gapped, unlimited gen	24GB GPU; magic-prompt still needs API key (free)
ComfyUI	Node-based visual workflows	Requires ComfyUI 0.24.0+ and `image_ideogram4_t2i.json` template

For most developers building image generation into a product, start with the API (Turbo at $0.03/image for prototyping). Move to local inference when you need custom fine-tunes, synthetic data pipelines, or on-premise deployment.

For comparison with other 2026 image models, see our posts on ChatGPT Images 2.0 / gpt-image-2 and the diffusion fundamentals guide.

Enterprise and commercial licensing

Open weights ship under Ideogram's commercial license. Key points from the press release:

Fine-tuning on brand data with weights, training data, and inference staying on customer infrastructure
Headquartered in Toronto and San Francisco — no embedded political alignment in weights
Commercial license tiers at ideogram.ai/licensing
Enterprise inquiries → [email protected]

The NF4/FP8 Hugging Face checkpoints use a non-commercial license for the open release. Commercial use through the API or enterprise licensing is the production path.

Summary

Ideogram 4.0 is the most significant open-weight image release of 2026 for anyone who ships visual assets — not hobbyists generating cats, but teams that need readable type, controlled layout, and 2K fidelity.

Three things to remember:

JSON is the native prompt format. Use magic-prompt for casual input; write JSON when layout and typography matter.
Three ways in: API for products, CLI for research/self-hosting, app for hands-on design.
It closes the open-vs-closed gap on design benchmarks while staying at 9.3B parameters — a fraction of FLUX.2 [dev]'s 32B.

Ideogram 4.0: Open-Weight Image Generation — How to Run, API & JSON Prompts (2026)

Quick reference

What Ideogram 4.0 ships today

1. Text rendering at production fidelity

2. Bounding-box layout control

3. Photoreal output at 2K

Layer-based roadmap

Architecture: a specialized foundation, not a unified multimodal model

Benchmarks: where Ideogram 4.0 ranks

How to run Ideogram 4.0 via the API

Step 1: Get an API key

Step 2: Generate your first image

API pricing and speed tiers

How to run Ideogram 4.0 locally (CLI)

Prerequisites

Step 1: Clone and install

Step 2: Accept the license gate and authenticate

Step 3: Generate with plain-text prompt

Step 4: Max quality settings

Optional: safety screening with Hive

Model checkpoints

JSON prompting: the format that matters

Why JSON-only training?

The caption schema (three top-level fields)

Magic-prompt: JSON without writing JSON

Bounding-box layout and color palettes

Spatial control with bbox

Color palette conditioning

API endpoints beyond generate

When to use API vs local vs the app

Enterprise and commercial licensing

Summary

Related reading

Related posts

Sarvam AI: Full Capabilities Guide — Models, API, Speech, Vision & How to Run (2026)

BharatGen: IIT Bombay Launches India's Sovereign AI for All 22 Scheduled Languages

VibeThinker 3B: A 3-Billion Parameter Model That Matches Opus 4.5 Performance