GPT Image model prompting

GPT Image prompting playbook▌

Operational patterns summarized from OpenAI’s GPT Image production cookbook lineage (gpt-image-2-era guidance). Tune with your evaluator harness—never ship blind.

Fidelity & sizing guardrails

Default new production flows to gpt-image-2 when your stack exposes it—especially for typography-heavy comps, composites, sensitive likeness edits, and shots where rerun cost hurts more than incremental spend. Larger canvases trending past ~2560×1440 can oscillate stylistically—treat oversized outputs as exploratory until regressions stabilize.

Prompt scaffolding

Establish setting / background plates before hero subjects—this anchors diffusion attention.
Enumerate materials & optics (paper grain, satin weave, sapphire edge highlights) sparingly rather than spraying adjectives randomly.
Declare deliverable modality (UI mock sheet, billboard, infographic) so polishing heuristics align.
Freeze invariants explicitly when iterating edits: typography blocks, silhouette geometry, branded glyph locks.
Quote literal in-image strings with contrast + placement cues; escalate quality settings when density rises.

Generation archetypes you will reuse

Infographics & classroom diagrams — specify audience reading level, maximum label count, arrow grammar, and forbidden embellishments (mascots, comic halftone, etc.) when clarity matters.
Photoreal lifestyle & product — say photorealistic outright; pair with believable imperfections (wear, dust, edge flecks) to avoid wax-skin failure modes.
UI mockups — describe shipped affordances, grid rhythm, and device frame generically; ban “fantasy HUD” language if you need App Store plausibility.
Localized image translation — instruct “translate only copy; preserve composition, weight, line breaks except unavoidable reflow.”
Marketing comps & sequential art — treat prompts like creative briefs: audience, cultural tone, palette bias, required tagline string once.

Edit / composite hygiene

For image → image flows, pair every destructive instruction with a preservation manifest (“do not alter camera yaw, skin tone mapping, label kerning”). When compositing multiple references, index them: “Image 1 = environment plate; Image 2 = subject insert; lock lighting direction from Image 1.” Iterate with single-axis tweaks after a strong base pass.

Legal & brand safety

Ask for original marks, refuse counterfeit trademarks, and keep sensitive likeness work human-in-the-loop. Model knowledge can infer real historical contexts—verify facts before publishing documentary-styled frames.

GPT Image prompting playbook▌

Operational patterns summarized from OpenAI’s GPT Image production cookbook lineage (gpt-image-2-era guidance). Tune with your evaluator harness—never ship blind.

Fidelity & sizing guardrails

Prompt scaffolding

Establish setting / background plates before hero subjects—this anchors diffusion attention.
Enumerate materials & optics (paper grain, satin weave, sapphire edge highlights) sparingly rather than spraying adjectives randomly.
Declare deliverable modality (UI mock sheet, billboard, infographic) so polishing heuristics align.
Freeze invariants explicitly when iterating edits: typography blocks, silhouette geometry, branded glyph locks.
Quote literal in-image strings with contrast + placement cues; escalate quality settings when density rises.

Generation archetypes you will reuse

Infographics & classroom diagrams — specify audience reading level, maximum label count, arrow grammar, and forbidden embellishments (mascots, comic halftone, etc.) when clarity matters.
Photoreal lifestyle & product — say photorealistic outright; pair with believable imperfections (wear, dust, edge flecks) to avoid wax-skin failure modes.
UI mockups — describe shipped affordances, grid rhythm, and device frame generically; ban “fantasy HUD” language if you need App Store plausibility.
Localized image translation — instruct “translate only copy; preserve composition, weight, line breaks except unavoidable reflow.”
Marketing comps & sequential art — treat prompts like creative briefs: audience, cultural tone, palette bias, required tagline string once.