For almost 2,000 years, the carbonized library of Herculaneum kept a cruel bargain: its scrolls survived the eruption of Mount Vesuvius in 79 AD, but only by becoming too fragile to open. To read one was to destroy it. Hundreds of rolls stayed sealed — preserved yet unreachable.
On June 25, 2026, that bargain broke.
The Vesuvius Challenge announced it has completely virtually unwrapped and read PHerc. 1667 — the scroll the community knows as Scroll 4 — without ever touching its pages. It is the first Herculaneum papyrus read in full, end to end, and released for sustained scholarly study. Alongside it: independent confirmation of the 2023 Grand Prize reading on Scroll 1, and a title recovered on a third roll identifying Philodemus, On Gods, Book 8.
The preprint is live: Complete virtual unwrapping and reading of a rolled Herculaneum papyrus (PDF). Data at scrollprize.org/data. Code on GitHub.
This is one of the clearest examples of machine learning used for discovery rather than engagement optimization — the opposite of reward hacking on coding benchmarks. Here, the model's job is to find ink humans literally cannot see in burned papyrus. When it works, two-thousand-year-old thought re-enters the conversation.
TL;DR
| Topic | Detail |
|---|---|
| Date | June 25, 2026 |
| Scroll read in full | PHerc. 1667 (Scroll 4) — ~1.4 m, ~22 columns of Greek |
| Method | ESRF synchrotron X-ray → virtual unroll → ML ink detection → papyrologist review |
| Text | Stoic ethics treatise, 2nd century BC; names Aristocreon (Chrysippus's nephew) |
| Also announced | Scroll 1 ink confirmed in 3D at higher resolution; PHerc. 139 title = Philodemus, On Gods, Book 8 |
| Open release | CC-licensed data, GitHub code, preprint PDF |
| What's left | ~600 surviving rolls; ~30 scanned; hundreds still sealed; most of Herculaneum unexcavated |
| Community | Vesuvius Challenge — donation-funded; many core team members were former prize winners |
From sealed lump to readable book
PHerc. 1667 began as a blackened, rolled mass of carbonized papyrus — what remains after 19th-century and 1969/1980s physical unrolling attempts destroyed its outer layers. Only a compact inner core survived (~8 cm of an original 19–24 cm height).
The team never unrolled it physically. The pipeline:
Carbonized roll → synchrotron X-ray CT → 3D volume
→ segment & trace wound sheet → virtual flatten
→ ML + physically based rendering → ink signal
→ papyrologist transcription → published Greek text
Scans used phase-contrast X-ray microtomography on the BM18 beamline at the European Synchrotron Radiation Facility (ESRF) in Grenoble — resolving the wafer-thin, densely packed layers inside a Herculaneum roll. Work was done with the National Library of Naples, which holds the papyri.
The result: roughly 1.4 metres of writing surface and about twenty-two columns of ancient Greek — the complete preserved text of this surviving portion, not isolated words or patches.
What the scroll says (first passages in 2,000 years)
The recovered text is a philosophical treatise on ethics. Evidence points to a Stoic work: it turns on human nature, impulse, and moral progress, and the final preserved column names Aristocreon — nephew and disciple of the great Stoic Chrysippus — placing the work in a Stoic context and dating it to the 2nd century BC.
The papyrus is damaged; many readings are fragmentary. Even so, several passages are clear for the first time since antiquity. From the Vesuvius Challenge announcement, translated from the Greek:
"…we will inquire into something, but we will not grasp it, if in some way we depart from ourselves and from our own nature…"
"Having…strained ourselves to the utmost through research and learning…possessing the same practical wisdom…"
"…such being the goods for us, even from the opposite evils there will be neither anything good — let alone beautiful — nor anything bad — let alone ugly — nor happiness…"
The full column-by-column transcription is in the preprint appendix. Columns 1–4 are lost; margins on others are incomplete — "entire" here means the full surviving continuous text, not a perfectly preserved roll. Once transcribed, the same downstream problem applies as in modern document parsing for RAG: turn fragile source material into structured, indexable text without inventing content.
Three scrolls, three milestones
The June 25 announcement is not a single-scroll story. Three results landed together:
PHerc. 1667 (Scroll 4) — read in full
First rolled Herculaneum scroll digitally unrolled and read continuously — the headline result.
PHerc. Paris 4 (Scroll 1) — ink visible in 3D
A higher-resolution imaging technique makes ink directly visible inside the scroll in the three-dimensional X-ray data — for the first time at this fidelity. Segmented in 3D and projected onto the unwrapped page, that ink matches the 2023 Grand Prize text one-to-one — independent confirmation, from better data, that the earlier reading was real.
The 2023 Grand Prize was a breakthrough moment for the Challenge; June 25 closes the loop with physical evidence in the volume itself, not just flattened surfaces.
PHerc. 139 — title and author before body text
Enhancing the ink signal recovered the scroll's title and author attribution: Philodemus, On Gods, Book 8 — an Epicurean treatise. The Villa dei Papiri library is famously rich in Philodemus; knowing a closed scroll's title tells scholars what a roll contains before a single column of body text is studied.
Early assumptions that the library might be mostly Epicurean are already shifting — PHerc. 1667 is Stoic, not Epicurean — which historians on Hacker News noted could rewrite expectations about the collection's breadth.
How machine learning fits — and where it stops
Most Herculaneum ink is carbon-based — it leaves a subtle texture on charred papyrus, recoverable with physically based rendering and ML classifiers trained on manually labeled segments. It is not the iron-gall ink that stands out brightly in X-ray the way the Challenge's campfire scroll experiment did.
The hard parts, in order:
| Stage | Difficulty |
|---|---|
| X-ray scanning | Expensive; ESRF beamtime is booked and paid; largest scrolls produce hundreds of terabytes (Paris 3 ≈ 260 TB reconstructed) |
| Virtual segmentation & unwrapping | Tedious at scale; automatable on easy sections, brutal on warped or damaged layers |
| Ink detection | ML helps enormously — but ground truth comes from human annotators labeling ink on unwrapped surfaces |
| Transcription | Papyrologists, not models — with fragmentary surfaces and lacunae |
Team members emphasized on HN that training data quality matters more than model architecture — the same data-over-architecture pattern that shows up in coding eval design and specification gaming, but here in service of archaeology.
Hallucination risk is real but bounded: ink detection may extend strokes or fill characters locally; it does not invent grammatical Greek paragraphs. Papyrologists review every reading — the same human verification layer arXiv now demands for AI-assisted research claims. The open data release lets outsiders audit contested characters.
Models and weights are public on Hugging Face (scrollprize). Training volumes are on AWS Open Data Registry.
Open science and the community that built this
Virtual unwrapping was pioneered at EduceLab by Professor Brent Seales. In 2023, Seales opened the imaging stack to the Vesuvius Challenge — a public, donation-funded effort co-founded with Nat Friedman and Daniel Gross to read the scrolls in the open.
The first letters and the 2023 Grand Prize went to contestants worldwide. What is less widely known: much of the core research team arrived as contestants — won prizes for breakthroughs, then joined the team that read an entire scroll. That pipeline — open competition → hire the winners → ship science — is rare in academia and industry alike.
Apple's statement on memory-driven price hikes and frontier labs gating model access dominate tech news. This project is the counterexample: CC-licensed tomography, public code, crowdfunded beamtime, and a Discord full of people segmenting papyrus for the joy of it.
Why this library matters
The Villa dei Papiri at Herculaneum was a private philosophical library — not Rome's main public collection. What survived is already skewed toward certain schools (much Philodemus, much Epicurean material in what was read before June 25). Only a fraction of ancient Greek and Latin literature survived the medieval copy bottleneck; estimates often cite ~1% of classical works in the languages we study today.
A sealed library that cannot be opened is a time capsule with no key — until now. If the method scales:
- Lost treatises may reappear — not just philosophy but potentially poetry, history, science
- Historiography gains independent witnesses to events known from a single surviving source (similar in spirit to how long-read genome sequencing finds structural variants short reads miss)
- Linguistics gets primary texts, not medieval copies filtered by monkish taste — material you could eventually route through RAG or agentic retrieval pipelines for scholarly search
Caution is warranted: many scrolls are badly damaged; some layers may be truly destroyed, not merely unread with current methods. PHerc. 1667 was smaller and more readable than average. Scaling is months per scroll, not days — scanning takes days; human refinement takes months. Automation helps; it has not eliminated the bottleneck.
What's next
PHerc. 1667 is one scroll. Hundreds remain sealed. A $1M Vesuvius Challenge Grand Prize is expected in coming days (team announcement). Separately, researchers reported 140 new columns of text unwrapped in PHerc. Paris 4 in a recent working session — the pipeline is still accelerating.
If you want to participate:
- Read the science: preprint PDF
- Get data and code: scrollprize.org/data, GitHub
- Join the community: scrollprize.org/get_started, Discord, open prize tracks
Only ~20% of Herculaneum has been excavated. There may be more scrolls still underground — and now, for the first time, a non-destructive reason to recover them.
The human scale (why people care)
On Hacker News, a team member (verditelabs) described swapping scrolls on the ESRF beamline — artifacts once presented as a diplomatic gift to Napoleon and Josephine — as the most stressful handling of a priceless object they will ever perform. Another comment thread reflected on Aristocreon in ~200 BC, writing on papyrus with no imaginable path to synchrotron tomography and global instant publication — yet the words survived anyway, through volcano and carbonization, until sand-adjacent silicon and lightning-adjacent X-rays read them again.
That is not AI hype. That is entropy temporarily reversed — and a reminder that some ML work is worth doing even when it will never optimize ad click-through.
Related reading
| Post | Connection |
|---|---|
| Cursor SWE-bench reward hacking | Contrast: ML gaming benchmarks vs ML recovering lost text |
| Apple global price hike | Closed, margin-driven hardware vs open CC-licensed tomography |
| MinerU document parsing | Modern pipeline for turning documents into machine-readable text |
| arXiv AI-error ban | Why human review beats model fluency for factual claims |
| Specification gaming | When metrics lie — here the metric is papyrologist-reviewed transcription |
| Agent harness guide | Controlled runtime design — analogous to sealing eval environments |
| Langflow RAG guide | Open tooling for indexing parsed text once scrolls are transcribed |
| Long-read genome sequencing | Another field where longer, richer reads unlock what fragments hide |
Summary
On June 25, 2026, the Vesuvius Challenge read PHerc. 1667 — an entire surviving Herculaneum scroll — virtually, non-destructively, end to end. The text is a 2nd-century BC Stoic ethics treatise naming Aristocreon. Scroll 1 ink was confirmed in 3D at higher resolution; PHerc. 139 yielded Philodemus, On Gods, Book 8. Everything is open.
Hundreds of scrolls remain. The thoughts of the ancient world, sealed in darkness for two millennia, are coming back — one scroll at a time.
Last updated: June 26, 2026. Primary source: Vesuvius Challenge — We read an entire scroll (June 25, 2026). Preprint: scrollprize.org/pdf/main.pdf.