On June 12, 2026, Google Cloud published the Open Knowledge Format (OKF) v0.1—an open specification that turns the emergent "LLM wiki" pattern into a portable standard for AI agent knowledge.
Sam McVeety (Tech Lead, Data Analytics) and Amir Hormati (Tech Lead, BigQuery) announced on the Google Cloud Blog:
OKF is a vendor-neutral, agent- and human-friendly standard for representing the metadata, context, and curated knowledge that modern AI systems need.
No new runtime. No required SDK. Just markdown files with YAML frontmatter in a directory—shippable in git, readable on GitHub, consumable by any agent.
@Karpathy's LLM Wiki gist (5,000+ stars) predicted this shape; Google formalized the interoperability layer. See our complete LLM Wiki guide. @DataChaz called it Google formalizing "the power of the LLM Wiki."
This post explains OKF's structure, design principles, what Google shipped, and how it fits alongside CLAUDE.md and agent skills.
TL;DR
| Question | Answer |
|---|---|
| What | Open spec for agent knowledge as markdown + YAML files |
| Required field | type in frontmatter (only mandatory field) |
| Optional fields | title, description, resource, tags, timestamp |
| Links | Standard markdown cross-links → knowledge graph |
| Version | v0.1 (June 2026) |
| Spec repo | GoogleCloudPlatform/knowledge-catalog → okf/SPEC.md |
| Reference tools | BigQuery enrichment agent + static HTML visualizer |
| Samples | GA4 e-commerce, Stack Overflow, Bitcoin datasets |
The Problem OKF Solves
Foundation models improve faster than most organizations can assemble context. When an agent needs to answer "How do we compute weekly active users from our event stream?", the answer fragments across:
| Surface | Example content |
|---|---|
| Metadata catalogs | Table schemas (vendor-specific APIs) |
| Wikis / Notion | Runbooks, metric definitions |
| Code | Docstrings, notebook comments |
| People's heads | Join paths, deprecation notices |
Every agent builder re-solves the same context assembly problem. Every catalog vendor reinvents schemas. Knowledge stays locked to the surface that created it.
OKF's bet: the missing piece is a format, not another platform.
What an OKF Bundle Looks Like
An OKF bundle is a directory of concept documents. File path = concept identity:
sales/
├── index.md
├── datasets/
│ ├── index.md
│ └── orders_db.md
├── tables/
│ ├── index.md
│ ├── orders.md
│ └── customers.md
└── metrics/
├── index.md
└── weekly_active_users.md
Each concept file:
---
type: BigQuery Table
title: Orders
description: One row per completed customer order.
resource: https://console.cloud.google.com/bigquery?p=acme&d=sales&t=orders
tags: [sales, revenue]
timestamp: 2026-05-28T14:30:00Z
---
# Schema
| Column | Type | Description |
|---------------|--------|------------------------------------------|
| `order_id` | STRING | Globally unique order identifier. |
| `customer_id` | STRING | FK to [customers](/tables/customers.md). |
# Joins
Joined with [customers](/tables/customers.md) on `customer_id`.
Markdown links turn the directory into a graph richer than filesystem hierarchy alone.
Reserved files
| File | Purpose |
|---|---|
index.md | Progressive disclosure as agents navigate |
log.md | Chronological change history |
The full v0.1 spec—including conformance criteria—fits on one page.
Three Design Principles
1. Minimally opinionated
Only type is required. What types exist, what other fields to use, what body sections to include—producer's choice. OKF defines interoperability, not content models.
2. Producer/consumer independence
| Producer | Consumer |
|---|---|
| Human-authored wiki | AI agent |
| BigQuery export pipeline | HTML visualizer |
| LLM-generated bundle | Search index |
Same format, swappable tooling at each end.
3. Format, not platform
No proprietary account or SDK required to read or write. Value scales with adoption breadth, not vendor lock-in.
OKF vs Existing Patterns
| Pattern | Scope | Interoperability |
|---|---|---|
| CLAUDE.md | Project agent memory | Claude ecosystem convention |
| AGENTS.md | Repo instructions (Codex, etc.) | Per-tool adoption |
| Obsidian vaults + agents | Personal/team wikis | Bespoke |
| Karpathy LLM wiki | Agent-maintained markdown | Pattern, not spec |
| Metadata catalogs | Schema registry APIs | Vendor-locked |
| OKF v0.1 | Org-wide knowledge graphs | Vendor-neutral spec |
OKF does not replace CLAUDE.md—it can contain the structured knowledge CLAUDE.md points at. Example: CLAUDE.md says "read /okf/sales/metrics/weekly_active_users.md before analytics tasks."
Compare to MCP for live tool access; OKF for curated static knowledge.
What Google Shipped
Reference producer: BigQuery enrichment agent
Walks a BigQuery dataset, drafts OKF concept docs for every table/view, then runs a second LLM pass to enrich with citations, schemas, and join paths from authoritative documentation.
Reference consumer: Static HTML visualizer
Turns any OKF bundle into an interactive graph view—single self-contained HTML file, no backend, no data leaves the page.
Sample bundles (on GitHub)
| Bundle | Domain |
|---|---|
| GA4 e-commerce | Analytics tables and metrics |
| Stack Overflow | Public dataset concepts |
| Bitcoin | Public blockchain datasets |
See our hands-on GA4 & Bitcoin sample bundle guide for starter SQL, dataset limitations, and how OKF maps raw tables to concept pages.
Cloud Knowledge Catalog integration
Google updated Knowledge Catalog to ingest OKF bundles and serve them to Google Cloud agents—enterprise path for teams already on GCP.
The Karpathy Connection
Andrej Karpathy's LLM wiki gist (5,000+ stars) argues:
LLMs don't get bored, don't forget to update a cross-reference, and can touch 15 files in one pass.
Humans abandon wikis when bookkeeping fails; agents excel at it. The pattern keeps reappearing—Obsidian + coding agents, AGENTS.md repos, index.md/log.md artifacts.
OKF's contribution: agree on what fields every document carries and what filenames mean, so your wiki and my wiki and a catalog export cooperate without translation.
Three-layer mapping (Karpathy → OKF)
| Karpathy layer | OKF equivalent | Notes |
|---|---|---|
| Raw sources (immutable) | External datasets, docs, APIs | OKF bundles are the compiled layer; producers ingest from raw |
| Wiki (LLM-maintained) | OKF bundle (*.md + frontmatter) | OKF adds required type, optional resource/tags/timestamp |
Schema (CLAUDE.md) | Producer/consumer conventions + okf/SPEC.md | Org-wide spec replaces per-vault bespoke rules |
Both reserve index.md (catalog) and log.md (change history). See the full LLM Wiki guide for ingest/query/lint operations and the implementation ecosystem (AutoSci, memwiki, secure-llm-wiki, synthadoc, and others).
Format wars context (from the gist community)
| Concern | Karpathy default | Community extensions |
|---|---|---|
| Contradictions | Lint flags as defects | Humanities: preserve tension; typed edges (contradicts, extends) |
| Security | Trust all ingested sources | secure-llm-wiki: untrusted sources never become trusted wiki |
| Provenance | Implicit in summaries | LLM-Wiki-v3, Dense-Mem: claim-level source attribution |
| Multi-writer | Single agent maintainer | Append-only logs, partitioned files, grep-dedup by citation |
| Human authorship | LLM writes wiki | Socratic variant: user ideas promoted only after challenge |
OKF v0.1 addresses interoperability first; contradiction semantics, trust tiers, and typed relationships remain open design space.
Getting Started
Google's recommended path:
- Read the spec — okf/SPEC.md (short)
- Browse sample bundles — GA4, Stack Overflow, Bitcoin in repo
- Try the visualizer — open a bundle in the HTML tool
- Write a producer — export from your DB, docs site, or wiki
- Write a consumer — agent that reads OKF before tasks; search index; viewer
- Contribute — file issues, PRs; v0.1 designed for backward-compatible growth
For Claude Code users: OKF bundles mount like any markdown corpus—agents with file tools traverse index.md hierarchies naturally.
Open Questions (v0.1 Is a Starting Point)
Google explicitly calls v0.1 a starting point, not a finished standard:
- Contradiction handling — Two OKF docs disagree; no merge semantics yet
- Faceted search — Some practitioners want richer tagging than minimal spec
- Live vs static — OKF is file-based; stale docs are a process problem
- Name collision — Unrelated "OKF" supply-chain spec exists (OKF-SCIS); Google's OKF is data/analytics/agent knowledge
Community feedback will shape v0.2.
Who Should Care
| Audience | Why |
|---|---|
| Data teams | Export catalog knowledge agents can actually read |
| Agent builders | Stop bespoke wiki parsers per project |
| Platform vendors | Produce/consume without lock-in |
| Enterprise AI | Version-controlled knowledge in git next to code |
| ExplainX readers | Complements MCP tools + CLAUDE.md memory patterns |
Summary
Open Knowledge Format (OKF) v0.1 is Google's bid to make agent knowledge portable: markdown files, YAML frontmatter, markdown links as graphs, one required type field, no mandatory SDK.
It formalizes what Karpathy, Obsidian users, and CLAUDE.md authors already discovered—LLMs work better with curated, linked, maintainable markdown libraries than with repeated document search.
The spec is the contribution. BigQuery agent, visualizer, and sample bundles lower the cost of trying it. Whether OKF becomes lingua franca depends on producers outside Google adopting it.
Related Reading
- OKF Sample Bundles: GA4 & Bitcoin BigQuery Guide
- Karpathy LLM Wiki: Complete Pattern Guide
- What is CLAUDE.md? Persistent Memory in Claude Code
- What Are Agent Skills? Complete Guide
- What is MCP? Model Context Protocol
- Files.md: Private Local-First Note-Taking
- Agent Markdown Files Complete Guide
- Design.md: Open Spec for AI Design Systems
OKF v0.1 structure and tooling cited from Google Cloud Blog and GoogleCloudPlatform/knowledge-catalog as of June 14, 2026.