Open Knowledge Format (OKF) Reviewed: What Is New in Google's AI Agent Knowledge Format, and What Is Verified?¶
For / Key Points
For: Engineers, data teams, and AI platform owners evaluating internal knowledge bases for agents or enterprise RAG.
Key Points:
- OKF is not a new hosted service. It is an exchange format for agent-readable knowledge, built from Markdown and YAML frontmatter.
- The new part is not the LLM Wiki idea itself, but the decision to standardize only the smallest interoperability surface.
- "Open", "vendor-neutral", and "secure" are still closer to design goals than proven ecosystem outcomes.
Google Cloud published Open Knowledge Format (OKF) on June 13, 2026.1 It is a proposal for representing internal knowledge that AI agents need: table definitions, metrics, runbooks, APIs, and similar context. The shape is deliberately simple: one concept becomes one Markdown file with YAML frontmatter.
This is not a flashy new platform. It is a format. That dullness is exactly why it matters for teams whose agents rebuild the same context on every task.
This article asks one question: what does OKF make new, and which Google claims can be verified as of June 16, 2026?
What Problem OKF Is Trying to Solve¶
Smarter models still fail when the organizational context is scattered. Ask an agent how to compute weekly active users, and the answer rarely lives in one place. The table meaning may be in a metadata catalog, the metric logic in a wiki, the edge cases in code comments, and the real interpretation in a senior engineer's memory.
The agent must assemble context across catalog APIs, wiki pages, source code, and vendor-specific schemas. Google Cloud calls this the context-assembly problem.1 The deeper issue is that knowledge is trapped inside the surfaces that created it.
OKF puts a common container around that knowledge. It does not ask every team to migrate into one database. It asks producers to expose knowledge as human- and agent-readable Markdown files that can move between tools.
What OKF Actually Is¶
An OKF bundle is a directory of Markdown files, one file per concept. A concept can be a table, dataset, metric, API, runbook, playbook, or abstract business idea. The concept ID is the file path without the .md suffix.2
sales/
├── index.md
├── datasets/
│ └── orders_db.md
├── tables/
│ ├── orders.md
│ └── customers.md
└── metrics/
└── weekly_active_users.md
Each file has YAML frontmatter for structured fields and a Markdown body for everything else. The only required field is type. Fields such as title, description, resource, tags, and timestamp are recommended, not mandatory.2
---
type: BigQuery Table
title: Orders
description: One row per completed customer order.
resource: https://console.cloud.google.com/bigquery?p=acme&d=sales&t=orders
tags: [sales, revenue]
timestamp: 2026-05-28T14:30:00Z
---
## Schema
| Column | Type | Description |
|-------------|--------|-----------------------------------|
| order_id | STRING | Globally unique order identifier. |
| customer_id | STRING | FK to [customers](/tables/customers.md). |
## Joins
Joined with [customers](/tables/customers.md) on `customer_id`.
Concepts link to each other with ordinary Markdown links. That turns the directory into a graph of relationships, not just a tree of folders. index.md is reserved for progressive disclosure, while log.md is reserved for chronological updates.2
flowchart LR
subgraph P[Producer]
P1[Human-authored]
P2[Generated from catalogs]
P3[Synthesized by an LLM]
end
B[OKF bundle<br/>Markdown + YAML<br/>Git-managed]
subgraph C[Consumer]
C1[AI agent]
C2[Visualizer]
C3[Search index]
C4[Another LLM]
end
P1 --> B
P2 --> B
P3 --> B
B --> C1
B --> C2
B --> C3
B --> C4The key point is producer and consumer independence. A bundle can be written by humans and read by agents. It can be generated from BigQuery and browsed in a static visualizer. The format becomes the contract, while tools at either end can change.
Why Progressive Disclosure Matters¶
OKF is not designed around dumping the entire bundle into an LLM prompt. An agent can first read index.md, inspect titles and one-line descriptions, choose a likely concept, open that file, then follow links to related concepts.
The specification calls this progressive disclosure.2description works as a snippet for lists and search results. type and tags let a consumer filter before reading full bodies.
The limit is just as important. OKF does not define the runtime. The specification says how consumption agents should traverse the format, but storage, serving, query infrastructure, permissions, and search APIs are out of scope.2
OKF gives you the readable knowledge files. You still need the agent, indexer, search layer, and access controls that use them correctly.
Why Another Format Can Still Be Useful¶
OKF's diagnosis is sound. The interoperability bottleneck is often not a lack of agents or search tools. It is the absence of a shared representation for the knowledge those tools need.
The low-friction design is the point. Many metadata standards try to include vocabulary control, governance, lineage, quality contracts, and detailed schemas. That can be powerful, but it raises adoption cost.
OKF chooses the other side of the trade-off. It standardizes only the minimum surface needed for exchange. Taxonomies, body structure, and richer semantics remain producer-defined.
It also rides an existing wave. Andrej Karpathy's LLM Wiki gist framed a pattern where an LLM maintains a persistent Markdown wiki from raw sources.3 AGENTS.md, CLAUDE.md, Obsidian vaults, and llms.txt-style conventions all point in a similar direction.
OKF did not invent Markdown knowledge bases. Its contribution is trying to make that practice portable.
What Is Actually Verified Today¶
As of June 16, 2026, the verified facts are still narrow. OKF v0.1 had been public for only a few days. The official Google Cloud post describes a BigQuery-based enrichment agent, a static HTML visualizer, and sample bundles for GA4, Stack Overflow, and Bitcoin datasets.1
"Vendor-neutral" is a design claim, not yet an ecosystem fact. The specification says OKF is not tied to a particular cloud, database, model, or agent framework.1 The GitHub repository is also published under Apache 2.0.4
That is a good start. But neutrality becomes real only when multiple independent producers and consumers appear. For now, the careful statement is that OKF is designed to be vendor-neutral, not that the market has proven it.
"Minimally opinionated" cuts both ways. If the only required field is type, then BigQuery Table, table, and dataset_table can all be valid. The spec explicitly says type values are not centrally registered and that consumers should tolerate unknown values.2
This lowers adoption cost. It also means consumers may still need normalization logic for each producer. The context-assembly problem can reappear as type normalization and vocabulary mapping.
"Secure" is not a property of the format by itself. The OKF specification lists storage, serving, and query infrastructure as non-goals.2 Permissions, secret redaction, approval workflows, freshness checks, and data quality remain separate layers.
Google's visualizer is a proof-of-concept consumer. Its implementation details do not create an organization-wide security model. Teams still need to decide who can create bundles, who reviews them, and which agents may read them.
Generated knowledge quality is a separate problem. The LLM Wiki pattern shows why agents are useful at cross-reference maintenance and repeated bookkeeping.3 But LLMs can also write plausible join paths that do not exist.
Even if an enrichment agent adds citations, someone must decide who approves the output and when it has gone stale. A confident mistake stored in Markdown can be reused by both humans and agents. That is the operational risk.
OKF Complements Stricter Standards¶
OKF does not replace Avro, Protobuf, OpenAPI, or other strict domain schemas. The specification says OKF may reference those schemas, but does not subsume them.2
Its place becomes clearer when compared with nearby patterns.
| Pattern | What It Standardizes | Constraint Level | Relationship to OKF |
|---|---|---|---|
| OKF | Knowledge container, file structure, minimal metadata | Low. Only type is required | Main topic of this article |
| llms.txt | LLM-facing site guidance | Low | Coarser granularity and different use case |
| AGENTS.md / CLAUDE.md | Repo-local agent instructions | Low | Local convention, not an exchange format |
| OpenLineage / OpenMetadata | Lineage, catalog, metadata models | High | Stricter but not necessarily human-readable |
| Data contracts | Quality, ownership, SLA expectations | Medium to high | Quality assurance rather than knowledge exchange |
OKF sits on the lightweight side of the trade-off. It should not replace catalogs or data contracts. It is more naturally a companion format that lets those systems emit agent-readable knowledge bundles.
How to Treat OKF in Practice¶
The wrong conclusion is that OKF completes an AI knowledge platform. The better approach is to test it as a low-cost exchange format.
Good starting uses are specific.
- Manage metadata, metric definitions, and runbooks as Git-reviewed, agent-readable files.
- Keep RAG or agent input knowledge in a vendor-portable file format.
- Let agents follow links such as "this table joins to that table" or "this API is deprecated."
Bad expectations are just as specific.
- Do not expect OKF to solve permissions, masking, approvals, freshness, or data quality.
- Do not rebuild an entire enterprise catalog around v0.1.
- Do not trust LLM-generated descriptions or join explanations without human review.
The smallest entry point is additive. Keep your existing Markdown content. Add frontmatter with type, description, and, where useful, tags. That alone gives agents a better way to choose what to read.
One rule matters: do not let an LLM guess timestamp. Last meaningful change should come from Git history, file metadata, or another factual source. A guessed date turns metadata into a confident error.
Summary¶
OKF's contribution is not the idea of putting knowledge in Markdown. Its contribution is the attempt to standardize the exchange surface for knowledge that AI agents read.
That is a sensible, low-cost bet. But as of June 16, 2026, "open", "vendor-neutral", and "secure" should be read as design goals, not proven outcomes. Start with a small bundle and measure whether agent answers and maintenance cost actually improve.
The signal to watch over the next six to twelve months is adoption outside Google. If independent producers and consumers appear, OKF could become a real standard. If not, it may remain a useful Markdown convention around Google Cloud's knowledge tooling.