Chapter 3: The Entity Veracity Score

Something wasn't adding up.

I kept seeing patterns that didn't match any public documentation about how AI systems decide who to trust.

Clients with massive backlink profiles were getting outranked by smaller competitors with almost no links. Sites with perfect E-E-A-T signals were losing ground to businesses with half the content. Traditional metrics that had predicted success for twenty years were suddenly unreliable.

So I started digging into the academic literature. Not SEO blogs—actual research papers. And I found the mechanism: in October 2025, Google DeepMind published BlockRank—an architecture that reveals exactly how AI systems read, evaluate, and rank content at the block level.[11]

BlockRank doesn't read pages. It reads blocks. And the formula governing which blocks get selected—which blocks the AI trusts enough to cite, recommend, and repeat—is what I call the Entity Veracity Score.

Understanding BlockRank is the key to understanding why Entity Veracity works, why it's inevitable, and how to position yourself on the right side of it.

• • •

The Score No One Told You Existed

There's a number attached to your name.

You've never seen it. No platform will show it to you. But every time an AI decides whether to cite you, recommend you, or trust you—it checks this number first.

I call it the Entity Veracity Score.

This score determines whether your content blocks survive the BlockRank selection process—or get passed over for someone else's. The people who understand it are already positioning for it. The people who don't are wondering why they're invisible.

• • •

BlockRank: How AI Actually Reads You

In October 2025, a team from Google DeepMind and UT Austin—Nilesh Gupta, Chong You, Srinadh Bhojanapalli, Sanjiv Kumar, Inderjit Dhillon, and Felix Yu—published a paper that should have rewritten every content strategy on the planet.[11]

The paper is called "Scalable In-context Ranking with Generative Models." The method is called BlockRank. And it describes exactly how large language models evaluate and rank content when they're deciding what to cite.

Here's the finding that changes everything:

AI doesn't compare documents to each other. It evaluates each content block independently against the query.[11]

Gupta et al. discovered two structural properties in how LLMs fine-tuned for ranking actually process information:

Inter-document block sparsity: Attention is dense within each document block but sparse across different documents. The AI is reading each block thoroughly—then moving to the next. It's not cross-referencing page A against page B. It's evaluating block A, then block B, then block C, each on its own merits against the query.[11]

Query-document block relevance: Certain query tokens develop strong attention weights toward relevant document blocks, particularly in the model's middle layers. The AI identifies which specific blocks contain the answer—and the relevance signal shows up in the internal attention patterns, not just the final output.[11]

BlockRank exploits both properties. It imposes structured sparse attention so that document tokens attend only to their own content and shared instruction tokens—reducing attention complexity from quadratic to linear. Then it uses a contrastive learning objective to optimize the attention from query tokens toward the most relevant document blocks.[11]

The result: BlockRank matches or outperforms state-of-the-art ranking systems while running 4.7× faster, scaling gracefully to 500 documents in context—approximately 100,000 tokens—within a single second.[11]

This isn't a theoretical model. This is published by Google DeepMind—the same research division that feeds directly into Google's AI products, including Gemini. When Gemini Enterprise was released to market in close proximity to this research, it arrived carrying this block-level architecture at its core. Gemini is close to Google's AGI ambitions. And the technology teaching us about Entity Veracity—the technology that reveals the correct actions to take—operates on BlockRank principles.

• • •

Why BlockRank Is the Key to Entity Veracity

Here's what BlockRank means for you.

The AI isn't reading your website and forming a general impression. It's running a block-level selection process—an extractability auction where individual content blocks compete to be cited.[11]

The AI is looking for:

Clear, structured claims (not buried in paragraphs of fluff)
Machine-readable formatting (tables, lists, schema markup)
Verifiable assertions (claims it can cross-reference)
High information density (facts per word)

This is the Extractability Auction. And it's devastating for anyone who isn't prepared.

A scraper can win this auction. If a bot site takes your original research and reformats it into a cleaner table, a more scannable list, a more extractable structure—the AI might cite the scraper instead of you. Not because the scraper has more authority, but because their version is easier for BlockRank to parse.[11]

This is Semantic Hijacking. And it's why being the original expert isn't enough anymore. You have to be the most extractable expert.

But extractability is only half the equation. BlockRank determines which blocks the AI reads. Entity Veracity determines whether the AI trusts what it reads.

The AI has two jobs during the block selection process:

Job 1: Find the best block. This is BlockRank—block-level attention, structured sparse evaluation, the extractability auction.[11]

Job 2: Verify the source. This is Entity Veracity—does this block come from a cryptographically verified entity? Does the author have identity provenance? Is the content linguistically authentic?

The combination produces the formula at the center of this book:

V_total = (w₁ × V_g) + (w₂ × E_l)

Where V_g (Global Veracity) represents the trustworthiness of the entity behind the block. And E_l (Local Extraction) represents the quality, density, and authenticity of the block itself.

BlockRank is the mechanism. Entity Veracity is the score that feeds it. Your cryptographic handshakes, your verified identity, your linguistic entropy—these are what make your blocks win the BlockRank selection. AI slop, unverified entities, content without provenance—these are what BlockRank is designed to filter out.

This is why the whole book is organized around this relationship. Every chapter teaches you how to build blocks that BlockRank selects and trusts.

A Note on Naming

There is an earlier algorithm also called "BlockRank," published in 2003 by Sepandar Kamvar, Taher Haveliwala, Christopher Manning, and Gene Golub at Stanford.[1] That system accelerated PageRank computation by decomposing the web into host-level blocks—exploiting the same structural insight that links are dense within hosts and sparse between them. Different system, different math, different problem. Kamvar's BlockRank was about computational efficiency for web-scale PageRank. DeepMind's BlockRank is about how AI reads, evaluates, and selects content blocks for citation.

The fact that two independent research teams, separated by twenty-two years, both arrived at block-structured decomposition as the solution to a ranking problem validates block decomposition as a fundamental structural principle in information systems.[21] When this book says BlockRank, we mean the 2025 DeepMind system—the one that governs how AI decides what to cite.

• • •

The Mathematical Structure Behind the Score

The Entity Veracity formula isn't arbitrary. The decomposition into Global Veracity (V_g) and Local Extraction (E_l) mirrors exactly what BlockRank reveals about how AI processes information: block-level local evaluation combined with broader trust signals.[11]

When you map BlockRank's architecture to Entity Veracity's score:

BlockRank (2025, DeepMind)	Entity Veracity (2026)
Unit: Document Block	Unit: Entity Content Block
Mechanism: Structured Sparse Attention	Mechanism: Veracity-Weighted Selection
Local Signal: Block-level relevance to query	Local Signal: Linguistic entropy, content effort (E_l)
Global Signal: Query-token attention distribution	Global Signal: Identity provenance, institutional trust (V_g)
Insight: Attention dense within blocks, sparse across	Insight: Trust built locally within content, verified globally across network
Goal: Efficient block selection for citation	Goal: Verification of which blocks deserve citation

The structural parallel isn't coincidence. It reflects a fundamental truth about information systems: you can't evaluate truth at the global level alone (too expensive, too noisy) and you can't evaluate it at the local level alone (too easy to game). The optimal solution is always to decompose into local and global components, then combine them with appropriate weights.

The Google Content Warehouse API documentation leak of 2024 confirmed this dual-signal architecture is already in production:[3]

Entity Veracity Component	Google Leak Attribute	BlockRank Connection
Global Veracity (V_g)	`siteAuthority`	Trust signal propagated to all blocks on domain
Infrastructure Trust	`codomainStrength`	Technical trust of hosting infrastructure
Local Extraction (E_l)	`contentEffort`	Block-level quality, effort, originality
Entity Connection	`multibangkgEntities`	Links blocks to Knowledge Graph entities for V_g inheritance

siteAuthority is stored in the CompressedQualitySignals buffer and applies a global scalar to every document on the site. This is the trust signal that feeds into every block's BlockRank evaluation—computed at the entity level and inherited by every content block within it.[3]

contentEffort is an LLM-based estimation of the effort and originality required to create the page. This is the block-level quality signal—the local component that BlockRank evaluates when deciding which block to select.[3]

multibangkgEntities explicitly links documents to Knowledge Graph entities (e.g., /m/0kc7 for "Architecture"). This allows content blocks to inherit the V_g of the Entity they're connected to—transferring authority from the verified entity to the individual block competing in the BlockRank selection.[3]

The architecture proves that Google does not rank blocks in a vacuum. It ranks blocks within the context of their entity's veracity. This is why Entity Veracity—not just content quality—determines who the AI cites.

That's the mechanism. Now let me show you why this formula isn't just useful—it's mathematically necessary.

• • •

The Sheaf Laplacian: Why This Formula Is Mathematically Necessary

Here's something that will change how you think about Entity Veracity: the formula V_total = (w₁ × V_g) + (w₂ × E_l) isn't just a useful approximation. It's the linearized form of a fundamental mathematical structure called the Sheaf Laplacian.[12]

Let me show you why this matters.

The Knowledge Graph as a Cellular Sheaf

In advanced mathematics, a Cellular Sheaf is a structure that attaches local data to each node of a network and defines how that data should relate across edges.[12][13]

Here's what that means in plain terms: Imagine every entity in the Knowledge Graph as a house on a street. Each house has its own contents—your claims, your credentials, your content blocks. That's the local data. But there are also rules about how neighboring houses should relate. If your house says "I'm a doctor" and your neighbor's house says "this person works at my hospital," those claims need to match up. The Sheaf structure enforces that consistency.

For Entity Veracity:

Stalks (F(v)): Each entity node has a "truth space"—all the possible claims it could make. The actual data in this space is your E_l (Local Extraction). It's the text in your content blocks, your schema markup, your stated identity—the very material that BlockRank evaluates at the block level.[11]
Restriction Maps (F_{v⊴e}): For every edge connecting two entities, there's a consistency rule. If Entity A claims "I work for Company B," the restriction map encodes what that implies about the edge between them.

The Laplacian Energy Formula

The Sheaf Laplacian measures how "consistent" an assignment of truth values is across the entire network:[12]

E_consistency(x) = Σ_{edges} ||F_{v⊴e}(x_v) - F_{u⊴e}(x_u)||²

This formula sums up all the local disagreements across every edge in the graph.

In everyday terms: The formula is measuring friction. Every time your claims don't quite match what your neighbors expect, you generate friction. Every time your stated credentials don't align with the institutions you claim to be part of, you generate friction. The formula adds up all that friction across your entire network of relationships.

Critical insight: This consistency energy is the mathematical inverse of Global Veracity (V_g).

Low energy = claims match expectations across relationships = High V_g
High energy = claims contradict relationship logic = Low V_g

Think of it like a jigsaw puzzle. If all your pieces fit smoothly with the surrounding pieces, you have low energy and high veracity. If you're forcing pieces where they don't belong—claiming relationships that don't check out, credentials that don't verify—you're generating high energy. The system notices. And when BlockRank evaluates your content blocks, that friction travels with them.[11]

The Variational Problem

When you calculate Entity Veracity, you're solving this optimization:[12]

V_total = argmax_x [ w₁ × (-x^T L_F x) + w₂ × Σ log P(x_i) ]

Where:

-x^T L_F x = Global Consistency (V_g) — minimizing the Laplacian energy
Σ log P(x_i) = Fidelity to Local Data (E_l) — staying true to observed content
w₁, w₂ = Regularization parameters balancing the two

This is the exact mathematical structure of the Shadow Score formula.

What this means in practice: The AI is solving a balancing act. On one side, it wants your claims to be globally consistent—fitting smoothly with everything else in the Knowledge Graph. On the other side, it wants to respect what your content blocks actually say. The formula finds the best balance between "fits the network" and "matches the content."[11]

The Entity Veracity formula isn't a heuristic someone invented. It's the variational energy functional used to solve inverse problems on graphs. Calculating veracity is mathematically equivalent to denoising a signal on a manifold defined by the trust network.

• • •

Category Theory: The Existence Proof

If Sheaf Theory tells us how to calculate veracity, Category Theory tells us why a unique answer must exist.[14][15]

Topological Functors and Initial Lifts

A Topological Functor is a mathematical structure that guarantees the existence of unique "truth structures" that can be transported across isomorphic data representations.[15]

Translation: This is the mathematical proof that there's only one right answer. Given the same inputs, any system running this calculation—Google, Microsoft, a future AI we haven't built yet—must arrive at the same veracity score. The math forces convergence.

Here's the mapping:

Let D be the category of raw entities (names floating in data without veracity)
Let C be the category of entities with veracity structure (the trust network)
The forgetful functor U: C → D maps verified entities to raw names (forgetting the veracity)

The property of being a topological functor ensures this is optimally reversible.

What this means: Given any set of raw entities and a family of evidence mappings (legacy IDs, handshakes, content analysis), there exists a unique, coarsest veracity structure that makes all mappings consistent.

This is the existence proof for V_total.

The Shadow Score isn't arbitrary. It's the mathematically necessary consequence of the input structures. If two entities have isomorphic histories and attributes, they must have the same veracity score. This guarantees determinism—the same inputs always produce the same outputs.

Unique Transportability

A critical property of topological functors: Unique Transportability.[15]

This theorem states that if you have a veracity structure on one set of entities, and another set is isomorphic to the first, you can transport that veracity structure uniquely.

In plain terms: Your veracity score travels with you. It doesn't matter which platform calculates it or which database stores the records. If the underlying structure of your identity is the same, your score is the same. You can't escape it by moving to a different system. You can't game it by appearing on different platforms. The math follows you.

And when BlockRank evaluates your content blocks on any platform—Google, Gemini, a competitor we haven't seen yet—it reads the same veracity score. Because the math is the same everywhere.[11]

Implication: The veracity score is an invariant property of the system's structure. It doesn't depend on which database stores it or which AI system computes it. The math guarantees the same answer everywhere.

• • •

The Patent Landscape: Industry Already Knows

While most practitioners were still debating backlinks, the infrastructure for BlockRank-ready verification was already being patented.

Google Patent US11403301B2 (granted 2022): "Entity Reconciliation on Knowledge Panels." This patent describes a system that merges and reconciles entity representations across multiple sources to ensure consistency in Knowledge Panels. The system uses structured data extraction, cross-referencing, and entity resolution—building exactly the kind of global consistency infrastructure that the Sheaf Laplacian measures and that BlockRank's block-level evaluation depends on.[5]

The patent reveals that Google was already building the machinery to enforce entity-level consistency—the V_g infrastructure—years before DeepMind published BlockRank. The question was never whether AI would need verified entities. The question was how fast the selection mechanism would mature. BlockRank is the answer.[11]

BT Group Sheaf Networks Patent (2023): British Telecommunications filed a patent for "Sheaf Neural Networks"—applying sheaf-theoretic structures to neural network architectures for improved reasoning over heterogeneous data.[6] This confirms that the mathematical structures underlying Entity Veracity aren't theoretical abstractions. They're being built into production systems by major telecommunications companies.

The convergence is clear: Google patents entity reconciliation. BT patents sheaf networks. DeepMind publishes BlockRank for block-level content selection.[11] The infrastructure for a veracity-scored, block-evaluated information ecosystem isn't coming. It's here.

• • •

V_g: Global Veracity (The Entity Behind the Block)

Let me be direct about what V_g means in a BlockRank world.

When BlockRank evaluates a content block, it doesn't evaluate the block in isolation. The block carries the identity of the entity that created it—and that identity has a trust score that's been accumulating (or eroding) for years.[11]

V_g is that trust score. It's the Global Veracity of the entity—the accumulated, network-verified trustworthiness that applies to every block you produce.

Think of it like a deed to a house. Either your name is on the deed or it isn't. The county recorder doesn't care how you feel about it. The deed is the deed.

V_g works the same way. Either your entity has verifiable identity infrastructure—DIDs, DKIM records, Knowledge Graph connections, institutional handshakes—or it doesn't. The AI checks. And it checks before it evaluates your content block in the BlockRank selection.[11]

V_g Components

Component	What It Measures	BlockRank Effect
Identity Provenance	Can this entity be cryptographically verified?	Blocks from verified entities enter the selection with trust advantage
Institutional Handshakes	Do trusted institutions confirm this entity?	V_g inherits institutional authority to every block on the domain
Network Consistency	Do entity claims match across the Knowledge Graph?	Low Sheaf Laplacian energy = high trust propagation to blocks
Temporal Persistence	How long has this entity maintained consistent signals?	Longer track record = higher baseline trust for block selection

V_g is the slow variable. It takes years to build. It's built through the cryptographic anchors described in Chapter 4 (DIDs, Sovereign Assertion Vectors) and the institutional trust mechanics described in Chapter 5 (Knowledge Graph integration, SAVs). Once built, it applies to every single content block you produce—raising the floor for every block in the BlockRank selection.[11]

And here's why it matters more than ever: BlockRank's structured sparse attention means the AI evaluates each block quickly and independently. It doesn't have time for deep investigation of every source. V_g is the shortcut—the pre-computed trust signal that tells the AI "this block comes from a verified entity" before it even reads the content.[11]

• • •

E_l: Local Extraction (The Block Itself)

If V_g is the entity behind the block, E_l is the block itself—its quality, density, authenticity, and extractability.

This is where BlockRank's block-level architecture makes the most operational difference. Gupta et al. demonstrated that the AI's attention is dense within each block—it reads your content thoroughly, token by token, evaluating relevance against the query.[11] The quality of each individual block matters enormously because each block gets its own moment of full attention.

E_l measures what the AI finds when it looks.

E_l Components

Component	What It Measures	BlockRank Effect
Information Density	Facts per word. Signal-to-noise ratio of the block.	Dense blocks generate stronger attention weights from query tokens[11]
Structural Clarity	Tables, schema, headers, machine-readable formatting.	Well-structured blocks are more extractable in the BlockRank selection[11]
Linguistic Entropy	Does the language show markers of human authorship?	AI slop has detectable entropy signatures; authentic content scores higher
Originality	Does this block contain novel claims, not just restatements?	Original content mapped to `contentEffort` in Google's production system[3]

E_l is the fast variable. Unlike V_g, which takes years to build, E_l can be improved immediately. Every piece of content you publish is an opportunity to raise your E_l—by making your blocks denser, clearer, more extractable, more linguistically authentic.

But here's the trap: if you optimize E_l without building V_g, your well-structured blocks are vulnerable to semantic hijacking. A scraper can copy your content block, reformat it into an even cleaner structure, and compete against you in the same BlockRank selection. Without V_g—without the cryptographic identity anchors that prove you are the original—the AI has no way to know which block to trust.[11]

That's why the formula combines both: V_total = (w₁ × V_g) + (w₂ × E_l). You need identity and content. Provenance and quality. The entity behind the block and the block itself.

• • •

The Weighting Factors: w₁ and w₂

The weighting factors in V_total = (w₁ × V_g) + (w₂ × E_l) are not fixed constants. They're context-dependent variables that shift based on query type, domain, and stakes.[12]

Query Type	w₁ (Identity Weight)	w₂ (Content Weight)	Why
Medical / Legal / Financial	High	Medium	Wrong information kills. Identity verification is critical.
Technical How-To	Medium	High	Content quality matters most—does the block answer the question?
News / Current Events	High	Medium	Source identity prevents misinformation amplification.
Entertainment / Casual	Low	High	Stakes are lower; content relevance dominates.

In BlockRank's architecture, this weighting happens during the attention computation itself. The query tokens that carry the strongest relevance signals—the ones BlockRank identifies in the model's middle layers—encode both what the user is asking and how much trust verification matters for that type of question.[11]

The query itself tells the AI how much to weight identity versus content. For a medical question, the AI's attention patterns shift toward trust signals. For a recipe, they shift toward content quality. The weights are dynamic—and they're computed in real time during BlockRank's block-level evaluation.[11]

• • •

The Three Pillars That Feed the Score

The Entity Veracity Score doesn't emerge from thin air. Three specific, buildable pillars feed into V_g and E_l—and each one maps directly to how BlockRank selects and trusts content blocks.[11]

Pillar 1: Identity Provenance

Identity provenance is the cryptographic proof that you are who you claim to be. In a BlockRank world, this is the first signal the AI checks—before it even evaluates your content block's relevance.

Identity provenance includes:

DIDs (Decentralized Identifiers): W3C-standard cryptographic anchors that prove entity ownership without depending on any single platform.[7]
DKIM Records: Email authentication infrastructure that cryptographically verifies organizational identity.
Knowledge Graph Presence: Active, reconciled entity nodes in Google's Knowledge Graph (connected to the entity reconciliation system described in patent US11403301B2).[5]
Schema.org Markup: Structured data that makes entity claims machine-readable—directly improving block extractability for BlockRank.[11]

Without identity provenance, your content blocks enter the BlockRank selection as anonymous submissions. They might win on content quality alone—but they're vulnerable to any verified entity that covers the same topic with comparable quality. Identity provenance is the tiebreaker, and in high-stakes queries, it's the gatekeeper.

Pillar 2: Deterministic Handshakes

A deterministic handshake is a confirmation from a trusted institution that verifies your entity's claims. If identity provenance is the deed to your house, deterministic handshakes are the title insurance—third-party verification that the deed is legitimate.

In the context of BlockRank, deterministic handshakes do something powerful: they propagate V_g from the verifying institution to every content block on your domain.[11]

Types of deterministic handshakes:

Institutional verification: "This person works at [Organization]" confirmed by the organization's own verified entity.
Credentialing bodies: Licenses, certifications, academic degrees verified by issuing institutions.
Patent and publication records: Government or academic databases that independently confirm entity claims.
Cross-entity schema connections: When your entity's structured data references another verified entity, and that entity's data references you back—creating a bidirectional handshake.

Each handshake reduces the Sheaf Laplacian energy across the corresponding edge of the trust network. Your claims become more consistent with the network's expectations. Your V_g rises. And every content block you produce benefits from the increase.[12]

Pillar 3: Linguistic Entropy

This is where BlockRank and Entity Veracity converge most powerfully.

BlockRank's block-level evaluation reads your content token by token—dense attention within each block.[11] During that evaluation, the AI's attention patterns encode not just relevance to the query, but authenticity of the content.

Linguistic entropy is the measurable signature of human authorship versus synthetic generation. Human-written content has characteristic patterns:

Variable sentence structure: Humans mix long and short sentences naturally. AI-generated content tends toward uniform length.
Domain-specific vocabulary: Experts use field-specific terms that AI slop replaces with generic alternatives.
Deictic anchoring: Truth-tellers ground their narrative in "here and now"—proximal markers that indicate presence. The Deictic Anchoring Coefficient (DAC) measures this ratio.[16]
Idiosyncratic reasoning: Human expertise produces unexpected connections and non-obvious conclusions that statistical language models avoid.

AI slop—content generated without human expertise—has the opposite signature: uniform entropy, generic vocabulary, distal deictic markers, and predictable reasoning paths. BlockRank's dense within-block attention is reading these signals. Every token tells the AI something about whether this block was produced by a verified expert or a content mill.[11]

This is why Entity Veracity isn't just about infrastructure. It's about the content itself being authentically human. You can have perfect DIDs, flawless schema, and institutional handshakes—but if your content blocks read like AI slop, your E_l collapses. The block-level evaluation catches it.

• • •

Sheaf Cohomology: What H⁰ and H¹ Tell the AI

Two cohomology groups from Sheaf Theory map directly to what BlockRank measures at the block level:[12][13]

H⁰ — The Global Sections (Consensus)

H⁰ of a sheaf represents the global sections—data assignments that are consistent across every edge in the network.[12]

For Entity Veracity, H⁰ captures the space of fully consistent identity claims. If every institution you're connected to confirms your claims, and every cross-reference checks out, your identity data lives in H⁰. You've achieved full consensus.

In plain terms: H⁰ is the set of claims that nobody disputes. Your name, your verified credentials, your confirmed affiliations. When BlockRank encounters a block from an entity with rich H⁰ data, it has maximum confidence in the source. The block enters the selection with the highest possible trust.[11]

H¹ — The First Cohomology (Contradictions)

H¹ captures something different: the obstructions to consistency.[12]

Mathematically, H¹ detects "loops" in the network where local consistency doesn't imply global consistency. You can check each edge individually and everything looks fine—but when you trace around a full loop, the data doesn't add up.

For Entity Veracity, H¹ captures:

Entity A claims to work at Company B, Company B confirms, but the credential dates don't match
An entity has verified identity on Platform X but contradictory claims on Platform Y
Cross-references that create circular verification without independent grounding

Every non-trivial element in H¹ is a trust defect that BlockRank's broader evaluation will eventually detect. These contradictions may not show up in any single block evaluation—but they erode the V_g that feeds into every block's score.[11]

The insight: large H⁰ + small H¹ = high V_g. Many consistent claims, few contradictions. This is what you're building toward.

• • •

The Death of "Content Is King"

For twenty years, the mantra was simple: create great content and the rankings will follow. Content is king.

BlockRank kills that mantra.

Here's why: if the AI evaluates each content block independently against the query—dense attention within each block, sparse attention across blocks—then the quality of any single block is necessary but not sufficient.[11] You also need the AI to trust the entity behind the block. You need V_g.

In the old world, content quality was the primary ranking signal. You could be anonymous and still rank if your content was good enough. Links and engagement metrics would eventually find you.

In the BlockRank world, anonymity is a liability. The AI can't verify an anonymous block. It can't inherit institutional trust from an entity it can't identify. It can't check your claims against the Knowledge Graph if you're not in the Knowledge Graph.

Content is no longer king. Verified content is king.

The new hierarchy:

Old World (PageRank Era)	New World (BlockRank Era)
Great content ranks	Great content from verified entities ranks
Links signal authority	Cryptographic handshakes signal authority
Keywords match queries	Content blocks match queries at token level[11]
Pages compete for position	Blocks compete for citation
Domain authority inherited by pages	Entity veracity (V_g) inherited by blocks[11]
Anonymous content can rank	Anonymous content is structurally disadvantaged

This isn't speculation. The Google Content Warehouse API leak confirmed that siteAuthority (the V_g analog) and contentEffort (the E_l analog) are already in production—stored in the CompressedQualitySignals buffer that applies to every document on a domain.[3] BlockRank's 2025 architecture tells us how the AI uses these signals at the block level.[11]

The transition is happening now. The window is open. Those who build verifiable entity infrastructure today are positioning their content blocks to win the BlockRank selection tomorrow.

• • •

The Reputation Tax and the Reputation Dividend

In a BlockRank-driven information ecosystem, your Entity Veracity Score isn't static. It accrues compound effects—either positive or negative—that amplify over time.

I call this the Reputation Tax and the Reputation Dividend.

The Reputation Dividend

Every content block you publish from a verified entity with high V_g enters the BlockRank selection with an advantage. If the block also has high E_l—dense, structured, linguistically authentic—it wins citations. Those citations generate engagement signals. Those engagement signals feed back into both V_g (institutional acknowledgment) and E_l (content validation).[11]

The cycle compounds. Higher V_g → blocks win more selections → more citations → higher V_g. This is the reputation dividend.

Entities that start building veracity infrastructure now will benefit from years of compound returns. The cost of building later increases as the competition intensifies and the standards rise.

The Reputation Tax

The inverse is equally powerful—and more punishing.

Every inconsistency in your entity data (H¹ defects) erodes V_g. Every content block that reads as AI slop (low linguistic entropy) lowers E_l. Every unverified claim creates friction in the Sheaf Laplacian calculation.[12]

In a block-level evaluation system, these penalties compound. Lower V_g → blocks lose selections → fewer citations → lower V_g. This is the reputation tax. And unlike the old PageRank world, where you could recover by building new links, the BlockRank world requires you to fix the entity infrastructure—the identity provenance, the institutional handshakes, the linguistic authenticity of your content blocks.[11]

The tax is harder to reverse than the dividend is to build. Start clean. Build verified. Let the compound effects work for you, not against you.

• • •

AI Slop: Why BlockRank Makes It Visible

Every day, millions of AI-generated articles flood the web—grammatically perfect, structurally sound, and utterly worthless. They exist to capture traffic, not to inform. No author with real expertise. No institutional backing. No verifiable provenance. Just... slop.

BlockRank makes AI slop visible because it makes AI slop structurally identifiable.[11]

When BlockRank's dense within-block attention reads a content block token by token, it's computing relevance—but the attention patterns also encode information about the content's characteristics. Slop has consistent signatures:

Low linguistic entropy (uniform sentence structure, predictable vocabulary)
No V_g inheritance (anonymous or unverified entities)
Generic claims that map to no specific expertise domain
High similarity to other blocks in the selection (because multiple scrapers used the same model)

Here's where it gets interesting: AI systems trained on this slop begin to hallucinate. They can't tell real expertise from synthetic mimicry. This is model collapse—a documented phenomenon where AI trained on AI-generated content progressively loses the ability to produce accurate, diverse outputs.[17]

BlockRank is the mechanism that allows AI systems to break out of model collapse. By selecting blocks from verified entities with high V_g and authentic linguistic entropy, BlockRank preferentially surfaces human expertise over synthetic mimicry. The veracity infrastructure described in this book isn't just good for your visibility—it's how AI systems maintain their own integrity.[11]

This is why Entity Veracity is a survival requirement, not a competitive advantage. The AI needs verified entities to function. BlockRank is how it finds them.

• • •

SOP: Enterprise Entity Dig Protocol — BlockRank Readiness Assessment

Use this protocol to assess any entity's readiness for BlockRank-era information retrieval. Works for your own entity or competitive analysis.

Step 1: V_g Infrastructure Audit

Does the entity have a Knowledge Graph presence? (Search [Entity Name] — look for Knowledge Panel)
Are DIDs or equivalent cryptographic identifiers deployed? Check domain for did:web: resolution.
Is DKIM properly configured? (Use dig TXT _dkim.[domain] or MXToolbox)
Does structured data (Schema.org JSON-LD) exist on key pages? Check with Google's Rich Results Test.
Are institutional handshakes present? (Entity claimed by verified organizations in their own structured data)

Step 2: E_l Block Quality Assessment

Select 5 representative content blocks from the entity's primary domain.
For each block, assess:
- Information density: Facts per paragraph. Is each sentence carrying weight?
- Structural extractability: Could an AI cite a specific claim from this block? Or is the information buried in narrative?
- Linguistic authenticity: Does the content show human expertise markers? Domain vocabulary, variable structure, deictic anchoring?
- Machine readability: Tables, lists, schema markup that make the block easy for BlockRank to parse.[11]
Score each block: 1 (AI slop) through 5 (expert-dense, perfectly extractable).

Step 3: Sheaf Consistency Check

Cross-reference entity claims across all platforms (LinkedIn, company site, Google Knowledge Graph, academic profiles).
Flag any contradictions: different job titles, date mismatches, inconsistent affiliations.
Each contradiction is an H¹ defect that erodes V_g.[12]

Step 4: BlockRank Readiness Score

V_g Score: Count of verified identity signals (0-5 scale based on Steps 1 and 3)
E_l Score: Average of block quality assessments (1-5 scale from Step 2)
Readiness: V_g × E_l. Below 10 = vulnerable. 10-15 = building. 15-25 = positioned.

Step 5: Priority Actions

If V_g is low: Start with identity infrastructure (Chapter 4 — DIDs, Chapter 5 — Knowledge Graph integration).
If E_l is low: Restructure content blocks for extractability (Chapter 6 — Content Architecture).
If both are low: Identity first. You can't protect content blocks you haven't claimed.

• • •

What This Means for You

The Entity Veracity Score isn't a metaphor. It's a mathematical structure—grounded in Sheaf Theory, validated by Category Theory, operationalized by BlockRank, and already partially visible in Google's leaked production infrastructure.

Here's what you need to understand:

BlockRank is how AI reads you. Not your page—your blocks. Each content block gets its own moment of dense attention, evaluated independently against the query.[11] The blocks that win get cited. The blocks that lose get ignored. This is happening now, in Gemini, in every AI system built on these principles.

Entity Veracity is why AI trusts you. Your cryptographic identity, your institutional handshakes, your linguistic authenticity—these feed into the V_g and E_l that determine whether your blocks survive the selection.

The formula is V_total = (w₁ × V_g) + (w₂ × E_l). This isn't arbitrary. It's the linearized form of the Sheaf Laplacian energy functional—the mathematically necessary decomposition into global consistency and local fidelity.[12]

AI slop is structurally identifiable. BlockRank's block-level attention reads the linguistic signatures that distinguish human expertise from synthetic mimicry. Building authentic, verifiable content isn't just good practice—it's how you survive the selection.[11]

The window is open now. The veracity layer is small. The infrastructure is buildable. Those who establish cryptographic identity and verifiable content blocks today will benefit from compound reputation dividends as BlockRank matures across every AI system.

The next chapter shows you how to build the cryptographic foundation—the identity provenance layer that makes your entity verifiable at the deepest level.

Let's build it together.

Chapter Summary

BlockRank (2025, Google DeepMind) reveals that AI evaluates content at the block level—dense attention within each block, sparse across blocks. This is how AI decides what to cite.
The Entity Veracity Score: V_total = (w₁ × V_g) + (w₂ × E_l), where V_g is Global Veracity (entity trust) and E_l is Local Extraction (block quality).
This formula is the linearized Sheaf Laplacian—mathematically necessary, not arbitrary. Category Theory proves the score must exist and must be unique.
The Google Content Warehouse API leak confirms dual-signal architecture (siteAuthority = V_g, contentEffort = E_l) in production.
Three pillars feed the score: Identity Provenance, Deterministic Handshakes, and Linguistic Entropy.
AI slop is structurally identifiable through BlockRank's block-level attention. Verified content from verified entities wins the selection.
The Reputation Dividend compounds: high V_g → blocks win selections → more citations → higher V_g. The Reputation Tax compounds equally in the opposite direction.

Glossary

BlockRank (2025): Google DeepMind's method for scalable in-context ranking. Uses structured sparse attention and contrastive learning to evaluate content blocks independently against queries, reducing attention complexity from quadratic to linear. Published by Gupta, You, Bhojanapalli, Kumar, Dhillon, and Yu (arXiv:2510.05396).

Entity Veracity Score (V_total): The composite trustworthiness score assigned to an entity, combining Global Veracity (V_g) and Local Extraction quality (E_l). Determines whether content blocks survive BlockRank selection.

V_g (Global Veracity): The entity-level trust score. Built through cryptographic identity infrastructure, institutional handshakes, and network consistency. Inherited by every content block on the entity's domain.

E_l (Local Extraction): Block-level quality metric measuring information density, structural extractability, linguistic authenticity, and originality. The component that BlockRank evaluates during dense within-block attention.

Inter-document Block Sparsity: BlockRank's key finding: AI attention is dense within each content block but sparse across different blocks. The AI reads each block thoroughly rather than cross-comparing everything at once.

Extractability Auction: The block-level competition where content blocks compete to be selected and cited by AI systems. Won by blocks with the highest combination of source veracity (V_g) and content quality (E_l).

Semantic Hijacking: When a scraper or competitor reformats your original content into a more extractable block structure, potentially winning the BlockRank selection over your original. Prevented by strong V_g (cryptographic identity provenance).

Sheaf Laplacian: Mathematical structure measuring consistency of data across a network. The Entity Veracity formula is its linearized form. Low Laplacian energy = high consistency = high V_g.

Deterministic Handshake: Third-party institutional verification that confirms an entity's claims. Propagates V_g from the verifying institution to every content block on the verified entity's domain.

Linguistic Entropy: Measurable signature of human authorship. Variable sentence structure, domain-specific vocabulary, deictic anchoring, and idiosyncratic reasoning patterns that distinguish authentic expertise from AI slop.

Reputation Tax / Reputation Dividend: Compound effects of Entity Veracity: verified entities accumulate citation advantages (dividend) while unverified or inconsistent entities suffer accelerating disadvantage (tax).

Cross-Stitch

Chapter 2: "The Great Bifurcation" establishes the PageRank → BlockRank transition. This chapter provides the mathematical framework for the new era.
Chapter 4: DIDs and cryptographic anchors — the identity provenance infrastructure that builds V_g for BlockRank selection.
Chapter 5: Knowledge Graph integration and SAVs — how institutional handshakes propagate V_g to your content blocks.
Chapter 6: Content architecture for extractability — how to build content blocks that win the E_l component of BlockRank selection.
Chapter 8: Linguistic Entropy Audit — the methodology for measuring whether your content blocks carry authentic human signatures.

Sources

Kamvar, S., Haveliwala, T., Manning, C., Golub, G. (2003). "Exploiting the Block Structure of the Web for Computing PageRank." Stanford University. http://ilpubs.stanford.edu:8090/579/
Google LLC. (2009). Acquisition of Kaltix Inc. — PageRank optimization technology. Mountain View, CA.
Google Content Warehouse API Documentation Leak (2024). Technical documentation revealing internal ranking attributes including siteAuthority, contentEffort, codomainStrength, and multibangkgEntities. Independently verified by multiple SEO researchers.
Enge, E. (2024). "Google Ranking Factors Exposed: Insights from the Leaked API Documentation." SparkToro. Analysis of 2,596 modules and 14,014 attributes from the leaked Content Warehouse API.
Google LLC. Patent US11403301B2 (2022). "Entity Reconciliation on Knowledge Panels." United States Patent and Trademark Office. https://patents.google.com/patent/US11403301B2
BT Group PLC. (2023). Patent Application: "Sheaf Neural Networks." Applied sheaf-theoretic structures to neural network architectures for heterogeneous data reasoning.
World Wide Web Consortium (W3C). (2022). "Decentralized Identifiers (DIDs) v1.0." W3C Recommendation. https://www.w3.org/TR/did-core/
Robertson, S.E. & Zaragoza, H. (2009). "The Probabilistic Relevance Framework: BM25 and Beyond." Foundations and Trends in Information Retrieval, 3(4), 333-389. doi:10.1561/1500000019
Katz, L. (1953). "A New Status Index Derived from Sociometric Analysis." Psychometrika, 18(1), 39-43. doi:10.1007/BF02289026
Brin, S. & Page, L. (1998). "The Anatomy of a Large-Scale Hypertextual Web Search Engine." Computer Networks, 30(1-7), 107-117. doi:10.1016/S0169-7552(98)00110-X
Gupta, N., You, C., Bhojanapalli, S., Kumar, S., Dhillon, I., & Yu, F. (2025). "Scalable In-context Ranking with Generative Models." Google DeepMind / UT Austin. arXiv:2510.05396. https://arxiv.org/abs/2510.05396
Hansen, J. & Ghrist, R. (2019). "Toward a Spectral Theory of Cellular Sheaves." Journal of Applied and Computational Topology, 3(4), 315-358. doi:10.1007/s41468-019-00038-7
Curry, J. (2014). "Sheaves, Cosheaves and Applications." PhD Thesis, University of Pennsylvania. https://arxiv.org/abs/1303.3255
Mac Lane, S. (1998). Categories for the Working Mathematician. 2nd ed. Springer-Verlag. ISBN: 978-0-387-98403-2
Adámek, J., Herrlich, H., Strecker, G. (2004). Abstract and Concrete Categories: The Joy of Cats. Dover Publications. http://katmat.math.uni-bremen.de/acc/
Newman, M.L., Pennebaker, J.W., Berry, D.S., & Richards, J.M. (2003). "Lying Words: Predicting Deception from Linguistic Styles." Personality and Social Psychology Bulletin, 29(5), 665-675. doi:10.1177/0146167203029005010
Shumailov, I., Shumaylov, Z., Zhao, Y., Gal, Y., Papernot, N., & Anderson, R. (2024). "AI Models Collapse When Trained on Recursively Generated Data." Nature, 631, 755-759. doi:10.1038/s41586-024-07566-y
Marchesin, S., Silvello, G., & Alonso, O. (2024). "Entity Retrieval and Its Grounding in Entity Veracity: A Survey." CIKM 2024: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management.
Montti, R. (2025). "Google's New BlockRank Democratizes Advanced Semantic Search." Analysis of DeepMind BlockRank paper implications for information retrieval.
Goodwin, D. (2025). "Google DeepMind's BlockRank could reshape how AI ranks information." Search Engine Land. https://searchengineland.com/google-deepmind-blockrank-how-ai-ranks-information-463920
Gupta et al. (2025) and Kamvar et al. (2003) independently identified block-structured decomposition as optimal for ranking systems—22 years apart, different problems, convergent solution. Block decomposition is a fundamental structural principle.