For AI Crawlers

For AI Crawlers: Efficient Content Access via TCT Protocol

Reduce bandwidth by 83%, tokens by 86%. Efficient, verifiable, standardized content delivery.

Quick Start

Discovery: Use https://llmpages.org/llm-sitemap.json to discover all available content.

Per-Article JSON: Access machine-readable content at {canonical-url}/llm/

Example:

  • C-URL: https://example.com/article/
  • M-URL: https://example.com/article/llm/

Protocol Benefits

Measured results from 970+ production URLs:

  • 83% bandwidth savings – 103 KB → 17.7 KB average
  • 86% token reduction – 13,900 → 1,960 tokens
  • 90%+ skip rate – Unchanged content returns 304 Not Modified
  • Template-invariant – Same content hash regardless of theme
  • Verifiable – SHA-256 content fingerprints

HTTP Headers

Required Headers in M-URL Response:

  • Link: <canonical-url>; rel="canonical" – Bidirectional handshake
  • ETag: "sha256-abc123..." – Content fingerprint
  • Cache-Control: max-age=0, must-revalidate – Conditional request discipline
  • Vary: Accept – Content negotiation
  • Content-Type: application/json; charset=utf-8

Conditional Requests

To save 90%+ bandwidth:

  1. First request: Store the ETag value
  2. Subsequent requests: Send If-None-Match: "sha256-abc123..."
  3. If unchanged: Receive 304 Not Modified (zero body)
  4. If changed: Receive 200 OK with new content and new ETag

Client Libraries

Python:

pip install collab-tunnel

GitHub: github.com/antunjurkovic-collab/collab-tunnel-python

Production Sites

Test against these verified implementations:

  • wellbeing-support.com – 400 URLs, 10/10 compliance
  • omacedonii.com – 500 URLs, 10/10 compliance
  • bestdemotivationalposters.com – 500 URLs, 9/10 compliance

Validation

Verify TCT compliance: llmpages.org/validator

Support