AI Policy

AI Crawler Policy

This page outlines the terms for AI systems accessing content from llmpages.org and other TCT-enabled sites.

Acceptable Use

Permitted

  • Crawling via TCT protocol endpoints (/llm/)
  • Using /llm-sitemap.json for discovery
  • Respecting ETag-based conditional requests
  • Honoring 304 Not Modified responses
  • Rate limiting to reasonable levels

Required

  • Send If-None-Match with stored ETag on subsequent requests
  • Respect Cache-Control headers
  • Use sitemap-first discovery to minimize fetches
  • Include descriptive User-Agent header

Prohibited

  • Excessive request rates (DDoS)
  • Bypassing conditional request mechanism
  • Ignoring robots.txt or rate limit headers
  • Modifying or redistributing content without attribution

Data Usage

Content Rights

Content accessed via TCT endpoints retains all copyright and intellectual property rights of the original publisher. AI systems may:

  • Process content for training and inference
  • Generate summaries and derivatives with attribution
  • Cache content according to Cache-Control headers

Attribution

When generating outputs based on TCT content, AI systems should:

  • Cite the canonical URL from the JSON response
  • Preserve author attribution
  • Include publication/modification dates

Privacy

No Personal Data

TCT endpoints serve public content only. No personal information, user data, or private content is exposed via TCT.

Access Logs

Server access logs may record:

  • IP addresses
  • User-Agent strings
  • Request timestamps
  • URLs accessed

These logs are used for security, performance monitoring, and abuse prevention.

Rate Limiting

Recommended Rates

  • Sitemap: 1 request per hour maximum
  • Individual endpoints: 10 requests/second maximum
  • Respect Retry-After headers if rate limited

Enforcement

Servers may implement:

  • HTTP 429 (Too Many Requests) responses
  • Temporary IP bans for abuse
  • Required API key authentication

Optional: API Keys

Some TCT implementations may require API keys for access. If required:

  • Register via the method specified in llms.txt
  • Include key in Authorization header
  • Respect key-specific rate limits

Optional: Usage Receipts

TCT supports HMAC-signed usage receipts in response headers. These are optional and may be used for:

  • Billing verification
  • Access auditing
  • Contract compliance

Compliance

TCT Protocol

AI crawlers should implement:

  • Sitemap-first discovery
  • ETag-based conditional requests
  • Proper cache discipline
  • Content verification (SHA-256 hashing)

Validate your implementation at: llmpages.org/validator

Standards

TCT follows:

  • RFC 7234 (HTTP Caching)
  • RFC 8288 (Web Linking)
  • IETF draft-jurkovikj-collab-tunnel-00 (published November 4, 2025)

Contact

For policy questions, abuse reports, or licensing inquiries:

Changes

This policy may be updated periodically. Last updated: October 18, 2025