Skip to content

RSL Crawler Authorization Protocol (CAP)

Version 1.0 Draft. Last updated: 2025-09-10.

Websites traditionally rely on web crawlers to correctly identify themselves via the User-Agent header and follow robots.txt rules. In practice, many crawlers spoof well-known agents or ignore the restrictions defined in robots.txt.

With RSL, websites can enforce stricter control over their content usage by blocking crawlers that have not obtained a license (free or paid) from an RSL License Server. When a crawler requests a page covered by an RSL license on your site, it must include a valid RSL License Token in the Authorization header using the new License scheme within the HTTP Authentication framework (RFC 7235).

The License scheme is compatible with the broadly adopted OAuth 2.0 Bearer Token Authorization Framework (RFC 6750) semantics, while carrying an RSL license token as the credential.

Example Code: Defining a Crawling License

Below is an RSL license file specifying that crawlers must first obtain a free license and associated <license_token> from the RSL license server at https://api.rslcollective.org:

xml
<rsl xmlns="https://rslstandard.org/rsl">
  <content url="/" server="https://api.rslcollective.org">
    <license>
      <permits type="usage">all</permits>
    </license>
  </content>
</rsl>

Example: Crawler Request with License Authentication

A licensed crawler authorizes requests by sending the <license_token> in the Authorization header:

http
GET /data HTTP/1.1
User-Agent: GPTBot
Authorization: License <license_token>

Successful Response

If the crawler presents a valid License Authorization header with a valid <license_token>, the server must respond with an HTTP 200 OK status code and include the requested content. The server must also include a Link to the governing license (see also Adding RSL to HTTP Headers).

Example: HTTP 200 OK Response

http
HTTP/1.1 200 OK
Link: <https://example.com/license.xml>; rel="license"; type="application/rsl+xml"
Content-Type: text/html; charset=UTF-8

Error Responses

If a crawler does not present a License Authorization header, or presents an invalid, expired, or revoked <license_token>, the server must respond with 401 Unauthorized or 402 Payment Required. The response must include a WWW-Authenticate: License header with error information and a Link header to the governing license. If the token is valid but the license terms do not allow the request, respond with 403 Forbidden (see next section).

Example: HTTP 401 Unauthorized Response

http
HTTP/1.1 401 Unauthorized
WWW-Authenticate: License error="invalid_request", error_description="Access to this resource requires a license"
Link: <https://example.com/license.xml>; rel="license"; type="application/rsl+xml"
Content-Type: text/plain; charset=UTF-8

Example: HTTP 402 Payment Required Response

http
HTTP/1.1 402 Payment Required
WWW-Authenticate: License error="invalid_request", error_description="Access to this resource requires a payment"
Link: <https://example.com/license.xml>; rel="license"; type="application/rsl+xml"
Content-Type: text/plain; charset=UTF-8

Handling Forbidden Requests (Insufficient Scope)

If a crawler presents a valid License Authorization header but the license terms do not permit the requested resource or use (e.g., URL not covered, disallowed usage type, geographic restriction), the server must respond with 403 Forbidden. The response must include a WWW-Authenticate header with error="insufficient_scope" and a Link header to the governing license.

Example: HTTP 403 Forbidden Response

http
HTTP/1.1 403 Forbidden
WWW-Authenticate: License error="insufficient_scope", error_description="Your license does not permit access to this resource"
Link: <https://example.com/license.xml>; rel="license"; type="application/rsl+xml"
Content-Type: text/plain; charset=UTF-8

WWW-Authenticate Header Fields

FieldDescription
License (scheme)Indicates the request MUST use the License authentication scheme.
errorOne of: invalid_request, invalid_token, insufficient_scope.
error_descriptionHuman-readable explanation of the failure.

Authentication Error Codes

Error CodeMeaning
invalid_requestMalformed request (missing/duplicate/unsupported params).
invalid_tokenLicense token missing, expired, revoked, or malformed.
insufficient_scopeLicense token valid, but license terms don’t allow this request.