Set Up Automatic Robots.txt

Overview

Protect sensitive content from unwanted scraping with a continuously updating robots.txt that stays up to date with the latest bots automatically.

1. Create a New Project

Sign up for Dark Visitors
Navigate to the Projects page
Click the New Project button
Enter your website details
Click Create

2. Copy Your Access Token

Click on your project
Click Settings
Copy your access token

3. Serve the Robots.txt

WordPress Node.js REST API Shopify Python PHP

Call the Robots.txt API to generate a new robots.txt. Do this periodically (e.g. once per day), then cache and serve the result.

The Request

Endpoint
URL	`https://api.darkvisitors.com/robots-txts`
HTTP Method	`POST`
Headers
`Authorization`	A bearer token with your project's access token (e.g. `Bearer 48d7dcbd-fc44-4b30-916b-2a5955c8ee42`).
`Content-Type`	This needs to be set to `application/json`
Body
`agent_types`	An array of agent types. Agent types include `AI Assistant`, `AI Data Scraper`, `AI Search Crawler`, and `Undocumented AI Agent`.
`disallow`	A string specifying which URLs are disallowed. Defaults to `/` to disallow all URLs.

The response body is a robots.txt in text/plain format. You can use this as is, or append additional lines to include things like sitemap directives. Cache and serve this as your website's robots.txt.

Example

This cURL example generates a robots.txt that blocks all known AI data scrapers and undocumented AI agents from all URLs.

curl -X POST https://api.darkvisitors.com/robots-txts \
-H "Authorization: Bearer ${ACCESS_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
    "agent_types": [
        "AI Data Scraper",
        "Undocumented AI Agent"
    ],
    "disallow": "/"
}'