Set Up Automatic Robots.txt

Overview

Keep your robots.txt updated with the latest known AI scrapers, crawlers, and assistants automatically.

New agents are added frequently, so setting up an automatic robots.txt that continuously updates is more effective than maintaining one manually. You can do this using the API or the WordPress plugin.

We also recommend setting up agent analytics to check whether they're actually following your robots.txt rules.

Enforce Your Robots.txt
The WordPress plugin can also block agents who try to ignore your robots.txt rules.

1. Create a New Project

Sign up and create a new project for your website if you haven't already.

2. Copy Your Access Token

3. Generate and Serve the Robots.txt

There are 2 ways to generate and serve a robots.txt from your website.

Option 1: Using the WordPress Plugin

Use this method for WordPress websites. Adding the plugin is quick and easy.

Option 2: Using the API

Make a request to the Robots.txts endpoint to generate a new robots.txt. Do this periodically (e.g. once per day), then cache and serve the result.

The Request

Endpoint
URL https://api.darkvisitors.com/robots-txts
HTTP Method POST
Headers
Authorization A bearer token with your project's access token (e.g. Bearer 48d7dcbd-fc44-4b30-916b-2a5955c8ee42).
Content-Type This needs to be set to application/json
Body
agent_types An array of agent types. Agent types include AI Assistant, AI Data Scraper, AI Search Crawler, and Undocumented AI Agent.
disallow A string specifying which URLs are disallowed. Defaults to / to disallow all URLs.

The Response

The response body is a robots.txt in text/plain format. You can use this as is, or append additional lines to include things like sitemap directives. Cache and serve this as your website's robots.txt.

Example

This cURL example generates a robots.txt that blocks all known AI data scrapers and undocumented AI agents from all URLs.

curl -X POST https://api.darkvisitors.com/robots-txts \
-H "Authorization: Bearer ${ACCESS_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
    "agent_types": [
        "AI Data Scraper",
        "Undocumented AI Agent"
    ],
    "disallow": "/"
}'

Here's an example of how to use this in practice for a Node.js backend:

const response = await fetch("https://api.darkvisitors.com/robots-txts", {
    method: "POST",
    headers: {
        "Authorization": "Bearer " + ACCESS_TOKEN,
        "Content-Type": "application/json"
    },
    body: JSON.stringify({
        agent_types: [
            "AI Data Scraper",
            "Undocumented AI Agent"
        ],
        disallow: "/"
    })
})

// Cache and serve response.text() as your robots.txt

You can follow these examples to call the API in any language.