Set Up Your Robots.txt

Overview

Keep your robots.txt up to date with the agent list automatically. The agent list is updated continuously, so generating your robots.txt using the Wordpress Plugin or API is more effective than maintaining one manually.

1. Create a New Project

Sign up and create a new project for your website if you haven't already.

2. Copy Your Access Token

3. Generate and Serve the Robots.txt

There are 3 ways to generate and serve a robots.txt from your website.

Option 1: Using the Wordpress Plugin

Use this method for Wordpress websites. Adding the plugin is quick and easy.

Option 2: Using the API

Make a request to the Robots.txts endpoint to generate a new robots.txt. Do this periodically (e.g. once per day), then cache and serve the result.

The Request

Endpoint
URL https://api.darkvisitors.com/robots-txts
HTTP Method POST
Headers
Authorization A bearer token with your project's access token (e.g. Bearer 48d7dcbd-fc44-4b30-916b-2a5955c8ee42).
Body
agent_types An array of agent types. Agent types include AI Assistant, AI Data Scraper, AI Search Crawler, and Undocumented AI Agent.
disallow A string specifying which URLs are disallowed. Defaults to / to disallow all URLs.

The Response

The response body is a robots.txt in text/plain format. You can use this as is, or append additional lines to include things like sitemap directives. Serve this as your website's robots.txt.

Example

This javascript example would generate a robots.txt that blocks all known AI data scrapers and undocumented AI agents from all URLs.

const robotsTXT = await fetch("https://api.darkvisitors.com/robots-txts", {
    method: "POST",
    headers: {
        "Authorization": "Bearer " + ACCESS_TOKEN
    },
    body: JSON.stringify({
        agent_types: [
            "AI Data Scraper",
            "Undocumented AI Agent"
        ],
        disallow: "/"
    })
})

// Cache and serve this from your website's /robots.txt path

Option 3: Using the Boilerplate

In this example, all currently known AI data scrapers and undocumented AI agents are blocked. You can use it as a starting point and manually customize it using the agent list. If you sign up, you’ll get notified when new agents are added. However, using the Wordpress plugin or API is a more effective strategy since the agent list is updated continuously.

# Dark Visitors Robots.txt

# AI Data Scraper
# https://darkvisitors.com/agents/bytespider

User-agent: Bytespider
Disallow: /

# AI Data Scraper
# https://darkvisitors.com/agents/ccbot

User-agent: CCBot
Disallow: /

# AI Data Scraper
# https://darkvisitors.com/agents/diffbot

User-agent: Diffbot
Disallow: /

# AI Data Scraper
# https://darkvisitors.com/agents/facebookbot

User-agent: FacebookBot
Disallow: /

# AI Data Scraper
# https://darkvisitors.com/agents/google-extended

User-agent: Google-Extended
Disallow: /

# AI Data Scraper
# https://darkvisitors.com/agents/gptbot

User-agent: GPTBot
Disallow: /

# AI Data Scraper
# https://darkvisitors.com/agents/omgili

User-agent: omgili
Disallow: /

# Undocumented AI Agent
# https://darkvisitors.com/agents/anthropic-ai

User-agent: anthropic-ai
Disallow: /

# Undocumented AI Agent
# https://darkvisitors.com/agents/claude-web

User-agent: Claude-Web
Disallow: /

# Undocumented AI Agent
# https://darkvisitors.com/agents/claudebot

User-agent: ClaudeBot
Disallow: /

# Undocumented AI Agent
# https://darkvisitors.com/agents/cohere-ai

User-agent: cohere-ai
Disallow: /