Agents
Archivers
Every known artificial agent (bot) on the internet. You can track their activity on your website with agent analytics, or control their behavior with automatic robots.txt.
Archivers
archive.org_bot
archive.org_bot is the Internet Archive's web crawler for the Wayback Machine, systematically crawling and preserving publicly accessible web pages for historical record and research.
Archiver
See More →
Arquivo-web-crawler
Arquivo-web-crawler is the Portuguese web archive's bot that systematically crawls and preserves Portuguese websites for historical research, creating a comprehensive digital heritage of Portugal's web presence.
Archiver
See More →
Authory
Authory is an automated content archiving crawler that systematically searches for and backs up published articles, podcasts, and videos by journalists and content creators to create secure portfolios and prevent content loss.
Archiver
See More →
bnf.fr_bot
bnf.fr_bot is the official web crawler of the Bibliothèque nationale de France (BNF), systematically collecting and archiving digital content from French websites to preserve France's national documentary heritage.
Archiver
See More →
heritrix
heritrix is an archiver operated by Internet Archive. If you think this is incorrect or can provide additional detail about its purpose, please let us know.
Archiver
See More →
ia_archiver
ia_archiver is an archiver operated by Internet Archive. If you think this is incorrect or can provide additional detail about its purpose, please let us know.
Archiver
See More →
ia_archiver-web.archive.org
ia_archiver-web.archive.org is an archiver operated by Internet Archive. If you think this is incorrect or can provide additional detail about its purpose, please let us know.
Archiver
See More →
IABot
IABot (InternetArchiveBot) is a Wikipedia bot operated by the Internet Archive that combats link rot by finding dead links on Wikipedia articles and adding archived versions from the Wayback Machine to preserve reference accessibility.
Archiver
See More →
Internet Archive
Internet Archive is an archiver operated by Internet Archive. If you think this is incorrect or can provide additional detail about its purpose, please let us know.
Archiver
See More →
mirrorweb
mirrorweb is a web archiving crawler that captures websites in their entirety, including dynamic content and interactive elements, creating tamper-proof WARC archives for compliance.
Archiver
See More →
netarkivindsamling
netarkivindsamling is a web crawler operated by the Royal Danish Library that collects Danish internet content according to the Danish Legal Deposit Act for preservation and research purposes.
Archiver
See More →
Nicecrawler
Nicecrawler is an archiver operated by NiceCrawler. If you think this is incorrect or can provide additional detail about its purpose, please let us know.
Archiver
See More →
Turnitin
Turnitin crawler gathers web content to build a comprehensive database for plagiarism detection services, comparing student papers against internet content for academic integrity.
Archiver
See More →
XY-Archive-Compliance
XY-Archive-Compliance creates website archives to help customers meet regulatory and legal compliance requirements for data retention and historical website preservation.
Archiver
See More →