Agents

Archive-It

Archive-It is a web archiving crawler operated by Internet Archive that preserves copies of web pages for long-term digital preservation and historical record-keeping. It is commonly used by libraries, museums, universities, and other institutions to create and maintain collections of archived web content.

Archiver

See More →

archive.org_bot

archive.org_bot is the Internet Archive's web crawler for the Wayback Machine, systematically crawling and preserving publicly accessible web pages for historical record and research.

Archiver

See More →

Arquivo-web-crawler

Arquivo-web-crawler is the Portuguese web archive's bot that systematically crawls and preserves Portuguese websites for historical research, creating a comprehensive digital heritage of Portugal's web presence.

Archiver

See More →

Authory

Authory is an automated content archiving crawler that systematically searches for and backs up published articles, podcasts, and videos by journalists and content creators to create secure portfolios and prevent content loss.

Archiver

See More →

bl.uk_lddc_bot

bl.uk_lddc_bot is operated by the British Library as part of their legal deposit web archiving program, which collects and preserves UK web content for the national archive. This bot crawls websites to fulfill the library's obligation under UK legal deposit legislation to archive online publications.

Archiver

See More →

bne.es_bot

bne.es_bot is an archiver. If you think this is incorrect or can provide additional detail about its purpose, please let us know.

Archiver

See More →

bnf.fr_bot

bnf.fr_bot is the official web crawler of the Bibliothèque nationale de France (BNF), systematically collecting and archiving digital content from French websites to preserve France's national documentary heritage.

Archiver

See More →

heritrix

heritrix is an archiver operated by Internet Archive. If you think this is incorrect or can provide additional detail about its purpose, please let us know.

Archiver

See More →

ia_archiver

ia_archiver is a web crawler operated by Amazon that was associated with Alexa, a web traffic analysis service. The bot archived and analyzed web pages for Alexa's ranking services, but the service was retired by Amazon in May 2022.

Archiver

See More →

ia_archiver-web.archive.org

ia_archiver-web.archive.org is an archiver operated by Internet Archive. If you think this is incorrect or can provide additional detail about its purpose, please let us know.

Archiver

See More →

IABot

IABot (InternetArchiveBot) is a Wikipedia bot operated by the Internet Archive that combats link rot by finding dead links on Wikipedia articles and adding archived versions from the Wayback Machine to preserve reference accessibility.

Archiver

See More →

Internet Archive

Internet Archive is an archiver operated by Internet Archive. If you think this is incorrect or can provide additional detail about its purpose, please let us know.

Archiver

See More →

mirrorweb

mirrorweb is a web archiving crawler that captures websites in their entirety, including dynamic content and interactive elements, creating tamper-proof WARC archives for compliance.

Archiver

See More →

netarkivindsamling

netarkivindsamling is a web crawler operated by the Royal Danish Library that collects Danish internet content according to the Danish Legal Deposit Act for preservation and research purposes.

Archiver

See More →

Nicecrawler

Nicecrawler is an archiver operated by NiceCrawler. If you think this is incorrect or can provide additional detail about its purpose, please let us know.

Archiver

See More →

SmarshBot

SmarshBot is an archiving bot operated by Smarsh that crawls and captures web content for compliance management and regulatory recordkeeping purposes. Organizations use this service to maintain archives of web pages and online communications to meet legal and regulatory requirements.

Archiver

See More →

special_archiver

special_archiver is a web crawler operated by Internet Archive as part of their Archive-It service, which creates and preserves collections of web content. This bot visits websites to capture and archive web pages for long-term preservation and historical research purposes.

Archiver

See More →

Turnitin

Turnitin crawler gathers web content to build a comprehensive database for plagiarism detection services, comparing student papers against internet content for academic integrity.

Archiver

See More →

XY-Archive-Compliance

XY-Archive-Compliance creates website archives to help customers meet regulatory and legal compliance requirements for data retention and historical website preservation.

Archiver

See More →

XY-Archive-Compliance-Archiver

XY-Archive-Compliance-Archiver is an archiving bot operated by XY Planning Network, a membership organization for fee-only financial advisors. This bot archives web content for compliance and recordkeeping purposes related to regulatory requirements in the financial advisory industry.

Archiver

See More →

Archivers