AI-powered web crawlers have been likened to the cockroaches of the internet, relentlessly scouring websites for data without regard for ethical constraints. Open-source developers, who often lack the resources of large corporations, are particularly vulnerable to these aggressive bots. Now, some of them are taking matters into their own hands with innovative and often humorous countermeasures.
The Plight of Open-Source Projects
Niccolò Venerandi, a developer contributing to the Plasma Linux desktop and owner of LibreNews, highlights the disproportionate impact AI web crawlers have on open-source (FOSS) projects. Unlike commercial platforms, FOSS projects operate with fewer resources while openly sharing their infrastructure. Many AI bots disregard the Robots Exclusion Protocol (robots.txt), originally designed to control crawler behavior.
FOSS developer Xe Iaso detailed in a January blog post how AmazonBot relentlessly hammered a Git server, causing DDoS-level disruptions. Despite explicitly forbidding the bot in a robots.txt file, the crawler ignored the directive, masked its IP, and pretended to be other users.
“It’s futile to block AI crawler bots because they lie, change user agents, use residential proxies, and more,” lamented Iaso. “They will scrape your site until it collapses.”
Enter Anubis: The Guardian Against AI Bots
Rather than surrender, Iaso devised an ingenious solution—Anubis, a reverse proxy requiring proof-of-work before allowing requests to reach Git servers. This tool filters out AI bots while allowing human-operated browsers to pass.
The name Anubis, inspired by the Egyptian god of the afterlife, adds an ironic twist. According to mythology, Anubis weighed the hearts of the dead against a feather, condemning those with heavy hearts. Iaso mirrored this judgmental approach, ensuring that bots “fail the test” while humans proceed.
If a request passes the challenge, a whimsical anime-style Anubis illustration appears, confirming success. Within days of its release on GitHub, Anubis gained 2,000 stars, 20 contributors, and 39 forks—evidence of widespread frustration with AI crawlers.
The Growing Movement Against AI Crawlers
Iaso isn’t alone. Developers across the FOSS community have shared their struggles:
- Drew DeVault, founder of SourceHut, spends up to 100% of his workweek mitigating aggressive AI web crawlers.
- Jonathan Corbet, editor of LWN, reports DDoS-level slowdowns caused by AI scraper bots.
- Kevin Fenzi, Fedora Linux sysadmin, blocked all traffic from Brazil due to overwhelming bot activity.
- Some projects have even resorted to blocking entire countries to mitigate crawler abuse.
Fighting Back with Trickery
Beyond Anubis, developers are deploying other creative countermeasures. A Hacker News user suggested poisoning AI crawlers by filling blocked pages with misleading content, such as exaggerated health claims.
A more systematic approach comes from Nepenthes, a tool designed by an anonymous creator known as “Aaron.” Named after a carnivorous plant, Nepenthes traps AI bots in an infinite loop of fake content, frustrating their data-gathering efforts.
Meanwhile, Cloudflare introduced AI Labyrinth, a commercial solution designed to “confuse and waste the resources of AI crawlers.” By feeding bots irrelevant data, Cloudflare prevents them from extracting useful information.
The Ethical Debate
While tools like Anubis and Nepenthes offer satisfying justice, some developers argue for a broader cultural shift. DeVault publicly urged people to stop legitimizing AI-driven tools like LLMs and GitHub Copilot. “I am begging you to stop using them, stop talking about them, stop making new ones—just stop.”
Yet, with AI’s relentless growth, developers are left with little choice but to defend their digital spaces—often with a mix of wit and ingenuity.