Cloudflare weaponizes AI against web crawlers

DailyAI | Exploring the World of Artificial Intelligence 03月23日

../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

Cloudflare 推出名为“AI 迷宫”的新型陷阱，旨在对抗无视网站权限的数据饥渴型 AI 机器人。该系统通过提供看似真实的页面，其中包含不相关的信息和隐藏链接，引导机器人进入由 AI 生成的无意义内容的无底洞。这些诱饵页面对人类访客完全不可见，但能有效消耗机器人的资源，帮助 Cloudflare 识别和阻止恶意爬虫。此举应对了日益增长的机器人流量，其中恶意机器人占据了相当大的比例，并试图保护客户免受 AI 公司的侵扰。

🕸️ Cloudflare 的“AI 迷宫”通过生成看似真实但内容无关的虚假页面来诱捕 AI 爬虫。这些页面包含隐藏的链接，引导机器人进入一个由 AI 生成的无意义内容的循环。

🔍 该系统的工作原理是：当检测到未经授权的爬取行为时，Cloudflare 不会阻止请求，而是提供一系列引人入胜的 AI 生成页面。这些页面上的内容看似科学准确，但实际上与受保护网站的内容无关。

🤖 这些陷阱内容完全对人类访客不可见，但它们可以帮助 Cloudflare 改进其检测系统，通过分析机器人与假页面的交互来识别和标记恶意机器人。陷阱内容是预先生成的，以提高性能，避免资源浪费。

📈 这种工具的推出，源于互联网机器人流量的惊人增长。根据 Imperva 的报告，2023 年机器人占网络流量的 49.6%，其中恶意机器人占总流量的 32%。AI 爬虫每天向 Cloudflare 的网络发送超过 500 亿个请求，消耗了大量资源。

Cloudflare has unleashed a devious new trap for data-hungry AI bots that ignore website permissions – the “AI Labyrinth.”

The AI Labyrinth attempts to actively sabotage AI bots by serving realistic-looking pages filled with irrelevant information and hidden links that lead deeper into a rabbit hole of AI-generated nonsense.

“When we detect unauthorized crawling, rather than blocking the request, we will link to a series of AI-generated pages that are convincing enough to entice a crawler to traverse them,” Cloudflare revealed.

“But while real looking, this content is not actually the content of the site we’re protecting.”

Here’s exactly how the system works:

It generates convincing fake pages with scientifically accurate but irrelevant content
Hidden invisible links within these pages lead to more fake content, creating endless loops
All trap content remains completely invisible to human visitors
Bot interactions with these fake pages help improve detection systems
Content is pre-generated rather than created on demand for better performance
Crawlers waste their resources rather than wasting Cloudfares’ resources

Such tools are needed because bot internet traffic is growing alarmingly.

According to Imperva’s 2024 Threat Research report, bots generated 49.6% of web traffic last year, with malicious bots accounting for a whopping 32% of the total.

AI crawlers bombard Cloudfare’s network with more than 50 billion requests daily – nearly 1% of all web traffic they handle – wasting their resources in the process.

These numbers lend credibility to what many dismissed as the “dead internet theory” – an internet conspiracy claim that most online content and interaction is artificially generated.

Cloudflare is attempting to support its customers in the cat-and-mouse game between website owners and AI companies. The trap remains completely invisible to human visitors, so they shouldn’t be able to accidentally stumble into the maze.

As Cloudfare describes: “No real human would go four links deep into a maze of AI-generated nonsense. Any visitor that does is very likely to be a bot, so this gives us a brand-new tool to identify and fingerprint bad bots, which we add to our list of known bad actors.”

The post Cloudflare weaponizes AI against web crawlers appeared first on DailyAI.

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签