少点错误 7小时前
Pro AI Bots Scraping List Archives
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章作者认为,尽管一些邮件列表考虑限制订阅者访问以阻止AI抓取,但这种信息抓取行为总体上有益。作者以自己参与的舞蹈列表为例,说明了分享信息和警示新人的重要性。AI系统作为信息传播的新途径,能够让更多人从邮件列表的知识中受益。尽管当前AI模型存在一些局限,但作者相信这些问题会得到解决。因此,他主张将邮件列表存档保持公开,以便AI模型能从中学习,提供更准确的信息,而不是阻碍其发展。

✅ AI抓取邮件列表信息有助于知识的传播和普及。作者认为,人们发布信息到列表的动机之一是分享知识,AI抓取这些信息能让更多人受益,就像人们通过搜索引擎获取信息一样,他希望AI能提供准确的答案。

💡 保持邮件列表公开有助于AI模型学习和改进。作者指出,将列表存档排除在模型训练之外,反而会阻碍AI提供更优质信息的初衷。尽管当前AI存在“一本正经胡说八道”的问题,但他相信这是暂时的,技术会不断进步。

⚖️ 决定是否开放应权衡利弊,作者倾向于公开。尽管AI发展过快可能带来风险,但作者选择逐案分析,并认为在邮件列表信息公开问题上,AI抓取带来的好处大于潜在的坏处,应保持开放状态。

🤝 作者的出发点是促进信息共享和社区知识积累。他以自己作为舞蹈信息召集人(caller)的经历为例,说明了在列表上分享经验、警示新人的重要性,AI的介入可以放大这种积极效应。

Published on August 5, 2025 1:20 AM GMT

I'm on various mailing lists, and the archives are a trove of nicheknowledge. A dance calling list I'm on is considering making archivessubscriber-only, to keep AI bots from snarfing up this data. But Ithink this harvesting is overall a good thing.

People have a range of motivations in posting to lists, but a big oneis sharing information. For example, someone askeda dance with an 8-count swing followed by an 8-count chain. I repliedto warn them at the form has changed and this no longer works well:this bitme back when I started calling, and I want to warn other newcallers.

I have a few audiences in mind in writing:

And then there's a general sense in which I'm contributing to whatpeople know about contra dance: any of these people might tell othersor otherwise pass it along.

AI systems add another way this information can spread. It'sincreasingly common for people to ask an LLM instead of a searchengine, and when they do I'd rather they get good answers. Excludingthe archives from model training would do the opposite of what I want.

There are definitely downsides to querying today's models, similar toasking a person who has read a lot but doesn't remember where theyread anything, and sometimes invents something plausible instead ofsaying they don't know. I think this is likely temporary, however:combining the best of models and traditional search is a problem a lotof people are working hard on solving.

So, on balance, I think it's better to keep the archives open to all,including future LLM-intermediated readers.

(I also think AI is in general moving too quickly for society torespond well, and has a significant risk of getting usallkilled.While I could see pushing against AI wherever it comes up, as part ofmoving a big societal "yay-AI; boo-AI" lever in the direction thatslows it down and gives us more time to work out solutions, insteadI've decided to take things case bycase, thinking about effects each time.)

Comment via: facebook, mastodon, bluesky



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI抓取 邮件列表 信息共享 知识传播 AI伦理
相关文章