Paul Graham: Essays 2024年11月25日
Filters that Fight Back
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了一种利用自动抓取链接来打击垃圾邮件的新方法。作者提出,可以设计一种“惩罚模式”,当邮件过滤器怀疑某邮件为垃圾邮件时,自动抓取邮件中的所有链接,从而给垃圾邮件发送者的服务器造成巨大的压力。这种方法利用垃圾邮件发送者大规模发送邮件的特点,将其转化为对服务器的攻击,从而提高垃圾邮件发送成本,降低其有效性。此外,文章还讨论了如何避免误伤正常网站,以及这种方法对垃圾邮件生态的影响,例如可能促使垃圾邮件发送者添加有效的取消订阅链接。

🤔 **利用自动抓取链接惩罚垃圾邮件发送者:** 作者建议在垃圾邮件过滤器中增加“惩罚模式”,当怀疑邮件为垃圾邮件时,自动抓取邮件中的所有链接,给垃圾邮件发送者的服务器造成压力,从而提高其发送成本。

🌐 **利用垃圾邮件自身特性反制:** 文章指出,垃圾邮件的发送量巨大,这种特性可以被利用,将其转化为对垃圾邮件发送者服务器的攻击,使其不堪重负。

⚠️ **避免误伤正常网站:** 为了避免误伤正常的网站,文章建议结合黑名单机制,只对黑名单上的网站进行抓取,并由人工审核网站是否为垃圾邮件网站,再将其加入黑名单。

🔗 **可能促使垃圾邮件发送者添加取消订阅链接:** 作者认为,如果自动抓取链接的垃圾邮件过滤器得到广泛使用,垃圾邮件发送者为了保护自己的服务器,可能会被迫在邮件中添加有效的取消订阅链接,从而改善用户体验。

💻 **高带宽用户可率先应用:** 文章指出,高带宽用户可以率先使用自动抓取链接的垃圾邮件过滤器,因为这种方法需要较高的带宽支持,但足够多的高带宽用户参与,就可以对垃圾邮件发送者造成严重困扰。

August 2003We may be able to improve the accuracy of Bayesian spam filtersby having them follow links to see what'swaiting at the other end. Richard Jowsey ofdeath2spam now doesthis in borderline cases, and reports that it works well.Why only do it in borderline cases? And why only do it once?As I mentioned in Will Filters Kill Spam?,following all the urls ina spam would have an amusing side-effect. If popular email clientsdid this in order to filter spam, the spammer's serverswould take a serious pounding. The more I think about this,the better an idea it seems. This isn't just amusing; itwould be hard to imagine a more perfectly targeted counterattackon spammers.So I'd like to suggest an additional feature to thoseworking on spam filters: a "punish" mode which,if turned on, would spider every urlin a suspected spam n times, where n could be set by the user. [1]As many people have noted, one of the problems with thecurrent email system is that it's too passive. It doeswhatever you tell it. So far all the suggestions for fixingthe problem seem to involve new protocols. This one wouldn't.If widely used, auto-retrieving spam filters would makethe email system rebound. The huge volume of thespam, which has so far worked in the spammer's favor,would now work against him, like a branch snapping back in his face. Auto-retrieving spam filters would drive thespammer's costs up, and his sales down: his bandwidth usagewould go through the roof, and his servers would grind to ahalt under the load, which would make them unavailableto the people who would have responded to the spam.Pump out a million emails an hour, get amillion hits an hour on your servers.We would want to ensure that this is only done tosuspected spams. As a rule, any url sent to millions ofpeople is likely to be a spam url, so submitting every httprequest in every email would work fine nearly all the time.But there are a few cases where this isn't true: the urlsat the bottom of mails sent from free email services likeYahoo Mail and Hotmail, for example.To protect such sites, and to prevent abuse, auto-retrievalshould be combined with blacklists of spamvertised sites.Only sites on a blacklist would get crawled, andsites would be blacklistedonly after being inspected by humans. The lifetime of a spammust be several hours at least, soit should be easy to update such a list in time tointerfere with a spam promoting a new site. [2]High-volume auto-retrieval would only be practical for userson high-bandwidthconnections, but there are enough of those to cause spammersserious trouble. Indeed, this solution neatlymirrors the problem. The problem with spam is that inorder to reach a few gullible people the spammer sends mail to everyone. The non-gullible recipientsare merely collateral damage. But the non-gullible majoritywon't stop getting spam until they can stop (or threaten tostop) the gulliblefrom responding to it. Auto-retrieving spam filters offerthem a way to do this.Would that kill spam? Not quite. The biggest spammerscould probably protect their servers against auto-retrieving filters. However, the easiest and cheapest way for themto do it would be to include working unsubscribe links in their mails. And this would be a necessity for smaller fry,and for "legitimate" sites that hired spammers to promotethem. So if auto-retrieving filters became widespread,they'd become auto-unsubscribing filters.In this scenario, spam would, like OS crashes, viruses, andpopups, become one of those plagues that only afflict peoplewho don't bother to use the right software.Notes[1] Auto-retrieving filters will have to follow redirects,and should in some cases (e.g. a page that just says"click here") follow more than one level of links.Make sure too thatthe http requests are indistinguishable from those ofpopular Web browsers, including the order and referrer.If the responsedoesn't come back within x amount of time, default tosome fairly high spam probability.Instead of making n constant, it might be a good idea tomake it a function of the number of spams that have beenseen mentioning the site. This would add a further level ofprotection against abuse and accidents.[2] The original version of this article used the term"whitelist" instead of "blacklist". Though they wereto work like blacklists, I preferred to call them whitelistsbecause it might make them less vulnerable to legal attack.This just seems to have confused readers, though.There should probably be multiple blacklists. A single pointof failure would be vulnerable both to attack and abuse.Thanks to Brian Burton, Bill Yerazunis, Dan Giffin,Eric Raymond, and Richard Jowsey for reading drafts of this.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

垃圾邮件 邮件过滤器 自动抓取 带宽 反垃圾邮件
相关文章