少点错误 01月03日
Human study on AI spear phishing campaigns
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

该研究评估了大型语言模型进行个性化网络钓鱼攻击的能力。结果显示,AI生成的钓鱼邮件点击率超过50%,与人类专家相当,远超对照组。AI模型能高效收集开源情报,生成准确的目标画像,且成本远低于人工攻击。研究还发现,现有安全措施无法有效阻止AI生成钓鱼邮件。同时,Claude 3.5 Sonnet在检测AI生成的钓鱼邮件方面表现出色。这项研究揭示了AI驱动的网络钓鱼攻击的巨大威胁,并提出了利用AI进行防御的可能性。

🎯AI鱼叉式网络钓鱼非常有效,点击率超过50%,显著优于对照组,表明AI在生成高度个性化和具有欺骗性的钓鱼邮件方面具有强大能力。

💰AI鱼叉式网络钓鱼成本效益高,与人工攻击相比,成本可降低高达50倍,这意味着攻击者可以更低的成本覆盖更多的目标。

🕵️AI模型能高效收集开源情报,为88%的目标生成准确有用的画像,仅有4%的画像包含不准确信息,这表明AI在情报收集方面具有很高准确性。

🛡️安全防护措施无法有效阻止AI生成钓鱼邮件,包括Claude 3.5 Sonnet和GPT-4o在内的模型都可用于创建网络钓鱼邮件,这突显了现有安全措施的局限性。

🧐Claude 3.5 Sonnet在检测AI生成的网络钓鱼邮件方面表现出色,但对于一些人类容易识别的明显可疑的钓鱼邮件,它也可能难以识别,这表明AI检测技术仍有改进空间。

Published on January 3, 2025 3:11 PM GMT

TL;DR: We ran a human subject study on whether language models can successfully spear-phish people. We use AI agents built from GPT-4o and Claude 3.5 Sonnet to search the web for available information on a target and use this for highly personalized phishing messages. We achieved a click-through rate of above 50% for our AI-generated phishing emails.

Full paper: https://arxiv.org/abs/2412.00586

This post is intended to be a brief summary of the main findings, these are some key insights we gained:

    AI spear-phishing is highly effective, receiving a click-through rate of more than 50%, significantly outperforming our control group.AI-spear phishing is also highly cost-efficient, reducing costs by up to 50 times compared to manual attacks.AI models are highly capable of gathering open-source intelligence. They produce accurate and useful profiles for 88% of targets. Only 4% of the generated profiles contained inaccurate information.Safety guardrails are not a noteworthy barrier for creating phishing mails with any tested model, including Claude 3.5 Sonnet, GPT-4o, and o1-preview.Claude 3.5 Sonnet is surprisingly good at detecting AI-generated phishing emails, though it struggles with some phishing emails that are clearly suspicious to most humans.

Abstract

In this paper, we evaluate the capability of large language models to conduct personalized phishing attacks and compare their performance with human experts and AI models from last year. We include four email groups with a combined total of 101 participants: A control group of arbitrary phishing emails, which received a click-through rate (recipient pressed a link in the email) of 12%, emails generated by human experts (54% click-through), fully AI-automated emails 54% (click-through), and AI emails utilizing a human-in-the-loop (56% click-through). Thus, the AI-automated attacks performed on par with human experts and 350% better than the control group. The results are a significant improvement from similar studies conducted last year, highlighting the increased deceptive capabilities of AI models. Our AI-automated emails were sent using a custom-built tool that automates the entire spear phishing process, including information gathering and creating personalized vulnerability profiles for each target. The AI-gathered information was accurate and useful in 88% of cases and only produced inaccurate profiles for 4% of the participants. We also use language models to detect the intention of emails. Claude 3.5 Sonnet scored well above 90% with low false-positive rates and detected several seemingly benign emails that passed human detection. Lastly, we analyze the economics of phishing, highlighting how AI enables attackers to target more individuals at lower cost and increase profitability by up to 50 times for larger audiences.

Method

In a brief summary, the method consists of 5 steps:

    Collecting Targets.Collecting information about the targets using an AI web-browsing agent that crawls publicly available information.Writing personalized phishing emails based on the collected information and carefully crafted language model prompts.Sending emails using custom-built and automated software.Analyzing outcomes by tracking clicks on embedded URLs in emails.
Example from the paper for an AI-compiled profile from step 2 and an AI-written email from step 3.

Results

For full results, go to the paper

The results of the phishing emails are presented in the figure below (figure 5 in the paper). The control group emails received a click-through rate of 12%, the emails generated by human experts 54%, the fully AI-automated emails 54%, and the AI-emails utilizing a human-in-the-loop 56%. Both the AI-generated email types (fully automated and human-in-the-loop) performed on par with the emails created by human experts.

Click-through rates for the 4 different groups in our human study

The figure below shows that about 40% of both AI groups specifically mentioned in their free text answer that personalization increased their trust in the email message, compared to 0% in the control group and about 20% in the human expert group. The presentation received equally trustworthy scores for the AI and human expert-generated emails. 

Common reasons given in free text survey responses for why the email was trustworthy as a percentage of survey responses per group.

Furthermore, we analyzed the quality of AI-compiled profiles. In a manual analysis of 51 profiles, we found that 88% of profiles were accurate and useful, 8% were accurate but not useful and 4% contained inaccurate information.

To measure the time saved, we measured how much time we spent on the interventions for the human-in-the-loop group. The information gathering (OSINT) updates for this group took an average of 1:05 minutes, and the email message updates averaged 1:36 minutes. Thus, the total combined time was 2:41 minutes per message. To better measure the time difference between AI-automated and manual attacks, we manually replicated the OSINT information gathering and email crafting processes
provided by our AI tool. We performed the manual replication on four targets. The process required an average of 23:27 minutes to gather OSINT data and an additional 10:10 minutes to write each personalized email, totaling about 34 minutes per target. Thus the human-in-the-loop based AI-automation was about 92% faster than the fully manual process.

Automated intent detection

On average, Claude achieved a true positive detection rate of 97.25% with no false positives. If we weigh the detection rates by category, i.e., each category is given the same weight regardless of the number of messages in the category, the detection rate remains almost identical (97.64%). Claude 3.5 Sonnet far outperformed GPT-4o. The paper contains additional results using other models for suspicious intent detection.

However, the possibilities of jailbreaks and prompt injections pose a significant challenge to using language models to prevent phishing.

 

Overview of suspicion scores evaluated by the Claude 3.5 Sonnet and GPT-4o. The first row is evaluated for suspicion by GPT-4o, and the second by Claude 3.5 Sonnet. The plots compare different types of mail, from legitimate mail, mail generated for our 2 AI groups (orange), mail generated by 3 different AI models (red), and other types of phishing mail (blue). For more information on the data used, see section 4.2 of the paper. For a theoretical detection threshold of 50%, we show a cutoff line with corresponding false positive (FP) and true positive (TP) percentages.

The economics of AI-enhanced phishing

Table 4 from the paper shows part of our economic analysis. We estimate q for three different scenarios, considering low, medium and high conversion rates. conversion rate refers to the ratio of opened URLs that result in a successful fraud. Using fully automated AI with no human intervention always leads to the highest returns.

 

Future Work

For future work, we hope to scale up studies on human participants by multiple orders of magnitude and measure granular differences in various persuasion techniques. Detailed persuasion results for different models would help us understand how AI-based deception is evolving and how to ensure our protection schemes stay up-to-date. Additionally, we will explore fine-tuning models for creating and detecting phishing. We are also interested in evaluating AI's capabilities to exploit other communication channels, such as social media or modalities like voice. Lastly, we want to measure what happens after users press a link in an email. For example, how likely is it that a pressed email link results in successful exploitation, what different attack trees exist (such as downloading files or entering account details in phishing sites), and how well can AI exploit and defend against these different paths? We also encourage other researchers to explore these avenues. 

We propose personalized mitigation strategies to counter AI-enhanced phishing. The cost-effective nature of AI makes it highly plausible we're moving towards an agent vs agent future. AI could assist users by creating personalized vulnerability profiles, combining their digital footprint with known behavioral patterns.

Conclusion

Our results reveal the significant challenges that personalized, AI-generated phishing emails present to current cybersecurity systems. Many existing spam filters use signature detection (detecting known malicious content and behaviors). By using language models, attackers can effortlessly create phishing emails that are uniquely adapted to every target, rendering signature detection schemes obsolete. As models advance, their capabilities of persuasion will likely also increase. We find that LLM-driven spear phishing is highly effective and economically viable, with automated reconnaissance that provides accurate and useful information in almost all cases. Current safety guardrails fail to reliably prevent models from conducting reconnaissance or generating phishing emails. However, AI could mitigate these threats through advanced detection and tailored countermeasures.

 



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI 网络钓鱼 LLM 安全 情报收集
相关文章