TechCrunch News 2024年11月07日
This Week in AI: It’s shockingly easy to make a Kamala Harris deepfake
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了生成式AI带来的虚假信息泛滥问题,以一个花费5美元、不到两分钟就制作出令人信服的哈里斯音频深度伪造的例子为例,说明了生成式AI技术的易用性和潜在危害。文章分析了深度伪造技术带来的挑战,如语音克隆工具的滥用、虚假信息的快速传播以及深度伪造检测的难度。同时,文章也提及了一些潜在的解决方案,如语音验证、水印技术和内容审核法律,但作者认为这些措施可能难以完全解决问题。最后,文章还简要介绍了AI在其他领域的应用,例如亚马逊无人机送货、OpenAI的机器人和消费硬件开发以及AI生成的电视剧集摘要等。

🤔 **生成式AI技术易于使用且成本低廉,导致虚假信息传播的风险大幅增加。**例如,作者仅用5美元和不到两分钟的时间就制作出了一个令人信服的哈里斯音频深度伪造,这突显了该技术的易用性以及其对社会造成的潜在威胁。

⚠️ **语音克隆工具的滥用可能导致虚假信息泛滥。**尽管一些平台采取了措施要求用户承诺不生成有害或非法内容,但仅依靠荣誉制度无法有效阻止恶意使用。

🌊 **深度伪造检测面临着持续的挑战,可能演变成一场永无止境的军备竞赛。**一些工具可能不会选择使用诸如水印等安全措施,或者会被恶意部署。

🛡️ **内容审核法律和水印技术等措施可能有助于遏制虚假信息传播,但效果有限。**作者认为,这些措施难以完全解决问题,我们应该保持高度的怀疑态度,尤其要警惕网络上的病毒式内容。

🇺🇸 **美国军方对生成式AI的应用持谨慎态度,担忧其安全性和法律合规性。**目前,只有美国陆军部署了生成式AI,其他军种则对商业模型的安全漏洞、情报数据共享的法律挑战以及模型在极端情况下的不可预测性表示担忧。

Hiya, folks, welcome to TechCrunch’s regular AI newsletter. If you want this in your inbox every Wednesday, sign up here.

It was shockingly easy to create a convincing Kamala Harris audio deepfake on Election Day. It cost me $5 and took less than two minutes, illustrating how cheap, ubiquitous generative AI has opened the floodgates to disinformation.

Creating a Harris deepfake wasn’t my original intent. I was playing around with Cartesia’s Voice Changer, a model that transforms your voice into a different voice while preserving the original’s prosody. That second voice can be a “clone” of another person’s — Cartesia will create a digital voice double from any 10-second recording.

So, I wondered, would Voice Changer transform my voice into Harris’? I paid $5 to unlock Cartesia’s voice cloning feature, created a clone of Harris’ voice using recent campaign speeches, and selected that clone as the output in Voice Changer.

It worked like a charm:

I’m confident that Cartesia didn’t exactly intend for its tools to be used in this way. To enable voice cloning, Cartesia requires that you check a box indicating that you won’t generate anything harmful or illegal and that you consent to your speech recordings being cloned.

But that’s just an honor system. Absent any real safeguards, there’s nothing preventing a person from creating as many “harmful or illegal” deepfakes as they wish.

That’s a problem, it goes without saying. So what’s the solution? Is there one? Cartesia can implement voice verification, as some other platforms have done. But by the time it does, chances are a new, unfettered voice cloning tool will have emerged.

I spoke about this very issue with experts at TC’s Disrupt conference last week. Some were supportive of the idea of invisible watermarks so that it’s easier to tell whether content has been AI-generated. Others pointed to content moderation laws such as the Online Safety Act in the U.K., which they argued might help stem the tide of disinformation.

Call me a pessimist, but I think those ships have sailed. We’re looking at, as CEO of the Center for Countering Digital Hate Imran Ahmed put it, a “perpetual bulls— machine.”

Disinformation is spreading at an alarming rate. Some high-profile examples from the past year include a bot network on X targeting U.S. federal elections and a voicemail deepfake of President Joe Biden discouraging New Hampshire residents from voting. But U.S. voters and tech-savvy people aren’t the targets of most of this content, according to True Media.org’s analysis, so we tend to underestimate its presence elsewhere.

The volume of AI-generated deepfakes grew 900% between 2019 and 2020, according to data from the World Economic Forum.

Meanwhile, there’s relatively few deepfake-targeting laws on the books. And deepfake detection is poised to become a never-ending arms race. Some tools inevitably won’t opt to use safety measures such as watermarking, or will be deployed with expressly malicious applications in mind.

Short of a sea change, I think the best we can do is be intensely skeptical of what’s out there — particularly viral content. It’s not as easy as it once was to tell truth from fiction online. But we’re still in control of what we share versus what we don’t. And that’s much more impactful than it might seem.

ChatGPT Search review: My colleague Max took OpenAI’s new search integration for ChatGPT, ChatGPT Search, for a spin. He found it to be impressive in some ways, but unreliable for short queries containing just a few words.

Amazon drones in Phoenix: A few months after ending its drone-based delivery program, Prime Air, in California, Amazon says that it’s begun making deliveries to select customers via drone in Phoenix, Arizona.

Ex-Meta AR lead joins OpenAI: The former head of Meta’s AR glasses efforts, including Orion, announced on Monday she’s joining OpenAI to lead robotics and consumer hardware. The news comes after OpenAI hired the co-founder of X (formerly Twitter) challenger Pebble.

Held back by compute: In a Reddit AMA, OpenAI CEO Sam Altman admitted that a lack of compute capacity is one major factor preventing the company from shipping products as often as it’d like.

AI-generated recaps: Amazon has launched “X-Ray Recaps,” a generative AI-powered feature that creates concise summaries of entire TV seasons, individual episodes, and even parts of episodes.

Anthropic hikes Haiku prices: Anthropic’s newest AI model has arrived: Claude 3.5 Haiku. But it’s pricier than the last generation, and unlike Anthropic’s other models, it can’t analyze images, graphs, or diagrams just yet.

Apple acquires Pixelmator: AI-powered image editor Pixelmator announced on Friday that it’s being acquired by Apple. The deal comes as Apple has grown more aggressive about integrating AI into its imaging apps.

An ‘agentic’ Alexa: Amazon CEO Andy Jassy last week hinted at an improved “agentic” version of the company’s Alexa assistant — one that could take actions on a user’s behalf. The revamped Alexa has reportedly faced delays and technical setbacks, and might not launch until sometime in 2025.

Pop-ups on the web can fool AI, too — not just grandparents.

In a new paper, researchers from Georgia Tech, the University of Hong Kong, and Stanford show that AI “agents” — AI models that can complete tasks — can be hijacked by “adversarial pop-ups” that instruct the models to do things like download malicious file extensions.

Image Credits:Zhang et al.

Some of these pop-ups are quite obviously traps to the human eye — but AI isn’t as discerning. The researchers say that the image- and text-analyzing models they tested failed to ignore pop-ups 86% of the time, and — as a result — were 47% less likely to complete tasks.

Basic defenses, like instructing the models to ignore the pop-ups, weren’t effective. “Deploying computer-use agents still suffers from significant risks,” the co-authors of the study wrote, “and more robust agent systems are needed to ensure safe agent workflow.”

Meta announced yesterday that it’s working with partners to make its Llama “open” AI models available for defense applications. Today, one of those partners, Scale AI, announced Defense Llama, a model built on top of Meta’s Llama 3 that’s “customized and fine-tuned to support American national security missions.”

Defense Llama, which is available in Scale’s Donavan chatbot platform for U.S. government customers, was optimized for planning military and intelligence operations, Scale says. Defense Llama can answer defense-related questions, for example like how an adversary might plan an attack against a U.S. military base.

So what makes Defense Llama different from stock Llama? Well, Scale says it was fine-tuned on content that might be relevant to military operations, like military doctrine and international humanitarian law, as well as the capabilities of various weapons and defense systems. It also isn’t restricted from answering questions about warfare, like a civilian chatbot might be:

Image Credits:Scale.ai

It’s not clear who might be inclined use it, though.

The U.S. military has been slow to adopt generative AI — and skeptical of its ROI. So far, the U.S. Army is the only branch of the U.S. armed forces with a generative AI deployment. Military officials have expressed concerns about security vulnerabilities in commercial models, as well as legal challenges associated with intelligence data sharing and models’ unpredictability when faced with edge cases.

Spawning AI, a startup creating tools to enable creators to opt out of generative AI training, has released an image dataset for training AI models that it claims is fully public domain.

Most generative AI models are trained on public web data, some of which may be copyrighted or under a restrictive license. OpenAI and many other AI vendors argue that fair-use doctrine shields them from copyright claims. But that hasn’t stopped data owners from filing lawsuits.

Spawning AI says its training dataset of 12.4 million image-caption pairs includes only content with “known provenance” and “labeled with clear, unambiguous rights” for AI training. Unlike some other datasets, it’s also available for download from a dedicated host, eliminating the need to web-scrape.

“Significantly, the public-domain status of the dataset is integral to these larger goals,” Spawning writes in a blog post. “Datasets that include copyrighted images will continue to rely on web-scraping because hosting the images would violate copyright.”

Spawning’s dataset, PD12M, and a version curated for “aesthetically pleasing” images, PD3M, can be found at this link.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

生成式AI 深度伪造 虚假信息 AI伦理 语音克隆
相关文章