TechCrunch News 2024年11月08日
Led by a founder who sold a video startup to Apple, Panjaya uses deepfake techniques to bite into video dubbing
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Panjaya是一家开发AI驱动的视频配音工具的初创公司,其产品BodyTalk能够将视频内容翻译成多种语言,并通过AI技术生成与原视频高度同步的配音和唇形,实现超写实的配音效果。Panjaya的技术团队来自以色列政府,拥有深厚的深度学习技术积累。BodyTalk目前已服务于JFrog和TED等客户,并在B2B领域取得了显著成果,例如TED的配音视频观看量提升了115%。Panjaya致力于打造‘深度真实’的AI视频配音,并通过控制访问权限和未来引入水印等方式防止技术滥用,为视频翻译和配音市场带来革新。

😊Panjaya开发了名为BodyTalk的AI驱动的视频配音工具,能够将视频内容翻译成29种语言,并生成与原视频高度同步的配音和唇形,实现超写实效果。

🤔Panjaya的技术团队来自以色列政府,拥有深厚的深度学习技术积累,其核心技术包括AI驱动的唇形同步引擎,以及基于大型语言模型的翻译和语音合成。

🎬BodyTalk目前已在B2B领域应用,客户包括JFrog和TED等,TED使用Panjaya的配音工具后,视频观看量提升了115%,完成率翻倍。

🛡️Panjaya致力于打造‘深度真实’的AI视频配音,并通过控制访问权限等方式,防止技术被滥用于制作虚假信息。

🌍Panjaya未来计划扩展到更多领域,例如体育、教育、营销、医疗等,并开发API和实时处理功能。

There’s a big opportunity for generative AI in the world of translation, and a startup called Panjaya is taking the concept to the next level: a hyperrealistic, gen AI-based dubbing tool for videos that re-creates a person’s original voice speaking the new language, with the video and the speaker’s physical movements automatically modifying to match up naturally with the new speech patterns. 

After being in stealth for the last three years, the startup is unveiling BodyTalk, the first version of its product, alongside its first outside funding of $9.5 million. 

Panjaya is the brainchild of Hilik Shani and Ariel Shalom, two deep learning specialists who have spent the majority of their professional lives quietly working on deep learning technology for the Israeli government and are now respectively the startup’s general manager and CTO. They hung up their G-man hats in 2021 with the startup itch, and 1.5 years ago, they were joined by Guy Piekarz as CEO. 

Piekarz is not a founder at Panjaya, but he is a notable name to have onboard: Back in 2013, he sold a startup that he did found to Apple. Matcha, as the startup was called, was an early, buzzy player in streaming video discovery and recommendation, and it was acquired during the very early days of Apple’s TV and streaming strategy, when these were more rumors than actual products. Matcha was bootstrapped and sold for a song: $10 million to $15 million — modest considering the significant steer Apple eventually made into streamed media. 

Piekarz stayed with Apple for nearly a decade building Apple TV and then its sports vertical. Then, he was introduced to Panjaya through Viola Ventures, one of its backers (others include R-Squared Ventures, JFrog co-founder and CEO Shlomi Ben Haim, Chris Rice, Guy Schory, Ryan Floyd of Storm Ventures, Ali Behnam of Riviera Partners, and Oded Vardi.

“I had left Apple by then and was planning to do something completely different,” Piekarz said. “However, seeing a demo of the tech blew my mind, and the rest is history.”

BodyTalk is interesting for how it simultaneously brings several pieces of technology that play on different aspects of synthetic media into the frame. 

It starts with audio-based translation that currently can offer translations in 29 languages. The translation is then spoken in a voice that mimics the original speaker, which in turn is set to a version of the original video where the speaker’s lips and other movements get modified to fit the new words and phrasing. All this is created automatically on videos after users upload them to the platform, which also comes with a dashboard that includes further editing tools. Future plans include an API, as well as getting closer to real-time processing. (Right now, BodyTalk is “near real-time,” taking minutes to process videos, Piekarz said.) 

“We’re using best of breed where where we need to,” Piekarz said of the company’s use of third-party large language models and other tools. “And we’re building our own AI models where the market doesn’t really have a solution.” 

An example of that is the company’s lip syncing, he continued. “Our whole lip sync engine is homegrown by our AI research team, because we haven’t found anything that gets to that level and quality of multiple speakers, angles, and all the business use cases we want to support.”

Its focus for the moment is just on B2B; clients include JFrog and the TED media organization. The company has plans to expand further in media, specifically in areas like sports, education, marketing, healthcare, and medicine.

The resulting translation videos are very uncanny, not unlike what you get with deepfakes, although Piekarz winces at that term, which has picked up negative connotations over the years that are the exact opposite of the market the startup is targeting.

“‘Deepfake’ is not something that we’re interested in,” he said. “We’re looking to avoid that whole name.” Instead, he said, think of Panjaya as part of the “deep real category.”

By aiming just for the B2B market, and controlling who gets to access its tools, the company is creating “guardrails” around the technology to protect from misuse, he added. He also thinks that longer term there will be more tools built, including watermarking, to help detect when any videos have been modified to create synthetic media, both legit and nefarious. “We definitely want to be a part of that and not allow misinformation,” he said.

There are a number of startups that compete with Panjaya in the wider area of AI-based translation for videos, including big names like Vimeo and Eleven Labs, as well as smaller players like Speechify and Synthesis. For all of them, building ways to improve how dubbing works feels a little like swimming against a strong tide. That is because captions have become a very standard part of how video is consumed these days. 

On TV, it’s for a litany of reasons like poor speakers, background noise in our busy lives, mumbling actors, limited production budgets, and more sound effects. CBS found in a poll of American TV viewers that more than half of them kept subtitles on “some (21%) or all (34%) of the time.” 

But some love captions just because they are entertaining to read, and there’s been a whole cult built around that. 

On social media and other apps, subtitles are simply baked into the experience. TikTok, as one example, started in November 2023 to turn on captioning by default on all videos. 

All the same, there remains a huge market internationally for dubbed content, and even if English is often thought of as the lingua franca of the internet, there is evidence from research groups like CSA that content delivered in native languages gets better engagement, especially in the B2B context. Panjaya’s pitch is that more natural native-language content could do even better.

Some of its customers appear to support that theory. TED says that Talks dubbed using Panjaya’s tooling have seen increased views of 115%, with completion rates doubling for those translated videos. 

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI配音 视频翻译 Panjaya BodyTalk 深度学习
相关文章