TechCrunch News 2024年11月29日
Linkup connects LLMs with premium content sources (legally)
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

随着AI聊天机器人如ChatGPT和Perplexity的兴起,获取并引用实时网页信息变得至关重要。这不仅提升了AI模型的输出质量,也减少了错误信息的产生。法国初创公司Linkup开发了一个API,帮助开发者获取来自优质来源的网页内容,并将其提供给大型语言模型(LLM),从而丰富其答案。然而,网页抓取的未来充满不确定性,版权问题日益受到关注。Linkup通过建立一个内容发布者与AI开发者之间的市场,促成内容授权交易,以解决这一难题。该平台旨在为AI应用提供高质量的网页内容,同时确保内容发布者获得相应的报酬,从而构建一个可持续发展的AI生态系统。

🤔**网页抓取引发的版权争议:**随着AI模型对网页内容的需求增加,未经授权的网页抓取行为引发了版权争议,许多内容发布者对AI模型训练数据来源表示不满,并开始寻求法律途径维护自身权益。

🤝**Linkup平台连接内容发布者和AI开发者:**Linkup平台旨在作为内容发布者与AI开发者之间的桥梁,通过内容授权的方式,帮助AI开发者获取高质量的网页内容,同时为内容发布者带来收益,形成良性循环。

💰**内容授权模式:**Linkup与内容发布者签署内容授权协议,并与他们的内容管理系统(CMS)集成,以便在不进行抓取的情况下获取内容。Linkup根据其客户访问内容的频率向内容合作伙伴付费。

💼**AI应用场景:**Linkup的目标是为各种AI应用提供高质量的网页内容,例如内部销售应用、客户关系管理等,通过整合外部信息,提升AI模型的输出质量和应用价值。

📈**Linkup的未来发展:**Linkup已获得300万欧元的种子轮融资,团队规模不断扩大,计划在未来一年内招聘10名员工,进一步拓展业务,为AI领域提供更完善的内容授权解决方案。

If you’ve used ChatGPT Search or Perplexity you know that being able to search the web and get citations inline greatly improves these AI chatbots. Results are better when they involve timely information, and web search may reduce so-called hallucinations (i.e. when a generative AI outputs incorrect information).

That’s why French startup Linkup is building an API that lets developers access web content from premium, trusted sources and hand the results to a large language model (LLM) to enrich its answers. Many AI developers call this workflow Retrieval-Augmented Generation (or RAG).

More importantly, the future of scraping bots is uncertain. If there’s no pre-existing financial agreement between content publishers and the entities scraping web pages, these bots are lifting content from the open web without paying and many people aren’t happy about that deal — which is increasing regulatory scrutiny around AI training.

There are also now high-profile legal cases in the frame, such as the ongoing lawsuit between OpenAI, the maker of ChatGPT, and the New York Times — so the situation around web scraping could change in the near future. Hence why OpenAI has signed multi-year content licensing deals with major publishers such as AP, Axel Springer, Condé Nast, El País, the Financial Times, Le Monde, and others.

“We set up the company around the time when OpenAI was making deals with news sources… for training or inference purposes, to augment the answers from OpenAI models and their products. And we thought: ‘OK, this is great because we finally have AI companies that pay their sources,’” Linkup co-founder and CEO Philippe Mizrahi told TechCrunch, laying out what propelled the founders to set up a business to connect AI devs with content providers for — hopefully — their mutual benefit.

Currently, content publishers are faced with a difficult decision over what to do about GenAI’s thirst for data. They can block web scrapers using the (non-legally binding) robots.txt metadata file (which indicates whether a website can be used to train an AI model or not). Furthermore, they can sue AI companies that they believe have breached their copyright. Alternatively, they could let bots index their content freely (er, YOLO?). Or they may be able to license content to AI devs to get some recompense for their intellectual property.

But there are thousands of AI companies (or tech companies using AI) that don’t have the scale and reach of OpenAI. At the same time, what’s great about the web is that there’s a long tail of content publishers. But this means that a small content publisher usually doesn’t have enough financial resource to file a lawsuit. It also means that it will be difficult to switch from a scraping model to a licensing model for millions of websites.

That’s why Linkup isn’t just a technical solution. It’s a marketplace; an intermediary between content publishers and companies that want to augment their LLM answers with web content.

Linkup signs content licensing deals with publishers and integrates with their CMS so that it can fetch content from publishers without any scraping. Linkup then pays content partners based on how often their content is accessed by Linkup clients.

Linkup’s founding teamImage Credits:Linkup

“We’re really targeting applications that are implementing AI in their own products,” said Mizrahi. “So, the typical use case is that I create an AI application using a model from Mistral or OpenAI. I build my own pipeline, but I need to enrich this pipeline with external information.”

As a side note, while ChatGPT can browse the web, GPT models can’t. OpenAI provides both a massively popular application (ChatGPT) and LLMs that developers can use with an API (GPT). But web search is a ChatGPT feature.

“There’s an example I like, which is one of our customers… built an internal application for their sales people,” Mizrahi also told us. “On the one hand, they have listed all the advantages of their own products. And thanks to us, they get fresh, quality information on their prospects and put it into a Mistral LLM. And Mistral’s LLM is going to generate a sort of sales pitch for the sales reps, which they’ll have in front of them when they make the calls with the customer leads.”

At first, Linkup decided to focus on corporate and business information. In addition to news websites, the startup works with knowledge databases — think Statista, Xerfi or other resources in the same vein.

It isn’t the only startup working on bringing premium content to LLMs with licensing contracts behind the scenes. The most visible competitor is ScalePost, a startup that works with Perplexity to speed up its licensing deals with publishers.

Linkup raised a €3 million seed round ($3.2 million at current exchange rates) a few months ago from Axeleo Capital, Motier Ventures, Seedcamp, and a hundred business angels. There are around 10 people working for the startup right now, and it plans to hire another 10 staff over the next year.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI模型 网页内容 内容授权 Linkup 大型语言模型
相关文章