TechCrunch News 2024年10月16日
Lightmatter’s $400M round has AI hyperscalers hyped for photonic datacenters
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

光物质公司获4亿美元融资,其光学互连层使数百个GPU能同步工作,解决训练和运行AI模型的成本与复杂性问题。AI发展使数据中心行业增长,但高性能计算存在节点闲置等问题。光物质利用光子芯片打造快速互连层,其产品具有高带宽、低延迟等优势,市场需求巨大,该轮融资使其估值大增,公司还在研发新芯片基板。

💻光物质公司的光学互连层能让数百个GPU同步工作,解决了数据中心在训练和运行AI模型时的成本高和复杂的问题,提高了工作效率。

🚀高性能计算中存在节点闲置等待数据的问题,而互连层能将CPU和GPU变成一个有效整体,光物质的互连层速度更快,具有很大优势。

🌐光物质通过大量光纤和纯光学接口,实现了高带宽,其光子互连目前可达30太比特,且能让1024个GPU在特制机架中同步工作。

💰光物质公司市场需求巨大,众多大型数据中心公司是其客户,该公司此轮4亿美元的D轮融资使其估值达到44亿美元,并计划研发新芯片基板。

Photonic computing startup Lightmatter has raised $400 million to blow one of modern datacenters’ bottlenecks wide open. The company’s optical interconnect layer allows hundreds of GPUs to work synchronously, streamlining the costly and complex job of training and running AI models.

The growth of AI and its correspondingly immense compute requirements have supercharged the datacenter industry, but it’s not as simple as plugging in another thousand GPUs. As high performance computing experts have known for years, it doesn’t matter how fast each node of your supercomputer is if those nodes are idle half the time waiting for data to come in.

The interconnect layer or layers are really what turn racks of CPUs and GPUs into effectively one giant machine — so it follows that the faster the interconnect, the faster the datacenter. And it is looking like Lightmatter builds the fastest interconnect layer by a long shot, by using the photonic chips it’s been developing since 2018.

“Hyperscalers know if they want a computer with a million nodes, they can’t do it with Cisco switches. Once you leave the rack, you go from high density interconnect to basically a cup on a strong,” Nick Harris, CEO and founder of the company, told TechCrunch. (You can see a short talk he gave summarizing this issue here.)

The state of the art, he said, is NVLink and particularly the NVL72 platform, which puts 72 Nvidia Blackwell units wired together in a rack, capable of a maximum of 1.4 exaFLOPs at FP4 precision. But no rack is an island, and all that compute has to be squeezed out through 7 terabits of “scale up” networking. Sounds like a lot, and it is, but the inability to network these units faster to each other and to other racks is one of the main barriers to improving performance.

“For a million GPUs, you need multiple layers of switches. and that adds a huge latency burden,” said Harris. “You have to go from electrical to optical to electrical to optical… the amount of power you use and the amount of time you wait is huge. And it gets dramatically worse in bigger clusters.”

So what’s Lightmatter bringing to the table? Fiber. Lots and lots of fiber, routed through a purely optical interface. With up to 1.6 terabits per fiber (using multiple colors), and up to 256 fibers per chip… well, let’s just say that 72 GPUs at 7 terabits starts to sound positively quaint.

“Photonics is coming way faster than people thought — people have been struggling to get it working for years, but we’re there,” said Harris. “After seven years of absolutely murderous grind,” he added.

The photonic interconnect currently available from Lightmatter does 30 terabits, while the on-rack optical wiring is capable of letting 1,024 GPUs work synchronously in their own specially designed racks. In case you’re wondering, the two numbers don’t increase by similar factors because a lot of what would need to be networked to another rack can be done on-rack in a thousand-GPU cluster. (And anyway, 100 terabit is on its way.)

Image Credits:Lightmatter

The market for this is huge, Harris pointed out, with every major datacenter company from Microsoft to Amazon to newer entrants like xAI and OpenAI showing an endless appetite for compute. “They’re linking together buildings! I wonder how long they can keep it up,” he said.

Many of these hyperscalers are already customers, though Harris wouldn’t name any. “Think of Lightmatter a little like a foundry, like TSMC,” he said. “We don’t pick favorites or attach our name to other people’s brands. We provide a roadmap and a platform for them — just helping grow the pie.”

But, he added coyly, “you don’t quadruple your valuation without leveraging this tech,” perhaps an allusion to OpenAI’s recent funding round valuing the company at $157 billion, but the remark could just as easily be about his own company.

This $400 million D round values it at $4.4 billion, a similar multiple of its mid-2023 valuation that “makes us by far the largest photonics company. So that’s cool!” said Harris. The round was led by T. Rowe Price Associates, with participation from existing investors Fidelity Management and Research Company and GV.

What’s next? In addition to interconnect, the company is developing new substrates for chips so that they can perform even more intimate, if you will, networking tasks using light.

Harris speculated that, apart from interconnect, power per chip is going to be the big differentiator going forward. “In ten years you’ll have wafer-scale chips from everybody — there’s just no other way to improve the performance per chip,” he said. Cerebras is of course already working on this, though whether they are able to capture the true value of that advance at this stage of the technology is an open question.

But for Harris, seeing the chip industry coming up against a wall, he plans to be ready and waiting with the next step. “Ten years from now, interconnect is Moore’s Law,” he said.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

光物质 光学互连层 数据中心 芯片基板
相关文章