Future Tech

What AI bubble? Groq rakes in $640M to grow inference cloud

Tan KW
Publish date: Tue, 06 Aug 2024, 07:22 AM
Tan KW
0 462,951
Future Tech

Even as at least some investors begin to question the return on investment of AI infrastructure and services, venture capitalists appear to be doubling down. On Monday, AI chip startup Groq - not to be confused with xAI's Grok chatbot - announced it had scored $640 million in series-D funding to bolster its inference cloud.

Founded in 2016, the Mountain View, California-based startup began its life as an AI chip slinger targeting high throughput, low cost inferencing as opposed to training. Since then the company has transitioned to an AI infrastructure-as-a-service provider and walked away from selling hardware.

In total, Groq has raised more than $1 billion and now boasts a valuation of $2.8 billion, with its latest funding round led by the likes of BlackRock, Neuberger Berman, Type One Ventures, Cisco Investments, Global Brain, and Samsung Catalyst.

The firm's main claim to fame is that its chips can generate more tokens faster, while using less energy, than GPU-based equipment. At the heart of all of this, is Groq's Language Processing Unit (LPU), which approaches the problem of running LLMs a little differently.

As our sibling site The Next Platform previously explored, Groq's LPUs don't require gobs of pricy high-bandwidth memory or advantaged packaging - both factors that have contributed to bottlenecks in the supply of AI infrastructure.

Instead, Groq's strategy is to stitch together hundreds of LPUs, each packed with on-die SRAM, using a fiber optic interconnect. Using a cluster of 576 LPUs, Groq claims it was able to achieve generation rates of more than 300 tokens per second on Meta's Llama 2 70B model, 10x that of an HGX H100 system with eight GPUs, while consuming a tenth of the power.

Groq now intends to use its millions to expand headcount and bolster its inference cloud to support more customers. As it stands, Groq purports to have more than 360,000 developers build on GroqCloud creating applications using openly available models.

Training AI models is solved, now it's time to deploy these models so the world can use them

"This funding will enable us to deploy more than 100,000 additional LPUs into GroqCloud," CEO Jonathan Ross said Monday.

"Training AI models is solved, now it's time to deploy these models so the world can use them. Having secured twice the funding sought, we now plan to significantly expand our talent density.

These won't, however, be Groq's next-gen LPUs. Instead, they'll be built using GlobalFoundries' 14nm process node, and delivered by the end of Q1 2025. Nvidia's next-gen Blackwell GPUs are expected to be arriving within the next 12 or so months, depending on how delayed they turn out to be.

Groq is said to be working on two new generations of LPUs, which, last we heard, would utilize Samsung's 4nm process tech and deliver somewhere between 15x and 20x higher power efficiency.

You can find a deeper dive on Groq's LPU strategy and performance claims on The Next Platform.

VC Capital continues to flow into AI startups

Groq isn't the only infrastructure vendor that's managed to capitalize on all the AI hype. In fact, $640 billion is far from the largest chunk of change we've seen startups walk away with in recent memory.

As you may recall, back in May, GPU bit barn CoreWeave scored $1.1 billion in series-C funding weeks before it managed to talk Blackstone, Blackrock, and others into a loan for $7.5 billion using its GPUs as collateral.

Meanwhile, Lambda labs, another GPU cloud operator, used its cache of GPUs to secure a combined $820 million in fresh funding and debt financing since February, and it doesn't look like it is satisfied yet. Last month we learned Lambda was reportedly in talks with VCs for another $800 million in funding to support the deployment of yet more Nvidia GPUs.

While VC funding continues to flow into AI startups, it seems some on Wall Street are increasingly nervous about whether these multi-billion-dollar investments in AI infrastructure will ever pay off.

Still that hasn't stopped ML upstarts, such as Cerebras, from pursuing an initial public offering (IPO). Last week the outfit, best known for its dinner plate-sized accelerators aimed at model training, revealed it had confidentially filed for a public listing.

The size and price range of the IPO have yet to be determined. Cerebras' rather unusual approach to the problem of AI training has helped it win north of $900 million in commitments from the likes of G42.

Meanwhile, with the rather notable exception in Intel, which saw its profits plunge $1.6 billion year-over-year in Q2 amid plans to lay off at least 15 percent of its workforce, chip vendors and the cloud providers reselling access to their accelerators have been among the biggest beneficiaries of the AI boom. Last week, AMD revealed its MI300X GPUs accounted for more than $1 billion of its datacenter sales.

However, it appears that the real litmus test for whether the AI hype train is about to derail won't come until the market leader Nvidia announces its earnings and outlook later this month. ®

 

https://www.theregister.com//2024/08/05/groq_ai_funding/

Discussions
Be the first to like this. Showing 0 of 0 comments

Post a Comment