Etched looks to challenge Nvidia with an ASIC purpose-built for transformer models

Tan KW

Publish date: Wed, 26 Jun 2024, 09:41 PM

Following ChatGPT's debut in late 2022, GPUs - Nvidia's in particular - have become synonymous with generative AI.

However, given the scale at which AI is now being deployed, some have questioned whether an application-specific approach to transformer models - the fundamental architecture on which large language and diffusion models are derived - could offer greater performance and efficiency than existing accelerators.

This is the bet that AI infrastructure startup Etched is making with its first inference chip, dubbed Sohu. Unlike GPUs - which, despite their name, are very much general-purpose processors - Etched's first product is designed to do one thing and one thing only: serve up transformer models, like LLMs.

The part can't run convolutional neural networks, state space, or any other kind of AI model - just transformers. By stripping out the flexibility associated with the current crop of accelerators and focusing not just on AI, but specific kinds of models, Etched claims to achieve a 20x performance advantage over Nvidia's H100.

"If you are willing to specialize - if you're willing to make a bet on the architecture, essentially burn that transformer architecture into the silicon - you can get way more performance, like an order of magnitude more performance," COO Robert Wachen boasted in an interview with The Register.

And in terms of performance, the startup claims the chip will achieve 500,000 tokens per second running Llama 70B and that a single eight Sohu server will replace 160 H100s.

That's a bold claim for a chip that hasn't taped out - and, for the moment, only exists in emulation running on FPGAs. However, the idea that an ASIC would outperform a general-purpose processor, like a GPU, in a narrow enough task, shouldn't come as much of a surprise. In general, ASICs trade functionality and programmability for simplicity of design and blazing fast throughput.

More important for the two-year-old startup is that the prospect of a chip that can drive down the cost of AI inferencing is tantalizing enough to warrant $120 million in a Series A funding round led by Primary Venture Partners and Positive Sum Ventures.

However, raw compute is only one of several factors impacting inference performance. As we've seen with Nvidia's H200 and AMD's MI325X, memory bandwidth and capacity appear to be the bottlenecks to beat.

In this respect, Etched's first chip doesn't look that competitive. Even if its performance claims are to be believed, with 144GB of HBM3 across six stacks, our estimates put its maximum bandwidth somewhere in the neighborhood of 4TB/sec. That puts it well behind the H200 and MI300X - not to mention the MI325X or Blackwell.

While inferencing may be memory bound at lower batch sizes, Wachen makes the argument that won't necessarily be the case at the levels of utilization Etched is targeting.

As we understand it, the main advantage of Etched's transformer-centric architecture is that it will allow for extremely large batch sizes far beyond what's reasonable on modern GPUs. In the context of a chatbot, you can equate batch size to the number of concurrent queries the chip can process. That's particularly important for services like ChatGPT, Gemini, or Copilot that are serving thousands - if not millions - of requests every second.

"Instead of being able to run batch size 32 and then go to 64 and have high performance degradation, we can run batch sizes in the thousands without any performance degradation," claimed Wachen.

If true, that could give it an advantage over Nvidia's current crop of GPUs.

However, it's worth noting that Etched is still constrained by the available supply of HBM3. An eight Sohu-based system is only going to have about 1.1TB on board. What's more, at larger batch sizes, more memory typically needs to be dedicated to the key value (KV) cache - something that could limit how large of a model a single Etched system is ultimately able to serve.

Etched's approach isn't without its downsides, either. ASICs are great for the one thing they're designed to do - but essentially worthless if you want to do something else.

This makes Etched's business proposition a bet that transformer models will not only be deployed and run at sufficient scale to justify a fixed-function accelerator, but also that the transformer architecture won't give way to different and more efficient approaches to machine learning down the line.

For the moment, Sohu's main priority is bringing its first chips to market. So far, it claims to have emulated a slice of it on FPGAs with the intent to tape out in the not too distant future. When? Wachen hesitated to say, but implied heavily that the first chips were less than two years away.

In the long term, the startup believes there will be sufficient demand for even more specialized ASICs tailored to the demands of specific models. We'll see. ®

https://www.theregister.com//2024/06/26/etched_asic_ai/

Discussions

Be the first to like this. Showing 0 of 0 comments

Featured Posts

MQ Chat

New Update. Discover investment communities that resonate with your ideas

Latest Videos

MQ Market Updates - 28 June 2024

MQ Trader

Apps

MQ Chat

Send individual or group chats with anyone on i3investor

MQ Trader

Earn MQ Points while trading with MQ Trader

MQ Affiliate

Earn side income from Affiliate Program

MQdemy

Online learning and teaching marketplace

Hot Stocks Today >

MPI

MALAYSIAN PACIFIC INDUSTRIES

1000

PTRANS

PERAK TRANSIT BERHAD

941

HLIND

HONG LEONG INDUSTRIES BHD

913

KIPREIT

KIP REAL ESTATE INVESTMENT TRUST

409

GENTING

GENTING BHD

408

YTLPOWR

YTL POWER INTERNATIONAL BHD

395

JCY

JCY INTERNATIONAL BERHAD

379

UCHITEC

UCHI TECHNOLOGIES BHD

336

GENM

GENTING MALAYSIA BERHAD

327

MAYBANK

MALAYAN BANKING BHD

291

Daily Stocks

HSI-HWE

0.17

-0.005

248,121,800

BORNOIL

0.01

+0.005

224,622,400

HSI-HU8

0.095

-0.01

154,131,800

INIX-OR

0.02

0.00

141,839,900

HSI-CXV

0.105

-0.005

126,319,700

HSI-CXF

0.07

-0.01

101,309,000

NOVAMSC

0.215

+0.02

86,522,500

AHB-WC

0.075

+0.005

79,965,600

EDUSPEC-OR

0.005

0.00

75,002,700

MYEG

1.02

+0.05

74,108,200

More active Stocks

DLADY

36.18

+0.68

15,800

MPI

39.42

+0.54

96,600

UTDPLT

24.50

+0.30

228,400

AJI

15.50

+0.26

208,600

CDB

3.68

+0.21

11,439,700

ALLIANZ-PA

23.60

+0.20

100

PETDAG

17.44

+0.18

704,400

ALLIANZ

22.30

+0.18

15,200

AIRPORT

9.90

+0.17

1,698,400

HUMEIND

3.35

+0.14

941,700

More gainer Stocks

ORIENT

6.97

-0.18

1,220,900

GESHEN

3.23

-0.17

150,100

TENAGA

13.78

-0.16

11,369,400

PETGAS

17.82

-0.16

871,600

HEIM

22.04

-0.16

164,300

APOLLO

6.71

-0.13

1,500

KUAISHO-C17

0.08

-0.12

58,800

NOTION-WD

1.77

-0.11

1,778,200

HLIND

11.12

-0.10

10,500

CANONE

3.00

-0.09

39,800

More loser Stocks

MQ Trading Signals

BUY
SELL

No trading signals available.

More Trading Signals

No trading signals available.

More Trading Signals

Featured Advertisers / Partners

Top Brokers >

AmEquities

Affin Hwang

Rakuten Trade

Hong Leong Bank

Books Review >

Ride The Bull Short The Bear

CS Tan

4.9 / 5.0

This book is the result of the author's many years of experience and observation throughout his 26 years in the stockbroking industry. It was written for general public to learn to invest based on facts and not on fantasies or hearsay....

Read More