Future Tech

All the datacenter roadmap updates Intel, AMD, Nvidia teased at Computex

Tan KW
Publish date: Wed, 05 Jun 2024, 10:22 AM
Tan KW
0 448,535
Future Tech

Computex At the annual Computex conference in Taipei this week Intel, AMD, and Nvidia showed off their latest datacenter and AI kit, and offered a tantalizing glimpse of what's coming next on their respective roadmaps.

One of the most surprising updates came from Nvidia. Last year, we learned the GPU designer was accelerating its development cycle to support a yearly release cadence. On stage in Taipei, Nvidia CEO Jensen Huang gave us the best picture yet of the chip maker's plans, including the name of its next-gen GPU and systems architecture - Rubin.

While we often get caught up in the specifications and features of the chips themselves, whether they be the H100, GB200 Superchip, or its Blackwell Ultra sibling, it's important to remember these components aren't really standalone parts you can pick off a shelf. Nvidia's highest-end accelerators aren't PCIe cards; they're entire platforms.

You can't just buy a B100 or B200; they come in packs of eight as part of Nvidia's DGX or HGX platform. And so, Nvidia's roadmap encompasses not only the CPUs and GPUs, but also the system and cluster networking necessary to support deployment at scale.

Looking closer at Nvidia's roadmap, there aren't too many surprises. Nvidia announced everything coming in 2024 back at GTC in March. What is new to the Blackwell line is the Ultra GPU and Spectrum Ultra Ethernet switches due in 2025. We don't know much about this Blackwell Ultra at this point other than it'll feature - if we're reading this right - eight stacks of 12-high HBM3e memory, which should bring with it a sizable bump in bandwidth and capacity over previous incarnations.

The next evolution of Nvidia's GPU architecture won't come until 2026 with the debut of Rubin, which will apparently use HBM4 memory. 2026 will also see Nvidia retire its Grace CPU architecture in favor of a new one codenamed Vera.

Along with the new compute architectures, Nvidia plans to roll out 1.6 Tbps InfiniBand and Ethernet Switches alongside matching ConnectX-9 SuperNICs. Meanwhile, its NVLink 6 switches will see their bandwidth double from 1.8 TBps today to 3.6 TBps.

We'll note that the roadmap Nvidia showed off during its Computex keynote is actually less aggressive than the one pitched to investors last year. That roadmap had 1.6 Tbps networking slated for release in 2025. As we reported at the time, there were a number of not inconsiderable challenges with pulling this off. Most notably, the PCIe bandwidth required to support networking that fast wouldn't be ready in time.

However, that isn't the end of Nvidia's roadmap, which now extends into 2027 with - you guessed it - a Rubin Ultra GPU which looks like it'll have a whopping 12 stacks of HBM4 memory. The Next Platform has more on Rubin and Nv's plans here.

AMD punches the accelerator on its AI roadmap

As we said, Nvidia intends to put out new GPUs each year. While it's not planning to unleash a new architecture every 12 months, AMD is going to try.

During her Computex keynote, CEO Lisa Su revealed AMD was moving to a yearly release cadence as the House of Zen looks to catch up with rival Nvidia.

Its Instinct MI300-series accelerators, based on its CDNA 3 graphics architecture, launched late last year and boast a not inconsiderable advantage over Nvidia's H100 and H200-series accelerators in floating-point performance, memory bandwidth, and capacity.

AMD intends to extend that lead with the introduction of the MI325X in Q4, which from all appearances is a HBM3e-boosted version of the regular MI300X but with 50 percent more capacity. You can find our deep dive on the memory-optimized accelerator here.

However it won't be long before AMD's CDNA 3 architecture is supplanted. The chip designer intends to bring more capable Instinct accelerators to market in 2025 with the launch of its CDNA 4 compute architecture.

There's a lot we still don't know about the upcoming CDNA 4 Instinct MI350 accelerator, though AMD has told us that chips based on CDNA 4 will feature 288 GB of HBM3e and move to a 3nm process node. The architecture will also add support for lower-precision 4-bit and 6-bit floating-point data types, bringing it into parity with Nvidia's Blackwell.

AMD's "CNDA next" architecture will follow a year later and bring with it "significant architectural upgrades," according to execs. However, beyond this, details are rather sparse.

Alongside AMD's more aggressive GPU roadmap, it also teased its 5th-gen Epyc CPU family, codenamed Turin. Due out later this year, AMD boasted the server processor will pack up to 192 cores, twice that of its 4th-gen Genoa parts and 50 percent more than its cloud-optimized Bergamo SKUs.

Based on the fact that AMD proceeded to show performance figures for a 128-core part, we're going to go out on a limb and say the 192-core variant is probably a Bergamo successor.

As a reminder, this is the most recent roadmap we have for AMD's Epyc lineup. We expect we'll get more details on its 6th-gen parts closer to Turin's release later this year.

Intel's phased Xeon 6 roll out while Gaudi3 remains on track

On the topic of CPUs, at Computex this week Intel unveiled a 144 e-core datacenter CPU, the first of several Xeon 6 products it plans to roll out over the next few quarters.

Intel has been hyping its Sierra Forrest e-core and Granite Rapids p-core Xeons going back to early 2023, and as it turns out the parts actually span two platforms: A smaller, lower power 6700-series platform and a larger, 500W 6900-series platform.

The first of these arrived at Computex in the form of Intel's Xeon 6 6700E processors, which offer 64 to 144 of its power sipping efficiency cores; employ a heterogenous die architecture that splits off I/O functionality and compute; and are its first to use its in-house Intel 3 process tech. 

That chip will be followed in Q3 by Intel's Xeon 6 6900P processor series, which will carry up to 128 performance cores and will boost its memory and PCIe capacity to 12 channels at up to 8,800 MT/s using MCR DIMMs, and 96 lanes of PCIe 5.0.

Finally, the remaining Xeon 6 processors in Intel's line up are slated for release sometime in Q1 of 2025. These chips will include Intel's four and eight socket p-core parts, as well as its monster 288 e-core 6900E SKU, first touted at Intel Innovation last September.

And while not shown in the graphic, Intel is already working on its next-gen e-core Xeon, codenamed Clearwater Forest, which will be one of its first based on its much anticipated 18A process tech.

You can find a full breakdown on everything we know about Intel's Xeon 6 processors here.

Intel also took the time to talk up its latest AI accelerator, announced back at Vision back in April. On stage, Gelsinger revealed that baseboards with eight of its Gaudi3 chips would cost you $125,000. You can find our coverage of Gelsinger’s Computex keynote here.

As a reminder, Gaudi3 is already sampling to Intel's partners, with volume production expected to ramp in Q3 for air-cooled units and Q4 for liquid-cooled variants. However it's worth noting that Intel's flagship AI accelerator is also the last of its kind.

Its replacement, codenamed Falcon Shores, is due out next year. For those who don't recall, Falcon Shores' development has been a bit of a roller coaster. The chip was originally envisioned as APU, or as Intel prefers XPU, which would combine CPUs and GPUs on a single package, similar to AMD's MI300A.

However, those plans were scrapped with Falcon Shores being recast as a GPU that would combine Intel's Xe graphics tech with the Intel Habana Team's AI chemistry. ®

PS: Oh look, it's the Nvidia CEO signing a woman's top with his autograph at Computex amid a throng of fans. Yup!

 

https://www.theregister.com//2024/06/05/chipmakers_computex_roadmaps/

Discussions
Be the first to like this. Showing 0 of 0 comments

Post a Comment