Want to save the planet from AI? Chuck in an FPGA and ditch the matrix

Tan KW

Publish date: Wed, 26 Jun 2024, 04:44 PM

Large language models can be made 50 times more energy efficient with alternative math and custom hardware, claim researchers at University of California Santa Cruz.

In a paper titled, "Scalable MatMul-free Language Modeling," authors Rui-Jie Zhu, Yu Zhang, Ethan Sifferman, Tyler Sheaves, Yiqiao Wang, Dustin Richmond, Peng Zhou, and Jason Eshraghian describe how the energy appetite of artificial intelligence can be moderated by getting rid of matrix multiplication and adding a custom field-programmable gate array (FPGA).

AI - by which we mean predictive, hallucinating machine learning models - has been terrible for keeping Earth habitable because it uses so much energy, much of which comes from fossil fuel use. The operation of datacenters to provide AI services has increased Microsoft's CO₂ emissions by 29.1 percent since 2020, and AI-powered Google searches each use 3.0 Wh, ten times more than traditional Google queries.

Earlier this year, a report from the International Energy Agency [PDF] projected that global data center power consumption will nearly double by 2026, rising from 460TWh in 2022 to just over 800TWh in two years. The hunger for energy to power AI has even reinvigorated interest in nuclear power, because accelerating fossil fuel consumption for the sake of chatbots, bland marketing copy, and on-demand image generation has become politically fraught, if not a potential crime against humanity.

Jason Eshraghian, an assistant professor of electrical and computer engineering at the UC Santa Cruz Baskin School of Engineering and the paper’s lead author, told The Register that the research findings could provide a 50x energy savings with the help of custom FPGA hardware.

"I should note that our FPGA hardware was very unoptimized, too," said Eshraghian. "So there's still a lot of space for improvement."

The prototype is already impressive. A billion-parameter LLM can be run on the custom FPGA with just 13 watts, compared to 700 watts that would have been required using a GPU.

To achieve this, the US-based researchers had to do away with matrix multiplication, a linear algebra technique that is widely used in machine learning and is costly from a computational perspective. Instead of multiplying weights (parameters assigned to link neural network layers) consisting of floating point numbers between 0 and 1, the computer scientists added and subtracted binary {0, 1} or ternary representations {-1, 0, 1}, thus demanding less of their hardware.

Other researchers over the past few years have explored alternative architectures for neural networks. One of these, BitNet, has shown promise as a way to reduce energy consumption through simpler math. As described in a paper released in February, representing neural network parameters (weights) as {-1, 0, 1} instead of using 16-bit floating point precision can provide high performance with much less computation.

The work of Eshraghian and his co-authors demonstrates what can be done with this architecture. Sample code has been published to GitHub.

Eshraghian said, the use of "ternary weights replaces multiplication with addition and subtraction, which is computationally much cheaper in terms of memory usage and the energy of actual operations undertaken."

That's combined, he said, with the replacement of "self-attention," the backbone of transformer models, with an "overlay" approach.

"In self attention, every element of a matrix interacts with every single other element," he said. "In our approach, one element only interacts with one other element. By default, less computation leads to worse performance. We compensate for this by having a model that evolves over time."

Eshraghian explained that transformer-based LLMs take all text in one hit. "Our model takes each bit of text piece by piece, so our model is tracking where a particular word is situated in a broader context by accounting for time," he said.

Reliance on ternary representation of data does hinder performance, Eshraghian acknowledged, but he and his co-authors found ways to offset that effect.

"Given the same number of computations, we're performing on par with Meta's open source LLM," he said. "However, our computations are ternary operations, and therefore, much cheaper (in terms of energy/power/latency). For a given amount of memory, we do far better."

Even without the custom FPGA hardware, this approach looks promising. The paper claims that by fused kernels in the GPU implementation of ternary dense layers, training can be accelerated by 25.6 percent while memory consumption can be reduced by 61 percent compared to a GPU baseline.

"Furthermore, by employing lower-bit optimized CUDA kernels, inference speed is increased by 4.57 times, and memory usage is reduced by a factor of 10 when the model is scaled up to 13B parameters," the paper claims.

"This work goes beyond software-only implementations of lightweight models and shows how scalable, yet lightweight, language models can both reduce computational demands and energy use in the real-world." ®

https://www.theregister.com//2024/06/26/ai_model_fpga/

Discussions

Be the first to like this. Showing 0 of 0 comments

Featured Posts

MQ Chat

New Update. Discover investment communities that resonate with your ideas

Latest Videos

MQ Market Updates - 28 June 2024

MQ Trader

Apps

MQ Chat

Send individual or group chats with anyone on i3investor

MQ Trader

Earn MQ Points while trading with MQ Trader

MQ Affiliate

Earn side income from Affiliate Program

MQdemy

Online learning and teaching marketplace

Hot Stocks Today >

MPI

MALAYSIAN PACIFIC INDUSTRIES

1000

PTRANS

PERAK TRANSIT BERHAD

949

HLIND

HONG LEONG INDUSTRIES BHD

922

JCY

JCY INTERNATIONAL BERHAD

633

DNEX

DAGANG NEXCHANGE BERHAD

606

GENTING

GENTING BHD

591

YTLPOWR

YTL POWER INTERNATIONAL BHD

556

MAYBANK

MALAYAN BANKING BHD

487

MYEG

MY E.G. SERVICES BHD

487

SAPNRG

SAPURA ENERGY BERHAD

471

Daily Stocks

HSI-HWE

0.17

-0.005

248,121,800

BORNOIL

0.01

+0.005

224,622,400

HSI-HU8

0.095

-0.01

154,131,800

HSI-CXV

0.105

-0.005

126,319,700

HSI-CXF

0.07

-0.01

101,309,000

NOVAMSC

0.215

+0.02

86,522,500

AHB-WC

0.075

+0.005

79,965,600

MYEG

1.02

+0.05

74,108,200

INGENIEU

0.05

-0.01

62,528,100

YNHPROP

0.545

+0.05

50,393,900

More active Stocks

DLADY

36.18

+0.68

15,800

MPI

39.42

+0.54

96,600

UTDPLT

24.50

+0.30

228,400

AJI

15.50

+0.26

208,600

CDB

3.68

+0.21

11,439,700

ALLIANZ-PA

23.60

+0.20

100

PETDAG

17.44

+0.18

704,400

ALLIANZ

22.30

+0.18

15,200

AIRPORT

9.90

+0.17

1,698,400

HUMEIND

3.35

+0.14

941,700

More gainer Stocks

ORIENT

6.97

-0.18

1,220,900

GESHEN

3.23

-0.17

150,100

TENAGA

13.78

-0.16

11,369,400

PETGAS

17.82

-0.16

871,600

HEIM

22.04

-0.16

164,300

APOLLO

6.71

-0.13

1,500

KUAISHO-C17

0.08

-0.12

58,800

NOTION-WD

1.77

-0.11

1,778,200

HLIND

11.12

-0.10

10,500

CANONE

3.00

-0.09

39,800

More loser Stocks

MQ Trading Signals

BUY
SELL

SCGBHD

SOUTHERN CABLE GROUP BERHAD

2024-06-28 16:55:00

EMA 5

5 Mins

LBICAP

LBI CAPITAL BHD

2024-06-28 16:55:00

EMA 5

5 Mins

KUB

KUB MALAYSIA BHD

2024-06-28 16:55:00

ADX

5 Mins

MBMR

MBM RESOURCES BHD

2024-06-28 16:55:00

EMA 5

5 Mins

KIMLUN

KIMLUN CORPORATION BERHAD

2024-06-28 16:55:00

EMA 5

5 Mins

More Trading Signals

LFECORP

LFE CORPORATION BHD

2024-06-28 16:55:00

EMA 5

5 Mins

JAYCORP

JAYCORP BHD

2024-06-28 16:55:00

EMA 5

5 Mins

LYSAGHT

LYSAGHT GALVANIZED STEEL BHD

2024-06-28 16:55:00

EMA 5

5 Mins

KUCHAI

KUCHAI DEVELOPMENT BHD

2024-06-28 16:55:00

EMA 5

5 Mins

MASTEEL

MALAYSIA STEEL WORKS (KL)BHD

2024-06-28 16:55:00

EMA 5

5 Mins

More Trading Signals

Featured Advertisers / Partners

Top Brokers >

AmEquities

Affin Hwang

Rakuten Trade

Hong Leong Bank

Books Review >

Ride The Bull Short The Bear

CS Tan

4.9 / 5.0

This book is the result of the author's many years of experience and observation throughout his 26 years in the stockbroking industry. It was written for general public to learn to invest based on facts and not on fantasies or hearsay....

Read More