MIT Tech Review AI December 20, 2024 NEUTRAL

Why logic-gate neural networks could make AI cheaper

Researchers are exploring neural networks built directly from logic gates, the basic hardware elements inside computer chips. The approach could make computer vision systems faster and far more energy efficient after training, but training remains costly and experts are cautious about whether it can scale.

WTF Index NEUTRAL

◄ Terminator 1 Idiocracy 0 ►

The story is mainly a technical efficiency update, with only a mild lean toward more capable and deployable AI hardware.

Why logic-gate neural networks could make AI cheaper

A new approach to neural networks aims to move more of AI’s work out of software and into computer chip hardware. The idea is simple in direction but difficult in practice: build networks from logic gates, then run image-recognition tasks directly through those hardware-friendly structures.

The work, presented at the Neural Information Processing Systems (NeurIPS) conference, focuses on computer vision. Its promise is speed and energy efficiency. Its unresolved question is whether the technique can handle more realistic problems without losing too much performance.

Why AI Networks Are So Energy Hungry

Modern AI systems such as GPT-4 and Stable Diffusion are based on neural networks built from perceptrons. Perceptrons are simplified software models inspired by neurons in the brain. In large numbers, they can become very powerful.

That power has a cost. The source article notes that perceptron-based systems consume enormous volumes of energy, and points to Microsoft’s deal to reopen Three Mile Island to support its AI advancements.

Part of the inefficiency comes from translation. A perceptron network is a software abstraction, but it often runs on hardware such as GPUs. That means the system has to express the network in a form the hardware can execute, using time and energy in the process.

Hardware-native neural networks take a different path. If the network is built directly from computer chip components, some of that translation overhead can disappear. In principle, this could make AI workloads faster, cheaper to run, and easier to place inside devices such as smartphones and personal computers.

How Logic-Gate Neural Networks Work

Felix Petersen, who did this work as a postdoctoral researcher at Stanford University, designed networks made from logic gates. Logic gates are among the basic building blocks of computer chips. Each is made from a few transistors, accepts two bits as inputs, and produces one bit as output.

Those bits are either 1s or 0s. A logic gate’s output depends on the rule created by its pattern of transistors. Like perceptrons, logic gates can be linked together into networks.

The advantage is efficiency at run time. Petersen said in his NeurIPS talk that logic-gate networks consume less energy than perceptron networks by a factor of hundreds of thousands. That is the core reason the idea has attracted attention: even if the model is not the strongest possible model, it may be much cheaper to use.

Zhiru Zhang, a professor of electrical and computer engineering at Cornell University, described the opportunity in terms of edge machine learning. The logic-gate approach does not yet match traditional neural networks on image labeling, but if the performance gap can be narrowed, it could open up possibilities for AI that runs closer to users and devices.

The Training Problem

The main complication is training. Backpropagation, the algorithm behind much of deep learning, depends on calculus. Standard logic gates, however, only work with 0s and 1s. Calculus needs meaningful answers for values between those points.

Petersen reached the approach through an interest in “differentiable relaxations,” methods that reshape certain mathematical problems into forms calculus can handle. As he put it, “It really started off as a mathematical and methodological curiosity.”

To train logic-gate networks, Petersen created functions that behave like logic gates for 0s and 1s, while also producing answers for intermediate values. That made it possible to use backpropagation during training. Afterward, the relaxed version could be converted back into something suitable for computer hardware.

That workaround is powerful, but expensive. Each node can become any one of 16 different logic gates, and the system must track and adjust the 16 probabilities connected to those choices. Petersen said during his NeurIPS talk that training these networks takes hundreds of times longer than training conventional neural networks on GPUs.

That burden matters for research groups without access to huge computing resources. Petersen developed the networks with colleagues at Stanford University and the University of Konstanz, and he noted that the required GPU time makes the research tremendously hard.

Where The Approach Already Looks Useful

Once training is finished, the costs shift dramatically. Petersen compared the logic-gate networks with other ultra-efficient networks, including binary neural networks, which use simplified perceptrons that process only binary values.

On the CIFAR-10 data set, which includes 10 categories of low-resolution images ranging from “frog” to “truck,” the logic-gate networks performed as well as those other efficient methods. They did so with fewer than a tenth of the logic gates required by the other methods, and in less than a thousandth of the time.

The tests used FPGAs, programmable computer chips that can emulate many possible logic-gate layouts. Petersen also noted that using non-programmable ASIC chips could reduce costs further, because programmable chips require more components to support their flexibility.

This points to the clearest near-term appeal of the method:

fast image classification after training
lower energy use during operation
hardware that could eventually sit inside personal devices
less need to send some data between devices and servers

Why Experts Are Still Cautious

The idea is not yet a replacement for traditional neural networks. Farinaz Koushanfar, a professor of electrical and computer engineering at the University of California, San Diego, said she is not convinced the approach will perform well on more realistic problems. “It’s a cute idea, but I’m not sure how well it scales,” she says.

Her concern centers on approximation. The networks are trained through the relaxation strategy, not by directly training exact logic-gate behavior. That has not caused problems yet, but Koushanfar says it could become more serious as the networks grow.

Petersen is still aiming higher. He hopes to push logic-gate networks toward what he calls a “hardware foundation model,” a general-purpose vision network that could be mass-produced directly on chips. Such chips could be integrated into devices like personal phones and computers.

The goal is not to outperform the most capable traditional neural networks. Petersen acknowledges that logic-gate networks will never compete with them on performance. The target is a different trade-off: good enough results at the lowest possible cost.

“It won’t be the best model,” he says. “But it should be the cheapest.”