Ars Technica AI January 12, 2025 TERMINATOR

Why photonic AI chips could cut latency to picoseconds

MIT researchers built a photonic chip that can run a complete deep neural network, including both linear and non-linear operations. The prototype reached 410 picoseconds of latency, but it remains a small system with 132 parameters.

WTF Index TERMINATOR

◄ Terminator 1 Idiocracy 0 ►

The story is mainly a technical hardware advance that could make AI systems faster, with only a mild lean toward more powerful autonomous applications.

Why photonic AI chips could cut latency to picoseconds

AI hardware usually begins by turning light into electrical information, then sending that data elsewhere for computation. MIT researchers are exploring a different path: keep the information in light and process the photons directly.

The result is a photonic AI chip designed around one main goal: lower latency. In the team's prototype, a complete deep neural network ran on a chip with a latency of 410 picoseconds.

Why latency is the main target

A standard digital camera used in a car for emergency braking has a perceptual latency of a hair above 20 milliseconds. That figure only covers the time needed to convert photons entering the camera into electrical charges with CMOS or CCD sensors. It does not include the additional time needed to move the data to an onboard computer or process it there.

That delay matters for systems that need to respond quickly. Saumil Bandyopadhyay, an MIT researcher, described the work as focused on applications where producing an answer quickly is the central requirement.

The team's photonic chip takes aim at the conversion step itself. Instead of sensing light, digitizing it, and then computing on the resulting electrical signals, the chip performs calculations with photons. Compared with a standard CPU clocked at 4 GHz, the chip could process the neural network it carried around 58 times within a single tick of the CPU clock.

The hard part is non-linear math

Neural networks are built from layers of computational units that act like neurons. Inputs move through these layers, where they are multiplied by weights or parameters. Each layer computes a weighted sum from the previous layer and passes the result forward.

That part resembles linear algebra, especially matrix multiplication. Photonics is well suited to this kind of work. Bandyopadhyay noted that photonics turned out to be particularly good at linear matrix operations, and a group at MIT led by Dirk Englund demonstrated a photonic chip doing matrix multiplication entirely with light in 2017.

But deep neural networks need more than matrix multiplication. They also depend on non-linear thresholding functions, which help the model represent relationships where outputs are not simply proportional to inputs. Repeating both the linear and non-linear steps across layers is part of what makes deep neural networks useful for complicated patterns in data.

That is where earlier photonic approaches ran into trouble. The common workaround was to do the linear algebra optically, then send the non-linear work to external electronics. This required converting light into electrical signals, processing those signals, and converting the result back into light. For a technology built to reduce latency, that detour weakened the advantage.

How the MIT chip keeps computation on the device

Bandyopadhyay and his colleagues designed and built what the source describes as likely the world's first chip able to compute an entire deep neural network, including both linear and non-linear operations, using photons.

The process begins with an external laser and a modulator that feeds light into the chip through an optical fiber. That converts electrical inputs into light before computation starts.

Inside the chip, light is fanned out into six channels and sent into a layer of six neurons. Those neurons perform linear matrix multiplication using Mach-Zehnder interferometers. In this system, the interferometers act as programmable beam splitters that mix two optical fields and produce two output optical fields. Applying voltage controls how much the two inputs mix.

A single Mach-Zehnder interferometer performs a two-by-two matrix operation on a pair of optical signals. A rectangular array of these devices lets the system carry out larger matrix operations across all six optical channels.

The non-linear part is handled by co-integrating electronics and optics. A tiny portion of the optical signal is sent to a photodiode, which measures the optical power. That measurement is then used to modulate the remaining photons moving through the device.

The complete chip had three layers of neurons for matrix multiplication, with two non-linear function units placed between them. In total, the neural network on the chip could work with 132 parameters.

Small models, specific uses

The 132-parameter scale shows both the promise and the limits of the prototype. The source notes that the number of parameters used in the Chat GPT-4 large language model is reportedly 1 trillion. Against that comparison, the MIT chip is not competing with the largest AI systems.

Bandyopadhyay's team is instead aiming at smaller models where latency matters more than model size. The team is targeting AIs that work with up to 100,000 parameters. The argument is that useful systems do not need to start with large language models if the job is fast classification or sensing.

In the study, the smaller model implemented on the chip recognized spoken vowels, a benchmark used in research on AI-focused hardware. It achieved 92 percent accuracy, which was on par with neural networks running on standard computers.

The same direction could matter for autonomous navigation. Bandyopadhyay described systems that repeatedly classify lidar signals with very fast latency, faster than human reflexes. His team believes chips like this could classify lidar data directly by feeding photons into photonic chips without converting them first into electrical signals.

The researchers also see a possible role in automotive vision systems that differ from today's camera-based designs. Instead of a standard camera pipeline, a large array of inputs could sample optical signals and send them directly to optical processors for machine-learning computation.

For now, the work is a modest beginning rather than a replacement for major AI accelerators. Its importance is narrower and more practical: it shows a way to combine linear and non-linear neural-network operations on a photonic chip, reducing the need to leave the optical domain just when speed matters most.