As machine learning models become larger and more complex, they require faster and more power-efficient hardware to perform calculations. Conventional digital computers struggle to keep up.
An analog optical neural network could perform the same tasks as a digital network, such as image classification or speech recognition, but since the calculations are performed using light instead of electrical signals, optical neural networks can run much faster while consuming less power.
However, these analog devices are prone to hardware errors that can make calculations less accurate. Microscopic imperfections in hardware components are one of the causes of these errors. In an optical neural network with many connected components, errors can quickly accumulate.
Even with error correction techniques, due to the fundamental properties of the devices that make up an optical neural network, some amount of error is inevitable. A network large enough to implement in the real world would be far too imprecise to be effective.
MIT researchers overcame this hurdle and found a way to efficiently scale an optical neural network. By adding a tiny piece of hardware to the optical switches that form the network architecture, they can even reduce uncorrectable errors that would otherwise accumulate in the device.
Their work could enable an ultra-fast, energy-efficient analog neural network that can operate with the same precision as a digital network. With this technique, as an optical circuit becomes larger, the amount of error in its calculations actually decreases.
“This is remarkable because it goes against the intuition of analog systems, where larger circuits are expected to have higher errors, so errors limit scalability. This paper allows us to ‘address the scalability of these systems with an unambiguous ‘yes’,” says lead author Ryan Hamerly, Visiting Scholar at MIT’s Research Electronics Laboratory (RLE) and the Quantum Photonics Laboratory and Principal Investigator at NTT Research.
Hamerly’s co-authors are graduate student Saumil Bandyopadhyay and lead author Dirk Englund, associate professor in MIT’s Department of Electrical and Computer Engineering (EECS), head of the Quantum Photonics Laboratory, and member of RLE. The research is published today in Nature Communication.
multiply with light
An optical neural network is made up of many connected components that function as reprogrammable and tunable mirrors. These tunable mirrors are called Mach-Zehnder inferometers (MZI). Neural network data is encoded in light, which is sent into the optical neural network from a laser.
A typical MZI contains two mirrors and two beam splitters. Light enters the top of an MZI, where it is split into two parts that interfere with each other before being recombined by the second beamsplitter and then reflected down to the next MZI in the network. Researchers can take advantage of the interference from these optical signals to perform complex linear algebra operations, called matrix multiplication, which is how neural networks process data.
But the errors that can occur in every MZI add up quickly as light travels from device to device. Some errors can be avoided by identifying them in advance and setting the MZIs so that earlier errors are canceled by later devices in the array.
“It’s a very simple algorithm if you know what the errors are. But those errors are notoriously difficult to determine because you only have access to the inputs and outputs of your chip,” says Hamerly. examine whether it was possible to create an error correction without calibration.”
Hamerly and his collaborators previously demonstrated a mathematical technique that went further. They were able to successfully deduce the errors and properly tune the MZIs accordingly, but even that didn’t remove all the errors.
Due to the fundamental nature of an MZI, there are instances where it is not possible to set a device so that all light flows from the bottom port to the next MZI. If the device loses a fraction of light at each step and the array is very large, in the end there will only be a tiny bit of power left.
“Even with error correction, there is a fundamental limit to the quality of a chip. MZIs are physically unable to achieve certain parameters that they must be configured for,” he says.
Thus, the team developed a new type of MZI. The researchers added an additional beamsplitter to the end of the device, calling it a 3-MZI because it has three beamsplitters instead of two. Because of the way this additional beam splitter blends the light, it becomes much easier for an MZI to achieve the setting it needs to send all the light through its bottom port.
It is important to note that the additional beam splitter is only a few micrometers in size and is a passive component, so it does not require any additional wiring. Adding additional beam splitters does not significantly change chip size.
Bigger chip, fewer errors
When the researchers ran simulations to test their architecture, they found that it could eliminate much of the uncorrectable error that hinders accuracy. And as the optical neural network gets bigger, the amount of errors in the device actually decreases, the opposite of what happens in a device with standard MZIs.
By using 3-MZIs, they could potentially create a device large enough for commercial uses with an error reduced by a factor of 20, Hamerly says.
The researchers also developed a variant of the MZI design specifically for correlated errors. These occur due to manufacturing imperfections – if a chip’s thickness is slightly off, the MZIs can all be off by about the same amount, so the errors are about the same. They found a way to modify the configuration of an MZI to make it robust to these types of errors. This technique also increased the bandwidth of the optical neural network so that it could operate three times faster.
Now that they have demonstrated these techniques using simulations, Hamerly and his collaborators plan to test these approaches on physical hardware and continue to work towards an optical neural network that they can effectively deploy in the real world. .
This research is supported, in part, by a National Science Foundation Graduate Research Fellowship and the United States Air Force Office of Scientific Research.
#Break #scaling #limitations #analog #computing