Thursday, March 31, 2016

Neuromorphic computing: Moving beyond the von Neumann architecture

For decades, computer scientists have striven to build machines as complex and efficient as the human brain. China's Tianhe-2, the world's most powerful supercomputer (consisting of 200 refrigerator-sized units in an area the size of a basketball court), may compute four times faster and hold 10 times more data than the human brain, but it also sucks up enough electricity to power 10,000 homes. On the other hand, the human brain consumes less juice than a dim light bulb and fits nicely within our skull. So what if we could build computers that were more like brains? Turns out we're almost there.

Brains are not like digital computers

As the end of Moore's law seems closer than ever, researchers are exploring new approaches to computing that can move beyond the traditional von Neumann architecture, which is the architecture that powers pretty much every conventional computer and smart device available today. In the traditional von Neumann architecture, a powerful logic core (central processing unit; CPU) operates sequentiually on data fetched from memory. This concept is very powerful, as we have seen it scale to systems with 3,120,000 cores and 1.34 pebibyte of memory (more than a million GB) in the case of Tianhe-2. The memory contains both data and instructions, and the CPU may both fetch instructions from the memory and perform (e.g.) arithmetic operations on the data, but it can't do both at the same time. All instructions and data are represented with ones and zeroes, and every program we write is essentially executed in a sequential manner. Sure, if you have multiple CPUs you can run multiple programs in parallel, and deep down instructions might be pipelined, but at the end of the day every machine instruction is executed with (a variant of) the same sequential recipe: You fetch and decode an instruction, you execute it, and you make a memory update.

Biological brains on the other hand, work completely different. In fact, we're still not quite sure how they work (consciousness, anyone?). What we do know is that they distribute computation and memory among up to 100 billion of relatively simple and noisy processing units (neurons), each highly interconnected with thousands of connections (synapses). Memory is not located in one particular place in the brain, but is instead a brain-wide process in which several different areas of the brain operate in concert with one another. In fact, 66% of all possible inter-areal connections actually exist in the brain! We can find hierarchies (e.g., the visual stream), modules, and functional specification (e.g., an area that predominantly recognizes faces), but this might be an oversimplifcation still. Neurons communicate with each other through spikes (an all-or-nothing digital event), but over time these spikes can give rise to an analog signal (neuronal firing rate), where the exact timing of these spikes might play a role as well. You can't tell what a brain is doing by simply looking at a single neuron (population coding). You can even kill off single neurons without affecting brain processing much (but that doesn't mean you should go get black-out drunk every weekend).

Overall, brains are remarkably complex, fault-tolerant, and efficient. Brains help us survive, adapt, and predict, yet they consume no more than 13-15 watts of power. This makes brains 8-9 orders of magnitude more efficient than digital computers!

So what if we could extract the computational principles that make brains so powerful, and build a computer based on them?

Building brains out of silicon

Researchers have long recognized the extraordinary energy stinginess of biological computing, most clearly in a visionary 1990 paper by the California Institute of Technology (Caltech)'s Carver Mead that established the term "neuromorphic", which at the time was a rather futuristic-sounding vision of a "brain chip" that could mimic the brain's structure and processing ability in silicon—quickly learning and chewing on data as fast as it could be generated.

Since then, a number of neuromorphic computing platforms have popped up all over the place, based on a wide variety of architectures and visions, including custom ASICs, FPGAs, ARM, GPUs, and other non-standard CPU cores. Although their approach may vary, their goal is the same—the delivery of a revolution in computing based on a chip that operates on the same principles as the brain.

Last week neuromorphic computing took another step forward with a workshop being offered to users from academia, industry and education interested in using two European neuromorphic systems that have been years in development and are coming online for broader use—the BrainScaleS system launching at the Kirchhoff Institute for Physics of Heidelberg University and SpiNNaker, a complementary approach and similarly sized system at the University of Manchester. Both are part of the Human Brain Project, originally funded by the European Commission's Future Emerging Technologies program (2005–2015). The complete workshop stream can be watched on YouTube.

BrainScaleS and SpiNNaker take different tacks for modeling neuron activity. One approach is to use traditional wafer-scale analog very large scale integration (VLSI) circuits—like the chips being developed by the BrainScaleS. Each 20-cm-diameter silicon wafer contains 384 chips, each of which implements 128,000 synapses and up to 512 spiking neurons. This gives a total of around 200,000 neurons and 49 million synapses per wafer. These VLSI models operate considerably faster than the biological originals and allow the emulated neural networks to evolve tens-of-thousands times quicker than real time. Leader of the BrainScaleS project, Prof. Dr. Karlheinz Meier (Heidelberg University) explains, "The BrainScaleS system goes beyond the paradigms of a Turing machine and the von Neumann architecture. It is neither executing a sequence of instructions nor is it constructed as a system of physically separated computing and memory units. It is rather a direct, silicon based image of the neuronal networks found in nature, realizing cells, connections and inter-cell communications by means of modern analogue and digital microelectronics."

Conversely, SpiNNaker's architecture closely links a very large number of digital cores (also fast, and in this case, also energy efficient). It will exhibit massive parallelism and resilience to failure of individual components. With more than one million cores, and one thousand simulated neurons per core, SpinNNaker should be capable of simulating one billion neurons in real-time. This equates to a little over one percent of the human brain’s estimated 85 billion neurons.

Steve Furber, a professor at the University of Manchester and a co-designer of the ARM chip architecture, leads the SpiNNaker team. SpiNNaker is a contrived acronym derived from Spiking Neural Network Architecture. The machine consists of 57,600 identical 18-core processors, giving it 1,036,800 ARM968 cores in total. The die is fabricated by United Microelectronics Corporation (UMC) on a 130 nm CMOS process. Each System-in-Package (SiP) node has an on-board router to form links with its neighbors, as well as 128 Mbyte off-die SDRAM to hold synaptic weights.

Most prominent in the U.S. is IBM Research's TrueNorth Chip, envisioned by IBM fellow and chief scientist for brain-inspired computing, Dharmendra Modha. The TrueNorth chip, introduced in August 2014, is a neuromorphic CMOS chip that consists of 4,096 hardware cores, each one simulating 256 programmable silicon “neurons” for a total of just over a million neurons. Each neuron has 256 programmable synapses which convey the signals between them. Hence, the total number of programmable synapses is just over 268 million. In terms of basic building blocks, its transistor count is 5.4 billion.

Recently, Modha's group revealed the world's first 1 million-neuron evaluation platform for mobile applications (NS1e). This week, they took another step forward by demomstrating the first 16 million-neuron scale-out system (NS1e-16) assembled using 16 instances of the NS1e board along with supporting periphery, which includes a host server, network router, power supervisors, and other components. The system possesses the equivalent of 16 million neurons and 4 billion synapses (about the size of a frog brain), but consumes the energy equivalent of a tablet computer—a mere 2.5 watts of power.

Where to go from here

BrainScaleS, SpiNNaker, and TrueNorth are just three examples of many ongoing neuromorphic projects. Other projects to look out for include Stanford's Neurogrid, ETH Zurich/University of Zurich's VLSI-based Neuromorphic Cognitive Systems, efforts from smaller endeavors such as BrainCorp and BrainChip, and many more to come.

However, turning these platforms into commercial products or more general-purpose computing machines remains challenging. The problem seems to be that there really is no "general" purpose as of yet: All of these devices are built with their own specific application in mind, resulting in their own hardware approach and programming environment. For example, some of these devices were built to model the brain as closely as possible (e.g., BrainScales, Neurogrid), whereas others do not so much care about neurobiological fidelity and instead focus on solving computer science-related challenges (e.g., TrueNorth). Adapting these platforms for a general audience will be key for future commercial success. And how do you program a neuromorphic supercomputer anyway?

The singularity might not be around the corner just yet, but the field is making rapid progress towards a new generation of general-purpose brain-inspired computing. The first complete, commercially available TrueNorth system was just sold to Brian van Essen's team at Lawrence Livermore National Laboratory. Van Essen sees neurosynaptic chips playing two key roles in the high-performance computing world of the future: "First, because the chips are designed to be integrated with one another, and, essentially, operate like one large processor, computer scientists will be able to build massive computers by adding more processors—enabling them to take on very large computing tasks", he explains on the IBM Blog. "Second, I believe that many of the high-performance systems of the future will possess a variety of computing capabilities so they can take on complex tasks. A neurosynaptic system focusing on pattern recognition and deep learning could be a component of a larger system on the road to exascale supercomputers."