Prof. Jason Eshraghian Delivering Keynote at “Workshop on Synchronization and Timing Systems” – “The Brain Computes Using Time and so Should Neural Networks”

See the agenda here.

Abstract: How can “time” be harnessed to boost neural network performance? The brain is a marvel of computation and memory, processing vast amounts of sensory data with an efficiency that puts modern electronics to shame. Reducing the megawatts consumed by hyperscale datacenters to the mere 10 watts the brain requires demands a fundamental shift – leveraging time. We will explore how temporal dynamics enhance neural network efficiency and performance. We will explore the Matrix-Multiply-free language model, where information is distributed across sequences, requiring the model to “learn to forget” in order to utilize limited cache effectively. Ultimately, by embracing temporal strategies, we pave the way toward neuromorphic computing systems that are not only more efficient but also closer to the elegant and sustainable designs found in nature. This exploration marks a step forward in reducing energy demands while advancing the capabilities of artificial intelligence.

“ON-OFF neuromorphic ISING machines using Fowler-Nordheim annealers” led by Zihao Chen, Zhili Xiao, and Shantanu Chakrabartty published in Nature Communications

Abstract: We introduce NeuroSA, a neuromorphic architecture specifically designed to ensure asymptotic convergence to the ground state of an Ising problem using a Fowler-Nordheim quantum mechanical tunneling based threshold-annealing process. The core component of NeuroSA consists of a pair of asynchronous ON-OFF neurons, which effectively map classical simulated annealing dynamics onto a network of integrate-and-fire neurons. The threshold of each ON-OFF neuron pair is adaptively adjusted by an FN annealer and the resulting spiking dynamics replicates the optimal escape mechanism and convergence of SA, particularly at low-temperatures. To validate the effectiveness of our neuromorphic Ising machine, we systematically solved benchmark combinatorial optimization problems such as MAX-CUT and Max Independent Set. Across multiple runs, NeuroSA consistently generates distribution of solutions that are concentrated around the state-of-the-art results (within 99%) or surpass the current state-of-the-art solutions for Max Independent Set benchmarks. Furthermore, NeuroSA is able to achieve these superior distributions without any graph-specific hyperparameter tuning. For practical illustration, we present results from an implementation of NeuroSA on the SpiNNaker2 platform, highlighting the feasibility of mapping our proposed architecture onto a standard neuromorphic accelerator platform.

NeuroBench published in Nature Communications

The multi-institutional, large-scale project led by Jason Yik (Harvard), Vijay Janapa Reddi (Harvard), and Charlotte Frenkel (TU Delft) has been published in Nature Communications.

Abstract: Neuromorphic computing shows promise for advancing computing efficiency and capabilities of AI applications using brain-inspired principles. However, the neuromorphic research field currently lacks standardized benchmarks, making it difficult to accurately measure technological advancements, compare performance with conventional methods, and identify promising future research directions. This article presents NeuroBench, a benchmark framework for neuromorphic algorithms and systems, which is collaboratively designed from an open community of researchers across industry and academia. NeuroBench introduces a common set of tools and systematic methodology for inclusive benchmark measurement, delivering an objective reference framework for quantifying neuromorphic approaches in both hardware-independent and hardware-dependent settings. For latest project updates, visit the project website (neurobench.ai).

Fig. 1

Prof. Jason Eshraghian Delivering Plenary Talk at IEEE MCSoC: “Large-Scale Neuromorphic Computing on Heterogeneous Systems”

In the realm of large-scale model training, the efficiency bottleneck often stems from the intensive data communication required between GPUs. Drawing inspiration from the brain’s remarkable efficiency, this talk explores neuromorphic computing’s potential to mitigate this bottleneck. As chip designers increasingly turn to advanced packaging technologies and chiplets, the models running on these heterogeneous platforms must evolve accordingly. Spiking neural networks, inspired by the brain’s method of encoding information over time and its utilization of fine-grained sparsity for information transfer, are perfectly poised to extract the benefits (and limitations) imposed in heterogeneous hardware systems. This talk will delve into strategies for integrating spiking neural networks into large-scale models and how neuromorphic computing, alongside the utilization of chiplets, can surpass the current capabilities of GPUs, paving the way for the next generation of AI systems.

“Autonomous Driving with Spiking Neural Networks” by Ph.D. Candidate Rui-Jie Zhu Accepted in NeurIPS 2024

Autonomous driving demands an integrated approach that encompasses perception, prediction, and planning, all while operating under strict energy constraints to enhance scalability and environmental sustainability. We present Spiking Autonomous Driving (\name{}), the first unified Spiking Neural Network (SNN) to address the energy challenges faced by autonomous driving systems through its event-driven and energy-efficient nature. SAD is trained end-to-end and consists of three main modules: perception, which processes inputs from multi-view cameras to construct a spatiotemporal bird’s eye view; prediction, which utilizes a novel dual-pathway with spiking neurons to forecast future states; and planning, which generates safe trajectories considering predicted occupancy, traffic rules, and ride comfort. Evaluated on the nuScenes dataset, SAD achieves competitive performance in perception, prediction, and planning tasks, while drawing upon the energy efficiency of SNNs. This work highlights the potential of neuromorphic computing to be applied to energy-efficient autonomous driving, a critical step toward sustainable and safety-critical automotive technology. Our code is available at https://github.com/ridgerchu/SAD.
Link: https://arxiv.org/abs/2405.19687

“Reducing Data Bottlenecks in Distributed, Heterogeneous Neural Networks” by Undergraduate Researcher Ruhai Lin Accepted at IEEE MCSoC-2024

The rapid advancement of embedded multicore and many-core systems has revolutionized computing, enabling the development of high-performance, energy-efficient solutions for a wide range of applications. As models scale up in size, data movement is increasingly the bottleneck to performance. This movement of data can exist between processor and memory, or between cores and chips. This paper investigates the impact of bottleneck size, in terms of inter-chip data traffic, on the performance of deep learning models in embedded multicore and many-core systems. We conduct a systematic analysis of the relationship between bottleneck size, computational resource utilization, and model accuracy. We apply a hardware-software co-design methodology where data bottlenecks are replaced with extremely narrow layers to reduce the amount of data traffic. In effect, time-multiplexing of signals is replaced by learnable embeddings that reduce the demands on chip IOs. Our experiments on the CIFAR100 dataset demonstrate that the classification accuracy generally decreases as the bottleneck ratio increases, with shallower models experiencing a more significant drop compared to deeper models. Hardware-side evaluation reveals that higher bottleneck ratios lead to substantial reductions in data transfer volume across the layers of the neural network. Through this research, we can determine the trade-off between data transfer volume and model performance, enabling the identification of a balanced point that achieves good performance while minimizing data transfer volume. This characteristic allows for the development of efficient models …