New Paper: “Side-channel attack analysis on in-memory computing architectures” led by Ph.D. candidate Ziyu Wang from the Lu Group published in IEEE Trans. Emerging Topics in Computing

“Side-channel attack analysis on in-memory computing architectures” has been published, led by Ziyu Wang and Prof. Wei Lu (Lu Group), along with collaborators Fan-Hsuan Meng and Yongmo Park.

Abstract—In-memory computing (IMC) systems have great potential for accelerating data-intensive tasks such as deep neural networks (DNNs). As DNN models are generally highly proprietary, the neural network architectures become valuable targets for attacks. In IMC systems, since the whole model is mapped on chip and weight memory read can be restricted, the system acts as a ”black box” for customers. However, the localized and stationary weight and data patterns may subject IMC systems to other attacks. In this paper, we propose a side-channel attack methodology on IMC architectures. We show that it is possible to extract model architectural information from power trace measurements without any prior knowledge of the neural network. We first developed a simulation framework that can emulate the dynamic power traces of the IMC macros. We then performed side-channel attacks to extract information such as the stored layer type, layer sequence, output channel/feature size and convolution kernel size from power traces of the IMC macros. Based on the extracted information, full networks can potentially be reconstructed without any knowledge of the neural network. Finally, we discuss potential countermeasures for building IMC systems that offer resistance to these model extraction attack.

SCA

New Preprint: “SpikeGPT: Generative Pre-Trained Language Model with Spiking Neural Networks” led by incoming Ph.D. candidate Ruijie Zhu

We have released SpikeGPT led by Ruijie Zhu and Qihang Zhao, the largest-scale SNN training via backprop to date, and (to the best of our knowledge) the first generative language spike-based model.

spikegpt-architecture

Abstract: As the size of large language models continue to scale, so does the computational resources required to run it. Spiking neural networks (SNNs) have emerged as an energy-efficient approach to deep learning that leverage sparse and event-driven activations to reduce the computational overhead associated with model inference. While they have become competitive with non-spiking models on many computer vision tasks, SNNs have also proven to be more challenging to train. As a result, their performance lags behind modern deep learning, and we are yet to see the effectiveness of SNNs in language generation. In this paper, we successfully implement `SpikeGPT’, a generative language model with pure binary, event-driven spiking activation units. We train the proposed model on three model variants: 45M, 125M and 260M parameters. To the best of our knowledge, this is 4x larger than any functional backprop-trained SNN to date. We achieve this by modifying the transformer block to replace multi-head self attention to reduce quadratic computational complexity to linear with increasing sequence length. Input tokens are instead streamed in sequentially to our attention mechanism (as with typical SNNs). Our preliminary experiments show that SpikeGPT remains competitive with non-spiking models on tested benchmarks, while maintaining 5x less energy consumption when processed on neuromorphic hardware that can leverage sparse, event-driven activations. Our code implementation is available at https://github.com/ridgerchu/SpikeGPT.

Preprint: https://arxiv.org/abs/2302.13939

Code: https://github.com/ridgerchu/SpikeGPT