PhD Student Binh Nguyen presents at IEEE NER 2025

Binh presented the accepted paper, “Accelerating Neuromorphic Deep Brain Stimulation Optimization through Knowledge Distillation and Enforced Sparsity” at IEEE NER 2025 in San Diego, CA.

Abstract:
Closed-loop Deep Brain Stimulation (DBS) systems hold immense promise for treating motor symptoms in Parkinson’s disease (PD) with greater adaptability and efficiency than traditional open-loop approaches. Spiking Neural Networks (SNNs) are particularly well-suited for implementing the control logic in these systems due to their inherent energy efficiency. However, training SNNs, especially using computationally intensive methods like Reinforcement Learning (RL), presents a significant bottleneck, often requiring extensive time and resources. To address this, we introduce a Knowledge Distillation (KD) framework specifically designed to train SNN-based DBS controllers. We leverage a pre-trained, high-performance Deep Spiking Q-Network (DSQN) as a ’teacher’ to rapidly guide the training of ’student’ SNNs. Our KD approach incorporates a tunable sparsity-enforcing mechanism, allowing us to generate student networks that exhibit varying degrees of sparse, bioinspired activity. We demonstrate that this KD framework achieves a dramatic reduction in training time compared to the initial RL process. Furthermore, we conduct a comprehensive analysis of the trade-offs between network sparsity, controller performance, and the resulting DBS parameters. Our findings support KD as a powerful and practical methodology for developing efficient, sparse, and biologically plausible SNN controllers, significantly accelerating the design and in silico validation of advanced neuromodulation systems.

Undergraduate student Skye Gunasekaran publishes paper titled, “A predictive approach to enhance time-series forecasting” in Nature Communications

A predictive approach to enhance time-series forecasting
Skye Gunasekaran, Assel Kembay, Hugo Ladret, Rui-Jie Zhu, Laurent Perrinet, Omid Kavehei & Jason Eshraghian
Nature Communications volume 16, Article number: 8645 (2025)

Abstract

Accurate time-series forecasting is crucial in various scientific and industrial domains, yet deep learning models often struggle to capture long-term dependencies and adapt to data distribution shifts over time. We introduce Future-Guided Learning, an approach that enhances time-series event forecasting through a dynamic feedback mechanism inspired by predictive coding. Our method involves two models: a detection model that analyzes future data to identify critical events and a forecasting model that predicts these events based on current data. When discrepancies occur between the forecasting and detection models, a more significant update is applied to the forecasting model, effectively minimizing surprise, allowing the forecasting model to dynamically adjust its parameters. We validate our approach on a variety of tasks, demonstrating a 44.8% increase in AUC-ROC for seizure prediction using EEG data, and a 23.4% reduction in MSE for forecasting in nonlinear dynamical systems (outlier excluded). By incorporating a predictive feedback mechanism, Future-Guided Learning advances how deep learning is applied to time-series forecasting.

FGL framework

Undergraduate alumni Dustin Wang and PhD Students Rui-Jie Zhu and Taylor Kergan submit a preprint titled, “A Systematic Analysis of Hybrid Linear Attention”

A Systematic Analysis of Hybrid Linear Attention

Abstract:

Transformers face quadratic complexity and memory issues with long sequences, prompting the adoption of linear attention mechanisms using fixed-size hidden states. However, linear models often suffer from limited recall performance, leading to hybrid architectures that combine linear and full attention layers. Despite extensive hybrid architecture research, the choice of linear attention component has not been deeply explored. We systematically evaluate various linear attention models across generations – vector recurrences to advanced gating mechanisms – both standalone and hybridized. To enable this comprehensive analysis, we trained and open-sourced 72 models: 36 at 340M parameters (20B tokens) and 36 at 1.3B parameters (100B tokens), covering six linear attention variants across five hybridization ratios. Benchmarking on standard language modeling and recall tasks reveals that superior standalone linear models do not necessarily excel in hybrids. While language modeling remains stable across linear-to-full attention ratios, recall significantly improves with increased full attention layers, particularly below a 3:1 ratio. Our study highlights selective gating, hierarchical recurrence, and controlled forgetting as critical for effective hybrid models. We recommend architectures such as HGRN-2 or GatedDeltaNet with a linear-to-full ratio between 3:1 and 6:1 to achieve Transformer-level recall efficiently. Our models are open-sourced at this https URL.

Three 'generations' of linear-attention state updates.

PhD Student Rui-Jie Zhu and fellow NCG lab members submit a preprint titled, “A Survey on Latent Reasoning”

A Survey on Latent Reasoning

Abstract:
Large Language Models (LLMs) have demonstrated impressive reasoning capabilities, especially when guided by explicit chain-of-thought (CoT) reasoning that verbalizes intermediate steps. While CoT improves both interpretability and accuracy, its dependence on natural language reasoning limits the model’s expressive bandwidth. Latent reasoning tackles this bottleneck by performing multi-step inference entirely in the model’s continuous hidden state, eliminating token-level supervision. To advance latent reasoning research, this survey provides a comprehensive overview of the emerging field of latent reasoning. We begin by examining the foundational role of neural network layers as the computational substrate for reasoning, highlighting how hierarchical representations support complex transformations. Next, we explore diverse latent reasoning methodologies, including activation-based recurrence, hidden state propagation, and fine-tuning strategies that compress or internalize explicit reasoning traces. Finally, we discuss advanced paradigms such as infinite-depth latent reasoning via masked diffusion models, which enable globally consistent and reversible reasoning processes. By unifying these perspectives, we aim to clarify the conceptual landscape of latent reasoning and chart future directions for research at the frontier of LLM cognition. An associated GitHub repository collecting the latest papers and repos is available at: this https URL.