Optimizing Chlorination in Water Distribution Systems via Surrogate-assisted Neuroevolution

This paper proposes a surrogate-assisted neuroevolution framework that combines NEAT and NSGA-II to optimize multi-objective chlorine injection strategies in complex water distribution systems, demonstrating superior performance over standard reinforcement learning methods while leveraging a neural network surrogate to bypass the computational costs of traditional hydraulic simulators.

Rivaaj Monsia, Daniel Young, Olivier Francon, Risto Miikkulainen2026-04-14⚡ eess

Isomorphic Functionalities between Ant Colony and Ensemble Learning: Part III -- Gradient Descent, Neural Plasticity, and the Emergence of Deep Intelligence

This paper completes a trilogy by proving that the fundamental mechanisms of deep learning, including stochastic gradient descent and neural plasticity, are mathematically isomorphic to the generational dynamics and adaptive behaviors of ant colonies, thereby suggesting a unified theory of learning that transcends biological and artificial substrates.

Ernest Fokoué, Gregory Babbitt, Yuval Levental2026-04-14🤖 cs.LG

Wolkowicz-Styan Upper Bound on the Hessian Eigenspectrum for Cross-Entropy Loss in Nonlinear Smooth Neural Networks

This paper derives a closed-form upper bound for the maximum eigenvalue of the Hessian matrix in nonlinear smooth multilayer neural networks with cross-entropy loss, utilizing the Wolkowicz-Styan bound to analytically characterize loss sharpness without relying on numerical eigenspectrum computations.

Yuto Omae, Kazuki Sakai, Yohei Kakimoto, Makoto Sasaki, Yusuke Sakai, Hirotaka Takahashi2026-04-14🤖 cs.LG

Heterogeneous Connectivity in Sparse Networks: Fan-in Profiles, Gradient Hierarchy, and Topological Equilibria

This paper demonstrates that while heterogeneous fan-in profiles in static sparse networks offer no accuracy advantage over uniform random connectivity due to arbitrary hub placement, initializing dynamic sparse training (RigL) with lognormal profiles matching the equilibrium fan-in distribution significantly improves performance by allowing optimization to refine weights rather than rearrange topology.

Nikodem Tomczak2026-04-14🤖 cs.LG

Frugal Knowledge Graph Construction with Local LLMs: A Zero-Shot Pipeline, Self-Consistency and Wisdom of Artificial Crowds

This paper presents a reproducible, zero-shot pipeline for constructing and querying knowledge graphs using local LLMs on consumer hardware, demonstrating that while self-consistency and multi-model diversity significantly enhance multi-hop reasoning performance, strong model consensus can paradoxically indicate collective hallucination, ultimately achieving competitive results with a minimal carbon footprint.

Pierre Jourlin (LIA)2026-04-14🤖 cs.AI

PaceLLM: Brain-Inspired Large Language Models for Long-Context Understanding

PaceLLM is a brain-inspired large language model that enhances long-context understanding and mitigates information decay by integrating a Persistent Activity mechanism for dynamic state retrieval and Cortical Expert clustering for semantic reorganization, achieving significant performance gains on long-context benchmarks without requiring structural overhauls.

Kangcong Li, Peng Ye, Chongjun Tu, Lin Zhang, Chunfeng Song, Jiamin Wu, Tao Yang, Qihao Zheng, Tao Chen2026-04-13🧬 q-bio

A Little Rank Goes a Long Way: Random Scaffolds with LoRA Adapters Are All You Need

The paper introduces LottaLoRA, a training paradigm demonstrating that low-rank LoRA adapters trained on frozen, randomly initialized backbones can achieve near-full performance across diverse architectures while training only a tiny fraction of parameters, revealing that task-specific information occupies a significantly smaller subspace than the full model size suggests.

Hananel Hazan, Yanbo Zhang, Benedikt Hartl, Michael Levin2026-04-13🤖 cs.LG

Memory Wall is not gone: A Critical Outlook on Memory Architecture in Digital Neuromorphic Computing

This paper argues that despite the distributed architectures of digital neuromorphic processors designed to overcome the von Neumann bottleneck, the high area and energy costs of on-chip memory systems have created a new "memory wall" that threatens their competitiveness in edge applications, necessitating a re-evaluation of memory organization for future research.

Amirreza Yousefzadeh, Sameed Sohail, Ana Lucia Varbanescu2026-04-13💻 cs

Hierarchical Kernel Transformer: Multi-Scale Attention with an Information-Theoretic Approximation Analysis

The Hierarchical Kernel Transformer (HKT) introduces a multi-scale attention mechanism with trainable causal downsampling that achieves consistent performance gains over standard attention baselines across diverse tasks while maintaining a bounded computational overhead of approximately 1.31x and providing rigorous theoretical guarantees on kernel properties, attention decomposition, and approximation error.

Giansalvo Cirrincione2026-04-13📊 stat

Ge2^\text{2}mS-T: Multi-Dimensional Grouping for Ultra-High Energy Efficiency in Spiking Transformer

The paper proposes Ge2^\text{2}mS-T, a novel Spiking Vision Transformer architecture that employs multi-dimensional grouped computation and a Grouped-Exponential-Coding-based IF model to simultaneously optimize memory overhead, learning capability, and energy efficiency, overcoming the limitations of existing ANN-SNN conversion and STBP methods.

Zecheng Hao, Shenghao Xie, Kang Chen, Wenxuan Liu, Zhaofei Yu, Tiejun Huang2026-04-13🤖 cs.AI