cs.LG papers | Gist.Science

Memorization capacity of deep ReLU neural networks characterized by width and depth

This paper establishes the optimal trade-off between width and depth for deep ReLU neural networks to memorize $N$ separated data points, proving that the product of the squared width and squared depth must scale as $\Theta(N\log(\delta^{-1}))$ .

Xin Yang, Yunfei Yang2026-03-11🤖 cs.LG

MM-algorithms for traditional and convex NMF with Tweedie and Negative Binomial cost functions and empirical evaluation

This paper presents a unified framework for traditional and convex Non-negative Matrix Factorization (NMF) under Negative Binomial and Tweedie distributions, deriving novel multiplicative update rules via Majorize-Minimization and demonstrating through empirical evaluation that appropriate noise model selection and convex formulations significantly improve feature recovery in overdispersed data.

Elisabeth Sommer James, Asger Hobolth, Marta Pelizzola2026-03-11🤖 cs.LG

Learning the Hierarchical Organization in Brain Network for Brain Disorder Diagnosis

The paper proposes BrainHO, a novel framework that learns intrinsic hierarchical brain network dependencies from fMRI data using a hierarchical attention mechanism and orthogonality constraints, thereby achieving state-of-the-art diagnosis performance and uncovering interpretable biomarkers for brain disorders without relying on predefined sub-network labels.

Jingfeng Tang, Peng Cao, Guangqi Wen, Jinzhu Yang, Xiaoli Liu, Osmar R. Zaiane2026-03-11🤖 cs.LG

Multi-DNN Inference of Sparse Models on Edge SoCs

This paper introduces SparseLoom, a system that employs model stitching to recombine subgraphs from sparse models without re-training, thereby significantly improving throughput, reducing memory overhead, and lowering Service Level Objective violation rates for multi-DNN inference on edge SoCs compared to state-of-the-art systems.

Jiawei Luo, Di Wu, Simon Dobson, Blesson Varghese2026-03-11🤖 cs.LG

Evolution of Photonic Quantum Machine Learning under Noise

This review systematically analyzes noise sources in photonic quantum machine learning, examining their impact on algorithm performance and exploring characterization and mitigation strategies to guide the development of robust, scalable systems.

A. M. A. S. D. Alagiyawanna, Asoka Karunananda2026-03-11⚛️ quant-ph

Well Log-Guided Synthesis of Subsurface Images from Sparse Petrography Data Using cGANs

This paper presents a conditional Generative Adversarial Network (cGAN) framework that synthesizes realistic, continuous pore-scale images of carbonate rock formations by conditioning on well log-derived porosity values, effectively bridging gaps between sparse petrography samples to enhance reservoir characterization for energy transition applications.

Ali Sadeghkhani, A. Assadi, B. Bennett, A. Rabbani2026-03-11🤖 cs.LG

FreqCycle: A Multi-Scale Time-Frequency Analysis Method for Time Series Forecasting

FreqCycle is a novel multi-scale time-frequency analysis framework that improves time series forecasting by combining a Filter-Enhanced Cycle module for low-frequency patterns and a Segmented Frequency-domain module for mid-to-high frequencies, further extended to MFreqCycle to decouple coupled multi-periodicity, thereby achieving state-of-the-art accuracy with efficient inference.

Boya Zhang, Shuaijie Yin, Huiwen Zhu, Xing He2026-03-11🤖 cs.LG

No evaluation without fair representation : Impact of label and selection bias on the evaluation, performance and mitigation of classification models

This paper empirically analyzes the distinct impacts of label and selection bias on classification model evaluation and performance using a new framework for introducing controlled bias, revealing that fairness-accuracy trade-offs disappear when models are evaluated on unbiased data and demonstrating that the effectiveness of mitigation methods depends on the specific bias type present.

Magali Legast, Toon Calders, François Fouss2026-03-11🤖 cs.LG

GNNs for Time Series Anomaly Detection: An Open-Source Framework and a Critical Evaluation

This paper introduces an open-source framework for Graph Neural Network-based Time Series Anomaly Detection to enable reproducible experimentation and critical evaluation, demonstrating that GNNs enhance both detection performance and interpretability while highlighting the need for standardized metrics and thresholding strategies.

Federico Bello, Gonzalo Chiarlone, Marcelo Fiori, Gastón García González, Federico Larroca2026-03-11🤖 cs.AI

EsoLang-Bench: Evaluating Genuine Reasoning in Large Language Models via Esoteric Programming Languages

The paper introduces EsoLang-Bench, a novel benchmark utilizing esoteric programming languages to expose the limitations of large language models' genuine reasoning capabilities by revealing a dramatic performance gap between their high scores on standard benchmarks and near-zero accuracy on tasks requiring the acquisition of new languages through documentation and experimentation rather than memorization.

Aman Sharma, Paras Chopra2026-03-11🤖 cs.AI

On Catastrophic Forgetting in Low-Rank Decomposition-Based Parameter-Efficient Fine-Tuning

This paper empirically demonstrates that catastrophic forgetting in low-rank decomposition-based parameter-efficient fine-tuning is primarily driven by update subspace geometry, revealing that tensor-based and structurally aligned methods outperform traditional shared matrix approaches in sequential learning scenarios.

Muhammad Ahmad, Jingjing Zheng, Yankai Cao2026-03-11🤖 cs.LG

ActiveUltraFeedback: Efficient Preference Data Generation using Active Learning

The paper introduces ActiveUltraFeedback, an efficient active learning pipeline that leverages uncertainty estimates and novel selection strategies like Double Reverse Thompson Sampling to generate high-quality preference data, enabling Large Language Models to achieve superior alignment performance with as little as one-sixth of the annotated data required by static baselines.

Davit Melikidze, Marian Schneider, Jessica Lam, Martin Wertich, Ido Hakimi, Barna Pásztor, Andreas Krause2026-03-11🤖 cs.AI

Physics-informed neural operator for predictive parametric phase-field modelling

This paper introduces PF-PINO, a physics-informed neural operator framework that embeds phase-field governing equation residuals into the training loss to significantly improve the accuracy, generalization, and long-term stability of predictive parametric phase-field modelling compared to conventional methods.

Nanxi Chen, Airong Chen, Rujin Ma2026-03-11🔬 cond-mat.mtrl-sci

Mousse: Rectifying the Geometry of Muon with Curvature-Aware Preconditioning

Mousse is a novel optimizer that improves upon the Muon algorithm by integrating Shampoo's Kronecker-factored preconditioning to adaptively handle the heavy-tailed curvature of deep neural networks, thereby achieving faster training convergence with negligible computational overhead.

Yechen Zhang, Shuhao Xing, Junhao Huang, Kai Lv, Yunhua Zhou, Xipeng Qiu, Qipeng Guo, Kai Chen2026-03-11🤖 cs.AI

A Multi-Prototype-Guided Federated Knowledge Distillation Approach in AI-RAN Enabled Multi-Access Edge Computing System

This paper proposes a Multi-Prototype-Guided Federated Knowledge Distillation (MP-FedKD) approach for AI-RAN enabled Multi-Access Edge Computing systems, which addresses non-IID data challenges and mitigates information loss from single-prototype averaging by integrating self-knowledge distillation, a conditional hierarchical agglomerative clustering strategy, and a novel loss function to outperform state-of-the-art baselines in accuracy and error metrics.

Luyao Zou, Hayoung Oh, Chu Myaet Thwal, Apurba Adhikary, Seohyeon Hong, Zhu Han2026-03-11🤖 cs.LG

Upper Generalization Bounds for Neural Oscillators

This paper derives upper PAC generalization bounds for neural oscillators based on second-order ODEs and MLPs, demonstrating that their estimation errors grow polynomially with model size and time while showing that constraining MLP Lipschitz constants via regularization enhances generalization performance in modeling nonlinear structural systems.

Zifeng Huang, Konstantin M. Zuev, Yong Xia, Michael Beer2026-03-11🤖 cs.LG

Global universality via discrete-time signatures

This paper establishes global universal approximation theorems for path-dependent functionals on spaces of piecewise linear paths using linear functionals of discrete-time signatures, demonstrating their applicability to Brownian motion-driven systems such as random and stochastic ordinary differential equations.

Mihriban Ceylan, David J. Prömel2026-03-11🤖 cs.LG

What is Missing? Explaining Neurons Activated by Absent Concepts

This paper identifies that deep neural networks frequently encode the absence of concepts to drive neuron activation—a phenomenon largely overlooked by standard explainable AI methods—and proposes simple extensions to attribution and feature visualization techniques to effectively reveal and leverage these "missing" concepts for better model interpretation and debiasing.

Robin Hesse, Simone Schaub-Meyer, Janina Hesse, Bernt Schiele, Stefan Roth2026-03-11🤖 cs.LG

A Hybrid Quantum-Classical Framework for Financial Volatility Forecasting Based on Quantum Circuit Born Machines

This paper proposes and validates a hybrid quantum-classical framework that integrates a Long Short-Term Memory (LSTM) network with a Quantum Circuit Born Machine (QCBM) to significantly improve financial volatility forecasting accuracy on high-frequency stock market data compared to traditional classical models.

Yixiong Chen2026-03-11⚛️ quant-ph

Exploiting Label-Aware Channel Scoring for Adaptive Channel Pruning in Split Learning

This paper proposes ACP-SL, an adaptive channel pruning scheme for Split Learning that utilizes a label-aware channel importance scoring module to compress smashed data, thereby significantly reducing communication overhead while improving test accuracy and training efficiency.

Jialei Tan, Zheng Lin, Xiangming Cai, Ruoxi Zhu, Zihan Fang, Pingping Chen, Wei Ni2026-03-11🤖 cs.AI

← Previous Next →