cs.AI papers | Gist.Science

From ARIMA to Attention: Power Load Forecasting Using Temporal Deep Learning

This paper empirically demonstrates that a Transformer model utilizing self-attention mechanisms outperforms traditional ARIMA and recurrent neural network approaches (LSTM, BiLSTM) in short-term power load forecasting on PJM data, achieving a superior 3.8% MAPE and highlighting the effectiveness of attention-based architectures for capturing complex temporal patterns.

Suhasnadh Reddy Veluru, Sai Teja Erukude, Viswa Chaitanya Marella2026-03-10🤖 cs.LG

Exploration Space Theory: Formal Foundations for Prerequisite-Aware Location-Based Recommendation

This paper introduces Exploration Space Theory (EST), a formal lattice-theoretic framework that adapts Knowledge Space Theory to location-based recommendation by modeling prerequisite dependencies among points of interest, thereby providing structural guarantees for validity, optimality, and explainability in the Exploration Space Recommender System (ESRS).

Madjid Sadallah2026-03-10🤖 cs.LG

Pavement Missing Condition Data Imputation through Collective Learning-Based Graph Neural Networks

This paper proposes a collective learning-based Graph Convolutional Network model that effectively imputes missing pavement condition data by integrating features from adjacent road sections and capturing dependencies between observed conditions, demonstrating promising results in a Texas Department of Transportation case study.

Ke Yu, Lu Gao2026-03-10🤖 cs.LG

Grouter: Decoupling Routing from Representation for Accelerated MoE Training

Grouter is a preemptive routing framework that decouples structural optimization from weight updates by distilling high-quality routing policies from fully trained models, thereby significantly accelerating Mixture-of-Experts (MoE) training convergence and throughput while improving data utilization.

Yuqi Xu, Rizhen Hu, Zihan Liu, Mou Sun, Kun Yuan2026-03-10🤖 cs.LG

Photons = Tokens: The Physics of AI and the Economics of Knowledge

This paper applies thermodynamic and economic principles to reframe AI tokens as physical quantities with measurable costs, establishing a finite global "question budget" to argue that the critical challenge for humanity is not the volume of computable answers but the agency required to determine which questions are worth asking.

Alec Litowitz, Nick Polson, Vadim Sokolov2026-03-10🔬 physics

SmartBench: Evaluating LLMs in Smart Homes with Anomalous Device States and Behavioral Contexts

This paper introduces SmartBench, the first dataset designed to evaluate LLMs on detecting anomalous device states and behavioral contexts in smart homes, revealing that current state-of-the-art models struggle significantly with this critical task.

Qingsong Zou, Zhi Yan, Zhiyao Xu, Kuofeng Gao, Jingyu Xiao, Yong Jiang2026-03-10🤖 cs.LG

HEARTS: Benchmarking LLM Reasoning on Health Time Series

The paper introduces HEARTS, a comprehensive benchmark comprising 16 real-world health datasets and 110 tasks across four reasoning capabilities, which reveals that current large language models significantly underperform specialized models in health time series analysis due to struggles with multi-step temporal reasoning and reliance on simple heuristics.

Sirui Li, Shuhan Xiao, Mihir Joshi, Ahmed Metwally, Daniel McDuff, Wei Wang, Yuzhe Yang2026-03-10🤖 cs.LG

RECAP: Local Hebbian Prototype Learning as a Self-Organizing Readout for Reservoir Dynamics

RECAP is a bio-inspired image classification method that couples untrained reservoir dynamics with a self-organizing Hebbian prototype readout to achieve robust, backpropagation-free learning capable of generalizing to corrupted inputs without prior exposure.

Heng Zhang2026-03-10🤖 cs.LG

SR-TTT: Surprisal-Aware Residual Test-Time Training

SR-TTT addresses the catastrophic recall failures of Test-Time Training (TTT) language models by introducing a loss-gated sparse memory mechanism that dynamically routes highly surprising tokens to an exact-attention residual cache, thereby preserving O(1) memory efficiency while enabling accurate retrieval of critical information.

Swamynathan V P2026-03-10🤖 cs.LG

Trust Aware Federated Learning for Secure Bone Healing Stage Interpretation in e-Health

This paper proposes a trust-aware federated learning framework that utilizes an Adaptive Trust Score Scaling and Filtering mechanism to secure bone healing stage interpretation in e-Health by mitigating the impact of unreliable or adversarial participants while maintaining model integrity and predictive performance.

Paul Shepherd, Tasos Dagiuklas, Bugra Alkan, Joaquim Bastos, Jonathan Rodriguez2026-03-10🤖 cs.LG

Performance Comparison of IBN orchestration using LLM and SLMs

This paper proposes a stateful, hierarchical multi-agent framework for 5G/6G Intent-Based Networking orchestration that leverages both Large and Small Language Models, demonstrating that while both achieve similar translation accuracy, Small Language Models improve the overall lifecycle completion speed by 20%.

Wai Lwin Phone, Brahim El Boudani, Tasos Dagiuklas, Saptarshi Ghosh2026-03-10💻 cs

ObjChangeVR: Object State Change Reasoning from Continuous Egocentric Views in VR Environments

This paper introduces ObjChangeVR, a novel framework and corresponding dataset designed to enhance object state change reasoning in virtual reality by addressing the challenges of detecting background changes without direct interaction through viewpoint-aware retrieval and cross-view reasoning.

Shiyi Ding, Shaoen Wu, Ying Chen2026-03-10💻 cs

HURRI-GAN: A Novel Approach for Hurricane Bias-Correction Beyond Gauge Stations using Generative Adversarial Networks

The paper introduces HURRI-GAN, a novel TimeGAN-based framework that corrects systemic biases in high-resolution hurricane simulation models like ADCIRC, enabling accurate, near real-time storm surge forecasting and bias extrapolation beyond gauge station locations while significantly reducing computational runtime.

Noujoud Nadera, Hadi Majed, Stefanos Giaremis, Rola El Osta, Clint Dawson, Carola Kaiser, Hartmut Kaiser2026-03-10🤖 cs.LG

Geodesic Gradient Descent: A Generic and Learning-rate-free Optimizer on Objective Function-induced Manifolds

This paper introduces Geodesic Gradient Descent (GGD), a generic, learning-rate-free optimization algorithm that approximates local neighborhoods of objective function-induced hypersurfaces using n-dimensional spheres to ensure update trajectories remain on the manifold, achieving significant performance improvements over Adam on both regression and classification tasks.

Liwei Hu, Guangyao Li, Wenyong Wang, Xiaoming Zhang, Yu Xiang2026-03-10🤖 cs.LG

PaLMR: Towards Faithful Visual Reasoning via Multimodal Process Alignment

PaLMR is a novel framework that enhances the faithfulness of multimodal large language models by aligning both the reasoning process and outcomes through a perception-aligned data layer and a hierarchical reward fusion scheme, thereby significantly reducing visual hallucinations while achieving state-of-the-art performance on key benchmarks.

Yantao Li, Qiang Hui, Chenyang Yan, Kanzhi Cheng, Fang Zhao, Chao Tan, Huanling Gao, Jianbing Zhang, Kai Wang, Xinyu Dai, Shiguo Lian2026-03-10💻 cs

A Parameter-efficient Convolutional Approach for Weed Detection in Multispectral Aerial Imagery

This paper introduces FCBNet, a parameter-efficient convolutional model featuring a frozen ConvNeXt backbone and a Feature Correction Block that achieves superior weed segmentation accuracy (over 85% mIoU) and computational efficiency across RGB and multispectral aerial imagery compared to existing state-of-the-art models.

Leo Thomas Ramos, Angel D. Sappa2026-03-10💻 cs

GameVerse: Can Vision-Language Models Learn from Video-based Reflection?

The paper introduces GameVerse, a comprehensive benchmark featuring a novel reflect-and-retry paradigm and a hierarchical taxonomy across 15 games, demonstrating that Vision-Language Models can effectively improve their gameplay policies through video-based reflection by combining failure trajectories with expert tutorials.

Kuan Zhang, Dongchen Liu, Qiyue Zhao, Jinkun Hou, Xinran Zhang, Qinlei Xie, Miao Liu, Yiming Li2026-03-10💻 cs

Science Literacy: Generative AI as Enabler of Coherence in the Teaching, Learning, and Assessment of Scientific Knowledge and Reasoning

This chapter explores the potential of generative AI to enhance K-16+ science literacy by proposing a coherent architectural framework that aligns the teaching, learning, and assessment of scientific knowledge and reasoning, while addressing associated challenges and outlining future research needs.

Xiaoming Zhai, James W. Pellegrino, Matias Rojas, Jongchan Park, Matthew Nyaaba, Clayton Cohn, Gautam Biswas2026-03-10💻 cs

Graph-of-Mark: Promote Spatial Reasoning in Multimodal Language Models with Graph-Based Visual Prompting

The paper proposes Graph-of-Mark (GoM), a novel pixel-level visual prompting technique that overlays scene graphs onto images to capture object relationships, thereby significantly enhancing the spatial reasoning and zero-shot performance of multimodal language models.

Giacomo Frisoni, Lorenzo Molfetta, Mattia Buzzoni, Gianluca Moro2026-03-10💻 cs

Accelerating Video Generation Inference with Sequential-Parallel 3D Positional Encoding Using a Global Time Index

This paper introduces a system-level inference optimization for Diffusion Transformer-based video generation that employs a sequence-parallel Causal-RoPE mechanism and operator fusion to overcome memory and latency bottlenecks, achieving near real-time speeds and sub-second first-frame latency on an eight-GPU cluster.

Chao Yuan, Pan Li2026-03-10💻 cs

← Previous Next →