cs.AI papers | Gist.Science

Sparse Variational Student-t Processes for Heavy-tailed Modeling

This paper introduces Sparse Variational Student-t Processes (SVTP), a scalable framework that extends sparse inducing point methods to Student-t processes via novel inference algorithms and natural gradient optimization, achieving superior robustness to outliers and heavy-tailed data with significantly faster convergence and lower prediction error compared to sparse Gaussian processes on large datasets.

Jian Xu, Delu Zeng, John Paisley2026-03-11🤖 cs.AI

Robust Training of Neural Networks at Arbitrary Precision and Sparsity

This paper introduces a unified framework that models quantization and sparsification as additive noise to derive a principled, noise-corrective gradient path, enabling the stable training of neural networks at arbitrary low precisions and sparsity levels without relying on heuristic estimators like the Straight-Through Estimator.

Chengxi Ye, Grace Chu, Yanfeng Liu, Yichi Zhang, Lukasz Lew, Li Zhang, Mark Sandler, Andrew Howard2026-03-11🤖 cs.AI

DRUPI: Dataset Reduction Using Privileged Information

The paper introduces DRUPI (Dataset Condensation using Privileged Information), a framework that enhances dataset condensation by synthesizing auxiliary privileged information, such as feature or attention labels, alongside reduced data to significantly improve model training performance across various benchmarks.

Shaobo Wang, Youxin Jiang, Tianle Niu, Yantai Yang, Ruiji Zhang, Shuhao Hu, Shuaiyu Zhang, Chenghao Sun, Weiya Li, Conghui He, Xuming Hu, Linfeng Zhang2026-03-11🤖 cs.AI

LAYOUTDREAMER: Physics-guided Layout for Text-to-3D Compositional Scene Generation

LayoutDreamer is a novel framework that leverages 3D Gaussian Splatting and physics-guided scene graphs to generate high-quality, physically plausible, and controllable text-to-3D compositional scenes, achieving state-of-the-art performance in multi-object generation.

Yang Zhou, Zongjin He, Qixuan Li + 1 more2026-03-11🤖 cs.AI

Astromer 2

This paper introduces Astromer 2, an enhanced foundational model for light curve analysis that leverages self-supervised pre-training on 1.5 million MACHO survey observations and weighted per-sample embeddings to significantly outperform its predecessor and prior models in classification tasks, particularly when trained with limited labeled data.

Cristobal Donoso-Oliva, Ignacio Becker, Pavlos Protopapas + 3 more2026-03-11🔭 astro-ph

On the Impact of the Utility in Semivalue-based Data Valuation

This paper addresses the sensitivity of semivalue-based data valuation to utility selection by introducing a "spatial signature" framework that embeds data points into a lower-dimensional space, enabling a practical methodology to quantify and ensure the robustness of valuation results against utility variations.

Mélissa Tamine, Benjamin Heymann, Maxime Vono, Patrick Loiseau2026-03-11🤖 cs.AI

MKE-Coder: Multi-Axial Knowledge with Evidence Verification in ICD Coding for Chinese EMRs

This paper presents MKE-Coder, a novel framework that improves automatic ICD coding for Chinese electronic medical records by integrating multi-axial disease knowledge with a clinical evidence verification module to address challenges in information extraction and code validity.

Xinxin You, Xien Liu, Xue Yang, Ziyi Wang, Ji Wu2026-03-11🤖 cs.AI

LLM-Advisor: An LLM Benchmark for Cost-efficient Path Planning across Multiple Terrains

The paper introduces LLM-Advisor, a prompt-based framework that leverages large language models as non-decisive post-processing advisors to significantly improve the cost efficiency of path planning across diverse terrains without modifying underlying planners, while addressing hallucination risks and demonstrating superior performance over zero-shot LLM approaches.

Ling Xiao, Toshihiko Yamasaki2026-03-11🤖 cs.AI

HyConEx: Hypernetwork classifier with counterfactual explanations for tabular data

The paper introduces HyConEx, a novel deep hypernetwork-based classifier for tabular data that uniquely integrates prediction and explanation by simultaneously generating class labels and local counterfactual examples to interpret model decisions.

Patryk Marszałek, Kamil Ksi\k{a}\.zek, Oleksii Furman, Ulvi Movsum-zada, Przemysław Spurek, Marek Smieja2026-03-11🤖 cs.AI

Logic Explanation of AI Classifiers by Categorical Explaining Functors

This paper proposes a theoretically grounded approach using category theory to introduce an "explaining functor" that structurally preserves logical entailment, thereby ensuring the consistency and fidelity of logic-based explanations for opaque AI models and overcoming the limitations of current heuristic post-hoc methods.

Stefano Fioravanti, Francesco Giannini, Paolo Frazzetto + 2 more2026-03-11🤖 cs.AI

GateLens: A Reasoning-Enhanced LLM Agent for Automotive Software Release Analytics

GateLens is a reasoning-enhanced LLM agent that utilizes Relational Algebra as a formal intermediate representation to bridge the gap between natural language and executable code, enabling fast, transparent, and highly accurate analysis of complex tabular data in automotive software release analytics without requiring few-shot examples or complex agent orchestration.

Arsham Gholamzadeh Khoee, Shuai Wang, Robert Feldt, Dhasarathy Parthasarathy, Yinan Yu2026-03-11🤖 cs.AI

A Consequentialist Critique of Binary Classification Evaluation: Theory, Practice, and Tools

This paper critiques the prevalent reliance on fixed-threshold metrics in machine learning evaluation by advocating for a consequentialist framework that prioritizes proper scoring rules like the Brier score, supported by a new decision-theoretic mapping, a practical Python package called `briertools`, and a clipped Brier score variant to bridge the gap between theoretical utility and current practices.

Gerardo Flores, Abigail Schiff, Alyssa H. Smith, Julia A Fukuyama, Ashia C. Wilson2026-03-11🤖 cs.AI

MCP Bridge: A Lightweight, LLM-Agnostic RESTful Proxy for Model Context Protocol Servers

This paper introduces MCP Bridge, a lightweight, LLM-agnostic RESTful proxy that enables Model Context Protocol servers to run in resource-constrained environments with enhanced security, while also presenting a fine-tuned Qwen3 model that achieves state-of-the-art performance on the MCPToolBench++ benchmark through advanced reinforcement learning techniques.

Arash Ahmadi, Sarah Sharif, Yaser M. Banad2026-03-11🤖 cs.AI

Stepwise Guided Policy Optimization: Coloring your Incorrect Reasoning in GRPO

This paper introduces Stepwise Guided Policy Optimization (SGPO), a framework that enhances Group Relative Policy Optimization (GRPO) by utilizing a step-wise judge model to provide learning signals from all-negative sample groups, thereby enabling large language models to learn from incorrect reasoning and improving performance across various reasoning benchmarks.

Peter Chen, Xiaopeng Li, Ziniu Li, Xi Chen, Tianyi Lin2026-03-11🤖 cs.AI

Let's Verify Math Questions Step by Step

This paper introduces MathQ-Verify, a novel five-stage pipeline that rigorously filters ill-posed or under-specified mathematical questions through format validation, formalization, contradiction detection, and completeness checks, achieving state-of-the-art performance in curating reliable datasets for training large language models.

Chengyu Shen, Zhen Hao Wong, Runming He, Hao Liang, Meiyi Qiang, Zimo Meng, Zhengyang Zhao, Bohan Zeng, Zhengzhou Zhu, Bin Cui, Wentao Zhang2026-03-11🤖 cs.AI

UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Language Models

The paper introduces UltraEdit, a training-, subject-, and memory-free approach for lifelong language model editing that achieves unprecedented scalability and efficiency by computing parameter shifts in a single step, enabling 7B models to be edited on consumer GPUs with over 2 million updates while outperforming existing methods in speed, memory usage, and accuracy.

Xiaojie Gu, Ziying Huang, Jia-Chen Gu, Kai Zhang2026-03-11🤖 cs.AI

SATURN: SAT-based Reinforcement Learning to Unleash LLMs Reasoning

The paper introduces Saturn, a reinforcement learning framework that leverages Boolean Satisfiability (SAT) problems to overcome scalability, verifiability, and difficulty control limitations in training large language models, resulting in significant reasoning improvements across SAT, math, and programming benchmarks.

Huanyu Liu, Ge Li, Jia Li, Hao Zhu, Kechi Zhang, Yihong Dong2026-03-11🤖 cs.AI

Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities

The paper introduces Daily-Omni, a comprehensive audio-visual benchmark with 1,197 questions designed to evaluate cross-modal temporal reasoning, revealing that current multimodal large language models still struggle with alignment-critical tasks despite strong unimodal performance.

Ziwei Zhou, Rui Wang, Zuxuan Wu, Yu-Gang Jiang2026-03-11🤖 cs.AI

Embodied AI with Foundation Models for Mobile Service Robots: A Systematic Review

This paper presents the first systematic review of integrating foundation models into mobile service robotics, analyzing how these technologies address core challenges in perception and control, enabling applications in domestic and healthcare settings while discussing ethical implications and outlining future directions for safe, scalable, and trustworthy deployment.

Matthew Lisondra, Beno Benhabib, Goldie Nejat2026-03-11💬 cs.CL

Rating Quality of Diverse Time Series Data by Meta-learning from LLM Judgment

The paper proposes TSRating, a novel meta-learning framework that leverages LLMs to generate quality comparisons across diverse domains and trains an efficient TSRater model to accurately and adaptively evaluate time series data quality without requiring extensive hypergradient computations.

Shunyu Wu, Dan Li, Wenjie Feng, Haozheng Ye, Jian Lou, See-Kiong Ng2026-03-11🤖 cs.AI

← Previous Next →