cs.LG papers | Gist.Science

Hospitality-VQA: Decision-Oriented Informativeness Evaluation for Vision-Language Models

This paper introduces a formal framework for "informativeness" and a corresponding hospitality-specific VQA dataset to evaluate Vision-Language Models, revealing that while current models struggle with decision-oriented reasoning, their performance significantly improves with modest domain-specific finetuning.

Jeongwoo Lee, Baek Duhyeong, Eungyeol Han, Soyeon Shin, Gukin han, Seungduk Kim, Jaehyun Jeon, Taewoo Jeong2026-03-10🤖 cs.LG

Toward Unified Multimodal Representation Learning for Autonomous Driving

This paper proposes a Contrastive Tensor Pre-training (CTP) framework that replaces traditional pairwise similarity alignment with a joint tensor-based approach to unify multiple modalities in a single embedding space, thereby enhancing scene understanding and end-to-end performance in autonomous driving.

Ximeng Tao, Dimitar Filev, Gaurav Pandey2026-03-10🤖 cs.LG

Reject, Resample, Repeat: Understanding Parallel Reasoning in Language Model Inference

This paper introduces a particle filtering framework to rigorously analyze the accuracy-cost tradeoffs of parallel inference methods in large language models, establishing theoretical guarantees and identifying fundamental limits while demonstrating that sampling error alone does not fully predict final model accuracy.

Noah Golowich, Fan Chen, Dhruv Rohatgi, Raghav Singhal, Carles Domingo-Enrich, Dylan J. Foster, Akshay Krishnamurthy2026-03-10🤖 cs.LG

VLM-SubtleBench: How Far Are VLMs from Human-Level Subtle Comparative Reasoning?

This paper introduces VLM-SubtleBench, a comprehensive benchmark spanning ten fine-grained difference types across diverse domains like industrial, medical, and aerial imagery, to evaluate and reveal the significant performance gaps between current vision-language models and humans in subtle comparative reasoning tasks.

Minkyu Kim, Sangheon Lee, Dongmin Park2026-03-10🤖 cs.LG

Designing probabilistic AI monsoon forecasts to inform agricultural decision-making

This paper presents a decision-theory framework and a blended AI-statistical forecasting system that successfully delivered skillful, tailored monsoon onset predictions to 38 million Indian farmers in 2025, enabling better agricultural decision-making under uncertainty.

Colin Aitken, Rajat Masiwal, Adam Marchakitus, Katherine Kowal, Mayank Gupta, Tyler Yang, Amir Jina, Pedram Hassanzadeh, William R. Boos, Michael Kremer2026-03-10🤖 cs.LG

SMGI: A Structural Theory of General Artificial Intelligence

This paper introduces SMGI, a structural theory of general artificial intelligence that formalizes learning as the controlled evolution of a typed meta-model, unifying diverse existing approaches under a rigorous framework defined by structural closure, dynamical stability, bounded capacity, and evaluative invariance.

Aomar Osmani2026-03-10🤖 cs.LG

LeJOT-AutoML: LLM-Driven Feature Engineering for Job Execution Time Prediction in Databricks Cost Optimization

LeJOT-AutoML is an LLM-driven AutoML framework that automates feature engineering for Databricks job execution time prediction by synthesizing runtime-derived features from diverse artifacts, reducing the development cycle from weeks to minutes and achieving 19.01% cost savings through improved orchestration.

Lizhi Ma, Yi-Xiang Hu, Yihui Ren, Feng Wu, Xiang-Yang Li2026-03-10🤖 cs.LG

Revisiting Unknowns: Towards Effective and Efficient Open-Set Active Learning

This paper introduces E $^2$ OAL, a unified and detector-free framework for open-set active learning that leverages labeled unknowns through label-guided clustering and a Dirichlet-calibrated auxiliary head to achieve superior accuracy, efficiency, and query precision compared to existing state-of-the-art methods.

Chen-Chen Zong, Yu-Qi Chi, Xie-Yang Wang, Yan Cui, Sheng-Jun Huang2026-03-10🤖 cs.LG

Bayesian Transformer for Probabilistic Load Forecasting in Smart Grids

This paper proposes a Bayesian Transformer framework that integrates Monte Carlo Dropout, variational feed-forward layers, and stochastic attention into a PatchTST backbone to deliver well-calibrated, state-of-the-art probabilistic load forecasts with robust uncertainty estimates under extreme weather distributional shifts across multiple global power grids.

Sajib Debnath, Md. Uzzal Mia2026-03-10🤖 cs.LG

NaviDriveVLM: Decoupling High-Level Reasoning and Motion Planning for Autonomous Driving

NaviDriveVLM proposes a decoupled framework that separates high-level reasoning and motion planning using a large-scale Navigator and a lightweight Driver, achieving superior end-to-end performance on the nuScenes benchmark while reducing training costs and enhancing interpretability.

Ximeng Tao, Pardis Taghavi, Dimitar Filev, Reza Langari, Gaurav Pandey2026-03-10🤖 cs.LG

DyQ-VLA: Temporal-Dynamic-Aware Quantization for Embodied Vision-Language-Action Models

DyQ-VLA is a dynamic quantization framework for Embodied Vision-Language-Action models that leverages real-time kinematic proxies to adaptively switch and allocate bit-widths, significantly reducing memory footprint and improving inference speed while maintaining near-original performance.

Zihao Zheng, Hangyu Cao, Sicheng Tian, Jiayu Chen, Maoliang Li, Xinhao Sun, Hailong Zou, Zhaobo Zhang, Xuanzhe Liu, Donggang Cao, Hong Mei, Xiang Chen2026-03-10🤖 cs.LG

Rel-MOSS: Towards Imbalanced Relational Deep Learning on Relational Databases

This paper introduces Rel-MOSS, a novel relation-centric deep learning framework that addresses the critical issue of class imbalance in relational databases by employing a relation-wise gating controller and a relation-guided minority synthesizer to enhance the representation and over-sampling of minority entities, thereby significantly outperforming existing methods in entity classification tasks.

Jun Yin, Peng Huo, Bangguo Zhu, Hao Yan, Senzhang Wang, Shirui Pan, Chengqi Zhang2026-03-10🤖 cs.LG

Robust Transfer Learning with Side Information

This paper proposes a robust transfer learning framework that leverages side information to construct estimate-centered uncertainty sets, thereby reducing the conservatism of standard distributionally robust optimization and improving policy performance under environmental shifts with limited target data.

Akram S. Awad, Shihab Ahmed, Yue Wang, George K. Atia2026-03-10🤖 cs.LG

Semantic Risk Scoring of Aggregated Metrics: An AI-Driven Approach for Healthcare Data Governance

This paper proposes an AI-driven framework that leverages CodeBERT embeddings and XGBoost to perform static, explainable risk scoring of SQL-based metric definitions, enabling healthcare institutions to proactively identify and mitigate privacy risks in aggregated data before deployment while ensuring regulatory compliance and auditability.

Mohammed Omer Shakeel Ahmed2026-03-10🤖 cs.LG

ELLMob: Event-Driven Human Mobility Generation with Self-Aligned LLM Framework

This paper introduces ELLMob, a self-aligned Large Language Model framework that leverages Fuzzy-Trace Theory to reconcile habitual patterns with event constraints, addressing the lack of event-annotated datasets and significantly improving the generation of human mobility trajectories during major societal events like typhoons, pandemics, and the Olympics.

Yusong Wang, Chuang Yang, Jiawei Wang, Xiaohang Xu, Jiayi Xu, Dongyuan Li, Chuan Xiao, Renhe Jiang2026-03-10🤖 cs.LG

RL unknotter, hard unknots and unknotting number

This paper introduces a reinforcement learning pipeline that successfully simplifies complex knot diagrams and determines the unknotting number of the composite knot $4_1\#9_{10}$ as three, confirming a recently established upper bound.

Anne Dranowski, Yura Kabkov, Daniel Tubbenhauer2026-03-10🤖 cs.LG

PSTNet: Physically-Structured Turbulence Network

This paper introduces PSTNet, a lightweight, physics-structured neural network with only 552 parameters that embeds atmospheric turbulence scaling laws directly into its architecture to enable accurate, real-time turbulence estimation on resource-constrained aircraft guidance systems where traditional models fail.

Boris Kriuk, Fedor Kriuk2026-03-10🤖 cs.LG

Local Constrained Bayesian Optimization

This paper introduces Local Constrained Bayesian Optimization (LCBO), a novel framework that overcomes the curse of dimensionality in high-dimensional constrained problems by alternating between local descent and uncertainty-driven exploration, achieving polynomial convergence rates and outperforming state-of-the-art methods on benchmarks up to 100 dimensions.

Jing Jingzhe, Fan Zheyi, Szu Hui Ng, Qingpei Hu2026-03-10🤖 cs.LG

Scaling Machine Learning Interatomic Potentials with Mixtures of Experts

This paper introduces Mixture-of-Experts (MoE) and Mixture-of-Linear-Experts (MoLE) architectures for Machine Learning Interatomic Potentials, demonstrating that element-wise routing with shared nonlinear experts achieves state-of-the-art accuracy across multiple benchmarks while revealing chemically interpretable specialization aligned with periodic-table trends.

Yuzhi Liu, Duo Zhang, Anyang Peng, Weinan E, Linfeng Zhang, Han Wang2026-03-10🤖 cs.LG

$OneMillion-Bench: How Far are Language Agents from Human Experts?

The paper introduces $OneMillion-Bench, a novel benchmark comprising 400 expert-curated tasks across five professional domains designed to rigorously evaluate the reliability, reasoning depth, and practical readiness of language agents in complex, real-world scenarios that existing benchmarks fail to address.

Qianyu Yang, Yang Liu, Jiaqi Li, Jun Bai, Hao Chen, Kaiyuan Chen, Tiliang Duan, Jiayun Dong, Xiaobo Hu, Zixia Jia, Yang Liu, Tao Peng, Yixin Ren, Ran Tian, Zaiyuan Wang, Yanglihong Xiao, Gang Yao, Lingyue Yin, Ge Zhang, Chun Zhang, Jianpeng Jiao, Zilong Zheng, Yuan Gong2026-03-10🤖 cs.LG

← Previous Next →