cs.AI papers | Gist.Science

ResearchEnvBench: Benchmarking Agents on Environment Synthesis for Research Code Execution

The paper introduces ResearchEnvBench, a new benchmark designed to evaluate autonomous agents' ability to synthesize complex execution environments for research code, revealing significant current limitations in dependency resolution and version management.

Yubang Wang, Chenxi Zhang, Bowen Chen, Zezheng Huai, Zihao Dai, Xinchi Chen, Yuxin Wang, Yining Zheng, Jingjing Gong, Xipeng Qiu2026-03-10💻 cs

ViroGym: Realistic Large-Scale Benchmarks for Evaluating Viral Proteins

The paper introduces ViroGym, a comprehensive benchmark comprising extensive deep mutational scanning data and real-world tasks to evaluate protein language models for predicting viral variant effects and guiding rational antigen selection for vaccine development.

Yichen Zhou, Jonathan Golob, Amir Karimi, Stefan Bauer, Patrick Schwab2026-03-10💻 cs

Heterogeneous Decentralized Diffusion Models

This paper introduces an efficient framework for heterogeneous decentralized diffusion models that enables experts to train with mixed objectives (DDPM and Flow Matching) and reduced resource requirements, achieving a 16x decrease in compute and 14x reduction in data compared to prior approaches while improving image quality and diversity.

Zhiying Jiang, Raihan Seraj, Marcos Villagra, Bidhan Roy2026-03-10🤖 cs.LG

Improved Constrained Generation by Bridging Pretrained Generative Models

This paper proposes a framework that fine-tunes pretrained generative models to directly sample within complex, structured feasible regions, achieving a novel balance between strict constraint satisfaction and high-quality sample realism for safety-critical applications like robotics and autonomous driving.

Xiaoxuan Liang, Saeid Naderiparizi, Yunpeng Liu, Berend Zwartsenberg, Frank Wood2026-03-10🤖 cs.LG

Stabilizing Reinforcement Learning for Diffusion Language Models

This paper identifies that applying Group Relative Policy Optimization (GRPO) to diffusion language models causes reward collapse due to noisy importance ratio estimates and formulation mismatches, and proposes StableDRL, a reformulated algorithm featuring unconditional clipping and self-normalization to stabilize training and prevent policy drift.

Jianyuan Zhong, Kaibo Wang, Ding Ding, Zijin Feng, Haoli Bai, Yang Xiang, Jiacheng Sun, Qiang Xu2026-03-10🤖 cs.LG

Enhancing Instruction Following of LLMs via Activation Steering with Dynamic Rejection

The paper introduces DIRECTER, a novel activation steering method that dynamically modulates steering strength through a plausibility-guided decoding loop and layer sensitivity analysis to enhance LLM instruction-following accuracy while preventing the oversteering that typically degrades text quality.

Minjae Kang, Jaehyung Kim2026-03-10🤖 cs.LG

ButterflyViT: 354 $\times$ Expert Compression for Edge Vision Transformers

ButterflyViT introduces a geometric parameterization method that treats Mixture of Experts as rotations of a shared quantized substrate, achieving a 354 $\times$ memory reduction for Vision Transformers on edge devices while maintaining accuracy through spatial smoothness regularization.

Aryan Karmore2026-03-10💻 cs

Property-driven Protein Inverse Folding With Multi-Objective Preference Alignment

This paper introduces ProtAlign, a multi-objective preference alignment framework that fine-tunes pretrained inverse folding models to simultaneously optimize diverse developability properties like solubility and thermostability while preserving structural designability, resulting in the enhanced MoMPNN model for practical protein sequence design.

Xiaoyang Hou, Junqi Liu, Chence Shi, Xin Liu, Zhi Yang, Jian Tang2026-03-10🤖 cs.LG

Robotic Foundation Models for Industrial Control: A Comprehensive Survey and Readiness Assessment Framework

This paper surveys the landscape of robotic foundation models, identifies eleven key industrial implications to establish a 149-criteria assessment framework, and evaluates 324 models to reveal that current industrial readiness is limited and uneven, necessitating a shift from isolated benchmark successes to systematic integration of safety, real-time performance, and robust system deployment.

David Kube, Simon Hadwiger, Tobias Meisen2026-03-10💻 cs

XMACNet: An Explainable Lightweight Attention based CNN with Multi Modal Fusion for Chili Disease Classification

This paper introduces XMACNet, an explainable, lightweight CNN that combines self-attention mechanisms with multi-modal fusion of RGB images and vegetation indices to achieve high-accuracy chili disease classification suitable for edge deployment.

Tapon Kumer Ray, Rajkumar Y, Shalini R, Srigayathri K, Jayashree S, Lokeswari P2026-03-10💻 cs

Learning Unbiased Cluster Descriptors for Interpretable Imbalanced Concept Drift Detection

This paper proposes ICD3, an interpretable and robust approach for detecting concept drift in imbalanced streaming data by employing multi-distribution-granular search to identify small concepts and training independent One-Cluster Classifiers for each, thereby overcoming the masking effect of dominant large clusters.

Yiqun Zhang, Zhanpei Huang, Mingjie Zhao, Chuyao Zhang, Yang Lu, Yuzhu Ji, Fangqing Gu, An Zeng2026-03-10🤖 cs.LG

Enhancing SHAP Explainability for Diagnostic and Prognostic ML Models in Alzheimer Disease

This paper proposes and validates a multi-level explainability framework demonstrating that SHAP explanations for Alzheimer's disease diagnostic and prognostic models are robust, stable, and consistent across different disease stages and prediction tasks, thereby enhancing their reliability for clinical adoption.

Pablo Guillén, Enrique Frias-Martinez2026-03-10🤖 cs.LG

Gradient-based Nested Co-Design of Aerodynamic Shape and Control for Winged Robots

This paper introduces a general-purpose, gradient-based nested co-design framework that jointly optimizes the aerodynamic shape and motion planner of winged robots using neural surrogate models for complex flow conditions, demonstrating superior performance and efficiency over evolutionary baselines in tasks like perching and short landing.

Daniele Affinita, Mingda Xu, Benoît Valentin Gherardi, Pascal Fua2026-03-10💻 cs

Diversity-Aware Adaptive Collocation for Physics-Informed Neural Networks via Sparse QUBO Optimization and Hybrid Coresets

This paper proposes a diversity-aware adaptive collocation method for Physics-Informed Neural Networks that formulates point selection as a sparse QUBO optimization problem on a kNN graph to efficiently construct hybrid coreset subsets, thereby reducing training redundancy and overhead while improving accuracy on PDEs with shock formation.

Hadi Salloum, Maximilian Mifsud Bonici, Sinan Ibrahim, Pavel Osinenko, Alexei Kornaev2026-03-10🤖 cs.LG

Failure Detection in Chemical Processes using Symbolic Machine Learning: A Case Study on Ethylene Oxidation

This paper demonstrates that symbolic machine learning can effectively predict failures in chemical processes, such as ethylene oxidation, by generating interpretable, rule-based models that outperform traditional black-box methods while addressing the scarcity of real-world failure data through simulator-generated examples.

Julien Amblard, Niklas Groll, Matthew Tait, Mark Law, Gürkan Sin, Alessandra Russo2026-03-10🤖 cs.LG

HGT-Scheduler: Deep Reinforcement Learning for the Job Shop Scheduling Problem via Heterogeneous Graph Transformers

This paper proposes HGT-Scheduler, a deep reinforcement learning framework that utilizes Heterogeneous Graph Transformers to explicitly model the distinct edge semantics of the Job Shop Scheduling Problem, thereby outperforming homogeneous graph baselines on benchmark instances by capturing type-specific relational patterns through edge-type-dependent attention mechanisms.

Bulent Soykan2026-03-10🤖 cs.LG

SpatialMAGIC: A Hybrid Framework Integrating Graph Diffusion and Spatial Attention for Spatial Transcriptomics Imputation

SpatialMAGIC is a hybrid framework that integrates graph diffusion and transformer-based spatial self-attention to effectively impute sparse and noisy spatial transcriptomics data, thereby enhancing clustering accuracy, improving gene detection, and preserving biological interpretability across multiple high-resolution platforms.

Sayeem Bin Zaman, Fahim Hafiz, Riasat Azim2026-03-10🤖 cs.LG

xaitimesynth: A Python Package for Evaluating Attribution Methods for Time Series with Synthetic Ground Truth

The paper introduces **xaitimesynth**, an open-source Python package that streamlines the evaluation of time series attribution methods by providing a reusable infrastructure for generating synthetic datasets with known ground truth masks and calculating standard localization metrics.

Gregor Baer2026-03-10🤖 cs.LG

Physics-Informed Diffusion Model for Generating Synthetic Extreme Rare Weather Events Data

To address the critical data scarcity of extreme rare weather events that hinders robust machine learning models, this paper proposes a physics-informed diffusion model based on Context-UNet that generates physically consistent, multi-spectral synthetic satellite imagery conditioned on key atmospheric parameters, thereby effectively mitigating extreme class imbalance and enhancing operational weather detection algorithms.

Marawan Yakout, Tannistha Maiti, Monira Majhabeen, Tarry Singh2026-03-10🤖 cs.LG

Optimistic Policy Regularization

This paper introduces Optimistic Policy Regularization (OPR), a lightweight mechanism that preserves and reinforces historically successful trajectories to prevent premature convergence, thereby significantly improving sample efficiency and final performance in deep reinforcement learning across Atari and cyber-defense benchmarks.

Mai Pham, Vikrant Vaze, Peter Chin2026-03-10🤖 cs.LG

← Previous Next →

cs.AI