cs.AI papers | Gist.Science

Interpretable Motion-Attentive Maps: Spatio-Temporally Localizing Concepts in Video Diffusion Transformers

This paper introduces GramCol and a motion-feature selection algorithm to generate Interpretable Motion-Attentive Maps (IMAPs) that effectively localize both motion and non-motion concepts in Video Diffusion Transformers without requiring gradient calculations or parameter updates.

Youngjun Jun, Seil Kang, Woojung Han, Seong Jae HwangTue, 10 Ma🤖 cs.LG

Why Adam Can Beat SGD: Second-Moment Normalization Yields Sharper Tails

This paper provides the first theoretical proof that Adam's second-moment normalization yields significantly sharper high-probability convergence guarantees ( $\delta^{-1/2}$ dependence) compared to SGD ( $\delta^{-1}$ dependence) under the classical bounded variance model, thereby explaining its empirical superiority.

Ruinan Jin, Yingbin Liang, Shaofeng ZouTue, 10 Ma🤖 cs.LG

Information Routing in Atomistic Foundation Models: How Task Alignment and Equivariance Shape Linear Disentanglement

This paper introduces Compositional Probe Decomposition (CPD) to demonstrate that linear disentanglement of geometric and compositional information in atomistic foundation models is primarily driven by task alignment rather than architecture, revealing a significant performance gradient where models trained on specific properties like HOMO-LUMO gaps outperform energy-trained models and exhibit symmetry-dependent information routing.

Joshua SteierTue, 10 Ma🤖 cs.LG

No Memorization, No Detection: Output Distribution-Based Contamination Detection in Small Language Models

This paper demonstrates that Contamination Detection via output Distribution (CDD) is largely ineffective for small language models (70M–410M parameters) because it fails to detect verbatim memorization, whereas probability-based methods like perplexity and Min-k% Prob consistently outperform it across various benchmarks.

Omer Sela (Tel Aviv University)Tue, 10 Ma💬 cs.CL

Isotonic Layer: A Universal Framework for Generic Recommendation Debiasing

This paper introduces the Isotonic Layer, a novel differentiable framework that integrates piecewise linear fitting and learnable embeddings into neural architectures to enforce global monotonicity, thereby enabling granular, context-aware debiasing and improved calibration for large-scale recommendation systems.

Hailing Cheng, Yafang Yang, Hemeng Tao, Fengyu ZhangTue, 10 Ma🤖 cs.LG

ARC-AGI-2 Technical Report

This paper presents a transformer-based system that significantly advances ARC performance by integrating a compact task encoding, symmetry-based data augmentation, test-time LoRA adaptation, and multi-perspective decoding to enable efficient neural inference and human-level generalization from few examples.

Wallyson Lemes de Oliveira, Mekhron Bobokhonov, Matteo Caorsi, Aldo Podestà, Gabriele Beltramo, Luca Crosato, Matteo Bonotto, Federica Cecchetto, Hadrien Espic, Dan Titus Salajan, Stefan Taga, Luca Pana, Joe CarthyTue, 10 Ma💬 cs.CL

A Coin Flip for Safety: LLM Judges Fail to Reliably Measure Adversarial Robustness

This paper demonstrates that current LLM-as-a-Judge frameworks fail to reliably measure adversarial robustness due to unaccounted distribution shifts that degrade performance to near-random levels, often leading to inflated attack success rates, and proposes new benchmarks to address these evaluation flaws.

Leo Schwinn, Moritz Ladenburger, Tim Beyer, Mehrnaz Mofakhami, Gauthier Gidel, Stephan GünnemannTue, 10 Ma💬 cs.CL

Distributionally Robust Geometric Joint Chance-Constrained Optimization: Neurodynamic Approaches

This paper introduces a two-time scale neurodynamic duplex approach utilizing projection equations to solve distributionally robust geometric joint chance-constrained optimization problems with unknown distributions, demonstrating convergence to the global optimum through neural networks in applications such as shape optimization and telecommunications.

Ange Valli (L2S), Siham Tassouli (OPTIM), Abdel Lisser (L2S)Tue, 10 Ma🔢 math

Building the ethical AI framework of the future: from philosophy to practice

This paper proposes an ethics-by-design control architecture that operationalizes AI governance across the entire lifecycle by embedding philosophical reasoning into a triple-gate enforcement structure (Metric, Governance, and Eco) with measurable triggers and audit trails, thereby translating normative commitments into testable controls compatible with existing MLOps pipelines and major regulatory frameworks like the EU AI Act and NIST RMF.

Jasper Kyle CatapangTue, 10 Ma💻 cs

FuzzingRL: Reinforcement Fuzz-Testing for Revealing VLM Failures

The paper introduces FuzzingRL, a framework that combines vision-language fuzzing with adversarial reinforcement fine-tuning to automatically generate diverse, challenging queries that systematically expose and degrade the performance of Vision Language Models.

Jiajun Xu, Jiageng Mao, Ang Qi, Weiduo Yuan, Alexander Romanus, Helen Xia, Vitor Campagnolo Guizilini, Yue WangTue, 10 Ma🤖 cs.LG

Scale Dependent Data Duplication

This paper demonstrates that data duplication is scale-dependent, revealing that as model capability and corpus size increase, semantically equivalent documents behave like exact duplicates by producing aligned gradients and causing accelerated semantic collisions, which leads to rapidly increasing training losses for larger models and necessitates new scaling laws to accurately predict performance.

Joshua Kazdan, Noam Levi, Rylan Schaeffer, Jessica Chudnovsky, Abhay Puri, Bo He, Mehmet Donmez, Sanmi Koyejo, David DonohoTue, 10 Ma🤖 cs.LG

Multi-Agent DRL for V2X Resource Allocation: Disentangling Challenges and Benchmarking Solutions

This paper addresses the lack of systematic evaluation in Multi-Agent Deep Reinforcement Learning for C-V2X resource allocation by introducing a disentangled benchmark suite of interference games and diverse datasets to isolate specific challenges, ultimately identifying policy robustness and generalization across vehicular topologies as the primary hurdle and demonstrating the superiority of actor-critic methods over value-based approaches.

Siyuan Wang, Lei Lei, Pranav Maheshwari, Sam Bellefeuille, Kan Zheng, Dusit NiyatoTue, 10 Ma🤖 cs.LG

Scaling Strategy, Not Compute: A Stand-Alone, Open-Source StarCraft II Benchmark for Accessible Reinforcement Learning Research

To address the complexity gap between StarCraft II's full game and its mini-games, this paper introduces the Two-Bridge Map Suite, an open-source, lightweight benchmark that isolates tactical navigation and combat skills to enable accessible reinforcement learning research under realistic compute budgets.

Sourav Panda, Shreyash Kale, Tanmay Ambadkar, Abhinav Verma, Jonathan DodgeTue, 10 Ma🤖 cs.LG

Consensus is Not Verification: Why Crowd Wisdom Strategies Fail for LLM Truthfulness

The paper demonstrates that unlike in domains with external verifiers, scaling inference compute through crowd wisdom strategies fails to improve LLM truthfulness in unverified settings because correlated model errors and the inability to distinguish social prediction from truth verification cause aggregation to reinforce shared misconceptions rather than identify correct answers.

Yegor Denisov-Blanch, Joshua Kazdan, Jessica Chudnovsky, Rylan Schaeffer, Sheng Guan, Soji Adeshina, Sanmi KoyejoTue, 10 Ma🤖 cs.LG

OptiRoulette Optimizer: A New Stochastic Meta-Optimizer for up to 5.3x Faster Convergence

The paper introduces OptiRoulette, a stochastic meta-optimizer that dynamically selects update rules from a pool during training, demonstrating significantly faster convergence and higher test accuracy across multiple image-classification benchmarks compared to a standard AdamW baseline.

Stamatis MastromichalakisTue, 10 Ma🤖 cs.LG

Annealed Co-Generation: Disentangling Variables via Progressive Pairwise Modeling

This paper proposes Annealed Co-Generation (ACG), a framework that replaces high-dimensional joint diffusion modeling with a low-dimensional, pairwise approach coupled through a three-stage annealing process to achieve efficient and consistent multivariate co-generation for scientific applications like flow-field completion and antibody generation.

Hantao Zhang, Jieke Wu, Mingda Xu, Xiao Hu, Yingxuan You, Pascal FuaTue, 10 Ma🤖 cs.LG

RACER: Risk-Aware Calibrated Efficient Routing for Large Language Models

RACER is a novel, model-agnostic routing framework that addresses misrouting risks in multi-LLM systems by formulating the problem as $\alpha$ -VOR to generate calibrated, variable-sized model sets with rigorous distribution-free risk control, thereby enhancing downstream accuracy and cost-performance trade-offs.

Sai Hao, Hao Zeng, Hongxin Wei, Bingyi JingTue, 10 Ma🤖 cs.LG

Evo: Autoregressive-Diffusion Large Language Models with Evolving Balance

The paper introduces Evo, a novel large language model that unifies autoregressive and diffusion-based generation within a continuous evolutionary latent framework, enabling adaptive balancing of planning and refinement to achieve state-of-the-art performance across diverse benchmarks while maintaining fast inference speeds.

Junde Wu, Minhao Hu, Jiayuan Zhu, Yuyuan Liu, Tianyi Zhang, Kang Li, Jingkun Chen, Jiazhen Pan, Min Xu, Yueming JinTue, 10 Ma🤖 cs.LG

Distilling and Adapting: A Topology-Aware Framework for Zero-Shot Interaction Prediction in Multiplex Biological Networks

This paper proposes a novel topology-aware framework that leverages domain-specific foundation models, a graph tokenizer for multiplex connectivity, and knowledge distillation to achieve robust zero-shot interaction prediction in multiplex biological networks, outperforming state-of-the-art methods.

Alana Deng, Sugitha Janarthanan, Yan Sun, Zihao Jing, Pingzhao HuTue, 10 Ma🤖 cs.LG

Not all tokens are needed(NAT): token efficient reinforcement learning

The paper introduces NAT (Not All Tokens Are Needed), a token-efficient reinforcement learning framework that utilizes unbiased partial-token gradient estimation via Horvitz-Thompson reweighting to achieve full-sequence performance with significantly reduced compute and memory costs by updating policies on only a subset of generated tokens.

Hejian Sang, Yuanda Xu, Zhengze Zhou, Ran He, Zhipeng WangTue, 10 Ma🤖 cs.LG

← Previous Next →