cs.LG papers | Gist.Science

B-DENSE: Branching For Dense Ensemble Network Supervision Efficiency

The paper proposes B-DENSE, a novel distillation framework that leverages multi-branch trajectory alignment to enforce dense intermediate supervision, thereby overcoming the structural information loss and discretization errors of existing methods to achieve superior image generation quality with reduced inference latency.

Cherish Puniani, Tushar Kumar, Arnav Bendre, Gaurav Kumar, Shree SinghiWed, 11 Ma🤖 cs.AI

Infusion: Shaping Model Behavior by Editing Training Data via Influence Functions

The paper introduces "Infusion," a framework that leverages scalable influence-function approximations to compute subtle perturbations in training data, demonstrating that modifying as little as 0.2% of a dataset can effectively and transferably shape model behavior across vision and language domains.

J Rosser, Robert Kirk, Edward Grefenstette, Jakob Foerster, Laura RuisWed, 11 Ma🤖 cs.AI

Automating Forecasting Question Generation and Resolution for AI Evaluation

This paper presents an automated system using LLM-powered web research agents to generate and resolve diverse, real-world forecasting questions at scale, demonstrating high-quality question creation and resolution rates that surpass human-curated platforms while effectively evaluating and improving AI forecasting performance.

Nikos I. Bosse, Peter Mühlbacher, Jack Wildman, Lawrence Phillips, Dan SchwarzWed, 11 Ma🤖 cs.AI

An AI-powered Bayesian Generative Modeling Approach for Arbitrary Conditional Inference

This paper introduces Bayesian Generative Modeling (BGM), a unified framework that leverages a stochastic iterative Bayesian updating algorithm to learn a single generative model capable of performing arbitrary conditional inference with principled uncertainty quantification, without requiring retraining for different conditioning structures.

Qiao Liu, Wing Hung WongWed, 11 Ma🤖 cs.AI

EMFusion: Conditional Diffusion Framework for Trustworthy Frequency Selective EMF Forecasting in Wireless Networks

This paper introduces EMFusion, a conditional multivariate diffusion-based framework that leverages a residual U-Net with cross-attention and imputation-based sampling to provide accurate, uncertainty-quantified, frequency-selective electromagnetic field forecasts for wireless network planning, significantly outperforming existing baseline models.

Zijiang Yan, Yixiang Huang, Jianhua Pei, Hina Tabassum, Luca ChiaraviglioWed, 11 Ma🤖 cs.AI

Enhancing Retrieval-Augmented Generation with Entity Linking for Educational Platforms

This paper introduces ELERAG, an enhanced Retrieval-Augmented Generation system that integrates Wikidata-based Entity Linking and a hybrid re-ranking strategy to significantly improve factual accuracy in Italian educational question-answering, particularly outperforming standard methods in domain-specific contexts while demonstrating the importance of domain-adapted strategies.

Francesco Granata, Francesco Poggi, Misael MongiovìWed, 11 Ma🤖 cs.AI

Periodic Asynchrony: An On-Policy Approach for Accelerating LLM Reinforcement Learning

This paper proposes "Periodic Asynchrony," a framework that accelerates LLM reinforcement learning by decoupling inference and training into a provably on-policy asynchronous pipeline, achieving a 3- to 5-fold throughput improvement on NPU platforms without sacrificing accuracy.

Jian LuWed, 11 Ma🤖 cs.AI

TSFM in-context learning for time-series classification of bearing-health status

This paper introduces a novel in-context learning approach using Time-Series Foundation Models (TSFMs) to classify bearing health status from vibration data without fine-tuning, enabling scalable, zero-shot maintenance solutions across varying operational conditions.

Michel Tokic, Slobodan Djukanovic, Anja von Beuningen, Cheng FengWed, 11 Ma🤖 cs.AI

Lightweight Time Series Data Valuation on Time Series Foundation Models via In-Context Finetuning

This paper proposes LTSV, a lightweight and efficient method for valuing time series data in foundation models by leveraging in-context finetuning and temporal block aggregation to overcome the computational bottlenecks and temporal dependency limitations of traditional approaches.

Shunyu Wu, Tianyue Li, Yixuan Leng, Jingyi Suo, Jian Lou, Dan Li, See-Kiong NgWed, 11 Ma🤖 cs.AI

Structured Matrix Scaling for Multi-Class Calibration

This paper proposes a structured matrix scaling approach for multi-class calibration that leverages theoretical insights from logistic regression, combined with structured regularization and robust optimization, to effectively manage the bias-variance tradeoff and achieve substantial performance gains over existing methods while providing an open-source implementation.

Eugène Berta, David Holzmüller, Michael I. Jordan, Francis BachWed, 11 Ma🤖 cs.AI

GraphKeeper: Graph Domain-Incremental Learning via Knowledge Disentanglement and Preservation

The paper proposes GraphKeeper, a novel framework for Graph Domain-Incremental Learning that addresses catastrophic forgetting through knowledge disentanglement and deviation-free preservation, achieving state-of-the-art performance across multiple graph domains while remaining compatible with various graph foundation models.

Zihao Guo, Qingyun Sun, Ziwei Zhang, Haonan Yuan, Huiping Zhuang, Xingcheng Fu, Jianxin LiWed, 11 Ma🤖 cs.AI

From Spatial to Actions: Grounding Vision-Language-Action Model in Spatial Foundation Priors

FALCON addresses the spatial reasoning limitations of existing 2D-based vision-language-action models by leveraging spatial foundation models to inject rich 3D geometric priors directly into the action head, achieving state-of-the-art performance across diverse simulation and real-world tasks without requiring architectural changes or specialized sensors.

Zhengshen Zhang, Hao Li, Yalun Dai, Zhengbang Zhu, Lei Zhou, Chenchen Liu, Dong Wang, Francis E. H. Tay, Sijin Chen, Ziwei Liu, Yuxiao Liu, Xinghang Li, Pan ZhouWed, 11 Ma🤖 cs.AI

RL-100: Performant Robotic Manipulation with Real-World Reinforcement Learning

RL-100 is a unified real-world reinforcement learning framework that combines diffusion visuomotor policies with a clipped PPO objective and consistency distillation to achieve 100% success across 1,000 diverse robotic manipulation trials, matching or surpassing human experts while demonstrating robust zero-shot generalization and continuous deployment in dynamic environments.

Kun Lei, Huanyu Li, Dongjie Yu, Zhenyu Wei, Lingxiao Guo, Zhennan Jiang, Ziyu Wang, Shiyu Liang, Huazhe XuWed, 11 Ma🤖 cs.AI

REAP the Experts: Why Pruning Prevails for One-Shot MoE compression

This paper introduces REAP, a novel expert pruning method that outperforms existing merging techniques for compressing Mixture-of-Experts models in generative tasks by leveraging router gate-values and activation norms to minimize reconstruction error, achieving near-lossless compression even at 50% expert reduction.

Mike Lasby, Ivan Lazarevich, Nish Sinnadurai, Sean Lie, Yani Ioannou, Vithursan ThangarasaWed, 11 Ma🤖 cs.AI

RECODE: Reasoning Through Code Generation for Visual Question Answering

The paper introduces RECODE, an agentic framework that enhances visual question answering by reverse-engineering structured visuals into executable code through iterative generation and selection, thereby transforming ambiguous perceptual tasks into verifiable symbolic reasoning problems that significantly outperform existing methods.

Junhong Shen, Mu Cai, Bo Hu, Ameet Talwalkar, David A Ross, Cordelia Schmid, Alireza FathiWed, 11 Ma🤖 cs.AI

Latent Speech-Text Transformer

The Latent Speech-Text Transformer (LST) improves the efficiency and performance of auto-regressive speech-text models by aggregating speech tokens into latent patches, which aligns sequence granularity with text, reduces computational costs, and achieves significant accuracy gains across speech and text benchmarks.

Yen-Ju Lu, Yashesh Gaur, Wei Zhou, Benjamin Muller, Jesus Villalba, Najim Dehak, Luke Zettlemoyer, Gargi Ghosh, Mike Lewis, Srinivasan Iyer, Duc LeWed, 11 Ma🤖 cs.AI

Singing Syllabi with Virtual Avatars: Enhancing Student Engagement Through AI-Generated Music and Digital Embodiment

This paper proposes and evaluates a novel educational approach that uses AI-generated singing and virtual avatars to transform traditional text-based syllabi into engaging audiovisual performances, demonstrating that this method significantly improves student awareness and recall of critical course information.

Xinxing WuWed, 11 Ma🤖 cs.AI

Latent Policy Steering with Embodiment-Agnostic Pretrained World Models

This paper introduces Latent Policy Steering (LPS), a method that leverages embodiment-agnostic optical flow to pretrain a World Model on diverse datasets, which is then fine-tuned with limited target-embodiment demonstrations to steer and significantly improve visuomotor policies in low-data regimes.

Yiqi Wang, Mrinal Verghese, Jeff SchneiderWed, 11 Ma🤖 cs.AI

ConLID: Supervised Contrastive Learning for Low-Resource Language Identification

The paper proposes ConLID, a supervised contrastive learning approach that learns domain-invariant representations to significantly improve language identification performance for low-resource languages on out-of-domain data while maintaining accuracy for high-resource languages.

Negar Foroutan, Jakhongir Saydaliev, Ye Eun Kim, Antoine BosselutWed, 11 Ma🤖 cs.AI

Towards Robust Real-World Multivariate Time Series Forecasting: A Unified Framework for Dependency, Asynchrony, and Missingness

The paper introduces ChannelTokenFormer, a unified Transformer-based framework that simultaneously addresses the challenges of complex inter-channel dependencies, asynchronous sampling, and missing values to achieve robust real-world multivariate time series forecasting.

Jinkwan Jang, Hyungjin Park, Jinmyeong Choi, Taesup KimWed, 11 Ma🤖 cs.AI

← Previous Next →