cs.AI papers | Gist.Science

An Extreme Multi-label Text Classification (XMTC) Library Dataset: What if we took "Use of Practical AI in Digital Libraries" seriously?

This paper introduces a large bilingual (English/German) corpus of catalog records annotated with the Integrated Authority File (GND) and a machine-actionable GND taxonomy to enable ontology-aware multi-label classification and agent-assisted cataloging, aiming to develop transparent, authority-anchored AI tools that enhance the efficiency and scalability of subject indexing in digital libraries.

Jennifer D'Souza, Sameer Sadruddin, Maximilian Kähler, Andrea Salfinger, Luca Zaccagna, Francesca Incitti, Lauro Snidaro, Osma Suominen2026-03-12💬 cs.CL

Continuous Diffusion Transformers for Designing Synthetic Regulatory Elements

This paper introduces a parameter-efficient Diffusion Transformer (DiT) with a 2D CNN encoder that generates high-quality, cell-type-specific synthetic regulatory DNA sequences with significantly faster convergence, reduced memorization, and enhanced regulatory activity compared to existing U-Net-based models.

Jonathan Liu, Kia Ghods2026-03-12🧬 q-bio

Dynamics-Predictive Sampling for Active RL Finetuning of Large Reasoning Models

This paper introduces Dynamics-Predictive Sampling (DPS), a method that models prompt solving progress as a dynamical system to predict and select informative training samples via online Bayesian inference, thereby significantly reducing the computational overhead of extensive rollouts while accelerating and improving the reinforcement learning finetuning of large reasoning models.

Yixiu Mao, Yun Qu, Qi Wang, Heming Zou, Xiangyang Ji2026-03-12🤖 cs.LG

A Hybrid Knowledge-Grounded Framework for Safety and Traceability in Prescription Verification

This paper introduces PharmGraph-Auditor, a hybrid framework that combines a Virtual Knowledge Graph-based pharmaceutical knowledge base with a Chain of Verification reasoning paradigm to enable reliable, evidence-grounded, and traceable prescription auditing by transforming Large Language Models into transparent reasoning engines.

Yichi Zhu, Kan Ling, Xu Liu, Hengrun Zhang, Huiqun Yu, Guisheng Fan2026-03-12🤖 cs.AI

LookaheadKV: Fast and Accurate KV Cache Eviction by Glimpsing into the Future without Generation

LookaheadKV is a lightweight KV cache eviction framework that achieves fast and accurate long-context inference by using parameter-efficient modules to predict future token importance without the computational overhead of explicit draft generation, thereby outperforming existing methods in both accuracy and speed.

Jinwoo Ahn, Ingyu Seong, Akhil Kedia, Junhan Kim, Hyemi Jang, Kangwook Lee, Yongkweon Jeon2026-03-12🤖 cs.LG

When Fine-Tuning Fails and when it Generalises: Role of Data Diversity and Mixed Training in LLM-based TTS

This paper demonstrates that LoRA fine-tuning of compact LLM backbones significantly enhances voice cloning performance in terms of perceptual quality, speaker fidelity, and signal-to-noise ratio, provided the training data possesses sufficient acoustic diversity.

Anupam Purwar, Aditya Choudhary2026-03-12🤖 cs.AI

Historical Consensus: Preventing Posterior Collapse via Iterative Selection of Gaussian Mixture Priors

This paper introduces Historical Consensus Training, an iterative method that eliminates posterior collapse in Variational Autoencoders by progressively refining Gaussian Mixture Model priors to create a stable parameter barrier that prevents the degeneration of latent variables, achieving robust representations without relying on specific architectural constraints or hyperparameter tuning.

Zegu Zhang, Jian Zhang2026-03-12🤖 cs.LG

Safe RLHF Beyond Expectation: Stochastic Dominance for Universal Spectral Risk Control

This paper proposes Risk-sensitive Alignment via Dominance (RAD), a novel Safe RLHF framework that replaces traditional expected cost constraints with First-Order Stochastic Dominance constraints within an Optimal Transport framework to universally control spectral risk measures, thereby achieving superior robustness against tail risks and out-of-distribution failures while maintaining helpfulness.

Yaswanth Chittepu, Ativ Joshi, Rajarshi Bhattacharjee, Scott Niekum2026-03-12🤖 cs.LG

Contact Coverage-Guided Exploration for General-Purpose Dexterous Manipulation

This paper proposes Contact Coverage-Guided Exploration (CCGE), a general-purpose exploration method that leverages contact state counters and energy-based rewards to guide dexterous hands in discovering diverse contact patterns, thereby significantly improving training efficiency and real-world transferability across complex manipulation tasks.

Zixuan Liu, Ruoyi Qiao, Chenrui Tie, Xuanwei Liu, Yunfan Lou, Chongkai Gao, Zhixuan Xu, Lin Shao2026-03-12🤖 cs.AI

GroundCount: Grounding Vision-Language Models with Object Detection for Mitigating Counting Hallucinations

GroundCount proposes a framework that augments Vision-Language Models with explicit spatial grounding from object detection models to significantly mitigate counting hallucinations, demonstrating that structured prompt-based integration outperforms feature-level fusion and yields consistent accuracy improvements across most architectures.

Boyuan Chen, Minghao Shao, Siddharth Garg, Ramesh Karri, Muhammad Shafique2026-03-12🤖 cs.AI

Artificial Intelligence as a Catalyst for Innovation in Software Engineering

This paper argues that integrating Artificial Intelligence, particularly through Machine Learning and Natural Language Processing, acts as a catalyst for innovation in software engineering by automating tedious tasks and enhancing Agile practices to better manage evolving requirements while maintaining quality and speed.

Carlos Alberto Fernández-y-Fernández, Jorge R. Aguilar-Cisneros2026-03-12🤖 cs.AI

RCTs & Human Uplift Studies: Methodological Challenges and Practical Solutions for Frontier AI Evaluation

This paper synthesizes findings from interviews with 16 experts to identify methodological challenges in applying randomized controlled trials to evaluate frontier AI's impact on human performance and proposes practical solutions to address validity issues in high-stakes decision-making.

Patricia Paskov, Kevin Wei, Shen Zhou Hong, Dan Bateyko, Xavier Roberts-Gaal, Carson Ezell, Gailius Praninskas, Valerie Chen, Umang Bhatt, Ella Guest2026-03-12🤖 cs.AI

Does AI See like Art Historians? Interpreting How Vision Language Models Recognize Artistic Style

Through an interdisciplinary collaboration between computer scientists and art historians, this paper employs latent-space decomposition and quantitative analysis to reveal that Vision Language Models predict artistic styles using concepts that are largely coherent and relevant to human experts, often aligning with art historical reasoning even when utilizing formally interpreted features.

Marvin Limpijankit, Milad Alshomary, Yassin Oulad Daoud, Amith Ananthram, Tim Trombley, Elias Stengel-Eskin, Mohit Bansal, Noam M. Elcott, Kathleen McKeown2026-03-12🤖 cs.AI

Instruction set for the representation of graphs

This paper introduces IsalGraph, a novel method that encodes any finite simple graph into a compact, valid nine-character instruction string using a virtual machine, enabling efficient canonical representation and demonstrating strong correlation between string edit distance and graph edit distance for applications in similarity search and language modeling.

Ezequiel Lopez-Rubio, Mario Pascual-Gonzalez2026-03-12💬 cs.CL

V2M-Zero: Zero-Pair Time-Aligned Video-to-Music Generation

V2M-Zero introduces a zero-pair video-to-music generation framework that achieves superior temporal synchronization and semantic alignment by leveraging shared intra-modal temporal structures via event curves, eliminating the need for paired training data or cross-modal supervision.

Yan-Bo Lin, Jonah Casebeer, Long Mai, Aniruddha Mahapatra, Gedas Bertasius, Nicholas J. Bryan2026-03-12🤖 cs.AI

Neural Field Thermal Tomography: A Differentiable Physics Framework for Non-Destructive Evaluation

The paper introduces Neural Field Thermal Tomography (NeFTY), a differentiable physics framework that parameterizes 3D material diffusivity as a continuous neural field optimized via a rigorous numerical solver to achieve high-resolution, quantitative reconstruction of subsurface defects from transient surface temperature measurements, overcoming the limitations of traditional 1D approximations and soft-constrained PINNs.

Tao Zhong, Yixun Hu, Dongzhe Zheng, Aditya Sood, Christine Allen-Blanchette2026-03-12🔬 cond-mat.mtrl-sci

LiTo: Surface Light Field Tokenization

LiTo introduces a unified 3D latent representation that tokenizes surface light fields from RGB-depth images to jointly model geometry and view-dependent appearance, enabling high-fidelity 3D object generation with realistic specular effects and consistent lighting.

Jen-Hao Rick Chang, Xiaoming Zhao, Dorian Chan, Oncel Tuzel2026-03-12🤖 cs.AI

COMIC: Agentic Sketch Comedy Generation

The paper presents COMIC, a fully automated AI system that generates high-quality, diverse comedic sketch videos by employing a multi-agent framework with specialized roles and LLM-based critics trained on YouTube data to iteratively refine content toward professional standards.

Susung Hong, Brian Curless, Ira Kemelmacher-Shlizerman, Steve Seitz2026-03-12💬 cs.CL

SDR-GAIN: A High Real-Time Occluded Pedestrian Pose Completion Method for Autonomous Driving

This paper proposes SDR-GAIN, a novel real-time framework that utilizes self-supervised adversarial learning on keypoint coordinate distributions to accurately reconstruct occluded pedestrian poses for autonomous driving, outperforming existing methods in both accuracy and inference speed.

Honghao Fu, Yongli Gu, Yidong Yan + 3 more2026-03-11🤖 cs.AI

A Temporal-Spectral Fusion Transformer with Subject-Specific Adapter for Enhancing RSVP-BCI Decoding

This paper proposes TSformer-SA, a novel framework that integrates a temporal-spectral fusion transformer with subject-specific adapters and cross-view consistency learning to significantly enhance RSVP-BCI decoding performance while minimizing the training data and preparation time required for new subjects.

Xujin Li, Wei Wei, Shuang Qiu + 1 more2026-03-11🤖 cs.AI

← Previous Next →