cs.AI papers | Gist.Science

SpecFuse: Ensembling Large Language Models via Next-Segment Prediction

The paper introduces SpecFuse (referred to as SpecEM in the abstract), a training-free ensemble framework that enhances large language model performance by enabling segment-level semantic collaboration through speculative decoding and dynamically adjusting model weights via an online feedback mechanism to prioritize stronger contributors.

Bo Lv, Nayu Liu, Chen Tang, Xin Liu, Yue Yu, Ping Luo2026-03-09🤖 cs.AI

Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation

This survey provides a comprehensive overview of the emerging ecosystem of large language models and tools that support researchers across the scientific lifecycle, covering key tasks from literature search and idea generation to content creation, experimentation, and evaluation, while addressing associated datasets, methods, limitations, and ethical concerns.

Steffen Eger, Yong Cao, Jennifer D'Souza, Andreas Geiger, Christian Greisinger, Stephanie Gross, Yufang Hou, Brigitte Krenn, Anne Lauscher, Yizhi Li, Chenghua Lin, Nafise Sadat Moosavi, Wei Zhao, Tristan Miller2026-03-09🤖 cs.AI

Conditioning LLMs to Generate Code-Switched Text

This paper proposes a methodology to fine-tune Large Language Models for generating fluent English-Spanish code-switched text by leveraging back-translated parallel corpora, demonstrating that while traditional metrics fail to correlate with human preferences, LLM-based evaluation aligns well with human judgment and the approach significantly advances CS text generation capabilities.

Maite Heredia, Gorka Labaka, Jeremy Barnes, Aitor Soroa2026-03-09🤖 cs.AI

Generative Predictive Control: Flow Matching Policies for Dynamic and Difficult-to-Demonstrate Tasks

This paper introduces Generative Predictive Control, a supervised learning framework that leverages flow matching and sampling-based predictive control to enable high-frequency, dynamic robotic tasks by eliminating the need for difficult-to-obtain expert demonstrations.

Vince Kurtz, Joel W. Burdick2026-03-09🤖 cs.AI

FragFM: Hierarchical Framework for Efficient Molecule Generation via Fragment-Level Discrete Flow Matching

The paper introduces FragFM, a hierarchical framework utilizing fragment-level discrete flow matching and a stochastic fragment bag strategy to achieve efficient, scalable, and property-controllable molecular generation, validated through a new Natural Product Generation (NPGen) benchmark where it outperforms existing atom-based methods.

Joongwon Lee, Seonghwan Kim, Seokhyun Moon, Hyunwoo Kim, Woo Youn Kim2026-03-09🤖 cs.AI

Aligning Compound AI Systems via System-level DPO

This paper introduces SysDPO, a framework that aligns complex, multi-component Compound AI Systems with human preferences by modeling them as Directed Acyclic Graphs and extending Direct Preference Optimization to overcome the challenges of non-differentiable interactions and the difficulty of translating system-level preferences to component levels.

Xiangwen Wang, Yibo Jacky Zhang, Zhoujie Ding, Katherine Tsai, Haolun Wu, Sanmi Koyejo2026-03-09🤖 cs.AI

Adversarial Robustness of Partitioned Quantum Classifiers

This paper investigates the adversarial robustness of partitioned quantum classifiers by demonstrating that perturbations targeting circuit partitioning techniques, such as wire cutting or teleportation, are equivalent to implementing adversarial gates within intermediate layers, a relationship analyzed through both theoretical and experimental perspectives.

Pouya Kananian, Hans-Arno Jacobsen2026-03-09⚛️ quant-ph

A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives

This paper provides a comprehensive survey of music generation research by categorizing systems across single-modal, cross-modal, and multi-modal perspectives, while examining key aspects such as representation, data alignment, datasets, evaluation methods, current challenges, and future directions.

Shuyu Li, Shulei Ji, Zihao Wang + 3 more2026-03-09🤖 cs.AI

FindAnything: Open-Vocabulary and Object-Centric Mapping for Robot Exploration in Any Environment

FindAnything is an efficient, open-world mapping framework that integrates vision-language features into object-centric volumetric submaps to enable real-time, open-vocabulary semantic understanding of large-scale environments on resource-constrained robots.

Sebastián Barbas Laina, Simon Boche, Sotiris Papatheodorou, Simon Schaefer, Jaehyung Jung, Helen Oleynikova, Stefan Leutenegger2026-03-09🤖 cs.AI

From Tokenizer Bias to Backbone Capability: A Controlled Study of LLMs for Time Series Forecasting

This paper investigates the inherent forecasting capabilities of large language models (LLMs) by controlling for tokenizer bias through large-scale pre-training, revealing that while LLM backbones show some promise, they still struggle to consistently outperform models specifically trained on large-scale time series data.

Xinyu Zhang, Shanshan Feng, Xutao Li, Kenghong Lin, Fan Li, Pengfei Jia2026-03-09🤖 cs.AI

Position: Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!

This position paper argues that anthropomorphizing intermediate token generation as "reasoning traces" or "thoughts" is a dangerous misconception that obscures the true nature of language models, hinders their effective use, and leads to flawed research, urging the community to abandon such metaphors.

Subbarao Kambhampati, Karthik Valmeekam, Siddhant Bhambri, Vardhan Palod, Lucas Saldyt, Kaya Stechly, Soumya Rani Samineni, Durgesh Kalwar, Upasana Biswas2026-03-09🤖 cs.AI

The Malicious Technical Ecosystem: Exposing Limitations in Technical Governance of AI-Generated Non-Consensual Intimate Images of Adults

This paper adopts a survivor-centered approach to expose how a "malicious technical ecosystem" of accessible tools enables the creation of AI-generated non-consensual intimate images, while demonstrating that current governance frameworks, such as the NIST AI 100-4 report, fail to effectively regulate this landscape due to flawed underlying assumptions.

Michelle L. Ding, Harini Suresh2026-03-09🤖 cs.AI

Federated Learning: A Survey on Privacy-Preserving Collaborative Intelligence

This survey provides a comprehensive overview of Federated Learning, detailing its architecture and lifecycle while addressing key challenges like data heterogeneity and privacy, exploring emerging trends and applications, and outlining future research directions for scalable and trustworthy collaborative intelligence.

Ratun Rahman2026-03-09🤖 cs.AI

HCT-QA: A Benchmark for Question Answering on Human-Centric Tables

This paper introduces HCT-QA, a comprehensive benchmark comprising thousands of real-world and synthetic human-centric tables with natural language question-answer pairs, designed to evaluate and improve the performance of Large Language Models and Vision Language Models in querying complex tabular data.

Mohammad S. Ahmad, Zan A. Naeem, Michaël Aupetit, Ahmed Elmagarmid, Mohamed Eltabakh, Xiaosong Ma, Mourad Ouzzani, Chaoyi Ruan, Hani Al-Sayeh2026-03-09🤖 cs.AI

FourierSpecNet: Neural Collision Operator Approximation Inspired by the Fourier Spectral Method for Solving the Boltzmann Equation

This paper introduces FourierSpecNet, a hybrid deep learning framework that integrates the Fourier spectral method to efficiently approximate the Boltzmann collision operator, achieving resolution-invariant learning, zero-shot super-resolution, and significant computational savings while maintaining accuracy across elastic and inelastic collision regimes.

Jae Yong Lee, Gwang Jae Jung, Byung Chan Lim, Hyung Ju Hwang2026-03-09🤖 cs.AI

RM-R1: Reward Modeling as Reasoning

The paper introduces Reasoning Reward Models (ReasRMs), specifically the RM-R1 family, which reformulate reward modeling as a reasoning task using a chain-of-rubrics mechanism and a two-stage training pipeline to achieve superior interpretability and performance compared to existing large-scale models.

Xiusi Chen, Gaotang Li, Ziqi Wang, Bowen Jin, Cheng Qian, Yu Wang, Hongru Wang, Yu Zhang, Denghui Zhang, Tong Zhang, Hanghang Tong, Heng Ji2026-03-09🤖 cs.AI

Software Development Life Cycle Perspective: A Survey of Benchmarks for Code Large Language Models and Agents

This paper presents a comprehensive survey of 178 benchmarks for Code Large Language Models and Agents through a tiered Software Development Life Cycle (SDLC) framework, revealing a significant imbalance that heavily favors the implementation phase while neglecting requirements and design, alongside critical gaps in anti-contamination strategies that necessitate future research to bridge the gap between theoretical capabilities and practical effectiveness.

Kaixin Wang, Tianlin Li, Xiaoyu Zhang, Chong Wang, Weisong Sun, Yang Liu, Aishan Liu, Xianglong Liu, Chao Shen, Bin Shi2026-03-09🤖 cs.AI

Maximizing Asynchronicity in Event-based Neural Networks

This paper introduces EVA, a novel event-by-event asynchronous-to-synchronous (A2S) framework inspired by language modeling that generates highly expressive features, outperforming prior methods in recognition tasks and achieving state-of-the-art results in detection for event-based vision.

Haiqing Hao, Nikola Zubic, Weihua He, Zhipeng Sui, Davide Scaramuzza, Wenhui Wang2026-03-09🤖 cs.AI

Mitigating Content Effects on Reasoning in Language Models through Fine-Grained Activation Steering

This paper proposes K-CAST, a novel fine-grained conditional activation steering method that dynamically mitigates content biases in large language models, significantly improving formal reasoning accuracy by up to 15% while maintaining robustness across prompts and languages.

Marco Valentino, Geonhee Kim, Dhairya Dalal, Zhixue Zhao, André Freitas2026-03-09🤖 cs.AI

AdAEM: An Adaptively and Automated Extensible Measurement of LLMs' Value Difference

This paper introduces AdAEM, a novel self-extensible evaluation framework that automatically generates adaptive test questions by probing the internal value boundaries of diverse LLMs to overcome the limitations of static benchmarks and provide more informative, distinguishable insights into models' value differences and alignment dynamics.

Jing Yao, Shitong Duan, Xiaoyuan Yi, Dongkuan Xu, Peng Zhang, Tun Lu, Ning Gu, Zhicheng Dou, Xing Xie2026-03-09🤖 cs.AI

← Previous Next →