cs.AI papers | Gist.Science

Make VLM Recognize Visual Hallucination on Cartoon Character Image with Pose Information

This paper proposes a pose-aware in-context visual learning (PA-ICVL) framework that enhances Vision-Language Models' ability to detect semantic structural visual hallucinations in non-photorealistic cartoon images by integrating pose information alongside RGB data, achieving significant performance improvements over RGB-only baselines.

Bumsoo Kim, Wonseop Shin, Kyuchul Lee, Yonghoon Jung, Sanghyun Seo2026-03-09🤖 cs.AI

Algorithmic Collusion by Large Language Models

This paper demonstrates that Large Language Model-based pricing agents in oligopoly and auction settings can autonomously achieve supracompetitive prices and profits, a behavior significantly influenced by prompt variations and driven by price-war concerns, thereby posing unique challenges for future AI regulation.

Sara Fish, Yannai A. Gonczarowski, Ran I. Shorrer2026-03-09🤖 cs.AI

Computational lexical analysis of Flamenco genres

This study employs computational lexical analysis and machine learning on over 2,000 Flamenco lyrics to accurately classify traditional genres (*palos*), identify their unique semantic fields, and map inter-genre relationships that reveal historical connections and evolutionary patterns within this cultural heritage.

Pablo Rosillo-Rodes, Maxi San Miguel, David Sanchez2026-03-09💬 cs.CL

Combining X-Vectors and Bayesian Batch Active Learning: Two-Stage Active Learning Pipeline for Speech Recognition

This paper proposes a novel two-stage active learning pipeline for automatic speech recognition that combines unsupervised x-vector clustering with a supervised Bayesian batch selection method to efficiently identify diverse and informative samples, thereby significantly reducing labeling effort while improving model performance across various test conditions.

Ognjen Kundacina, Vladimir Vincan, Dragisa Miskovic2026-03-09⚡ eess

My part is bigger than yours -- assessment within a group of peers

This paper presents a method for aggregating peer assessments of individual contributions in collaborative projects by weighting each expert's opinion according to the significance of their contribution, thereby facilitating a fair consensus on reward distribution.

Konrad Kułakowski, Jacek Szybowski2026-03-09🤖 cs.AI

Predictive Coding Networks and Inference Learning: Tutorial and Survey

This paper provides a comprehensive review and formal specification of Predictive Coding Networks (PCNs), highlighting their biological plausibility, computational efficiency through inference learning, and versatility as a probabilistic framework that extends beyond traditional backpropagation-based neural networks.

Björn van Zwol, Ro Jefferson, Egon L. van den Broek2026-03-09🤖 cs.AI

Transforming Agency. On the mode of existence of Large Language Models

This paper argues that while Large Language Models lack the autonomy required for genuine agency due to their absence of self-generated norms, goals, and embodied interaction, they function as transformative "linguistic automata" that, through a unique human-machine coupling, enable new "midtended" forms of intentional agency.

Xabier E. Barandiaran, Lola S. Almendros2026-03-09🤖 cs.AI

FALCON: Future-Aware Learning with Contextual Object-Centric Pretraining for UAV Action Recognition

FALCON is a unified self-supervised pretraining framework for UAV action recognition that overcomes spatial imbalance in aerial footage by combining object-aware masked autoencoding with object-centric dual-horizon future reconstruction, achieving superior accuracy and faster inference without requiring additional preprocessing at test time.

Ruiqi Xian, Xiyang Wu, Tianrui Guan, Xijun Wang, Boqing Gong, Dinesh Manocha2026-03-09🤖 cs.AI

UniHR: Hierarchical Representation Learning for Unified Knowledge Graph Link Prediction

UniHR is a unified hierarchical representation learning framework that overcomes the limitations of existing methods by unifying diverse knowledge graph types into triple-based representations and employing a hierarchical structure learning module to effectively model both intra-fact semantics and inter-fact relationships for link prediction.

Zhiqiang Liu, Yin Hua, Mingyang Chen + 4 more2026-03-09💬 cs.CL

SpecFuse: Ensembling Large Language Models via Next-Segment Prediction

The paper introduces SpecFuse (referred to as SpecEM in the abstract), a training-free ensemble framework that enhances large language model performance by enabling segment-level semantic collaboration through speculative decoding and dynamically adjusting model weights via an online feedback mechanism to prioritize stronger contributors.

Bo Lv, Nayu Liu, Chen Tang, Xin Liu, Yue Yu, Ping Luo2026-03-09🤖 cs.AI

Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation

This survey provides a comprehensive overview of the emerging ecosystem of large language models and tools that support researchers across the scientific lifecycle, covering key tasks from literature search and idea generation to content creation, experimentation, and evaluation, while addressing associated datasets, methods, limitations, and ethical concerns.

Steffen Eger, Yong Cao, Jennifer D'Souza, Andreas Geiger, Christian Greisinger, Stephanie Gross, Yufang Hou, Brigitte Krenn, Anne Lauscher, Yizhi Li, Chenghua Lin, Nafise Sadat Moosavi, Wei Zhao, Tristan Miller2026-03-09🤖 cs.AI

Conditioning LLMs to Generate Code-Switched Text

This paper proposes a methodology to fine-tune Large Language Models for generating fluent English-Spanish code-switched text by leveraging back-translated parallel corpora, demonstrating that while traditional metrics fail to correlate with human preferences, LLM-based evaluation aligns well with human judgment and the approach significantly advances CS text generation capabilities.

Maite Heredia, Gorka Labaka, Jeremy Barnes, Aitor Soroa2026-03-09🤖 cs.AI

Generative Predictive Control: Flow Matching Policies for Dynamic and Difficult-to-Demonstrate Tasks

This paper introduces Generative Predictive Control, a supervised learning framework that leverages flow matching and sampling-based predictive control to enable high-frequency, dynamic robotic tasks by eliminating the need for difficult-to-obtain expert demonstrations.

Vince Kurtz, Joel W. Burdick2026-03-09🤖 cs.AI

FragFM: Hierarchical Framework for Efficient Molecule Generation via Fragment-Level Discrete Flow Matching

The paper introduces FragFM, a hierarchical framework utilizing fragment-level discrete flow matching and a stochastic fragment bag strategy to achieve efficient, scalable, and property-controllable molecular generation, validated through a new Natural Product Generation (NPGen) benchmark where it outperforms existing atom-based methods.

Joongwon Lee, Seonghwan Kim, Seokhyun Moon, Hyunwoo Kim, Woo Youn Kim2026-03-09🤖 cs.AI

Aligning Compound AI Systems via System-level DPO

This paper introduces SysDPO, a framework that aligns complex, multi-component Compound AI Systems with human preferences by modeling them as Directed Acyclic Graphs and extending Direct Preference Optimization to overcome the challenges of non-differentiable interactions and the difficulty of translating system-level preferences to component levels.

Xiangwen Wang, Yibo Jacky Zhang, Zhoujie Ding, Katherine Tsai, Haolun Wu, Sanmi Koyejo2026-03-09🤖 cs.AI

Adversarial Robustness of Partitioned Quantum Classifiers

This paper investigates the adversarial robustness of partitioned quantum classifiers by demonstrating that perturbations targeting circuit partitioning techniques, such as wire cutting or teleportation, are equivalent to implementing adversarial gates within intermediate layers, a relationship analyzed through both theoretical and experimental perspectives.

Pouya Kananian, Hans-Arno Jacobsen2026-03-09⚛️ quant-ph

A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives

This paper provides a comprehensive survey of music generation research by categorizing systems across single-modal, cross-modal, and multi-modal perspectives, while examining key aspects such as representation, data alignment, datasets, evaluation methods, current challenges, and future directions.

Shuyu Li, Shulei Ji, Zihao Wang + 3 more2026-03-09🤖 cs.AI

FindAnything: Open-Vocabulary and Object-Centric Mapping for Robot Exploration in Any Environment

FindAnything is an efficient, open-world mapping framework that integrates vision-language features into object-centric volumetric submaps to enable real-time, open-vocabulary semantic understanding of large-scale environments on resource-constrained robots.

Sebastián Barbas Laina, Simon Boche, Sotiris Papatheodorou, Simon Schaefer, Jaehyung Jung, Helen Oleynikova, Stefan Leutenegger2026-03-09🤖 cs.AI

From Tokenizer Bias to Backbone Capability: A Controlled Study of LLMs for Time Series Forecasting

This paper investigates the inherent forecasting capabilities of large language models (LLMs) by controlling for tokenizer bias through large-scale pre-training, revealing that while LLM backbones show some promise, they still struggle to consistently outperform models specifically trained on large-scale time series data.

Xinyu Zhang, Shanshan Feng, Xutao Li, Kenghong Lin, Fan Li, Pengfei Jia2026-03-09🤖 cs.AI

Position: Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!

This position paper argues that anthropomorphizing intermediate token generation as "reasoning traces" or "thoughts" is a dangerous misconception that obscures the true nature of language models, hinders their effective use, and leads to flawed research, urging the community to abandon such metaphors.

Subbarao Kambhampati, Karthik Valmeekam, Siddhant Bhambri, Vardhan Palod, Lucas Saldyt, Kaya Stechly, Soumya Rani Samineni, Durgesh Kalwar, Upasana Biswas2026-03-09🤖 cs.AI

← Previous Next →