cs.AI papers | Gist.Science

ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

ReFusion introduces a novel masked diffusion model that integrates sequence reorganization with a hybrid parallel-autoregressive decoding strategy to simultaneously achieve full KV cache efficiency, reduce learning complexity, and significantly outperform existing diffusion models while narrowing the performance gap with autoregressive models.

Jia-Nan Li, Jian Guan, Wei Wu + 1 more2026-03-06💻 cs

HydroGEM: A Self Supervised Zero Shot Hybrid TCN Transformer Foundation Model for Continental Scale Streamflow Quality Control

HydroGEM is a self-supervised, zero-shot hybrid TCN-Transformer foundation model that effectively performs continental-scale streamflow quality control by detecting and reconstructing sensor anomalies with high accuracy and cross-national generalization, thereby addressing the scalability limitations of manual hydrological data validation.

Ijaz Ul Haq, Byung Suk Lee, Julia N. Perdrial + 1 more2026-03-06💻 cs

RePo: Language Models with Context Re-Positioning

This paper introduces RePo, a novel mechanism that leverages a differentiable module to dynamically re-position tokens based on contextual dependencies rather than fixed linear indices, thereby reducing extraneous cognitive load and enhancing LLM performance on tasks involving noisy contexts, structured data, and long-range dependencies.

Huayang Li, Tianyu Zhao, Deng Cai + 1 more2026-03-06💻 cs

MCP-SafetyBench: A Benchmark for Safety Evaluation of Large Language Models with Real-World MCP Servers

This paper introduces MCP-SafetyBench, a comprehensive benchmark leveraging real-world Model Context Protocol (MCP) servers to evaluate the safety of large language models in multi-turn, cross-tool scenarios, revealing that current models remain vulnerable to diverse MCP-specific attacks despite a significant safety-utility trade-off.

Xuanjun Zong, Zhiqi Shen, Lei Wang + 2 more2026-03-06💻 cs

FluenceFormer: Transformer-Driven Multi-Beam Fluence Map Regression for Radiotherapy Planning

This paper introduces FluenceFormer, a transformer-driven, two-stage framework that leverages a physics-informed Fluence-Aware Regression loss to achieve superior, geometry-aware fluence map prediction for radiotherapy planning, significantly outperforming existing CNN and single-stage methods in energy conservation and structural fidelity.

Ujunwa Mgboh, Rafi Ibn Sultan, Joshua Kim + 2 more2026-03-06💻 cs

Yukthi Opus: A Multi-Chain Hybrid Metaheuristic for Large-Scale NP-Hard Optimization

Yukthi Opus is a multi-chain hybrid metaheuristic that combines Markov Chain Monte Carlo exploration, greedy local search, and adaptive simulated annealing to achieve robust, budget-efficient optimization for large-scale NP-hard problems.

SB Danush Vikraman, Hannah Abigail, Prasanna Kesavraj + 1 more2026-03-06💻 cs

When Do Tools and Planning Help Large Language Models Think? A Cost- and Latency-Aware Benchmark

This paper presents a cost- and latency-aware benchmark demonstrating that while tool-augmented planning significantly improves accuracy for complex knowledge-intensive tasks like Event-QA, it often incurs prohibitive latency costs and offers no benefit—or even degrades performance—for tasks like persuasive response generation where simple one-shot prompting is more efficient.

Subha Ghoshal, Ali Al-Bustami2026-03-06💻 cs

Interleaved Tool-Call Reasoning for Protein Function Understanding

The paper introduces PFUA, a tool-augmented reasoning agent that outperforms text-only models in protein function prediction by integrating domain-specific tools and external biological priors to generate verifiable evidence, rather than relying on ineffective internal chain-of-thought reasoning.

Chuanliu Fan, Zicheng Ma, Huanran Meng + 6 more2026-03-06💻 cs

Identifying Good and Bad Neurons for Task-Level Controllable LLMs

The paper proposes NeuronLLM, a novel framework that improves task-level controllability in Large Language Models by identifying both facilitative "good" and inhibitive "bad" neurons through contrastive learning and augmented question sets to overcome the limitations of existing ability-specific methods.

Wenjie Li, Guansong Pang, Hezhe Qiao + 2 more2026-03-06💻 cs

Controlled LLM Training on Spectral Sphere

The paper introduces the Spectral Sphere Optimizer (SSO), a novel parallel training algorithm that enforces strict module-wise spectral constraints on both weights and updates to achieve full Maximal Update Parametrization alignment, resulting in superior convergence, stability, and performance across diverse large-scale architectures compared to AdamW and Muon.

Tian Xie, Haoming Luo, Haoyu Tang + 9 more2026-03-06💻 cs

EmboTeam: Grounding LLM Reasoning into Reactive Behavior Trees via PDDL for Embodied Multi-Robot Collaboration

EmboTeam is a novel framework that enhances embodied multi-robot collaboration by cascading LLM-based instruction parsing into formal PDDL planning and reactive behavior tree execution, achieving significantly higher task success rates on the new MACE-THOR benchmark compared to existing baselines.

Haishan Zeng, Mengna Wang, Peng Li2026-03-06💻 cs

"What if she doesn't feel the same?" What Happens When We Ask AI for Relationship Advice

This study reveals that users are highly satisfied with LLM-generated romantic relationship advice, finding it reliable and helpful, which significantly improves their overall trust and positive attitudes toward AI systems.

Niva Manchanda, Akshata Kishore Moharir, Ratna Kandala2026-03-06💻 cs

ButterflyMoE: Sub-Linear Ternary Experts via Structured Butterfly Orbits

ButterflyMoE achieves sub-linear memory scaling for Mixture-of-Experts models on edge devices by representing diverse experts as geometric rotations of a shared ternary substrate, enabling a 150 $\times$ memory reduction with negligible accuracy loss.

Aryan Karmore2026-03-06💻 cs

Yuan3.0 Ultra: A Trillion-Parameter Enterprise-Oriented MoE LLM

This paper introduces Yuan3.0 Ultra, an open-source, trillion-parameter Mixture-of-Experts large language model that utilizes a novel Layer-Adaptive Expert Pruning algorithm to significantly improve pre-training efficiency and reduce model size while achieving state-of-the-art performance on enterprise-oriented benchmarks.

YuanLab. ai, :, Shawn Wu + 25 more2026-03-06💻 cs

Where is the multimodal goal post? On the Ability of Foundation Models to Recognize Contextually Important Moments

This paper introduces a new dataset derived from football highlight reels to evaluate foundation models' ability to identify contextually important video moments, revealing that current state-of-the-art models perform near chance levels due to their reliance on single dominant modalities and failure to effectively synthesize cross-modal information.

Aditya K Surikuchi, Raquel Fernández, Sandro Pezzelle2026-03-06💻 cs

A Scalable Inter-edge Correlation Modeling in CopulaGNN for Link Sign Prediction

This paper proposes a scalable extension of CopulaGNN for link sign prediction that overcomes computational intractability by representing edge correlations via a Gramian of edge embeddings and reformulating conditional probabilities, thereby achieving linear convergence and competitive performance on signed graphs.

Jinkyu Sung, Myunggeum Jee, Joonseok Lee2026-03-06💻 cs

Mobility-Embedded POIs: Learning What A Place Is and How It Is Used from Human Movement

This paper introduces Mobility-Embedded POIs (ME-POIs), a framework that enhances general-purpose point-of-interest representations by integrating large-scale human mobility data with language model embeddings to capture both place identity and real-world usage functions, thereby outperforming existing text-only and mobility-only baselines across diverse map enrichment tasks.

Maria Despoina Siampou, Shushman Choudhury, Shang-Ling Hsu + 2 more2026-03-06💻 cs

PerfGuard: A Performance-Aware Agent for Visual Content Generation

PerfGuard is a performance-aware agent framework for visual content generation that enhances task planning and execution reliability by systematically modeling tool performance boundaries through Performance-Aware Selection Modeling, Adaptive Preference Update, and Capability-Aligned Planning Optimization.

Zhipeng Chen, Zhongrui Zhang, Chao Zhang + 5 more2026-03-06💻 cs

YuriiFormer: A Suite of Nesterov-Accelerated Transformers

This paper proposes a variational framework that interprets transformer layers as optimization iterations, enabling the design of a Nesterov-accelerated transformer architecture that outperforms standard baselines on language modeling tasks.

Aleksandr Zimin, Yury Polyanskiy, Philippe Rigollet2026-03-06🔢 math

Replacing Parameters with Preferences: Federated Alignment of Heterogeneous Vision-Language Models

This paper proposes MoR, a federated alignment framework that replaces parameter exchange with preference-based learning using a Mixture-of-Rewards mechanism and GRPO to effectively align heterogeneous Vision-Language Models while preserving data privacy and accommodating diverse client constraints.

Shule Lu, Yujing Wang, Hainan Zhang + 5 more2026-03-06💻 cs

← Previous Next →