BACE-RUL: A Bi-directional Adversarial Network with Covariate Encoding for Machine Remaining Useful Life Prediction

This paper introduces BACE-RUL, a bi-directional adversarial network with covariate encoding that predicts machine Remaining Useful Life using only current sensor measurements to overcome the limitations of prior knowledge and temporal mining, demonstrating superior performance over state-of-the-art methods on real-world datasets.

Zekai Zhang, Dan Li, Shunyu Wu + 4 more2026-03-06💻 cs

Safety Mirage: How Spurious Correlations Undermine VLM Safety Fine-Tuning and Can Be Mitigated by Machine Unlearning

This paper identifies the "safety mirage" in Vision-Language Models, where supervised fine-tuning creates spurious correlations that leave models vulnerable to simple attacks and prone to over-refusal, and proposes machine unlearning as a superior alignment strategy that significantly reduces attack success rates and unnecessary rejections while preserving general capabilities.

Yiwei Chen, Yuguang Yao, Yihua Zhang + 3 more2026-03-06💻 cs

Assessing the Impact of Code Changes on the Fault Localizability of Large Language Models

This paper introduces a large-scale, mutation-based evaluation framework to assess the robustness of Large Language Models in fault localization, revealing that their reasoning is often brittle and reliant on syntactic cues rather than deep semantic understanding, as evidenced by a 78% failure rate when subjected to semantic-preserving code changes.

Sabaat Haroon, Ahmad Faraz Khan, Ahmad Humayun + 5 more2026-03-06💻 cs

TianQuan-S2S: A Subseasonal-to-Seasonal Global Weather Model via Incorporate Climatology State

The paper introduces TianQuan-S2S, a novel global subseasonal-to-seasonal weather forecasting model that integrates climatological states into patch embeddings and utilizes an uncertainty-augmented Transformer to overcome the limitations of over-smoothing and inadequate climate representation, thereby outperforming both traditional numerical methods and advanced data-driven models in deterministic and ensemble forecasting.

Guowen Li, Xintong Liu, Yang Liu + 11 more2026-03-06💻 cs

Differentially Private and Scalable Estimation of the Network Principal Component

This paper proposes a novel, instance-specific Differentially Private framework based on the Propose-Test-Release mechanism that enables scalable and accurate estimation of network principal components on large real-world graphs, achieving a 180-fold runtime improvement over existing baselines while also providing the first DP solution for the Densest-kk-subgraph problem.

Alireza Khayatian, Anil Vullikanti, Aritra Konar2026-03-06💻 cs

VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use

VTool-R1 is a novel framework that leverages reinforcement learning to train vision-language models to generate multimodal chains of thought by strategically interleaving text with intermediate visual reasoning steps using Python-based editing tools, thereby enhancing performance on structured visual tasks without requiring process-based supervision.

Mingyuan Wu, Jingcheng Yang, Jize Jiang + 6 more2026-03-06💻 cs

Continuous Chain of Thought Enables Parallel Exploration and Reasoning

This paper introduces Continuous Chain of Thought (CoT2), a framework that replaces discrete token sampling with continuously-valued tokens to enable parallel exploration of multiple reasoning traces, offering theoretical guarantees for solving combinatorial problems and demonstrating improved performance through novel supervision and policy optimization strategies.

Halil Alperen Gozeten, M. Emrullah Ildiz, Xuechen Zhang + 3 more2026-03-06💻 cs

SealQA: Raising the Bar for Reasoning in Search-Augmented Language Models

The paper introduces SealQA, a new benchmark comprising three challenging flavors (Seal-0, Seal-Hard, and LongSeal) designed to evaluate search-augmented language models on fact-seeking tasks with noisy or conflicting web results, revealing that even frontier models struggle significantly with reasoning accuracy, robustness to noise, and long-context document retrieval.

Thinh Pham, Nguyen Nguyen, Pratibha Zunjare + 3 more2026-03-06💻 cs

HSG-12M: A Large-Scale Benchmark of Spatial Multigraphs from the Energy Spectra of Non-Hermitian Crystals

This paper introduces Poly2Graph, an automated pipeline for generating HSG-12M, a pioneering 16.7-million-scale dataset of spatial multigraphs derived from non-Hermitian crystal energy spectra, which bridges condensed matter physics and geometry-aware graph learning by preserving vital geometric information often discarded in existing benchmarks.

Xianquan Yan, Hakan Akgün, Kenji Kawaguchi + 2 more2026-03-06🔬 cond-mat.mes-hall