Reference-guided Policy Optimization for Molecular Optimization via LLM Reasoning

This paper introduces Reference-guided Policy Optimization (RePO), a novel framework that combines reinforcement learning with verifiable rewards and supervised reference guidance to effectively balance exploration and exploitation in molecular optimization tasks where only single-reference data is available, thereby outperforming existing SFT and RLVR baselines.

Xuan Li, Zhanke Zhou, Zongze Li, Jiangchao Yao, Yu Rong, Lu Zhang, Bo Han2026-03-09🤖 cs.AI

The World Won't Stay Still: Programmable Evolution for Agent Benchmarks

This paper introduces ProEvolve, a graph-based framework that enables programmable and scalable evolution of agent environments through graph transformations, addressing the limitations of static benchmarks by better evaluating agents' adaptability to real-world dynamics.

Guangrui Li, Yaochen Xie, Yi Liu, Ziwei Dong, Xingyuan Pan, Tianqi Zheng, Jason Choi, Michael J. Morais, Binit Jha, Shaunak Mishra, Bingrou Zhou, Chen Luo, Monica Xiao Cheng, Dawn Song2026-03-09🤖 cs.AI

CORE-Seg: Reasoning-Driven Segmentation for Complex Lesions via Reinforcement Learning

This paper introduces CORE-Seg, a reinforcement learning-driven framework that integrates a Semantic-Guided Prompt Adapter with a progressive SFT-to-GRPO training strategy to bridge the gap between visual segmentation and cognitive reasoning for complex medical lesions, achieving state-of-the-art performance on the newly proposed ComLesion-14K Chain-of-Thought benchmark.

Yuxin Xie, Yuming Chen, Yishan Yang, Yi Zhou, Tao Zhou, Zhen Zhao, Jiacheng Liu, Huazhu Fu2026-03-09🤖 cs.AI

DeepFact: Co-Evolving Benchmarks and Agents for Deep Research Factuality

This paper introduces DeepFact, a framework that addresses the brittleness of static factuality benchmarks for deep research reports by proposing an Evolving Benchmarking via Audit-then-Score (AtS) methodology, which significantly improves expert verification accuracy and enables the development of a high-performing document-level verification agent.

Yukun Huang, Leonardo F. R. Ribeiro, Momchil Hardalov, Bhuwan Dhingra, Markus Dreyer, Venkatesh Saligrama2026-03-09🤖 cs.AI

Stock Market Prediction Using Node Transformer Architecture Integrated with BERT Sentiment Analysis

This paper proposes an integrated framework combining a node transformer architecture with BERT-based sentiment analysis to model stock market graphs and social media sentiment, demonstrating superior forecasting accuracy (0.80% MAPE) and directional precision compared to traditional ARIMA and LSTM models across 20 S&P 500 stocks from 1982 to 2025.

Mohammad Al Ridhawi, Mahtab Haj Ali, Hussein Al Osman2026-03-09🤖 cs.AI

BlackMirror: Black-Box Backdoor Detection for Text-to-Image Models via Instruction-Response Deviation

This paper introduces BlackMirror, a novel black-box, training-free framework that detects backdoored text-to-image models by identifying and verifying the stability of partial semantic deviations between instructions and generated images, overcoming the limitations of existing image-similarity-based methods against diverse backdoor attacks.

Feiran Li, Qianqian Xu, Shilong Bao, Zhiyong Yang, Xilin Zhao, Xiaochun Cao, Qingming Huang2026-03-09🤖 cs.AI

Addressing the Ecological Fallacy in Larger LMs with Human Context

This paper demonstrates that addressing the ecological fallacy by modeling an author's language context through a specific task called HuLM, particularly during fine-tuning (HuFT) or continued pre-training, significantly improves the performance of an 8B Llama model across multiple downstream tasks compared to standard training methods.

Nikita Soni, Dhruv Vijay Kunjadiya, Pratham Piyush Shah, Dikshya Mohanty, H. Andrew Schwartz, Niranjan Balasubramanian2026-03-09🤖 cs.AI

Who We Are, Where We Are: Mental Health at the Intersection of Person, Situation, and Large Language Models

This paper proposes an interpretable modeling approach that integrates person-level psychological traits with situational context features derived from social media data to predict dynamic mental well-being, demonstrating that theory-driven methods offer competitive performance and greater human-understandable insights compared to standard language model embeddings.

Nikita Soni, August Håkan Nilsson, Syeda Mahwish, Vasudha Varadarajan, H. Andrew Schwartz, Ryan L. Boyd2026-03-09🤖 cs.AI

Skeleton-to-Image Encoding: Enabling Skeleton Representation Learning via Vision-Pretrained Models

This paper introduces Skeleton-to-Image Encoding (S2I), a novel method that transforms heterogeneous 3D skeleton sequences into standardized image-like formats to leverage powerful vision-pretrained models for effective self-supervised skeleton representation learning and cross-modal action recognition.

Siyuan Yang, Jun Liu, Hao Cheng, Chong Wang, Shijian Lu, Hedvig Kjellstrom, Weisi Lin, Alex C. Kot2026-03-09🤖 cs.AI

Technical Report: Automated Optical Inspection of Surgical Instruments

This technical report details a collaboration with industry leaders in Pakistan's Sialkot surgical cluster to develop an Automated Optical Inspection system using deep learning models (YOLOv8, ResNet-152, and EfficientNet-b4) on a new dataset of 4,414 images to detect manufacturing defects in surgical instruments, thereby enhancing patient safety and manufacturing quality.

Zunaira Shafqat, Atif Aftab Ahmed Jilani, Qurrat Ul Ain2026-03-09🤖 cs.AI