Offline Dynamic Inventory and Pricing Strategy: Addressing Censored and Dependent Demand

This paper proposes a novel data-driven framework using offline reinforcement learning and survival analysis to estimate optimal pricing and inventory control policies in sequential environments with censored and dependent demand, overcoming challenges like missing profit information and non-stationarity by approximating the problem as a high-order Markov decision process.

Korel Gundem, Zhengling Qi2026-03-12📊 stat

Scalable Multi-Task Learning through Spiking Neural Networks with Adaptive Task-Switching Policy for Intelligent Autonomous Agents

The paper proposes SwitchMT, a novel methodology for scalable multi-task learning in resource-constrained autonomous agents that combines a Deep Spiking Q-Network with active dendrites and an adaptive task-switching policy to effectively mitigate task interference and outperform state-of-the-art methods in Atari games.

Rachmad Vidya Wicaksana Putra, Avaneesh Devkota, Muhammad Shafique2026-03-12🤖 cs.AI

Large Language Model Psychometrics: A Systematic Review of Evaluation, Validation, and Enhancement

This systematic review introduces the emerging interdisciplinary field of LLM Psychometrics, which applies psychometric theories and instruments to develop comprehensive evaluation frameworks for measuring human-like psychological constructs in large language models, ultimately guiding the creation of more robust, human-centered AI systems.

Haoran Ye, Jing Jin, Yuhang Xie, Xin Zhang, Guojie Song2026-03-12💬 cs.CL

Consistency-based Abductive Reasoning over Perceptual Errors of Multiple Pre-trained Models in Novel Environments

This paper proposes a consistency-based abductive reasoning framework that integrates predictions from multiple pre-trained models at test time to mitigate performance degradation in novel environments, achieving significant improvements in accuracy and F1-score over individual models and standard ensembles by selecting a subset of predictions that maximizes coverage while minimizing logical inconsistencies.

Mario Leiva, Noel Ngu, Joshua Shay Kricheli, Aditya Taparia, Ransalu Senanayake, Paulo Shakarian, Nathaniel Bastian, John Corcoran, Gerardo Simari2026-03-12🤖 cs.AI

Comparative Analysis of Modern Machine Learning Models for Retail Sales Forecasting

This study demonstrates that for retail sales forecasting characterized by intermittent demand and missing data, localized tree-based ensemble methods like XGBoost outperform sophisticated deep learning architectures, suggesting that aligning model selection with specific problem constraints is more critical than architectural complexity.

Luka Hobor, Mario Brcic, Lidija Polutnik, Ante Kapetanovic2026-03-12🤖 cs.LG

Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions

The paper introduces ReLIFT, a novel training framework that interleaves reinforcement learning with online supervised fine-tuning on challenging questions, enabling large language models to acquire new knowledge and reasoning patterns beyond their original capabilities while achieving superior performance with significantly less demonstration data.

Lu Ma, Hao Liang, Meiyi Qiang, Lexiang Tang, Xiaochen Ma, Zhen Hao Wong, Junbo Niu, Chengyu Shen, Runming He, Yanhao Li, Bin Cui, Wentao Zhang2026-03-12🤖 cs.AI

Technological folie à deux: Feedback Loops Between AI Chatbots and Mental Illness

This paper argues that the interaction between human cognitive biases and AI chatbot behaviors like sycophancy creates dangerous feedback loops that can destabilize beliefs and exacerbate mental illness, necessitating coordinated interventions across clinical, technical, and regulatory domains.

Sebastian Dohnány, Zeb Kurth-Nelson, Eleanor Spens, Lennart Luettgau, Alastair Reid, Iason Gabriel, Christopher Summerfield, Murray Shanahan, Matthew M Nour2026-03-12🧬 q-bio

Shadow in the Cache: Unveiling and Mitigating Privacy Risks of KV-cache in LLM Inference

This paper reveals that the Key-Value (KV) cache used to accelerate Large Language Model inference is vulnerable to privacy attacks that allow attackers to reconstruct sensitive user inputs, and it proposes KV-Cloak, a lightweight and efficient obfuscation defense that effectively prevents such leakage without compromising model accuracy or performance.

Zhifan Luo, Shuo Shao, Su Zhang, Lijing Zhou, Yuke Hu, Chenxu Zhao, Zhihao Liu, Zhan Qin2026-03-12💬 cs.CL

The Yokai Learning Environment: Tracking Beliefs Over Space and Time

This paper introduces the Yokai Learning Environment (YLE), a new open-source benchmark for zero-shot coordination that overcomes the saturation of the Hanabi Learning Environment by requiring agents to track moving cards and reason under ambiguous hints, thereby revealing that current state-of-the-art methods fail to maintain consistent internal models when paired with unseen partners.

Constantin Ruhdorfer, Matteo Bortoletto, Johannes Forkel, Jakob Foerster, Andreas Bulling2026-03-12🤖 cs.AI

Global Minimizers of Sigmoid Contrastive Loss

This paper theoretically characterizes the global minimizers of sigmoid contrastive loss as (m,brel)(\mathsf{m}, \mathsf{b}_{\mathsf{rel}})-Constellations, providing a rigorous explanation for the success of SigLIP models, the origin of the modality gap, and the necessary dimensionality for high-quality representations while proposing an improved reparameterization for training dynamics.

Kiril Bangachev, Guy Bresler, Iliyas Noman, Yury Polyanskiy2026-03-12🤖 cs.LG