How to make the most of your masked language model for protein engineering

This paper introduces a flexible stochastic beam search sampling method for masked language models that optimizes protein properties by evaluating entire-sequence neighborhoods, demonstrating through extensive in silico and in vitro antibody engineering experiments that the choice of sampling strategy is at least as critical as the model itself.

Calvin McCarter, Nick Bhattacharya, Sebastian W. Ober, Hunter Elliott2026-03-12🧬 q-bio

Data-Driven Integration Kernels for Interpretable Nonlocal Operator Learning

This paper introduces a data-driven integration kernel framework that enhances the interpretability and efficiency of nonlocal operator learning in climate modeling by separating nonlocal information aggregation via learnable weighting functions from local nonlinear prediction, thereby achieving competitive performance with fewer parameters and clearer physical insights.

Savannah L. Ferretti, Jerry Lin, Sara Shamekh, Jane W. Baldwin, Michael S. Pritchard, Tom Beucler2026-03-12🤖 cs.LG

Federated Active Learning Under Extreme Non-IID and Global Class Imbalance

This paper introduces FairFAL, an adaptive federated active learning framework that leverages lightweight prediction discrepancy and prototype-guided pseudo-labeling to dynamically select between global and local query models, effectively addressing the challenges of extreme non-IID data and global class imbalance to achieve superior performance over state-of-the-art methods.

Chen-Chen Zong, Sheng-Jun Huang2026-03-12🤖 cs.LG

On The Complexity of Best-Arm Identification in Non-Stationary Linear Bandits

This paper addresses the fixed-budget best-arm identification problem in non-stationary linear bandits by establishing a tighter, arm-set-dependent lower bound on error probability and proposing the Adjacent-BAI\textsf{Adjacent-BAI} algorithm, which utilizes an Adjacent-optimal design to achieve minimax-optimal performance that fully leverages the geometric structure of the arm set.

Leo Maynard-Zhang, Zhihan Xiong, Kevin Jamieson, Maryam Fazel2026-03-12📊 stat

Causal Concept Graphs in LLM Latent Space for Stepwise Reasoning

This paper introduces Causal Concept Graphs (CCG), a framework that combines task-conditioned sparse autoencoders with differentiable structure learning to map causal dependencies between interpretable latent features in LLMs, demonstrating through the Causal Fidelity Score that graph-guided interventions significantly enhance stepwise reasoning performance compared to existing tracing and random baselines.

Md Muntaqim Meherab, Noor Islam S. Mohammad, Faiza Feroz2026-03-12🤖 cs.LG

Effective Dataset Distillation for Spatio-Temporal Forecasting with Bi-dimensional Compression

The paper introduces STemDist, the first dataset distillation method designed for spatio-temporal forecasting that simultaneously compresses both spatial and temporal dimensions through a hybrid cluster-level and subset-based approach, achieving significantly faster training, reduced memory usage, and lower prediction errors compared to existing methods.

Taehyung Kwon, Yeonje Choi, Yeongho Kim, Kijung Shin2026-03-12🤖 cs.LG

Domain-Adaptive Health Indicator Learning with Degradation-Stage Synchronized Sampling and Cross-Domain Autoencoder

This paper proposes a domain-adaptive framework featuring degradation-stage synchronized batch sampling and a cross-domain aligned fusion large autoencoder to overcome distribution mismatches and temporal dependency limitations in health indicator learning, achieving significant performance improvements on industrial datasets.

Jungho Choo, Hanbyeol Park, Gawon Lee, Yunkyung Park, Hyerim Bae2026-03-12🤖 cs.LG

The Curse and Blessing of Mean Bias in FP4-Quantized LLM Training

This paper identifies a coherent rank-one mean bias as the primary cause of numerical instability in low-bit LLM training and demonstrates that simply subtracting this mean restores stability and performance in FP4 quantization, offering a hardware-efficient alternative to complex spectral methods.

Hengjie Cao, Zhendong Huang, Mengyi Chen, Yifeng Yang, Fanqi Yu, Ruijun Huang, Fang Dong, Xin Zhang, Jixian Zhou, Anrui Chen, Mingzhi Dong, Yujiang Wang, Jinlong Hou, Qin Lv, Yuan Cheng, Tun Lu, Fan Yang, Li Shang2026-03-12🤖 cs.LG

Unlearning the Unpromptable: Prompt-free Instance Unlearning in Diffusion Models

This paper introduces a prompt-free instance unlearning method for diffusion models that effectively removes specific, unpromptable undesired outputs—such as individual faces or culturally inaccurate depictions—while preserving the model's overall integrity through a novel approach combining image editing, timestep-aware weighting, and gradient surgery.

Kyungryeol Lee, Kyeonghyun Lee, Seongmin Hong, Byung Hyun Lee, Se Young Chun2026-03-12🤖 cs.LG

Spatio-Temporal Forecasting of Retaining Wall Deformation: Mitigating Error Accumulation via Multi-Resolution ConvLSTM Stacking Ensemble

This study introduces a multi-resolution ConvLSTM stacking ensemble framework that effectively mitigates error accumulation and enhances the accuracy of long-horizon retaining wall deformation forecasting by integrating models trained on diverse temporal input resolutions.

Jihoon Kim (Department of Civil,Environmental Engineering, Hongik University, Seoul, Republic of Korea), Heejung Youn (Department of Civil,Environmental Engineering, Hongik University, Seoul, Republic of Korea)2026-03-12🤖 cs.LG