Template-assisted Contrastive Learning of Task-oriented Dialogue Sentence Embeddings

This paper introduces Template-aware Dialogue Sentence Embedding (TaDSE), a novel self-supervised contrastive learning method that leverages easily obtainable token-level template information to generate high-quality sentence embeddings for task-oriented dialogues, achieving significant performance improvements over state-of-the-art methods on five benchmark datasets.

Minsik Oh, Jiwei Li, Guoyin Wang2026-04-14💬 cs.CL

SCITUNE: Aligning Large Language Models with Human-Curated Scientific Multimodal Instructions

The paper introduces SciTune, a framework that aligns large language models with human-curated scientific multimodal instructions, resulting in a model (LLaMA-SciTune) that significantly outperforms state-of-the-art systems on scientific visual and language benchmarks, even surpassing human performance in certain categories.

Sameera Horawalavithana, Sai Munikoti, Ian Stewart, Henry Kvinge, Karl Pazdernik2026-04-14💬 cs.CL

CROP: Conservative Reward for Model-based Offline Policy Optimization

This paper proposes CROP, a model-based offline reinforcement learning algorithm that introduces a conservative reward estimator to mitigate distribution shift and overestimation by minimizing both estimation error and the rewards of random actions, achieving competitive performance through a streamlined objective.

Hao Li, Xiao-Hu Zhou, Shu-Hai Li, Mei-Jiang Gui, Xiao-Liang Xie, Shi-Qi Liu, Shuang-Yi Wang, Zhen-Qiu Feng, Zeng-Guang Hou2026-04-14🤖 cs.LG

Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved Offloading

This paper introduces "Deep Optimizer States," a novel technique that dynamically interleaves CPU and GPU computations by splitting model subgroups based on a performance model to exploit memory utilization fluctuations, thereby overcoming memory bottlenecks and achieving 2.5×\times faster training iterations compared to state-of-the-art offloading approaches.

Avinash Maurya, Jie Ye, M. Mustafa Rafique, Franck Cappello, Bogdan Nicolae2026-04-14🤖 cs.LG

The Phantom of PCIe: Constraining Generative Artificial Intelligences for Practical Peripherals Trace Synthesizing

This paper introduces Phantom, a framework that combines generative AI with a novel PCIe-specific constraint filter to eliminate hallucinations and synthesize high-fidelity, protocol-compliant Transaction Layer Packet (TLP) traces for practical device simulation.

Zhibai Huang, Chen Chen, James Yen, Yihan Shen, Yongchen Xie, Zhixiang Wei, Kailiang Xu, Yun Wang, Fangxin Liu, Tao Song, Mingyuan Xia, Zhengwei Qi2026-04-14🤖 cs.LG

WebLLM: A High-Performance In-Browser LLM Inference Engine

The paper introduces WebLLM, an open-source JavaScript framework that leverages WebGPU and WebAssembly to enable high-performance, privacy-preserving large language model inference entirely within web browsers, achieving up to 80% of native device performance.

Charlie F. Ruan, Yucheng Qin, Akaash R. Parthasarathy, Xun Zhou, Ruihang Lai, Hongyi Jin, Yixin Dong, Bohan Hou, Meng-Shiun Yu, Yiyan Zhai, Sudeep Agarwal, Hangrui Cao, Siyuan Feng, Tianqi Chen2026-04-14🤖 cs.LG

A Multiparty Homomorphic Encryption Approach to Confidential Federated Kaplan Meier Survival Analysis

This paper introduces a privacy-preserving federated framework using threshold CKKS homomorphic encryption to enable multi-institutional Kaplan-Meier survival analysis that achieves high-fidelity results comparable to centralized data while preventing the reconstruction of sensitive individual records through encrypted aggregation and threshold decryption.

Narasimha Raghavan Veeraragavan, Svetlana Boudko, Jan Franz Nygård2026-04-14📊 stat

Influencing Humans to Conform to Preference Models for RLHF

This paper demonstrates that human preference data quality for Reinforcement Learning from Human Feedback (RLHF) can be significantly improved by designing specific interventions—such as visualizing underlying model quantities, training users on the model, and modifying elicitation questions—to align human expression with the algorithm's preference model assumptions without altering their underlying reward functions.

Stephane Hatgis-Kessell, W. Bradley Knox, Serena Booth, Peter Stone2026-04-14🤖 cs.LG

ExPath: Targeted Pathway Inference for Biological Knowledge Bases via Graph Learning and Explanation

ExPath is a novel subgraph inference framework that integrates experimental molecular data and biological foundation models to identify biologically meaningful targeted pathways in knowledge bases, significantly outperforming existing explainers in fidelity and pathway length preservation.

Rikuto Kotoge, Ziwei Yang, Zheng Chen, Yushun Dong, Yasuko Matsubara, Jimeng Sun, Yasushi Sakurai2026-04-14🤖 cs.LG

If an LLM Were a Character, Would It Know Its Own Story? Evaluating Lifelong Learning in LLMs

This paper introduces LIFESTATE-BENCH, a novel benchmark utilizing narrative datasets like Hamlet to evaluate lifelong learning in large language models, revealing that while non-parametric methods outperform parametric ones in managing stateful interactions, all models still struggle with catastrophic forgetting over extended engagements.

Siqi Fan, Xiusheng Huang, Yiqun Yao, Xuezhi Fang, Kang Liu, Peng Han, Shuo Shang, Aixin Sun, Yequan Wang2026-04-14💬 cs.CL