cs.AI papers | Gist.Science

An Updated Assessment of Reinforcement Learning for Macro Placement

This paper presents an updated and rigorous assessment of Google's deep reinforcement learning approach (Circuit Training) for macro placement by introducing stronger baselines, new sub-10nm benchmarks, and commercial-grade evaluations to address reproducibility challenges and identify remaining open questions regarding scalability and pre-training methodologies.

Chung-Kuan Cheng, Andrew B. Kahng, Sayak Kundu, Yucheng Wang, Zhiang Wang2026-03-12🤖 cs.LG

Mindstorms in Natural Language-Based Societies of Mind

This paper proposes Natural Language-Based Societies of Mind (NLSOMs), a modular framework where large multimodal neural networks communicate via natural language to solve complex AI tasks more effectively than single models, while also exploring the emerging social, economic, and structural challenges of scaling these heterogeneous societies to include billions of agents.

Mingchen Zhuge, Haozhe Liu, Francesco Faccio, Dylan R. Ashley, Róbert Csordás, Anand Gopalakrishnan, Abdullah Hamdi, Hasan Abed Al Kader Hammoud, Vincent Herrmann, Kazuki Irie, Louis Kirsch, Bing Li, Guohao Li, Shuming Liu, Jinjie Mai, Piotr Pi\k{e}kos, Aditya Ramesh, Imanol Schlag, Weimin Shi, Aleksandar Stanic, Wenyi Wang, Yuhui Wang, Mengmeng Xu, Deng-Ping Fan, Bernard Ghanem, Jürgen Schmidhuber2026-03-12💬 cs.CL

Large Language Models for Travel Behavior Prediction

This study demonstrates that large language models, utilized through zero-shot prompting or as embedding generators for supervised learning, offer a flexible and data-efficient alternative to traditional numerical models for predicting travel behavior.

Baichuan Mo, Hanyong Xu, Ruoyun Ma, Jung-Hoon Cho, Dingyi Zhuang, Xiaotong Guo, Jinhua Zhao2026-03-12💬 cs.CL

Optimal Transport Aggregation for Distributed Mixture-of-Experts

This paper proposes an optimal transport-based aggregation framework that efficiently combines locally trained Mixture-of-Experts models into a global estimator with a single communication step, achieving performance comparable to centralized training while significantly reducing computational and communication costs.

Faïcel Chamroukhi, Nhat Thien Pham2026-03-12📊 stat

Personalizing explanations of AI-driven hints to users' characteristics: an empirical evaluation

This paper presents an empirical study demonstrating that personalizing AI-driven hint explanations in an Intelligent Tutoring System for students with low Need for Cognition and Conscientiousness significantly increases their engagement, understanding, and learning outcomes.

Vedant Bahel, Harshinee Sriram, Cristina Conati2026-03-12🤖 cs.AI

Toward Adaptive Large Language Models Structured Pruning via Hybrid-grained Weight Importance Assessment

This paper introduces HyWIA, a novel adaptive structured pruning method for Large Language Models that leverages an attention mechanism to hybridize fine-grained and coarse-grained weight importance assessments, thereby significantly outperforming existing approaches in accuracy retention across various benchmarks.

Jun Liu, Zhenglun Kong, Pu Zhao + 9 more2026-03-12💬 cs.CL

Modelling Language using Large Language Models

This paper argues that large language models serve as valuable scientific models of public languages as external social entities, defending their utility against claims of lacking linguistic insight and proposing a framework to interpret their internal mechanisms as model construals.

Jumbly Grindrod2026-03-12💬 cs.CL

Explainable machine learning for predicting shellfish toxicity in the Adriatic Sea using long-term monitoring data of HABs

This study utilizes a 28-year dataset and explainable machine learning techniques, specifically a Random Forest model, to identify key toxic phytoplankton species and environmental factors driving diarrhetic shellfish poisoning in the Gulf of Trieste, thereby enhancing early warning systems for sustainable aquaculture.

Martin Marzidovšek, Janja Francé, Vid Podpečan + 3 more2026-03-12🤖 cs.AI

Synthesizing Interpretable Control Policies through Large Language Model Guided Search

This paper proposes a novel method that leverages Large Language Models to evolve interpretable control policies represented as standard Python programs, offering a transparent and verifiable alternative to black-box neural network controllers for dynamical systems.

Carlo Bosio, Mark W. Mueller2026-03-12⚡ eess

EoRA: Fine-tuning-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation

The paper introduces EoRA, a fine-tuning-free method that utilizes eigenspace low-rank approximation and an optimized CUDA kernel to significantly recover the accuracy of compressed LLMs while offering flexible trade-offs between performance and computational overhead.

Shih-Yang Liu, Maksim Khadkevich, Nai Chit Fung, Charbel Sakr, Chao-Han Huck Yang, Chien-Yi Wang, Saurav Muralidharan, Hongxu Yin, Kwang-Ting Cheng, Jan Kautz, Yu-Chiang Frank Wang, Pavlo Molchanov, Min-Hung Chen2026-03-12💬 cs.CL

Token Cleaning: Fine-Grained Data Selection for LLM Supervised Fine-Tuning

This paper proposes a generic token cleaning pipeline for supervised fine-tuning of large language models that filters out uninformative tokens based on their influence on model updates, thereby improving downstream performance by prioritizing data quality over quantity at the token level.

Jinlong Pang, Na Di, Zhaowei Zhu, Jiaheng Wei, Hao Cheng, Chen Qian, Yang Liu2026-03-12💬 cs.CL

Boosting Cross-problem Generalization in Diffusion-Based Neural Combinatorial Solver via Inference Time Adaptation

This paper proposes DIFU-Ada, a training-free inference time adaptation framework that enables diffusion-based neural combinatorial solvers to achieve zero-shot cross-problem and cross-scale generalization without additional training, as demonstrated by a TSP-trained model successfully solving variants like PCTSP and OP.

Haoyu Lei, Kaiwen Zhou, Yinchuan Li, Zhitang Chen, Farzan Farnia2026-03-12🤖 cs.LG

Talking like Piping and Instrumentation Diagrams (P&IDs)

This paper proposes a methodology that enables natural language interaction with Piping and Instrumentation Diagrams (P&IDs) by converting DEXPI data into labeled property graphs and integrating them with Large Language Models via graph-based retrieval augmented generation to enhance data retrieval, reduce hallucinations, and support engineering tasks.

Achmad Anggawirya Alimin, Dominik P. Goldstein, Lukas Schulze Balhorn + 1 more2026-03-12🤖 cs.AI

SCAM: A Real-World Typographic Robustness Evaluation for Multimodal Foundation Models

This paper introduces SCAM, the largest and most diverse real-world dataset of typographic attack images, to evaluate and demonstrate the significant vulnerability of state-of-the-art multimodal foundation models to such attacks while providing empirical insights into how model architecture and training data influence robustness.

Justus Westerhoff, Erblina Purelku, Jakob Hackstein + 4 more2026-03-12🤖 cs.AI

Offline Dynamic Inventory and Pricing Strategy: Addressing Censored and Dependent Demand

This paper proposes a novel data-driven framework using offline reinforcement learning and survival analysis to estimate optimal pricing and inventory control policies in sequential environments with censored and dependent demand, overcoming challenges like missing profit information and non-stationarity by approximating the problem as a high-order Markov decision process.

Korel Gundem, Zhengling Qi2026-03-12📊 stat

Scalable Multi-Task Learning through Spiking Neural Networks with Adaptive Task-Switching Policy for Intelligent Autonomous Agents

The paper proposes SwitchMT, a novel methodology for scalable multi-task learning in resource-constrained autonomous agents that combines a Deep Spiking Q-Network with active dendrites and an adaptive task-switching policy to effectively mitigate task interference and outperform state-of-the-art methods in Atari games.

Rachmad Vidya Wicaksana Putra, Avaneesh Devkota, Muhammad Shafique2026-03-12🤖 cs.AI

Large Language Model Psychometrics: A Systematic Review of Evaluation, Validation, and Enhancement

This systematic review introduces the emerging interdisciplinary field of LLM Psychometrics, which applies psychometric theories and instruments to develop comprehensive evaluation frameworks for measuring human-like psychological constructs in large language models, ultimately guiding the creation of more robust, human-centered AI systems.

Haoran Ye, Jing Jin, Yuhang Xie, Xin Zhang, Guojie Song2026-03-12💬 cs.CL

REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning?

This paper introduces REI-Bench, the first benchmark for evaluating robot task planning under vague referring expressions, revealing that such vagueness significantly degrades performance and demonstrating that a task-oriented context cognition approach effectively mitigates this issue to improve accessibility for non-expert users.

Chenxi Jiang, Chuhao Zhou, Jianfei Yang2026-03-12💬 cs.CL

Training with Pseudo-Code for Instruction Following

This paper proposes a training-time approach that fine-tunes Large Language Models using instruction-tuning data augmented with pseudo-code representations of natural language instructions, resulting in significant improvements in instruction-following reliability and overall reasoning performance across multiple benchmarks.

Prince Kumar, Rudra Murthy, Riyaz Bhat, Danish Contractor2026-03-12💬 cs.CL

LLLMs: A Data-Driven Survey of Evolving Research on Limitations of Large Language Models

This paper presents a data-driven survey of 14,648 studies from 2022 to early 2025, revealing that research on the limitations of large language models (LLLMs) has surged to over 30% of all LLM-related work, with reasoning, generalization, and hallucination being the most prominent areas of focus.

Aida Kostikova, Zhipin Wang, Deidamea Bajri, Ole Pütz, Benjamin Paaßen, Steffen Eger2026-03-12💬 cs.CL

← Previous Next →