cs.CL papers | Gist.Science

Adaptive Social Learning via Mode Policy Optimization for Language Agents

This paper proposes the Adaptive Social Learning (ASL) framework, featuring the Adaptive Mode Policy Optimization (AMPO) algorithm, to enable language agents to dynamically switch between intuitive and deliberative reasoning modes based on context, thereby achieving superior task performance and token efficiency compared to existing methods like GPT-4o and GRPO.

Minzheng Wang, Yongbin Li, Haobo Wang + 6 more2026-03-04🤖 cs.AI

Talk to Your Slides: High-Efficiency Slide Editing via Language-Driven Structured Data Manipulation

This paper introduces "Talk to Your Slides," a high-efficiency slide editing agent that leverages language-driven structured data manipulation instead of visual perception to achieve faster, more accurate, and cost-effective text-centric and formatting modifications compared to Multimodal LLM-based GUI agents, supported by the newly proposed TSBench benchmark.

Kyudan Jung, Hojun Cho, Jooyeol Yun + 3 more2026-03-04💬 cs.CL

Efficient Agent Training for Computer Use

The paper introduces PC Agent-E, an efficient training framework that synthesizes diverse action decisions using Claude 3.7 Sonnet to augment a small set of 312 human trajectories, resulting in a model that significantly outperforms both human-only training and direct distillation on the WindowsAgentArena-V2 benchmark.

Yanheng He, Jiahe Jin, Pengfei Liu2026-03-04🤖 cs.AI

REFLEX: Metacognitive Reasoning for Reflective Zero-Shot Robotic Planning with Large Language Models

This paper introduces REFLEX, a metacognitive framework that empowers LLM-driven robotic agents to decompose skills, reflect on failures, and creatively synthesize novel solutions, thereby significantly enhancing their zero-shot and few-shot performance in complex multi-robot collaboration tasks.

Wenjie Lin, Jin Wei-Kocsis, Jiansong Zhang + 4 more2026-03-04💬 cs.CL

BitBypass: A New Direction in Jailbreaking Aligned Large Language Models with Bitstream Camouflage

This paper introduces BitBypass, a novel black-box jailbreak attack that exploits hyphen-separated bitstream camouflage to bypass the safety alignment of state-of-the-art large language models, demonstrating superior stealth and success rates compared to existing adversarial methods.

Kalyan Nakka, Nitesh Saxena2026-03-04💬 cs.CL

DiaBlo: Diagonal Blocks Are Sufficient For Finetuning

DiaBlo is a parameter-efficient fine-tuning method that updates only the diagonal blocks of model weight matrices, offering a simple, stable, and theoretically grounded alternative to LoRA that achieves competitive performance with comparable memory efficiency and training speed.

Selcuk Gurses, Aozhong Zhang, Yanxia Deng + 5 more2026-03-04🤖 cs.AI

Go-Browse: Training Web Agents with Structured Exploration

The paper introduces Go-Browse, a method that uses structured graph-based exploration to automatically collect a large-scale dataset of web agent trajectories, which, when used to fine-tune a 7B language model, achieves state-of-the-art performance on the WebArena benchmark.

Apurva Gandhi, Graham Neubig2026-03-04💬 cs.CL

HSSBench: Benchmarking Humanities and Social Sciences Ability for Multimodal Large Language Models

This paper introduces HSSBench, a comprehensive multilingual benchmark featuring over 13,000 samples generated through a novel expert-agent collaboration pipeline, designed to evaluate and address the current limitations of Multimodal Large Language Models in handling the interdisciplinary and abstract reasoning tasks characteristic of the Humanities and Social Sciences.

Zhaolu Kang, Junhao Gong, Jiaxu Yan + 15 more2026-03-04🤖 cs.AI

MedXIAOHE: A Comprehensive Recipe for Building Medical MLLMs

MedXIAOHE is a state-of-the-art medical vision-language foundation model that leverages an entity-aware continual pretraining framework, reinforcement learning, and tool-augmented agentic training to achieve superior diagnostic reasoning, reliability, and performance across diverse medical benchmarks.

Baorong Shi, Bo Cui, Boyuan Jiang + 17 more2026-03-04⚡ eess

A Zipf-preserving, long-range correlated surrogate for written language and other symbolic sequences

This paper introduces a novel surrogate model that simultaneously preserves both the empirical symbol frequency distributions (such as Zipf's law) and the long-range correlation structures of symbolic sequences like language and DNA by mapping fractional Gaussian noise onto the original histogram, thereby enabling the disentanglement of structural features and the testing of scaling law origins.

Marcelo A. Montemurro, Mirko Degli Esposti2026-03-04🧬 q-bio

FeynTune: Large Language Models for High-Energy Theory

This paper introduces FeynTune, a suite of 20 specialized Large Language Models fine-tuned on High-Energy Physics arXiv abstracts that outperform both their base model and leading commercial LLMs in theoretical physics tasks, offering valuable insights for developing domain-specific AI in the field.

Paul Richmond, Prarit Agarwal, Borun Chowdhury + 2 more2026-03-02⚛️ hep-th

When ChatGPT is gone: Creativity reverts and homogeneity persists

This study reveals that while ChatGPT temporarily boosts human creative performance, its use ultimately leads to a reversion to baseline creativity and a persistent homogenization of content, challenging the notion that generative AI enhances long-term human creativity.

Qinghan Liu, Yiyong Zhou, Jihao Huang + 1 more2024-01-11💬 cs.CL

Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling

This paper addresses the safety challenges of end-to-end conversational AI by surveying the problem landscape, proposing a value-sensitive design framework for release decisions, and providing a suite of tools to help researchers mitigate potential harms.

Emily Dinan, Gavin Abercrombie, A. Stevie Bergman + 4 more2021-07-07💬 cs.CL

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

The paper introduces BERT, a novel bidirectional language representation model that leverages pre-training on unlabeled text to achieve state-of-the-art performance across a wide range of natural language processing tasks with minimal fine-tuning.

Jacob Devlin, Ming-Wei Chang, Kenton Lee + 1 more2018-10-11💬 cs.CL

Attention Is All You Need

This paper introduces the Transformer, a novel neural network architecture that relies entirely on attention mechanisms while eliminating recurrence and convolutions, demonstrating superior translation quality, faster training times, and strong generalization to other tasks compared to existing state-of-the-art models.

Ashish Vaswani, Noam Shazeer, Niki Parmar + 5 more2017-06-12💬 cs.CL

Efficient Estimation of Word Representations in Vector Space

This paper introduces two novel, computationally efficient model architectures for learning high-quality continuous word vector representations from massive datasets, which achieve state-of-the-art performance in measuring syntactic and semantic word similarities at a fraction of the previous computational cost.

Tomas Mikolov, Kai Chen, Greg Corrado + 1 more2013-01-16💬 cs.CL

← Previous