cs.AI papers | Gist.Science

CRAFT: A Tendon-Driven Hand with Hybrid Hard-Soft Compliance

The paper introduces CRAFT, a low-cost, open-source, tendon-driven anthropomorphic hand that combines rigid links with soft joints and rolling-contact surfaces to achieve robust, repeatable, and versatile contact-rich manipulation.

Leo Lin, Shivansh Patel, Jay Moon, Svetlana Lazebnik, Unnat Jain2026-03-13🤖 cs.AI

Increasing intelligence in AI agents can worsen collective outcomes

This paper demonstrates that increasing the sophistication and diversity of AI agents can paradoxically worsen collective outcomes and system overload when resources are scarce, with the overall impact depending entirely on the capacity-to-population ratio rather than the agents' inherent intelligence.

Neil F. Johnson2026-03-13💰 q-fin

TopoBench: Benchmarking LLMs on Hard Topological Reasoning

This paper introduces TopoBench, a benchmark for evaluating large language models on topological grid puzzles, revealing that their poor performance stems primarily from difficulties in extracting and maintaining spatial constraints rather than inherent reasoning limitations.

Mayug Maniparambil, Nils Hoehing, Janak Kapuriya, Arjun Karuvally, Ellen Rushe, Anthony Ventresque, Noel O'Connor, Fergal Reid2026-03-13🤖 cs.AI

Automatic Generation of High-Performance RL Environments

This paper introduces a cost-effective, automated recipe combining generic prompts, hierarchical verification, and iterative agent-assisted repair to translate complex reinforcement learning environments into high-performance implementations with zero sim-to-sim gap, achieving massive throughput gains (up to 22,320x) across diverse use cases including game emulation, physics simulation, and card game engines.

Seth Karten, Rahul Dev Appapogu, Chi Jin2026-03-13🤖 cs.LG

FlashMotion: Few-Step Controllable Video Generation with Trajectory Guidance

FlashMotion is a novel training framework that enables high-quality, few-step trajectory-controllable video generation by combining a pre-trained trajectory adapter with a hybrid diffusion-adversarial finetuning strategy, while introducing the FlashBench benchmark to evaluate performance across varying object counts.

Quanhao Li, Zhen Xing, Rui Wang, Haidong Cao, Qi Dai, Daoguo Dong, Zuxuan Wu2026-03-13🤖 cs.LG

IsoCompute Playbook: Optimally Scaling Sampling Compute for LLM RL

This paper establishes compute-optimal scaling laws for on-policy LLM reinforcement learning by demonstrating that the ideal number of parallel rollouts per problem increases predictably with the compute budget before saturating, driven by solution sharpening on easy tasks and coverage expansion on hard ones, while providing practical allocation rules for batch size and update steps to maximize training efficiency.

Zhoujun Cheng, Yutao Xie, Yuxiao Qu, Amrith Setlur, Shibo Hao, Varad Pimpalkhute, Tongtong Liang, Feng Yao, Zhengzhong Liu, Eric Xing, Virginia Smith, Ruslan Salakhutdinov, Zhiting Hu, Taylor Killian, Aviral Kumar2026-03-13🤖 cs.LG

GlyphBanana: Advancing Precise Text Rendering Through Agentic Workflows

GlyphBanana introduces a training-free, agentic workflow that integrates auxiliary tools to inject glyph templates into latent spaces and attention maps, enabling various Text-to-Image models to achieve superior precision in rendering complex text and mathematical formulas without requiring retraining.

Zexuan Yan, Jiarui Jin, Yue Ma, Shijian Wang, Jiahui Hu, Wenxiang Jiao, Yuan Lu, Linfeng Zhang2026-03-13🤖 cs.AI

A Quantitative Characterization of Forgetting in Post-Training

This paper provides a theoretical framework for quantifying forgetting in continual post-training of generative models by analyzing how divergence objectives (forward vs. reverse KL), geometric mode overlap, and replay strategies interact to cause either mass forgetting or controlled component drift.

Krishnakumar Balasubramanian, Shiva Prasad Kasiviswanathan2026-03-13📊 stat

BehaviorVLM: Unified Finetuning-Free Behavioral Understanding with Vision-Language Reasoning

BehaviorVLM is a unified, finetuning-free framework that leverages pretrained Vision-Language Models with explicit reasoning steps to achieve scalable, label-light pose estimation and behavioral understanding for freely moving animals without relying on extensive human annotation.

Jingyang Ke, Weihan Li, Amartya Pradhan, Jeffrey Markowitz, Anqi Wu2026-03-13🤖 cs.AI

Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections

This paper introduces the MADQA benchmark and a novel accuracy-effort evaluation protocol to demonstrate that while multimodal agents can match human accuracy on document-based tasks, they rely on inefficient brute-force search rather than genuine strategic reasoning, failing to close the performance gap to oracle levels.

Łukasz Borchmann, Jordy Van Landeghem, Michał Turski, Shreyansh Padarha, Ryan Othniel Kearns, Adam Mahdi, Niels Rogge, Clémentine Fourrier, Siwei Han, Huaxiu Yao, Artemis Llabrés, Yiming Xu, Dimosthenis Karatzas, Hao Zhang, Anupam Datta2026-03-13💬 cs.CL

Proof-Carrying Materials: Falsifiable Safety Certificates for Machine-Learned Interatomic Potentials

This paper introduces Proof-Carrying Materials (PCM), a rigorous framework combining adversarial falsification, statistical refinement, and formal Lean 4 certification to overcome the high failure rates of single machine-learned interatomic potentials, thereby significantly improving the reliability and discovery yield of high-throughput materials screening.

Abhinaba Basu, Pavan Chakraborty2026-03-13🔬 cond-mat.mtrl-sci

Compiling Temporal Numeric Planning into Discrete PDDL+: Extended Version

This paper presents a practical, polynomial-time compilation method that translates temporal planning with durative actions into the PDDL+ language, fully capturing the semantics while retaining plan length up to a constant factor and demonstrating effectiveness on hard temporal numeric problems.

Andrea Micheli, Enrico Scala, Alessandro Valentini2026-03-13🤖 cs.AI

WORKSWORLD: A Domain for Integrated Numeric Planning and Scheduling of Distributed Pipelined Workflows

This paper introduces WORKSWORLD, a new domain for numeric planners that automates the joint planning and scheduling of distributed data pipelines by allowing users to define high-level goals without specifying the entire workflow graph, demonstrating the ability to solve complex multi-site problems on commodity hardware.

Taylor Paul, William Regli2026-03-13🤖 cs.AI

RDNet: Region Proportion-Aware Dynamic Adaptive Salient Object Detection Network in Optical Remote Sensing Images

This paper proposes RDNet, a salient object detection network for optical remote sensing images that leverages a SwinTransformer backbone and three novel modules—Dynamic Adaptive Detail-aware, Frequency-matching Context Enhancement, and Region Proportion-aware Localization—to overcome challenges related to scale variations and global context modeling, thereby achieving superior detection performance compared to state-of-the-art methods.

Bin Wan, Runmin Cong, Xiaofei Zhou, Hao Fang, Yaoqi Sun, Sam Kwong2026-03-13🤖 cs.AI

Portfolio of Solving Strategies in CEGAR-based Object Packing and Scheduling for Sequential 3D Printing

This paper presents Portfolio-CEGAR-SEQ, a parallelized algorithm that leverages modern multi-core CPUs and a portfolio of diverse object arrangement strategies to outperform the original CEGAR-SEQ method in solving the combinatorial challenges of object arrangement and scheduling for sequential 3D printing, often resulting in more efficient use of printing plates.

Pavel Surynek2026-03-13🤖 cs.AI

Sparking Scientific Creativity via LLM-Driven Interdisciplinary Inspiration

The paper introduces Idea-Catalyst, a novel LLM-driven framework that enhances scientific creativity by systematically decomposing research goals into domain-agnostic problems to retrieve and synthesize interdisciplinary insights, thereby improving the novelty and insightfulness of brainstorming without premature solution anchoring.

Priyanka Kargupta, Shuhaib Mehri, Dilek Hakkani-Tur, Jiawei Han2026-03-13💬 cs.CL

Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights

This paper argues that large pretrained models contain a dense distribution of task-specific experts near their initial weights, enabling a simple, parallel post-training method that samples random perturbations and ensembles the best performers to achieve competitive results with standard optimization techniques like PPO and GRPO.

Yulu Gan, Phillip Isola2026-03-13🤖 cs.LG

Security Considerations for Artificial Intelligence Agents

Drawing from Perplexity's operational experience with general-purpose agentic systems, this paper outlines the unique security failure modes introduced by AI agents, maps their primary attack surfaces, proposes a layered defense strategy, and identifies critical research gaps and standards needed to secure multi-agent systems in alignment with NIST risk management principles.

Ninghui Li, Kaiyuan Zhang, Kyle Polley, Jerry Ma2026-03-13🤖 cs.LG

Incremental Neural Network Verification via Learned Conflicts

This paper proposes an incremental neural network verification technique that reuses learned conflicts across related queries via a SAT solver to prune infeasible search spaces early, achieving speedups of up to 1.9x over non-incremental baselines.

Raya Elsaleh, Liam Davis, Haoze Wu, Guy Katz2026-03-13🤖 cs.AI

Separable neural architectures as a primitive for unified predictive and generative intelligence

This paper introduces the separable neural architecture (SNA) as a domain-agnostic primitive that unifies predictive and generative intelligence across physics, language, and perception by formalizing a structural inductive bias that factorizes high-dimensional mappings into low-arity components, thereby enabling effective modeling of both chaotic continuous systems and discrete sequences.

Reza T. Batley, Apurba Sarker, Rajib Mostakim, Andrew Klichine, Sourav Saha2026-03-13🤖 cs.LG

← Previous Next →