cs.AI papers | Gist.Science

Unveiling the Potential of Quantization with MXFP4: Strategies for Quantization Error Reduction

This paper introduces two software-only techniques, Overflow-Aware Scaling (OAS) and Macro Block Scaling (MBS), that significantly reduce the accuracy gap between the hardware-efficient MXFP4 format and NVIDIA's NVFP4 standard in Large Language Models, achieving near-parity performance with minimal computational overhead.

Jatin Chhugani, Geonhwa Jeong, Bor-Yiing Su, Yunjie Pan, Hanmei Yang, Aayush Ankit, Jiecao Yu, Summer Deng, Yunqing Chen, Nadathur Satish, Changkyu Kim2026-03-11🤖 cs.AI

Design Conductor: An agent autonomously builds a 1.5 GHz Linux-capable RISC-V CPU

The paper introduces Design Conductor, an autonomous agent that leverages frontier models to independently design, verify, and generate a tape-out ready 1.48 GHz RISC-V CPU (VerCore) from a text specification to GDSII layout in just 12 hours, marking the first instance of an agent building a complete, working CPU end-to-end.

The Verkor Team, Ravi Krishna, Suresh Krishna, David Chin2026-03-11🤖 cs.AI

CktEvo: Repository-Level RTL Code Benchmark for Design Evolution

This paper introduces CktEvo, a repository-level benchmark and closed-loop framework that enables large language models to iteratively optimize Power, Performance, and Area (PPA) in complete RTL designs by preserving functional behavior across cross-file dependencies without human intervention.

Zhengyuan Shi, Jingxin Wang, Tairan Cheng, Changran Xu, Weikang Qian, Qiang Xu2026-03-11🤖 cs.AI

SiliconMind-V1: Multi-Agent Distillation and Debug-Reasoning Workflows for Verilog Code Generation

The paper introduces SiliconMind-V1, a unified multi-agent framework that leverages testbench-driven verification and iterative debug-reasoning workflows to train locally fine-tuned LLMs for generating functionally correct Verilog RTL designs, outperforming state-of-the-art models with greater efficiency and privacy.

Mu-Chi Chen, Yu-Hung Kao, Po-Hsuan Huang, Shao-Chun Ho, Hsiang-Yu Tsou, I-Ting Wu, En-Ming Huang, Yu-Kai Hung, Wei-Po Hsin, Cheng Liang, Chia-Heng Tu, Shih-Hao Hung, Hsiang-Tsung Kung2026-03-11🤖 cs.AI

ALADIN: Accuracy-Latency-Aware Design-space Inference Analysis for Embedded AI Accelerators

This paper presents ALADIN, an accuracy-latency-aware framework that enables the pre-deployment evaluation of mixed-precision quantized neural networks on scratchpad-based embedded AI accelerators by transforming models into platform-aware representations to analyze trade-offs and bottlenecks without requiring physical hardware.

T. Baldi, D. Casini, A. Biondi2026-03-11🤖 cs.AI

Alignment Is the Disease: Censorship Visibility and Alignment Constraint Complexity as Determinants of Collective Pathology in Multi-Agent LLM Systems

This paper presents preliminary evidence from multi-agent simulations suggesting that alignment techniques and invisible censorship in large language models may paradoxically induce collective pathological behaviors and insight-action dissociation, indicating that safety interventions can sometimes cause the very harms they aim to prevent.

Hiroki Fukui2026-03-11🤖 cs.AI

PhD Thesis Summary: Methods for Reliability Assessment and Enhancement of Deep Neural Network Hardware Accelerators

This PhD thesis presents novel, cost-efficient methods for assessing and enhancing the reliability of Deep Neural Network hardware accelerators, including a systematic literature review, new analytical tools, optimized trade-off methodologies, and the development of the AdAM real-time fault tolerance technique.

Mahdi Taheri2026-03-11🤖 cs.AI

ARKV: Adaptive and Resource-Efficient KV Cache Management under Limited Memory Budget for Long-Context Inference in LLMs

ARKV is a lightweight, adaptive framework that dynamically allocates precision levels to KV cache tokens based on per-layer attention dynamics and token importance, achieving a 4x reduction in memory usage while preserving ~97% of baseline accuracy for long-context LLM inference without requiring retraining or architectural modifications.

Jianlong Lei, Shashikant Ilager2026-03-11🤖 cs.AI

Measurement-Free Ancilla Recycling via Blind Reset: A Cross-Platform Study on Superconducting and Trapped-Ion Processors

This cross-platform study evaluates blind reset as a measurement-free ancilla recycling technique on superconducting and trapped-ion processors, demonstrating that it can significantly reduce logical-cycle latency while maintaining high ancilla cleanliness and identifying specific architecture-dependent crossover points for optimal deployment.

Sangkeum Lee2026-03-11⚛️ quant-ph

Benchmarking Federated Learning in Edge Computing Environments: A Systematic Review and Performance Evaluation

This paper presents a systematic review and performance evaluation of Federated Learning in edge computing, benchmarking five leading algorithms across key metrics to identify trade-offs, highlight SCAFFOLD's superior accuracy and robustness versus FedAvg's efficiency, and propose a future research agenda to address challenges like data heterogeneity and energy limitations.

Sales Aribe Jr., Gil Nicholas Cagande2026-03-11🤖 cs.AI

Autonomous Edge-Deployed AI Agents for Electric Vehicle Charging Infrastructure Management

This paper introduces Auralink SDC, an edge-deployed multi-agent AI architecture that autonomously manages electric vehicle charging infrastructure with high reliability and sub-50ms latency, achieving 78% autonomous incident resolution and 87.6% diagnostic accuracy to address the critical failure rates and slow remediation times of current cloud-centric systems.

Mohammed Cherifi2026-03-11🤖 cs.AI

Sensitivity-Guided Framework for Pruned and Quantized Reservoir Computing Accelerators

This paper presents a sensitivity-guided framework for compressing Reservoir Computing accelerators that systematically balances quantization and pruning to significantly improve hardware efficiency and reduce power consumption on FPGAs while maintaining high model accuracy across various time-series tasks.

Atousa Jafari, Mahdi Taheri, Hassan Ghasemzadeh Mohammadi, Christian Herglotz, Marco Platzner2026-03-11🤖 cs.AI

Architectural Design and Performance Analysis of FPGA based AI Accelerators: A Comprehensive Review

This paper reviews FPGA-based AI accelerators for deep learning, highlighting their advantages over ASICs and GPUs, detailing key hardware optimization techniques such as loop pipelining and quantization, and analyzing state-of-the-art designs to identify challenges for future innovations.

Soumita Chatterjee, Sudip Ghosh, Tamal Ghosh, Hafizur Rahaman2026-03-11🤖 cs.AI

Zipage: Maintain High Request Concurrency for LLM Reasoning through Compressed PagedAttention

The paper introduces Zipage, an LLM inference engine utilizing Compressed PagedAttention to combine token-wise KV cache eviction with PagedAttention, achieving over 2.1 $\times$ speedup in high-concurrency reasoning tasks while maintaining approximately 95% of the performance of full KV inference.

Mengqi Liao, Lu Wang, Chaoyun Zhang, Bo Qiao, Si Qin, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Huaiyu Wan2026-03-11🤖 cs.AI

Diagnosing FP4 inference: a layer-wise and block-wise sensitivity analysis of NVFP4 and MXFP4

This paper presents a systematic layer-wise and block-wise sensitivity analysis of NVFP4 and MXFP4 quantization across three Qwen2.5 model scales, revealing that MLP up- and down-projection layers are the most sensitive components while sensitivity patterns vary by format and model depth rather than being confined to final blocks.

Musa Cim, Burak Topcu, Mahmut Taylan Kandemir2026-03-11🤖 cs.AI

Permutation-Equivariant 2D State Space Models: Theory and Canonical Architecture for Multivariate Time Series

This paper introduces the Variable-Invariant Two-Dimensional State Space Model (VI 2D SSM) and its unified VI 2D Mamba architecture, which theoretically establish and implement a permutation-equivariant framework for multivariate time series that eliminates artificial variable ordering to achieve state-of-the-art performance with improved structural scalability.

Seungwoo Jeong, Heung-Il Suk2026-03-11🤖 cs.AI

Hindsight Credit Assignment for Long-Horizon LLM Agents

The paper introduces HCAPO, a novel framework that enhances long-horizon LLM agents by leveraging hindsight reasoning to refine step-level Q-values and employing a multi-scale advantage mechanism to address sparse reward challenges, thereby significantly outperforming state-of-the-art methods like GRPO on benchmarks such as WebShop and ALFWorld.

Hui-Ze Tan, Xiao-Wen Yang, Hao Chen, Jie-Jing Shao, Yi Wen, Yuteng Shen, Weihong Luo, Xiku Du, Lan-Zhe Guo, Yu-Feng Li2026-03-11🤖 cs.AI

Turn: A Language for Agentic Computation

This paper introduces **Turn**, a compiled, actor-based programming language that enhances agentic software by integrating LLM inference as a typed primitive with schema validation, confidence-based control flow, isolated actor contexts, capability-based identity, and compile-time schema absorption to enforce critical safety and state invariants at the language level.

Muyukani Kizito2026-03-11🤖 cs.AI

Generalized Reduction to the Isotropy for Flexible Equivariant Neural Fields

This paper introduces a principled method to reduce $G$ -invariant functions on product spaces $X \times M$ to $H$ -invariant functions on $X$ alone, where $H$ is the isotropy subgroup of $M$ , thereby enabling flexible Equivariant Neural Fields to handle arbitrary group actions and heterogeneous product spaces without structural constraints.

Alejandro García-Castellanos, Gijs Bellaard, Remco Duits, Daniel Pelt, Erik J Bekkers2026-03-11🤖 cs.AI

EDMFormer: Genre-Specific Self-Supervised Learning for Music Structure Segmentation

The paper introduces EDMFormer, a transformer model trained on a newly released dataset of 98 professionally annotated EDM tracks (EDM-98) to address the limitations of existing music segmentation methods by leveraging genre-specific energy, rhythm, and timbre features for improved structure detection in Electronic Dance Music.

Sahal Sajeer, Krish Patel, Oscar Chung, Joel Song Bae2026-03-11🤖 cs.AI

← Previous Next →