The Malicious Technical Ecosystem: Exposing Limitations in Technical Governance of AI-Generated Non-Consensual Intimate Images of Adults

This paper adopts a survivor-centered approach to expose how a "malicious technical ecosystem" of accessible tools enables the creation of AI-generated non-consensual intimate images, while demonstrating that current governance frameworks, such as the NIST AI 100-4 report, fail to effectively regulate this landscape due to flawed underlying assumptions.

Michelle L. Ding, Harini Suresh2026-03-09🤖 cs.AI

HCT-QA: A Benchmark for Question Answering on Human-Centric Tables

This paper introduces HCT-QA, a comprehensive benchmark comprising thousands of real-world and synthetic human-centric tables with natural language question-answer pairs, designed to evaluate and improve the performance of Large Language Models and Vision Language Models in querying complex tabular data.

Mohammad S. Ahmad, Zan A. Naeem, Michaël Aupetit, Ahmed Elmagarmid, Mohamed Eltabakh, Xiaosong Ma, Mourad Ouzzani, Chaoyi Ruan, Hani Al-Sayeh2026-03-09🤖 cs.AI

FourierSpecNet: Neural Collision Operator Approximation Inspired by the Fourier Spectral Method for Solving the Boltzmann Equation

This paper introduces FourierSpecNet, a hybrid deep learning framework that integrates the Fourier spectral method to efficiently approximate the Boltzmann collision operator, achieving resolution-invariant learning, zero-shot super-resolution, and significant computational savings while maintaining accuracy across elastic and inelastic collision regimes.

Jae Yong Lee, Gwang Jae Jung, Byung Chan Lim, Hyung Ju Hwang2026-03-09🤖 cs.AI

Software Development Life Cycle Perspective: A Survey of Benchmarks for Code Large Language Models and Agents

This paper presents a comprehensive survey of 178 benchmarks for Code Large Language Models and Agents through a tiered Software Development Life Cycle (SDLC) framework, revealing a significant imbalance that heavily favors the implementation phase while neglecting requirements and design, alongside critical gaps in anti-contamination strategies that necessitate future research to bridge the gap between theoretical capabilities and practical effectiveness.

Kaixin Wang, Tianlin Li, Xiaoyu Zhang, Chong Wang, Weisong Sun, Yang Liu, Aishan Liu, Xianglong Liu, Chao Shen, Bin Shi2026-03-09🤖 cs.AI

AdAEM: An Adaptively and Automated Extensible Measurement of LLMs' Value Difference

This paper introduces AdAEM, a novel self-extensible evaluation framework that automatically generates adaptive test questions by probing the internal value boundaries of diverse LLMs to overcome the limitations of static benchmarks and provide more informative, distinguishable insights into models' value differences and alignment dynamics.

Jing Yao, Shitong Duan, Xiaoyuan Yi, Dongkuan Xu, Peng Zhang, Tun Lu, Ning Gu, Zhicheng Dou, Xing Xie2026-03-09🤖 cs.AI

ESGenius: Benchmarking LLMs on Environmental, Social, and Governance (ESG) and Sustainability Knowledge

The paper introduces ESGenius, the first comprehensive benchmark comprising a curated corpus of authoritative ESG documents and a rigorously validated question-answer dataset, which reveals that while large language models exhibit moderate zero-shot performance in sustainability domains, their accuracy significantly improves when grounded in retrieval-augmented generation (RAG) using the provided source materials.

Chaoyue He, Xin Zhou, Yi Wu + 9 more2026-03-09💬 cs.CL

KramaBench: A Benchmark for AI Systems on Data-to-Insight Pipelines over Data Lakes

The paper introduces KramaBench, a comprehensive benchmark featuring 104 real-world data-to-insight challenges across diverse domains, which reveals that current AI systems struggle to orchestrate end-to-end data pipelines over data lakes, achieving a maximum of only 55% accuracy despite strong performance in isolated tasks.

Eugenie Lai, Gerardo Vitagliano, Ziyu Zhang, Om Chabra, Sivaprasad Sudhir, Anna Zeng, Anton A. Zabreyko, Chenning Li, Ferdi Kossmann, Jialin Ding, Jun Chen, Markos Markakis, Matthew Russo, Weiyang Wang, Ziniu Wu, Michael J. Cafarella, Lei Cao, Samuel Madden, Tim Kraska2026-03-09🤖 cs.AI

Discerning What Matters: A Multi-Dimensional Assessment of Moral Competence in LLMs

This paper critiques existing evaluations of LLM moral competence for over-relying on simplified scenarios and proposes a novel five-dimensional framework that reveals models often outperform humans in structured tasks but significantly underperform when required to discern moral relevance from noisy information, suggesting current assessments substantially overestimate their true moral reasoning capabilities.

Daniel Kilov, Caroline Hendy, Secil Yanik Guyot, Aaron J. Snoswell, Seth Lazar2026-03-09🤖 cs.AI

ContextBench: Modifying Contexts for Targeted Latent Activation

This paper introduces ContextBench, a benchmark for evaluating methods that generate fluent inputs to trigger specific latent features in language models, and demonstrates that enhanced Evolutionary Prompt Optimization variants achieve state-of-the-art performance in balancing elicitation strength with linguistic fluency.

Robert Graham, Edward Stevinson, Leo Richter, Alexander Chia, Joseph Miller, Joseph Isaac Bloom2026-03-09🤖 cs.AI

Iterative Quantum Feature Maps

The paper proposes Iterative Quantum Feature Maps (IQFMs), a hybrid quantum-classical framework that constructs deep architectures by iteratively connecting shallow, noise-resilient quantum feature maps with classically computed weights to mitigate hardware limitations and achieve performance comparable to classical neural networks without optimizing variational quantum parameters.

Nasa Matsumoto, Quoc Hoan Tran, Koki Chinzei, Yasuhiro Endo, Hirotaka Oshima2026-03-09⚛️ quant-ph

A Multi-Agent System Enables Versatile Information Extraction from the Chemical Literature

This paper presents a multimodal large language model-based multi-agent system that significantly outperforms existing state-of-the-art methods in automatically extracting structured chemical information from diverse and complex literature graphics, thereby advancing AI-driven chemical research.

Yufan Chen, Ching Ting Leung, Bowen Yu, Jianwei Sun, Yong Huang, Linyan Li, Hao Chen, Hanyu Gao2026-03-09🤖 cs.AI