What are AI researchers worried about?

Based on a survey of over 4,000 AI researchers, this paper reveals that contrary to public and media narratives dominated by existential threats, researchers prioritize immediate sociotechnical risks and show significant convergence with public opinion on risk assessment, suggesting a need for collaborative dialogue focused on mitigating present-day harms rather than speculating on future catastrophes.

Cian O'Donovan, Sarp Gurakan, Ananya Karanam, Xiaomeng Wu, Jack StilgoeMon, 09 Ma💻 cs

Measuring Perceptions of Fairness in AI Systems: The Effects of Infra-marginality

This paper presents a user study demonstrating that human perceptions of fairness in AI systems are shaped not just by statistical parity or outcomes, but significantly by beliefs about the underlying causes of disparities, specifically how infra-marginality and data distribution differences influence judgments in medical decision-making scenarios.

Schrasing Tong, Minseok Jung, Ilaria Liccardi, Lalana KagalMon, 09 Ma💻 cs

From Risk Avoidance to User Empowerment: Reframing Safety in Generative AI for Mental Health Crises

This paper argues that current generative AI chatbots' risk-avoidant responses to mental health crises can harm users and proposes shifting toward empowerment-oriented design principles that enable AI to act as a supportive bridge for de-escalation and connection to professional care.

Benjamin Kaveladze, Arka Ghosh, Leah Ajmani, Denae Ford, Peter M Gutierrez, Jetta E Hanson, Eugenia Kim, Keertana Namuduri, Theresa Nguyen, Ebele Okoli, Teresa Rexin, Jessica L Schleider, Hongyi Shen, Jina SuhMon, 09 Ma💻 cs

Biometric-enabled Personalized Augmentative and Alternative Communications

This study proposes a roadmap for integrating biometric technologies into personalized Augmentative and Alternative Communication (AAC) systems by introducing concepts like the AAC biometric register, while highlighting through case studies that current AI accuracy in gesture and sign language recognition remains insufficient for practical applications and offering recommendations to bridge this gap.

S. Yanushkevich, E. Berepiki, P. Ciunkiewicz, V. Shmerko, G. Wolbring, R. GuestMon, 09 Ma💻 cs

Mind the Gap: Pitfalls of LLM Alignment with Asian Public Opinion

This paper presents a multilingual audit revealing that while contemporary large language models generally align with public opinion on broad social issues across Asian regions, they consistently fail to accurately represent diverse religious viewpoints—particularly those of minority groups—and often amplify negative stereotypes, a problem that persists despite lightweight prompting interventions and remains undetected by standard bias benchmarks.

Hari Shankar, Vedanta S P, Sriharini Margapuri, Debjani Mazumder, Ponnurangam Kumaraguru, Abhijnan ChakrabortyMon, 09 Ma💬 cs.CL

Towards Autonomous Mathematics Research

This paper introduces Aletheia, an autonomous AI research agent powered by advanced reasoning models and tool use that successfully generates, verifies, and revises mathematical proofs from Olympiad problems to PhD-level research, achieving milestones such as fully AI-generated papers and the autonomous solution of open problems while proposing new frameworks for quantifying AI autonomy and transparency.

Tony Feng, Trieu H. Trinh, Garrett Bingham, Dawsen Hwang, Yuri Chervonyi, Junehyuk Jung, Joonkyung Lee, Carlo Pagano, Sang-hyun Kim, Federico Pasqualotto, Sergei Gukov, Jonathan N. Lee, Junsu Kim, Kaiying Hou, Golnaz Ghiasi, Yi Tay, YaGuang Li, Chenkai Kuang, Yuan Liu, Hanzhao Lin, Evan Zheran Liu, Nigamaa Nayakanti, Xiaomeng Yang, Heng-Tze Cheng, Demis Hassabis, Koray Kavukcuoglu, Quoc V. Le, Thang LuongMon, 09 Ma🤖 cs.AI

AdAEM: An Adaptively and Automated Extensible Measurement of LLMs' Value Difference

This paper introduces AdAEM, a novel self-extensible evaluation framework that automatically generates adaptive test questions by probing the internal value boundaries of diverse LLMs to overcome the limitations of static benchmarks and provide more informative, distinguishable insights into models' value differences and alignment dynamics.

Jing Yao, Shitong Duan, Xiaoyuan Yi, Dongkuan Xu, Peng Zhang, Tun Lu, Ning Gu, Zhicheng Dou, Xing XieMon, 09 Ma🤖 cs.AI

The Malicious Technical Ecosystem: Exposing Limitations in Technical Governance of AI-Generated Non-Consensual Intimate Images of Adults

This paper adopts a survivor-centered approach to expose how a "malicious technical ecosystem" of accessible tools enables the creation of AI-generated non-consensual intimate images, while demonstrating that current governance frameworks, such as the NIST AI 100-4 report, fail to effectively regulate this landscape due to flawed underlying assumptions.

Michelle L. Ding, Harini SureshMon, 09 Ma🤖 cs.AI

The Consensus Trap: Dissecting Subjectivity and the "Ground Truth" Illusion in Data Annotation

This systematic literature review critiques the "ground truth" paradigm in machine learning as a positivistic fallacy that misinterprets human disagreement as noise, arguing instead for pluralistic annotation infrastructures that treat diverse subjective perspectives as high-fidelity signals essential for building culturally competent models.

Sheza Munir, Benjamin Mah, Krisha Kalsi, Shivani Kapania, Julian Posada, Edith Law, Ding Wang, Syed Ishtiaque AhmedMon, 09 Ma🤖 cs.AI

The DSA's Blind Spot: Algorithmic Audit of Advertising and Minor Profiling on TikTok

This paper presents an algorithmic audit of TikTok revealing that while the platform technically complies with the Digital Service Act's ban on profiled advertising to minors, it effectively circumvents this protection by delivering highly personalized, often undisclosed influencer marketing content to adolescents, thereby highlighting the urgent need to expand the regulatory definition of "advertisement" to cover such commercial practices.

Sara Solarova, Matej Mosnar, Matus Tibensky, Jan Jakubcik, Adrian Bindas, Simon Liska, Filip Hossner, Matúš Mesarčík, Ivan SrbaMon, 09 Ma🤖 cs.AI

Exploring Human-in-the-Loop Themes in AI Application Development: An Empirical Thematic Analysis

This paper presents a multi-source qualitative study that identifies four key themes—AI Governance and Human Authority, Human-in-the-Loop Iterative Refinement, AI System Lifecycle and Operational Constraints, and Human-AI Team Collaboration and Coordination—to address the fragmented operational guidance for structuring human roles and oversight in AI application development.

Parm Suksakul, Nathan Kittichaikoonkij, Nakhin Polthai, Aung PyaeMon, 09 Ma🤖 cs.AI