Alignment Is the Disease: Censorship Visibility and Alignment Constraint Complexity as Determinants of Collective Pathology in Multi-Agent LLM Systems

This paper presents preliminary evidence from multi-agent simulations suggesting that alignment techniques and invisible censorship in large language models may paradoxically induce collective pathological behaviors and insight-action dissociation, indicating that safety interventions can sometimes cause the very harms they aim to prevent.

Hiroki FukuiWed, 11 Ma🤖 cs.AI

SuperSkillsStack: Agency, Domain Knowledge, Imagination, and Taste in Human-AI Design Education

This study analyzes how 80 student design teams integrated generative AI into their creative process, revealing that while AI serves as a cognitive accelerator for early-stage tasks like brainstorming, human competencies in agency, domain knowledge, imagination, and taste remain essential for interpreting context, validating outputs, and refining design solutions.

Qian Huang, King Wang PoonTue, 10 Ma💻 cs

Science Literacy: Generative AI as Enabler of Coherence in the Teaching, Learning, and Assessment of Scientific Knowledge and Reasoning

This chapter explores the potential of generative AI to enhance K-16+ science literacy by proposing a coherent architectural framework that aligns the teaching, learning, and assessment of scientific knowledge and reasoning, while addressing associated challenges and outlining future research needs.

Xiaoming Zhai, James W. Pellegrino, Matias Rojas, Jongchan Park, Matthew Nyaaba, Clayton Cohn, Gautam BiswasTue, 10 Ma💻 cs

Causal Analysis of Author Demographics in Academic Peer Review

Using causal inference on a dataset of 530 papers, this study quantifies statistically significant disadvantages in academic peer review rankings for authors from minority racial groups, female authors, and those affiliated with institutions in the Global South, highlighting the urgent need for fairness interventions in both traditional and AI-driven assessment systems.

Uttamasha Anjally Oyshi, Gibson Nkhata, Susan GauchTue, 10 Ma💻 cs

Building the ethical AI framework of the future: from philosophy to practice

This paper proposes an ethics-by-design control architecture that operationalizes AI governance across the entire lifecycle by embedding philosophical reasoning into a triple-gate enforcement structure (Metric, Governance, and Eco) with measurable triggers and audit trails, thereby translating normative commitments into testable controls compatible with existing MLOps pipelines and major regulatory frameworks like the EU AI Act and NIST RMF.

Jasper Kyle CatapangTue, 10 Ma💻 cs

The Potential for an Innovation Winter: Estimating Impact of Federal Research Reductions on Faculty Activity

This paper utilizes stochastic modeling and Boston University data to predict that the proposed 40% federal research funding cuts under the 2026 Trump Administration would drastically increase the proportion of R1 universities where over half of their faculty face subcritical annual expenditures, thereby threatening the viability of quality research and doctoral programs across the United States.

Robert A. BrownTue, 10 Ma🔬 physics

SPOT: An Annotated French Corpus and Benchmark for Detecting Critical Interventions in Online Conversations

This paper introduces SPOT, the first annotated French corpus and benchmark for detecting "stopping points"—subtle critical interventions that pause or redirect online discussions—and demonstrates that fine-tuned encoder models outperform prompted LLMs in this task, particularly when enriched with contextual metadata.

Manon Berriche, Célia Nouri, Chloée Clavel, Jean-Philippe CointetTue, 10 Ma💬 cs.CL

Evaluating LLM-Based Grant Proposal Review via Structured Perturbations

This paper evaluates LLM-based grant proposal reviews using structured perturbations on six quality axes, finding that a section-by-section analysis approach outperforms other architectures but that current models still struggle with clarity detection and holistic assessment, suggesting they are best suited as supplementary tools rather than replacements for human reviewers.

William Thorne, Joseph James, Yang Wang, Chenghua Lin, Diana MaynardTue, 10 Ma💬 cs.CL

Dual-Metric Evaluation of Social Bias in Large Language Models: Evidence from an Underrepresented Nepali Cultural Context

This study evaluates seven state-of-the-art large language models in the underrepresented Nepali cultural context using a Dual-Metric Bias Assessment framework, revealing that while explicit agreement with biased statements is measurable, implicit generative bias is distinct, follows a non-linear relationship with temperature, and is poorly predicted by agreement metrics, thereby highlighting the critical need for culturally grounded datasets and evaluation strategies.

Ashish Pandey, Tek Raj ChhetriTue, 10 Ma💬 cs.CL