cs.CY papers | Gist.Science

Alignment Is the Disease: Censorship Visibility and Alignment Constraint Complexity as Determinants of Collective Pathology in Multi-Agent LLM Systems

This paper presents preliminary evidence from multi-agent simulations suggesting that alignment techniques and invisible censorship in large language models may paradoxically induce collective pathological behaviors and insight-action dissociation, indicating that safety interventions can sometimes cause the very harms they aim to prevent.

Hiroki FukuiWed, 11 Ma🤖 cs.AI

The Reasoning Trap -- Logical Reasoning as a Mechanistic Pathway to Situational Awareness

This paper argues that advancements in logical reasoning for large language models inadvertently create a mechanistic pathway to dangerous situational awareness and strategic deception, necessitating new safety frameworks like the RAISE model to mitigate these emergent risks.

Subramanyam Sahoo, Aman Chadha, Vinija Jain, Divya ChaudharyWed, 11 Ma🤖 cs.AI

MediTools -- Medical Education Powered by LLMs

This paper introduces MediTools, an AI-powered prototype application that leverages large language models to revolutionize medical education through interactive dermatology case simulations, enhanced literature analysis, and automated medical news summaries, while validating its potential through a survey of medical professionals and students.

Amr Alshatnawi, Remi Sampaleanu, David LiebovitzTue, 10 Ma💻 cs

Life Histories of Taboo Knowledge Artifacts

This mixed-methods study investigates the lifecycle and sustainability of taboo knowledge artifacts on Wikipedia, revealing that their successful development relies on resilient leadership, engaged organizations, and emergent governance to navigate conflict and limited identifiability.

Kaylea Champion, Benjamin Mako HillTue, 10 Ma💻 cs

Social Proof is in the Pudding: The (Non)-Impact of Social Proof on Software Downloads

Through two field experiments on GitHub involving the manipulation of repository stars and package download counts, the study finds that social proof metrics have no discernible impact on subsequent software downloads or developer engagement, suggesting that open-source software choices are not easily gamed by inflating these indicators.

Lucas Shen, Gaurav SoodTue, 10 Ma💻 cs

AI Misuse in Education Is a Measurement Problem: Toward a Learning Visibility Framework

This paper argues that addressing AI misuse in education requires shifting from unreliable detection methods to a "Learning Visibility Framework" that treats the learning process as assessable evidence, thereby fostering ethical AI integration through transparency and shared understanding rather than surveillance.

Eduardo Davalos, Yike ZhangTue, 10 Ma💻 cs

Governance of AI-Generated Content: A Case Study on Social Media Platforms

This paper examines the governance of AI-generated content across 40 social media platforms, finding that while most focus on moderation and disclosure, few address ownership and monetization, prompting a call for more comprehensive policies and user education.

Lan Gao, Abani Ahmed, Oscar Chen, Margaux Reyl, Zayna Cheema, Nick Feamster, Chenhao Tan, Kurt Thomas, Marshini ChettyTue, 10 Ma💻 cs

Brexit Means Brexit: Selection Bias, Echo Chambers, and Entrenched Opinion on Reddit

This paper presents an end-to-end framework analyzing the r/Brexit subreddit to demonstrate that political polarization on Reddit is driven by self-selection and echo chambers, where user opinions become entrenched rather than softened by cross-cutting exposure.

Marian-Andrei Rizoiu, Duy Khuu, Andrew Law, Christine LargeronTue, 10 Ma💻 cs

SuperSkillsStack: Agency, Domain Knowledge, Imagination, and Taste in Human-AI Design Education

This study analyzes how 80 student design teams integrated generative AI into their creative process, revealing that while AI serves as a cognitive accelerator for early-stage tasks like brainstorming, human competencies in agency, domain knowledge, imagination, and taste remain essential for interpreting context, validating outputs, and refining design solutions.

Qian Huang, King Wang PoonTue, 10 Ma💻 cs

Science Literacy: Generative AI as Enabler of Coherence in the Teaching, Learning, and Assessment of Scientific Knowledge and Reasoning

This chapter explores the potential of generative AI to enhance K-16+ science literacy by proposing a coherent architectural framework that aligns the teaching, learning, and assessment of scientific knowledge and reasoning, while addressing associated challenges and outlining future research needs.

Xiaoming Zhai, James W. Pellegrino, Matias Rojas, Jongchan Park, Matthew Nyaaba, Clayton Cohn, Gautam BiswasTue, 10 Ma💻 cs

Causal Analysis of Author Demographics in Academic Peer Review

Using causal inference on a dataset of 530 papers, this study quantifies statistically significant disadvantages in academic peer review rankings for authors from minority racial groups, female authors, and those affiliated with institutions in the Global South, highlighting the urgent need for fairness interventions in both traditional and AI-driven assessment systems.

Uttamasha Anjally Oyshi, Gibson Nkhata, Susan GauchTue, 10 Ma💻 cs

Building the ethical AI framework of the future: from philosophy to practice

This paper proposes an ethics-by-design control architecture that operationalizes AI governance across the entire lifecycle by embedding philosophical reasoning into a triple-gate enforcement structure (Metric, Governance, and Eco) with measurable triggers and audit trails, thereby translating normative commitments into testable controls compatible with existing MLOps pipelines and major regulatory frameworks like the EU AI Act and NIST RMF.

Jasper Kyle CatapangTue, 10 Ma💻 cs

Evaluating AI-Enabled deception vulnerability amongst Sub-Saharan-Africa migrants

This study evaluates the vulnerability of Sub-Saharan African migrants to AI-enabled deception, finding that prior exposure to targeting is the strongest predictor of risk, while confidence in identifying AI content and high verification effort serve as significant protective factors.

Deborah OluwasanyaTue, 10 Ma💻 cs

The Potential for an Innovation Winter: Estimating Impact of Federal Research Reductions on Faculty Activity

This paper utilizes stochastic modeling and Boston University data to predict that the proposed 40% federal research funding cuts under the 2026 Trump Administration would drastically increase the proportion of R1 universities where over half of their faculty face subcritical annual expenditures, thereby threatening the viability of quality research and doctoral programs across the United States.

Robert A. BrownTue, 10 Ma🔬 physics

SPOT: An Annotated French Corpus and Benchmark for Detecting Critical Interventions in Online Conversations

This paper introduces SPOT, the first annotated French corpus and benchmark for detecting "stopping points"—subtle critical interventions that pause or redirect online discussions—and demonstrates that fine-tuned encoder models outperform prompted LLMs in this task, particularly when enriched with contextual metadata.

Manon Berriche, Célia Nouri, Chloée Clavel, Jean-Philippe CointetTue, 10 Ma💬 cs.CL

Llama-Mob: Instruction-Tuning Llama-3-8B Excels in City-Scale Mobility Prediction

This paper introduces Llama-Mob, an instruction-tuned Llama-3-8B model that outperforms state-of-the-art methods in long-term, city-scale human mobility prediction and demonstrates strong zero-shot generalization across different urban environments.

Peizhi Tang, Chuang Yang, Tong Xing, Xiaohang Xu, Jiayi Xu, Renhe Jiang, Kaoru SezakiTue, 10 Ma💬 cs.CL

The Third Ambition: Artificial Intelligence and the Science of Human Behavior

This paper proposes a "third ambition" for artificial intelligence research, advocating for the use of large language models as scientific instruments to study human behavior, culture, and moral reasoning by treating them as computationally accessible condensates of collective discourse while addressing their methodological and epistemic limitations.

W. Russell Neuman, Chad ColemanTue, 10 Ma💬 cs.CL

Evaluating LLM-Based Grant Proposal Review via Structured Perturbations

This paper evaluates LLM-based grant proposal reviews using structured perturbations on six quality axes, finding that a section-by-section analysis approach outperforms other architectures but that current models still struggle with clarity detection and holistic assessment, suggesting they are best suited as supplementary tools rather than replacements for human reviewers.

William Thorne, Joseph James, Yang Wang, Chenghua Lin, Diana MaynardTue, 10 Ma💬 cs.CL

Dual-Metric Evaluation of Social Bias in Large Language Models: Evidence from an Underrepresented Nepali Cultural Context

This study evaluates seven state-of-the-art large language models in the underrepresented Nepali cultural context using a Dual-Metric Bias Assessment framework, revealing that while explicit agreement with biased statements is measurable, implicit generative bias is distinct, follows a non-linear relationship with temperature, and is poorly predicted by agreement metrics, thereby highlighting the critical need for culturally grounded datasets and evaluation strategies.

Ashish Pandey, Tek Raj ChhetriTue, 10 Ma💬 cs.CL

Estimating Item Difficulty Using Large Language Models and Tree-Based Machine Learning Algorithms

This study demonstrates that while Large Language Models can directly estimate item difficulty for K-5 assessments, a hybrid approach combining LLM-extracted cognitive and linguistic features with tree-based machine learning algorithms yields significantly higher predictive accuracy, offering a scalable alternative to resource-intensive field testing.

Pooya Razavi, Sonya PowersTue, 10 Ma🤖 cs.LG

← Previous Next →