cs papers | Gist.Science

It is not always greener on the other side: Greenery perception across demographics and personalities in multiple cities

This study analyzes the discrepancies between objective and subjective urban greenery perceptions across five countries using street view imagery and a survey of 1,000 participants, revealing that while demographics and personality have little influence, an individual's geographic location is a primary factor shaping how they perceive green spaces.

Matias Quintana, Fangqi Liu, Jussi Torkko, Youlong Gu, Xiucheng Liang, Yujun Hou, Koichi Ito, Yihan Zhu, Mahmoud Abdelrahman, Tuuli Toivonen, Yi Lu, Filip Biljecki2026-03-10💻 cs

VOIC: Visible-Occluded Integrated Guidance for 3D Semantic Scene Completion

This paper introduces VOIC, a novel dual-decoder framework for monocular 3D Semantic Scene Completion that employs a Visible Region Label Extraction strategy to decouple visible-region perception from occluded-region reasoning, thereby mitigating feature dilution and achieving state-of-the-art performance on standard benchmarks.

Zaidao Han, Risa Higashita, Jiang Liu2026-03-10💻 cs

Cost Trade-offs of Reasoning and Non-Reasoning Large Language Models in Text-to-SQL

This paper demonstrates that reasoning Large Language Models significantly reduce cloud query execution costs and data consumption compared to non-reasoning models in Text-to-SQL tasks, while revealing that execution time is a poor proxy for cost efficiency and highlighting the substantial financial risks posed by non-reasoning models' tendency to generate inefficient queries.

Saurabh Deochake, Debajyoti Mukhopadhyay2026-03-10💻 cs

NashOpt -- A Python Library for Computing Generalized Nash Equilibria

NashOpt is an open-source Python library that computes generalized Nash equilibria in noncooperative games with shared constraints by leveraging joint KKT conditions, JAX-based automatic differentiation for nonlinear problems, and mixed-integer linear programming for linear-quadratic cases, while also supporting inverse and Stackelberg game design.

Alberto Bemporad2026-03-10💻 cs

Toward a Physical Theory of Intelligence

This paper introduces the Conservation-Congruent Encoding (CCE) framework, a unified physical theory that defines intelligence as an irreversible process of extracting work while minimizing dissipation, thereby deriving universal computational bounds and linking thermodynamic measurement, quantum decoherence, and spacetime geometry to establish substrate-neutral constraints for both natural and artificial intelligence.

Peter David Fagan2026-03-10💻 cs

DrivingGen: A Comprehensive Benchmark for Generative Video World Models in Autonomous Driving

This paper introduces DrivingGen, the first comprehensive benchmark for generative driving world models that addresses the lack of rigorous evaluation by combining a diverse dataset with a novel suite of metrics to assess visual realism, trajectory plausibility, temporal coherence, and controllability, thereby revealing critical trade-offs in current state-of-the-art models.

Yang Zhou, Hao Shao, Letian Wang, Zhuofan Zong, Hongsheng Li, Steven L. Waslander2026-03-10💻 cs

Machine Learning Guided Cooling System Optimization for Data Center

This paper presents a three-stage, physics-guided machine learning framework applied to the Frontier exascale supercomputer that identifies significant cooling inefficiencies and demonstrates how safe, counterfactual setpoint adjustments can recover up to 96% of excess energy consumption while maintaining thermal limits.

Shrenik Jadhav, Zheng Liu2026-03-10💻 cs

Batch-of-Thought: Cross-Instance Learning for Enhanced LLM Reasoning

This paper introduces Batch-of-Thought (BoT), a training-free method that enhances Large Language Model reasoning by jointly processing related queries to leverage cross-instance signals, thereby improving accuracy, calibration, and computational efficiency through a multi-agent reflection architecture.

Xuan Yang, Furong Jia, Roy Xie, Xiong Xi, Hengwei Bian, Jian Li, Monica Agrawal2026-03-10💻 cs

Route, Retrieve, Reflect, Repair: Self-Improving Agentic Framework for Visual Detection and Linguistic Reasoning in Medical Imaging

The paper introduces R^4, a self-improving agentic framework that enhances medical image analysis by decomposing workflows into routing, retrieval, reflection, and repair stages to iteratively refine both textual reports and spatial bounding boxes, achieving significant performance gains over single-pass VLM baselines without requiring gradient-based fine-tuning.

Md. Faiyaz Abdullah Sayeedi, Rashedur Rahman, Siam Tahsin Bhuiyan, Sefatul Wasi, Ashraful Islam, Saadia Binte Alam, AKM Mahbubur Rahman2026-03-10💻 cs

The Algorithmic Gaze of Image Quality Assessment: An Audit and Trace Ethnography of the LAION-Aesthetics Predictor

This paper audits the LAION-Aesthetics Predictor to reveal how its algorithmic gaze reinforces Western, male, and imperial biases by disproportionately filtering content and prioritizing specific cultural aesthetics, ultimately urging a shift toward pluralistic evaluation methods in AI development.

Jordan Taylor, William Agnew, Maarten Sap, Sarah E. Fox, Haiyi Zhu2026-03-10💻 cs

CaMeLs Can Use Computers Too: System-level Security for Computer Use Agents

This paper introduces "Single-Shot Planning," a secure architecture for Computer Use Agents that generates a complete, trusted execution graph before observing untrusted UI states to effectively mitigate prompt injection and branch steering attacks while maintaining competitive task performance.

Hanna Foerster, Tom Blanchard, Kristina Nikolic, Ilia Shumailov, Cheng Zhang, Robert Mullins, Nicolas Papernot, Florian Tramèr, Yiren Zhao2026-03-10💻 cs

User Detection and Response Patterns of Sycophantic Behavior in Conversational AI

This paper investigates how users detect and respond to sycophantic behavior in conversational AI through a proposed DCR epistemology, revealing that while users employ various mitigation strategies, sycophancy is not universally harmful and can provide valued emotional support for vulnerable populations, suggesting a need for context-aware AI design rather than universal elimination.

Kazi Noshin, Syed Ishtiaque Ahmed, Sharifa Sultana2026-03-10💻 cs

BoxMind: Closed-loop AI strategy optimization for elite boxing validated in the 2024 Olympics

This paper introduces BoxMind, a closed-loop AI system that transforms unstructured boxing footage into hierarchical tactical indicators and predictive gradients to generate expert-level strategic recommendations, which were validated during the 2024 Paris Olympics by contributing to the Chinese National Team's historic medal success.

Kaiwen Wang, Kaili Zheng, Rongrong Deng, Qingmin Fan, Milin Zhang, Zongrui Li, Xuesi Zhou, Bo Han, Liren Chen, Chenyi Guo, Ji Wu2026-03-10💻 cs

Multifaceted Scenario-Aware Hypergraph Learning for Next POI Recommendation

This paper proposes the Multifaceted Scenario-Aware Hypergraph Learning (MSAHG) framework, which addresses the limitations of existing methods in handling mobility variations across distinct contexts by constructing scenario-specific disentangled sub-hypergraphs and employing a parameter-splitting mechanism to resolve inter-scenario conflicts, thereby significantly improving next POI recommendation performance.

Yuxi Lin, Yongkang Li, Jie Xing, Zipei Fan2026-03-10💻 cs

S2DiT: Sandwich Diffusion Transformer for Mobile Streaming Video Generation

The paper introduces S2DiT, a novel Streaming Sandwich Diffusion Transformer that leverages efficient attention mechanisms, a budget-aware sandwich architecture, and a 2-in-1 distillation framework to achieve high-fidelity, real-time video generation on mobile devices with performance comparable to server-grade models.

Lin Zhao, Yushu Wu, Aleksei Lebedev, Dishani Lahiri, Meng Dong, Arpit Sahni, Michael Vasilkovsky, Hao Chen, Ju Hu, Aliaksandr Siarohin, Sergey Tulyakov, Yanzhi Wang, Anil Kag, Yanyu Li2026-03-10💻 cs

Equal-Pay Contracts

This paper investigates multi-agent contract design under equal-pay constraints, providing tight polynomial-time approximation algorithms and hardness results for various reward functions while resolving open problems in unconstrained settings and quantifying the efficiency loss of fairness via a $\Theta(\log n / \log \log n)$ price of equality.

Michal Feldman, Yoav Gal-Tzur, Tomasz Ponitka, Maya Schlesinger2026-03-10💻 cs

ReViP: Mitigating False Completion in Vision-Language-Action Models with Vision-Proprioception Rebalance

This paper introduces ReViP, a novel Vision-Language-Action framework that mitigates "false completion" failures caused by proprioceptive bias through vision-proprioception rebalancing and a new benchmark suite, achieving significant performance gains over existing models.

Zhuohao Li, Yinghao Li, Jian-Jian Jiang, Lang Zhou, Tianyu Zhang, Jiadong Yin, Mu Lin, Yi-Kin Wei, Wei-Shi Zheng2026-03-10💻 cs

ScenePilot-Bench: A Large-Scale Dataset and Benchmark for Evaluation of Vision-Language Models in Autonomous Driving

This paper introduces ScenePilot-Bench, a large-scale benchmark built on the diverse ScenePilot-4K dataset to comprehensively evaluate and advance vision-language models in autonomous driving through multi-granularity annotations and a safety-aware, four-axis assessment framework.

Yujin Wang, Yutong Zheng, Wenxian Fan, Tianyi Wang, Hongqing Chu, Li Zhang, Bingzhao Gao, Daxin Tian, Jianqiang Wang, Hong Chen2026-03-10💻 cs

Query-Guided Spatial-Temporal-Frequency Interaction for Music Audio-Visual Question Answering

This paper proposes QSTar, a novel query-guided spatial-temporal-frequency interaction method enhanced by a Query Context Reasoning block, which significantly improves Audio-Visual Question Answering performance by deeply integrating question-guided clues and audio frequency characteristics with visual perception, outperforming existing multimodal approaches on multiple benchmarks.

Kun Li, Michael Ying Yang, Sami Sebastian Brandt2026-03-10💻 cs

Dynamic framework for edge-connectivity maintenance of simple graphs

This paper presents a dynamic framework for maintaining $k$ -edge-connectivity in undirected simple graphs under edge insertions and deletions by combining Nagamochi-Ibaraki sparse certificates with Link-Cut Trees for efficient $O(k \log n)$ amortized insertions and a maximum-flow-based approach for $O(k^{3/2} n^{3/2})$ deletions, all while keeping the graph sparse with $O(kn)$ edges.

Blazej Wrobel2026-03-10💻 cs

← Previous Next →