cs papers | Gist.Science

Agentified Assessment of Logical Reasoning Agents

This paper introduces an agentified assessment framework that utilizes an assessor agent to ensure reproducible and robust evaluation of logical reasoning systems, demonstrating its effectiveness by benchmarking an auto-formalization agent that achieves 86.70% accuracy on a solver-verified FOLIO dataset, significantly outperforming a chain-of-thought baseline.

Zhiyu Ni, Yifeng Xiao, Zheng Liang2026-03-10💻 cs

Required-edge Cycle Cover Problem: an ASP-Completeness Framework for Graph Problems and Puzzles

This paper introduces the Required-edge Cycle Cover Problem (RCCP) and a corresponding flow model to establish an ASP-completeness framework that resolves open complexity questions for Constraint Graph Satisfiability and Kakuro while proving the ASP-completeness of several other pencil-and-paper puzzles.

Kosuke Susukita, Junichi Teruyama2026-03-10💻 cs

Sharing is caring: Attestable and Trusted Workflows out of Distrustful Components

This paper presents Mica, a confidential computing architecture built on Arm CCA that decouples confidentiality from trust by enabling tenants to explicitly define, restrict, and attest communication paths between distrustful TEE components, thereby preventing sensitive data leakage without significantly expanding the trusted computing base.

Amir Al Sadi, Sina Abdollahi, Adrien Ghosn, Hamed Haddadi, Marios Kogias2026-03-10💻 cs

LDP-Slicing: Local Differential Privacy for Images via Randomized Bit-Plane Slicing

This paper introduces LDP-Slicing, a lightweight, training-free framework that overcomes the utility limitations of applying Local Differential Privacy to high-dimensional images by decomposing pixel values into binary bit-planes and integrating perceptual obfuscation and optimized budget allocation to achieve rigorous privacy with high downstream task performance.

Yuanming Cao, Chengqi Li, Wenbo He2026-03-10💻 cs

RISCBench: Benchmarking RISC-V Orchestration Efficiency in FPGA and FPGA-Like Computing Engines

This paper introduces RISCBench, a benchmark suite and methodology that quantifies orchestration efficiency in heterogeneous RISC-V systems using a new Sustained Instantaneous Throughput (SIT) metric to address the limitations of conventional performance indicators in FPGA and accelerator-class platforms.

Dave Ojika, Projjal Gupta, Preethi Budi + 2 more2026-03-10💻 cs

Converting Binary Floating-Point Numbers to Shortest Decimal Strings: An Experimental Review

This paper presents an empirical review comparing binary floating-point to decimal string conversion algorithms, highlighting that modern techniques like Schubfach and Dragonbox offer significant speedups over legacy methods like Dragon4, though many implementations still fail to consistently generate the shortest possible decimal strings.

Jaël Champagne Gareau, Daniel Lemire2026-03-10💻 cs

AI-Powered Multi-Stakeholder Ecosystems for Global Development: A Design Research Study on the GSI D-Hub Proof-of-Concept Platform

This design-science study introduces and validates the GSI D-Hub, an AI-powered platform that leverages explainable algorithms and synthetic data to enhance transparency, trust, and decision-making within multi-stakeholder global development ecosystems.

Muzakkiruddin Ahmed Mohammed, Adeeba Tarannum, Eileen Devereux Dailey + 3 more2026-03-10💻 cs

Evaluating AI-Enabled deception vulnerability amongst Sub-Saharan-Africa migrants

This study evaluates the vulnerability of Sub-Saharan African migrants to AI-enabled deception, finding that prior exposure to targeting is the strongest predictor of risk, while confidence in identifying AI content and high verification effort serve as significant protective factors.

Deborah Oluwasanya2026-03-10💻 cs

Building the ethical AI framework of the future: from philosophy to practice

This paper proposes an ethics-by-design control architecture that operationalizes AI governance across the entire lifecycle by embedding philosophical reasoning into a triple-gate enforcement structure (Metric, Governance, and Eco) with measurable triggers and audit trails, thereby translating normative commitments into testable controls compatible with existing MLOps pipelines and major regulatory frameworks like the EU AI Act and NIST RMF.

Jasper Kyle Catapang2026-03-10💻 cs

Causal Analysis of Author Demographics in Academic Peer Review

Using causal inference on a dataset of 530 papers, this study quantifies statistically significant disadvantages in academic peer review rankings for authors from minority racial groups, female authors, and those affiliated with institutions in the Global South, highlighting the urgent need for fairness interventions in both traditional and AI-driven assessment systems.

Uttamasha Anjally Oyshi, Gibson Nkhata, Susan Gauch2026-03-10💻 cs

Performance Comparison of IBN orchestration using LLM and SLMs

This paper proposes a stateful, hierarchical multi-agent framework for 5G/6G Intent-Based Networking orchestration that leverages both Large and Small Language Models, demonstrating that while both achieve similar translation accuracy, Small Language Models improve the overall lifecycle completion speed by 20%.

Wai Lwin Phone, Brahim El Boudani, Tasos Dagiuklas, Saptarshi Ghosh2026-03-10💻 cs

ObjChangeVR: Object State Change Reasoning from Continuous Egocentric Views in VR Environments

This paper introduces ObjChangeVR, a novel framework and corresponding dataset designed to enhance object state change reasoning in virtual reality by addressing the challenges of detecting background changes without direct interaction through viewpoint-aware retrieval and cross-view reasoning.

Shiyi Ding, Shaoen Wu, Ying Chen2026-03-10💻 cs

Margin-Consistent Deep Subtyping of Invasive Lung Adenocarcinoma via Perturbation Fidelity in Whole-Slide Image Analysis

This paper proposes a margin-consistent deep subtyping framework for invasive lung adenocarcinoma that integrates attention-weighted aggregation, contrastive regularization, and a novel Perturbation Fidelity scoring mechanism to achieve robust, high-accuracy classification across multiple architectures and demonstrate cross-institutional generalizability on whole-slide images.

Meghdad Sabouri Rad (Vincent), Junze (Vincent), Huang, Mohammad Mehdi Hosseini, Rakesh Choudhary, Saverio J. Carello, Ola El-Zammar, Michel R. Nasr, Bardia Rodd2026-03-10💻 cs

PaLMR: Towards Faithful Visual Reasoning via Multimodal Process Alignment

PaLMR is a novel framework that enhances the faithfulness of multimodal large language models by aligning both the reasoning process and outcomes through a perception-aligned data layer and a hierarchical reward fusion scheme, thereby significantly reducing visual hallucinations while achieving state-of-the-art performance on key benchmarks.

Yantao Li, Qiang Hui, Chenyang Yan, Kanzhi Cheng, Fang Zhao, Chao Tan, Huanling Gao, Jianbing Zhang, Kai Wang, Xinyu Dai, Shiguo Lian2026-03-10💻 cs

Digital Twin-Enabled Mobility-Aware Cooperative Caching in Vehicular Edge Computing

This paper proposes a Digital Twin-enabled framework (DAPR) that integrates asynchronous federated learning, a GRU-VAE prediction model, and deep reinforcement learning to optimize client selection and content request prediction, thereby significantly improving cache hit ratios and reducing transmission latency in vehicular edge computing systems.

Jiahao Zeng, Zhenkui Shi, Chunpei Li, Mengkai Yan, Hongliang Zhang, Sihan Chen, Xiantao Hu, Xianxian Li2026-03-10💻 cs

A Parameter-efficient Convolutional Approach for Weed Detection in Multispectral Aerial Imagery

This paper introduces FCBNet, a parameter-efficient convolutional model featuring a frozen ConvNeXt backbone and a Feature Correction Block that achieves superior weed segmentation accuracy (over 85% mIoU) and computational efficiency across RGB and multispectral aerial imagery compared to existing state-of-the-art models.

Leo Thomas Ramos, Angel D. Sappa2026-03-10💻 cs

GameVerse: Can Vision-Language Models Learn from Video-based Reflection?

The paper introduces GameVerse, a comprehensive benchmark featuring a novel reflect-and-retry paradigm and a hierarchical taxonomy across 15 games, demonstrating that Vision-Language Models can effectively improve their gameplay policies through video-based reflection by combining failure trajectories with expert tutorials.

Kuan Zhang, Dongchen Liu, Qiyue Zhao, Jinkun Hou, Xinran Zhang, Qinlei Xie, Miao Liu, Yiming Li2026-03-10💻 cs

ASMIL: Attention-Stabilized Multiple Instance Learning for Whole Slide Imaging

The paper introduces ASMIL, a unified framework that addresses unstable attention dynamics, overfitting, and over-concentrated attention in attention-based multiple instance learning for whole slide imaging by employing an anchor model with a normalized sigmoid function and token random dropping, resulting in significant performance improvements over state-of-the-art methods.

Linfeng Ye, Shayan Mohajer Hamidi, Zhixiang Chi, Guang Li, Mert Pilanci, Takahiro Ogawa, Miki Haseyama, Konstantinos N. Plataniotis2026-03-10💻 cs

Science Literacy: Generative AI as Enabler of Coherence in the Teaching, Learning, and Assessment of Scientific Knowledge and Reasoning

This chapter explores the potential of generative AI to enhance K-16+ science literacy by proposing a coherent architectural framework that aligns the teaching, learning, and assessment of scientific knowledge and reasoning, while addressing associated challenges and outlining future research needs.

Xiaoming Zhai, James W. Pellegrino, Matias Rojas, Jongchan Park, Matthew Nyaaba, Clayton Cohn, Gautam Biswas2026-03-10💻 cs

Graph-of-Mark: Promote Spatial Reasoning in Multimodal Language Models with Graph-Based Visual Prompting

The paper proposes Graph-of-Mark (GoM), a novel pixel-level visual prompting technique that overlays scene graphs onto images to capture object relationships, thereby significantly enhancing the spatial reasoning and zero-shot performance of multimodal language models.

Giacomo Frisoni, Lorenzo Molfetta, Mattia Buzzoni, Gianluca Moro2026-03-10💻 cs

← Previous Next →