cs papers | Gist.Science

Mozart: Modularized and Efficient MoE Training on 3.5D Wafer-Scale Chiplet Architectures

The paper introduces Mozart, an algorithm-hardware co-design framework that leverages 3.5D wafer-scale chiplet architectures with specialized expert allocation and scheduling strategies to overcome communication and memory bottlenecks in the efficient training of large-scale Mixture-of-Experts (MoE) language models.

Shuqing Luo (Katie), Ye Han (Katie), Pingzhi Li (Katie), Jiayin Qin (Katie), Jie Peng (Katie), Yang (Katie), Zhao (Kevin), Yu (Kevin), Cao, Tianlong Chen2026-03-10💻 cs

SuperSkillsStack: Agency, Domain Knowledge, Imagination, and Taste in Human-AI Design Education

This study analyzes how 80 student design teams integrated generative AI into their creative process, revealing that while AI serves as a cognitive accelerator for early-stage tasks like brainstorming, human competencies in agency, domain knowledge, imagination, and taste remain essential for interpreting context, validating outputs, and refining design solutions.

Qian Huang, King Wang Poon2026-03-10💻 cs

OV-DEIM: Real-time DETR-Style Open-Vocabulary Object Detection with GridSynthetic Augmentation

This paper presents OV-DEIM, a real-time end-to-end DETR-style open-vocabulary object detector that combines the DEIMv2 framework with a query supplement strategy and a novel GridSynthetic data augmentation technique to achieve state-of-the-art performance and efficiency, particularly for rare categories.

Leilei Wang, Longfei Liu, Xi Shen, Xuanlong Yu, Ying Tiffany He, Fei Richard Yu, Yingyi Chen2026-03-10💻 cs

Enhancing Web Agents with a Hierarchical Memory Tree

This paper proposes the Hierarchical Memory Tree (HMT), a structured framework that decouples high-level task logic from site-specific action details through a three-level abstraction hierarchy, thereby significantly enhancing the generalization and robustness of large language model-based web agents in unseen environments.

Yunteng Tan, Zhi Gao, Xinxiao Wu2026-03-10💻 cs

Two Frames Matter: A Temporal Attack for Text-to-Video Model Jailbreaking

This paper introduces TFM, a temporal attack framework that exploits the vulnerability of text-to-video models to generate harmful content by providing only sparse boundary conditions (start and end frames) and implicitly substituting sensitive cues, thereby bypassing existing safety filters and significantly increasing jailbreak success rates.

Moyang Chen, Zonghao Ying, Wenzhuo Xu, Quancheng Zou, Deyue Zhang, Dongdong Yang, Xiangzheng Zhang2026-03-10💻 cs

Improved Leakage Abuse Attacks in Searchable Symmetric Encryption with eBPF Monitoring

This paper demonstrates that leveraging eBPF-based system-level monitoring reveals new leakage patterns in Searchable Symmetric Encryption (SSE) that extend beyond traditional threat models, thereby enabling more powerful leakage abuse attacks and highlighting the critical need to address system-level exposures in SSE defenses.

Chinecherem Dimobi2026-03-10💻 cs

SSP: Safety-guaranteed Surgical Policy via Joint Optimization of Behavioral and Spatial Constraints

This paper introduces the Safety-guaranteed Surgical Policy (SSP) framework, which integrates Neural ODE-based uncertainty modeling with robust Control Barrier Functions to enforce behavioral and spatial constraints, thereby ensuring near-zero safety violations while maintaining high task success rates in data-driven robot-assisted surgery.

Jianshu Hu, ZhiYuan Guan, Lei Song, Kantaphat Leelakunwet, Hesheng Wang, Wei Xiao, Qi Dou, Yutong Ban2026-03-10💻 cs

Monetizing Generative AI: YouTubers' Collective Knowledge on Earning from Generative AI Content

This paper analyzes 377 YouTube videos to map the collective knowledge creators share about monetizing Generative AI, identifying ten common use cases and revenue strategies while highlighting structural tensions such as unverifiable income claims and shifting authorship norms in AI-mediated creative labor.

Shuo Niu, Yao Lyu, He Zhang, Na Li, Bumjin Kim, Jie Cai2026-03-10💻 cs

Self-Supervised Multi-Modal World Model with 4D Space-Time Embedding

The paper introduces DeepEarth, a self-supervised multi-modal world model featuring Earth4D, a novel 4D space-time positional encoder that achieves state-of-the-art ecological forecasting performance and outperforms larger foundation models through efficient planetary-scale learning.

Lance Legel, Qin Huang, Brandon Voelker, Daniel Neamati, Patrick Alan Johnson, Favyen Bastani, Jeff Rose, James Ryan Hennessy, Robert Guralnick, Douglas Soltis, Pamela Soltis, Shaowen Wang2026-03-10💻 cs

TacDexGrasp: Compliant and Robust Dexterous Grasping with Tactile Feedback

TacDexGrasp is a robust dexterous grasping framework that utilizes tactile feedback and a Second-Order Cone Programming controller to actively constrain tangential-to-normal force ratios, thereby preventing both translational and rotational slip without requiring explicit torque modeling or slip detection.

Yubin Ke, Jiayi Chen, Hang Lv, Xiao Zhou, He Wang2026-03-10💻 cs

AIReSim: A Discrete Event Simulator for Large-scale AI Cluster Reliability Modeling

The paper introduces AIReSim, a discrete event simulator designed to help system designers evaluate and tune reliability mechanisms, prioritize improvements, and plan capacity for large-scale AI clusters by modeling the complex tradeoffs involved in failure, recovery, scheduling, and repair processes.

Karthik Pattabiraman, Mihir Patel, Fred Lin2026-03-10💻 cs

Fine-Grained 3D Facial Reconstruction for Micro-Expressions

This paper proposes a novel fine-grained 3D facial reconstruction method for micro-expressions that integrates global dynamic features with locally-enriched cues from 2D motions, facial priors, and 3D geometry to overcome data scarcity and achieve superior geometric accuracy and perceptual detail compared to state-of-the-art approaches.

Che Sun, Xinjie Zhang, Rui Gao, Xu Chen, Yuwei Wu, Yunde Jia2026-03-10💻 cs

Understanding User Requirements for Creating Sensor-Powered Smart Car Cabins Through Retrofitting

This paper investigates the potential of retrofitting to enhance sensor-powered smart car cabins by identifying limitations in built-in systems through interviews and defining user requirements via participatory design, ultimately offering design recommendations for future retrofit solutions.

Bofan Yu, Borui Li, Tingyu Zhang, Xing-Dong Yang2026-03-10💻 cs

Looking Back and Forth: Cross-Image Attention Calibration and Attentive Preference Learning for Multi-Image Hallucination Mitigation

This paper proposes CAPL, a framework that mitigates multi-image hallucinations in large vision-language models by introducing a selectable image token interaction mechanism for fine-grained cross-image alignment and a preference learning strategy that trains the model to rely on genuine visual evidence rather than textual priors.

Xiaochen Yang, Hao Fang, Jiawei Kong, Yaoxin Mao, Bin Chen, Shu-Tao Xia2026-03-10💻 cs

Communication Network-Aware Missing Data Recovery for Enhanced Distribution Grid Visibility

This paper proposes a communication network-aware framework that integrates routing constraints with low-rank matrix completion to mitigate spatially correlated data losses and significantly improve missing data recovery accuracy in power distribution grids compared to traditional measurement-only approaches.

Biswas Rudra Jyoti Arka, Md Zahidul Islam, Yuzhang Lin, Vinod M. Vokkarane, Junbo Zhao2026-03-10💻 cs

Leveraging Large Language Models for Automated Scalable Development of Open Scientific Databases

This paper introduces a scalable, domain-agnostic web-based framework that leverages Large Language Models to automate the collection, filtering, and construction of open scientific databases, achieving 90% overlap with expert-curated datasets while significantly reducing manual workload.

Nikita Gautam, Doina Caragea, Ignacio Ciampitti, Federico Gomez2026-03-10💻 cs

Animating Petascale Time-varying Data on Commodity Hardware with LLM-assisted Scripting

This paper presents a user-friendly framework that enables domain scientists to generate 3D animations of petascale, time-varying climate data on commodity hardware using an LLM-assisted conversational interface, thereby eliminating the need for specialized visualization expertise and high-performance computing resources.

Ishrat Jahan Eliza, Xuan Huang, Aashish Panta, Alper Sahistan, Zhimin Li, Amy A. Gooch, Valerio Pascucci2026-03-10💻 cs

Bi-directional digital twin prototype anchoring with multi-periodicity learning for few-shot fault diagnosis

This paper proposes a bi-directional digital twin prototype anchoring framework enhanced with multi-periodicity learning to achieve robust few-shot fault diagnosis by leveraging meta-training in a virtual simulation space and test-time adaptation in the physical domain, thereby overcoming the limitations of traditional methods that require abundant labeled or unlabeled target data.

Pengcheng Xia, Zhichao Dong, Yixiang Huang, Chengjin Qin, Qun Chao, Chengliang Liu2026-03-10💻 cs

SODA: Sensitivity-Oriented Dynamic Acceleration for Diffusion Transformer

SODA introduces a sensitivity-oriented dynamic acceleration framework for Diffusion Transformers that adaptively optimizes caching and pruning strategies through fine-grained sensitivity modeling and dynamic programming, achieving state-of-the-art generation fidelity under controllable acceleration ratios.

Tong Shao, Yusen Fu, Guoying Sun, Jingde Kong, Zhuotao Tian, Jingyong Su2026-03-10💻 cs

GuideTWSI: A Diverse Tactile Walking Surface Indicator Dataset from Synthetic and Real-World Images for Blind and Low-Vision Navigation

This paper introduces GuideTWSI, a diverse dataset combining synthetic and real-world images to address the scarcity of Tactile Walking Surface Indicator (TWSI) data, specifically bridging the gap between East Asian directional bars and North American/European truncated domes to improve navigation safety for blind and low-vision individuals.

Hochul Hwang, Soowan Yang, Anh N. H. Nguyen, Parth Goel, Krisha Adhikari, Sunghoon I. Lee, Joydeep Biswas, Nicholas A. Giudice, Donghyun Kim2026-03-10💻 cs

← Previous Next →