cs.AI 편의 논문 | Gist.Science

Resource-constrained Amazons chess decision framework integrating large language models and graph attention

이 논문은 제한된 컴퓨팅 자원에서 그래프 어텐션 메커니즘과 GPT-4o-mini 를 결합한 경량 하이브리드 프레임워크를 제안하여, 노이즈가 있는 데이터에서도 아만존스 체스 게임에서 기존 베이스라인과 교사 모델보다 뛰어난 성능을 달성함을 입증합니다.

Tianhao Qian, Zhuoxuan Li, Jinde Cao, Xinli Shi, Hanjie Liu, Leszek Rutkowski2026-03-12🤖 cs.AI

IH-Challenge: A Training Dataset to Improve Instruction Hierarchy on Frontier LLMs

이 논문은 프론티어 LLM 의 지시 계층 구조 (IH) 강인성을 향상시키기 위해 고안된 강화 학습 데이터셋 'IH-Challenge'를 소개하고, 이를 통해 GPT-5-Mini 의 지시 계층 안정성을 10% 이상 개선하면서도 안전성과 유용성을 동시에 확보한 결과를 제시합니다.

Chuan Guo (Michael Pokorny), Juan Felipe Ceron Uribe (Michael Pokorny), Sicheng Zhu (Michael Pokorny), Christopher A. Choquette-Choo (Michael Pokorny), Steph Lin (Michael Pokorny), Nikhil Kandpal (Michael Pokorny), Milad Nasr (Michael Pokorny), Rai (Michael Pokorny), Sam Toyer, Miles Wang, Yaodong Yu, Alex Beutel, Kai Xiao2026-03-12🤖 cs.AI

← 이전 다음 →

cs.AI

Resource-constrained Amazons chess decision framework integrating large language models and graph attention

IH-Challenge: A Training Dataset to Improve Instruction Hierarchy on Frontier LLMs

UAV-MARL: Multi-Agent Reinforcement Learning for Time-Critical and Dynamic Medical Supply Delivery

Prompting with the human-touch: evaluating model-sensitivity of foundation models for musculoskeletal CT segmentation

SCORE: Replacing Layer Stacking with Contractive Recurrent Depth

Towards Cognitive Defect Analysis in Active Infrared Thermography with Vision-Text Cues

Adaptive RAN Slicing Control via Reward-Free Self-Finetuning Agents

CUAAudit: Meta-Evaluation of Vision-Language Models as Auditors of Autonomous Computer-Use Agents

Does LLM Alignment Really Need Diversity? An Empirical Study of Adapting RLVR Methods for Moral Reasoning

Gradient Flow Drifting: Generative Modeling via Wasserstein Gradient Flows of KDE-Approximated Divergences

Recover to Predict: Progressive Retrospective Learning for Variable-Length Trajectory Prediction

Trajectory-Informed Memory Generation for Self-Improving Agent Systems

Reinforcement Learning with Conditional Expectation Reward

Detecting and Eliminating Neural Network Backdoors Through Active Paths with Application to Intrusion Detection

Interleaving Scheduling and Motion Planning with Incremental Learning of Symbolic Space-Time Motion Abstractions

Are Video Reasoning Models Ready to Go Outside?

FAME: Formal Abstract Minimal Explanation for Neural Networks

Emulating Clinician Cognition via Self-Evolving Deep Clinical Research

A Platform-Agnostic Multimodal Digital Human Modelling Framework: Neurophysiological Sensing in Game-Based Interaction

Contract And Conquer: How to Provably Compute Adversarial Examples for a Black-Box Model?