cs.LG 편의 논문 | Gist.Science

Learning Adaptive LLM Decoding

이 논문은 고정된 샘플링 하이퍼파라미터 대신 강화학습을 통해 추론 시 계산 자원에 따라 동적으로 샘플링 전략을 선택하는 경량 디코딩 어댑터를 제안하여, 수학 및 코딩 벤치마크에서 고정된 예산 대비 정확도를 크게 향상시킨다는 점을 설명합니다.

Chloe H. Su, Zhe Ye, Samuel Tenka, Aidan Yang, Soonho Kong, Udaya Ghai2026-03-11🤖 cs.LG

Verifying Good Regulator Conditions for Hypergraph Observers: Natural Gradient Learning from Causal Invariance via Established Theorems

이 논문은 Wolfram 의 초그래프 물리학과 Vanchurin 의 신경망 우주론을 기반으로, 인과 불변 초그래프 기반의 지속적 관찰자가 Conant-Ashby 좋은 조절자 정리를 만족하고 자연 기울기 하강법이 유일한 학습 규칙임을 증명하며, 이를 통해 다양한 수렴 모델에 따라 관찰자가 피셔 계량 텐서의 고유 방향을 따라 서로 다른 Vanchurin 체제에 동시에 존재할 수 있음을 규명합니다.

Max Zhuravlev2026-03-11🤖 cs.LG

← 이전 다음 →

cs.LG

Learning Adaptive LLM Decoding

Verifying Good Regulator Conditions for Hypergraph Observers: Natural Gradient Learning from Causal Invariance via Established Theorems

Exclusive Self Attention

PPO-Based Hybrid Optimization for RIS-Assisted Semantic Vehicular Edge Computing

Not All News Is Equal: Topic- and Event-Conditional Sentiment from Finetuned LLMs for Aluminum Price Forecasting

Latent World Models for Automated Driving: A Unified Taxonomy, Evaluation Framework, and Open Challenges

Overcoming Valid Action Suppression in Unmasked Policy Gradient Algorithms

Probabilistic Hysteresis Factor Prediction for Electric Vehicle Batteries with Graphite Anodes Containing Silicon

Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards

Causally Sufficient and Necessary Feature Expansion for Class-Incremental Learning

RubiCap: Rubric-Guided Reinforcement Learning for Dense Image Captioning

Wrong Code, Right Structure: Learning Netlist Representations from Imperfect LLM-Generated RTL

GIAT: A Geologically-Informed Attention Transformer for Lithology Identification

Better Bounds for the Distributed Experts Problem

Differentiable Stochastic Traffic Dynamics: Physics-Informed Generative Modelling in Transportation

Latent-DARM: Bridging Discrete Diffusion And Autoregressive Models For Reasoning

The Costs of Reproducibility in Music Separation Research: a Replication of Band-Split RNN

$P^2$ GNN: Two Prototype Sets to boost GNN Performance

The Reasoning Trap -- Logical Reasoning as a Mechanistic Pathway to Situational Awareness

The Radio-Frequency Transformer for Signal Separation

cs.LG

Learning Adaptive LLM Decoding

Verifying Good Regulator Conditions for Hypergraph Observers: Natural Gradient Learning from Causal Invariance via Established Theorems

Exclusive Self Attention

PPO-Based Hybrid Optimization for RIS-Assisted Semantic Vehicular Edge Computing

Not All News Is Equal: Topic- and Event-Conditional Sentiment from Finetuned LLMs for Aluminum Price Forecasting

Latent World Models for Automated Driving: A Unified Taxonomy, Evaluation Framework, and Open Challenges

Overcoming Valid Action Suppression in Unmasked Policy Gradient Algorithms

Probabilistic Hysteresis Factor Prediction for Electric Vehicle Batteries with Graphite Anodes Containing Silicon

Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards

Causally Sufficient and Necessary Feature Expansion for Class-Incremental Learning

RubiCap: Rubric-Guided Reinforcement Learning for Dense Image Captioning

Wrong Code, Right Structure: Learning Netlist Representations from Imperfect LLM-Generated RTL

GIAT: A Geologically-Informed Attention Transformer for Lithology Identification

Better Bounds for the Distributed Experts Problem

Differentiable Stochastic Traffic Dynamics: Physics-Informed Generative Modelling in Transportation

Latent-DARM: Bridging Discrete Diffusion And Autoregressive Models For Reasoning

The Costs of Reproducibility in Music Separation Research: a Replication of Band-Split RNN

P2P^2P2GNN: Two Prototype Sets to boost GNN Performance

The Reasoning Trap -- Logical Reasoning as a Mechanistic Pathway to Situational Awareness

The Radio-Frequency Transformer for Signal Separation

$P^2$ GNN: Two Prototype Sets to boost GNN Performance