cs.LG 件の論文 | Gist.Science

NerVE: Nonlinear Eigenspectrum Dynamics in LLM Feed-Forward Networks

本論文は、大規模言語モデルのフィードフォワードネットワークにおける高次元動的な情報フローを、スペクトルエントロピーや参加率などの 4 つの指標を用いた「NerVE」と呼ばれる統一された固有スペクトル解析フレームワークを通じて解明し、モデルの汎化性能や設計選択との関連性を示すことで、試行錯誤に頼らないアーキテクチャやオプティマイザの最適化を可能にすることを提案しています。

Nandan Kumar Jha, Brandon Reagen2026-03-10🤖 cs.LG

Swimba: Switch Mamba Model Scales State Space Models

本論文は、状態空間モデル（SSM）の計算コストを増大させずに専門性を導入する「Switch Mamba（Swimba）」を提案し、パラメータ空間でエキスパートを混合する設計が、再帰計算のコストを固定したまま SSM の容量を拡張できることを理論的・実証的に示しています。

Zhixu Du, Krishna Teja Chitty-Venkata, Murali Emani, Venkatram Vishwanath, Hai Helen Li, Yiran Chen2026-03-10🤖 cs.LG

Physics-Consistent Neural Networks for Learning Deformation and Director Fields in Microstructured Media with Loss-Based Validation Criteria

この論文は、コシラ弾性理論に基づく微細構造媒体の力学挙動を解析するため、変形とディレクター場を独立に表現しフレーム不変性を満たす物理整合性ニューラルネットワークを開発し、準凸性やレジェンドル・ハダマード不等式などの安定性条件を損失関数として組み込むことで、エネルギー最小化解の物理的妥当性を検証する新しい計算手法を提案している。

Milad Shirani, Pete H. Gueldner, Murat Khidoyatov, Jeremy L. Warren, Federica Ninno2026-03-10🤖 cs.LG

← 前へ次へ →

cs.LG

NerVE: Nonlinear Eigenspectrum Dynamics in LLM Feed-Forward Networks

Swimba: Switch Mamba Model Scales State Space Models

Physics-Consistent Neural Networks for Learning Deformation and Director Fields in Microstructured Media with Loss-Based Validation Criteria

Joint MDPs and Reinforcement Learning in Coupled-Dynamics Environments

How Private Are DNA Embeddings? Inverting Foundation Model Representations of Genomic Sequences

Not All Neighbors Matter: Understanding the Impact of Graph Sparsification on GNN Pipelines

Post-Training with Policy Gradients: Optimality and the Base Model Barrier

Chart-RL: Generalized Chart Comprehension via Reinforcement Learning with Verifiable Rewards

Learning Quadruped Walking from Seconds of Demonstration

A SISA-based Machine Unlearning Framework for Power Transformer Inter-Turn Short-Circuit Fault Localization

Topology-Aware Reinforcement Learning over Graphs for Resilient Power Distribution Networks

Conditional Unbalanced Optimal Transport Maps: An Outlier-Robust Framework for Conditional Generative Modeling

NePPO: Near-Potential Policy Optimization for General-Sum Multi-Agent Reinforcement Learning

Diffusion Controller: Framework, Algorithms and Parameterization

Masked Unfairness: Hiding Causality within Zero ATE

Adaptive Discovery of Interpretable Audio Attributes with Multimodal LLMs for Low-Resource Classification

Combinatorial Allocation Bandits with Nonlinear Arm Utility

Can Safety Emerge from Weak Supervision? A Systematic Analysis of Small Language Models

TEA-Time: Transporting Effects Across Time

RESCHED: Rethinking Flexible Job Shop Scheduling from a Transformer-based Architecture with Simplified States