cs.LG 편의 논문 | Gist.Science

Entropic Confinement and Mode Connectivity in Overparameterized Neural Networks

이 논문은 손실 지형의 곡률 변화와 최적화 과정의 노이즈가 상호작용하여 생성되는 엔트로피 장벽이, 낮은 손실 경로를 연결하면서도 최적화 동역학이 특정 Basin 에 국한되게 하는 모순을 해결한다고 설명합니다.

Luca Di Carlo, Chase Goddard, David J. Schwab2026-03-13📊 stat

CTIGuardian: A Few-Shot Framework for Mitigating Privacy Leakage in Fine-Tuned LLMs

이 논문은 사이버 위협 정보 (CTI) 도메인을 사례로, 파인튜닝된 대규모 언어 모델 (LLM) 에서 발생할 수 있는 민감 정보 유출을 방지하기 위해 재학습 없이 소량의 예시를 활용한 '프라이버시 정렬' 프레임워크인 CTIGuardian 을 제안하고, 기존 NER 기반 방법보다 우수한 프라이버시와 유용성의 균형을 입증합니다.

Shashie Dilhara Batan Arachchige, Benjamin Zi Hao Zhao, Hassan Jameel Asghar + 2 more2026-03-13🤖 cs.LG

Deep Eigenspace Network for Parametric Non-self-adjoint Eigenvalue Problems

이 논문은 비자기수반 연산자의 고유값 문제를 효율적으로 해결하기 위해 고유함수 대신 고유공간을 학습하는 '심층 고유공간 네트워크 (DEN)'를 제안하고, 이를 스테클로프 고유값 문제에 적용하여 이론적 수렴성과 수치적 유효성을 입증합니다.

H. Li, J. Sun, Z. Zhang2026-03-13🤖 cs.LG

Provably Finding a Hidden Dense Submatrix among Many Planted Dense Submatrices via Convex Programming

이 논문은 기존 연구가 단일 밀집 서브그래프를 가정했던 것과 달리, 실제 네트워크에서 흔히 나타나는 여러 개의 밀집 서브그래프가 혼재된 환경에서도 볼록 프로그래밍을 통해 밀집 서브행렬 문제를 다항 시간 내에 해결할 수 있는 충분 조건을 제시하고 실험적으로 검증합니다.

Valentine Olanubi (University of Alabama, Department of Mathematics), Phineas Agar (University of Alabama, Department of Mathematics), Brendan Ames (University of Southampton, School of Mathematical Sciences)2026-03-13🤖 cs.LG

A Learnable Wavelet Transformer for Long-Short Equity Trading and Risk-Adjusted Return Optimization

이 논문은 금융 시계열의 잡음과 비정상성 문제를 해결하고 위험 조정 수익을 최적화하기 위해, 학습 가능한 웨이블릿 기반의 다중 스케일 분해와 리스크 인식 정규화를 통해 직접 시장 중립적 롱/숏 포트폴리오를 생성하는 'WaveLSFormer'라는 새로운 트랜스포머 모델을 제안하고 그 우수성을 입증합니다.

Shuozhe Li, Du Cheng, Leqi Liu2026-03-13💰 q-fin

Text-only adaptation in LLM-based ASR through text denoising

이 논문은 타겟 도메인의 텍스트 데이터만으로 대규모 언어 모델 (LLM) 기반 음성 인식 시스템을 적응시키는 과정에서 기존 정렬을 해치지 않도록, 텍스트 복원 (denoising) 태스크를 통해 경량화된 적응 방법을 제안하고 기존 최첨단 방법보다 우수한 성능을 입증합니다.

Andrés Carofilis, Sergio Burdisso, Esaú Villatoro-Tello, Shashi Kumar, Kadri Hacioglu, Srikanth Madikeri, Pradeep Rangappa, Manjunath K E, Petr Motlicek, Shankar Venkatesan, Andreas Stolcke2026-03-13⚡ eess

WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

이 논문은 수직적 확장 (깊이) 에 집중해 온 기존 LLM 접근법의 한계를 넘어, 다중 에이전트 강화학습을 통해 병렬 실행과 확장 가능한 오케스트레이션을 가능하게 하는 'WideSeek-R1'을 제안하여, 4B 파라미터 모델이 671B 단일 에이전트 모델과 comparable 한 성능을 내며 폭넓은 정보 검색 과제를 효과적으로 해결함을 보여줍니다.

Zelai Xu, Zhexuan Xu, Ruize Zhang, Chunyang Zhu, Shi Yu, Weilin Liu, Quanlu Zhang, Wenbo Ding, Chao Yu, Yu Wang2026-03-13🤖 cs.AI

Kernel-based optimization of measurement operators for quantum reservoir computers

이 논문은 양자 저수지 컴퓨터의 고정된 특징 매핑을 활용하여 커널 릿지 회귀 프레임워크 내에서 최적의 측정 연산자를 도출함으로써 예측 오차를 최소화하고, 대규모 큐비트 환경에서 기존 방법보다 효율적인 학습 전략을 제안합니다.

Markus Gross, Hans-Martin Rieser2026-03-13⚛️ quant-ph

From Classical to Quantum: Extending Prometheus for Unsupervised Discovery of Phase Transitions in Three Dimensions and Quantum Systems

이 논문은 2 차원 고전 시스템에서 3 차원 고전 및 양자 다체 시스템으로 '프로메테우스' 프레임워크를 확장하여, 지도 학습 없이 3 차원 이징 모델의 임계 온도와 임계 지수를 정밀하게 탐지하고 양자 위상 전이 및 무질서한 시스템의 이국적 임계성을 성공적으로 발견했음을 보여줍니다.

Brandon Yee, Wilson Collins, Maximilian Rutkowski2026-03-13🔬 cond-mat

RAT+: Train Dense, Infer Sparse -- Recurrence Augmented Attention for Dilated Inference

이 논문은 사전 학습된 밀집 어텐션 모델을 재학습 없이도 추론 시 희소화 (dilated) 패턴으로 유연하게 전환하면서도 긴 범위의 연결성을 유지하고 정확도 저하를 최소화하는 'RAT+'라는 새로운 아키텍처를 제안합니다.

Xiuying Wei, Caglar Gulcehre2026-03-13🤖 cs.LG

[b]=[d]-[t]+[p]: Self-supervised Speech Models Discover Phonological Vector Arithmetic

이 논문은 자기지도학습 음성 모델이 음운론적 특징을 선형 벡터로 인코딩하여 음소 간의 산술 연산 (예: [d]-[t]+[p]=[b]) 이 가능함을 96 개 언어를 대상으로 한 연구를 통해 입증했습니다.

Kwanghee Choi, Eunjung Yeo, Cheol Jun Cho, David Harwath, David R. Mortensen2026-03-13⚡ eess

De novo molecular structure elucidation from mass spectra via flow matching

이 논문은 질량 스펙트럼으로부터 분자 구조를 직접 규명하기 위해 제안된 'MSFlow'라는 2 단계 인코더-디코더 흐름 매칭 생성 모델을 소개하며, 기존 최첨단 방법 대비 최대 14 배 향상된 성능으로 스펙트럼의 45% 까지 정확한 분자 표현으로 변환하는 성과를 입증했습니다.

Ghaith Mqawass (TUM School of Life Sciences Weihenstephan, Technical University of Munich, Germany, Machine Learning and Computational Sciences, Pfizer Research & Development, Berlin, Germany), Tuan Le (Machine Learning and Computational Sciences, Pfizer Research & Development, Berlin, Germany), Fabian Theis (TUM School of Life Sciences Weihenstephan, Technical University of Munich, Germany, TUM School of Computation, Information and Technology, Technical University of Munich, Germany, Institute of Computational Biology, Helmholtz Center Munich, Germany), Djork-Arné Clevert (Machine Learning and Computational Sciences, Pfizer Research & Development, Berlin, Germany)2026-03-13🤖 cs.LG

Controllable Exploration in Hybrid-Policy RLVR for Multi-Modal Reasoning

이 논문은 다중 모달 대형 언어 모델의 강화 학습에서 엔트로피 붕괴와 정책 저하를 방지하고 탐색과 활용의 균형을 유지하기 위해 전문가 지식과 희소성 기반 가중치를 결합한 'CalibRL'이라는 제어 가능한 탐색 하이브리드 정책 RLVR 프레임워크를 제안합니다.

Zhuoxu Huang, Mengxi Jia, Hao Sun, Xuelong Li, Jungong Han2026-03-13🤖 cs.LG

ECHOSAT: Estimating Canopy Height Over Space And Time

이 논문은 다중 센서 위성 데이터와 자기지도 학습 기반의 성장 손실 함수를 활용하여 전 세계적 규모로 시계열 일관성을 갖춘 10m 해상도의 수관 높이 지도 'ECHOSAT'를 개발함으로써, 기존 정적 지도의 한계를 극복하고 탄소 모니터링 및 교란 평가에 기여하는 것을 목표로 합니다.

Jan Pauls, Karsten Schrödter, Sven Ligensa, Martin Schwartz, Berkant Turan, Max Zimmer, Sassan Saatchi, Sebastian Pokutta, Philippe Ciais, Fabian Gieseke2026-03-13🤖 cs.LG

Unsupervised Discovery of Intermediate Phase Order in the Frustrated $J_1$ - $J_2$ Heisenberg Model via Prometheus Framework

이 논문은 프롬테우스 (Prometheus) 프레임워크를 활용하여 국소 양자 상관관계를 인코딩한 축소 밀도 행렬 (RDM) 기반의 비지도 학습을 통해, 풀 파동함수 접근이 불가능한 큰 시스템에서도 $J_1$ - $J_2$ 헤이젠베르크 모델의 중간 위상 전이를 성공적으로 발견하고 확장 가능한 경로를 제시했습니다.

Brandon Yee, Wilson Collins, Maximilian Rutkowski2026-03-13⚛️ quant-ph

Extending Sequence Length is Not All You Need: Effective Integration of Multimodal Signals for Gene Expression Prediction

이 논문은 긴 DNA 서열 길이 확장보다는 표적 유전자 근처의 다중 모달 후성유전 신호를 효과적으로 통합하여 배경 염색질 패턴의 혼란 효과를 줄이는 'Prism' 프레임워크를 제안함으로써, 짧은 서열로도 최첨단 수준의 유전자 발현 예측 성능을 달성할 수 있음을 보여줍니다.

Zhao Yang, Yi Duan, Jiwei Zhu, Ying Ba, Chuan Cao, Bing Su2026-03-13🧬 q-bio

← 이전 다음 →

cs.LG

Entropic Confinement and Mode Connectivity in Overparameterized Neural Networks

CTIGuardian: A Few-Shot Framework for Mitigating Privacy Leakage in Fine-Tuned LLMs

Deep Eigenspace Network for Parametric Non-self-adjoint Eigenvalue Problems

Provably Finding a Hidden Dense Submatrix among Many Planted Dense Submatrices via Convex Programming

A Learnable Wavelet Transformer for Long-Short Equity Trading and Risk-Adjusted Return Optimization

Text-only adaptation in LLM-based ASR through text denoising

WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

Kernel-based optimization of measurement operators for quantum reservoir computers

From Classical to Quantum: Extending Prometheus for Unsupervised Discovery of Phase Transitions in Three Dimensions and Quantum Systems

RAT+: Train Dense, Infer Sparse -- Recurrence Augmented Attention for Dilated Inference

[b]=[d]-[t]+[p]: Self-supervised Speech Models Discover Phonological Vector Arithmetic

De novo molecular structure elucidation from mass spectra via flow matching

Controllable Exploration in Hybrid-Policy RLVR for Multi-Modal Reasoning

ECHOSAT: Estimating Canopy Height Over Space And Time

Unsupervised Discovery of Intermediate Phase Order in the Frustrated $J_1$ - $J_2$ Heisenberg Model via Prometheus Framework

Extending Sequence Length is Not All You Need: Effective Integration of Multimodal Signals for Gene Expression Prediction

Evaluating Zero-Shot and One-Shot Adaptation of Small Language Models in Leader-Follower Interaction

FlashOptim: Optimizers for Memory-Efficient Training

Geodesic Semantic Search: Learning Local Riemannian Metrics for Citation Graph Retrieval

Subliminal Signals in Preference Labels

cs.LG

Entropic Confinement and Mode Connectivity in Overparameterized Neural Networks

CTIGuardian: A Few-Shot Framework for Mitigating Privacy Leakage in Fine-Tuned LLMs

Deep Eigenspace Network for Parametric Non-self-adjoint Eigenvalue Problems

Provably Finding a Hidden Dense Submatrix among Many Planted Dense Submatrices via Convex Programming

A Learnable Wavelet Transformer for Long-Short Equity Trading and Risk-Adjusted Return Optimization

Text-only adaptation in LLM-based ASR through text denoising

WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

Kernel-based optimization of measurement operators for quantum reservoir computers

From Classical to Quantum: Extending Prometheus for Unsupervised Discovery of Phase Transitions in Three Dimensions and Quantum Systems

RAT+: Train Dense, Infer Sparse -- Recurrence Augmented Attention for Dilated Inference

[b]=[d]-[t]+[p]: Self-supervised Speech Models Discover Phonological Vector Arithmetic

De novo molecular structure elucidation from mass spectra via flow matching

Controllable Exploration in Hybrid-Policy RLVR for Multi-Modal Reasoning

ECHOSAT: Estimating Canopy Height Over Space And Time

Unsupervised Discovery of Intermediate Phase Order in the Frustrated J1J_1J1​-J2J_2J2​ Heisenberg Model via Prometheus Framework

Extending Sequence Length is Not All You Need: Effective Integration of Multimodal Signals for Gene Expression Prediction

Evaluating Zero-Shot and One-Shot Adaptation of Small Language Models in Leader-Follower Interaction

FlashOptim: Optimizers for Memory-Efficient Training

Geodesic Semantic Search: Learning Local Riemannian Metrics for Citation Graph Retrieval

Subliminal Signals in Preference Labels

Unsupervised Discovery of Intermediate Phase Order in the Frustrated $J_1$ - $J_2$ Heisenberg Model via Prometheus Framework