cs.LG papers | Gist.Science

VISTA: Vision-Language Inference for Training-Free Stock Time-Series Analysis

The paper introduces VISTA, a novel training-free framework that leverages Vision-Language Models to predict stock prices by jointly analyzing textual data and line charts through zero-shot prompting, achieving significant performance improvements over traditional statistical and text-only baselines.

Tina Khezresmaeilzadeh, Parsa Razmara, Seyedarmin Azizi, Mohammad Erfan Sadeghi, Erfan Baghaei PotraghlooTue, 10 Ma🤖 cs.LG

Stronger Enforcement of Instruction Hierarchy via Augmented Intermediate Representations

This paper proposes a novel defense against prompt injection attacks in large language models by augmenting intermediate token representations with layer-specific trainable embeddings to enforce instruction hierarchy, achieving a 1.6x to 9.2x reduction in attack success rates compared to state-of-the-art methods without compromising model utility.

Sanjay Kariyappa, G. Edward SuhTue, 10 Ma🤖 cs.LG

OCN: Effectively Utilizing Higher-Order Common Neighbors for Better Link Prediction

This paper proposes Orthogonal Common Neighbor (OCN), a novel link prediction method that addresses redundancy and over-smoothing in higher-order common neighbors through orthogonalization and normalization, achieving significant performance improvements over state-of-the-art baselines.

Juntong Wang, Xiyuan Wang, Muhan ZhangTue, 10 Ma🤖 cs.LG

ViTaPEs: Visuotactile Position Encodings for Cross-Modal Alignment in Multimodal Transformers

The paper introduces ViTaPEs, a transformer-based architecture that employs a novel two-stage positional encoding strategy to effectively fuse visual and tactile modalities, achieving state-of-the-art performance and zero-shot generalization across diverse recognition and robotic grasping tasks without relying on pre-trained vision-language models.

Fotios Lygerakis, Ozan Özdenizci, Elmar RückertTue, 10 Ma🤖 cs.LG

LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning

The paper introduces LoFT, a novel parameter-efficient fine-tuning method that aligns optimizer dynamics (momentum and variance) with full fine-tuning within a low-rank subspace, thereby eliminating the need for hyperparameter tuning and achieving performance comparable to full fine-tuning without increasing inference costs.

Nurbek Tastan, Stefanos Laskaridis, Martin Takac, Karthik Nandakumar, Samuel HorvathTue, 10 Ma🤖 cs.LG

Rethinking Continual Learning with Progressive Neural Collapse

This paper introduces Progressive Neural Collapse (ProNC), a novel continual learning framework that overcomes the limitations of fixed global ETF targets by progressively expanding the simplex equiangular tight frame with new class prototypes, thereby effectively mitigating catastrophic forgetting while maintaining flexibility and efficiency.

Zheng Wang, Wanhao Yu, Li Yang, Sen LinTue, 10 Ma🤖 cs.LG

Adaptive Correction for Ensuring Conservation Laws in Neural Operators

This paper proposes a novel, lightweight, and plug-and-play adaptive correction method that utilizes a learnable operator to enforce strict conservation laws in neural operators, thereby significantly improving their accuracy, stability, and flexibility compared to existing constraint-based approaches.

Chaoyu Liu, Yangming Li, Zhongying Deng, Chris Budd, Carola-Bibiane SchönliebTue, 10 Ma🤖 cs.LG

ActivePusher: Active Learning and Planning with Residual Physics for Nonprehensile Manipulation

ActivePusher is a novel framework that enhances data efficiency and planning reliability in nonprehensile manipulation by combining residual physics modeling with uncertainty-based active learning to prioritize informative data collection and guide control sampling toward more reliable actions.

Zhuoyun Zhong, Seyedali Golestaneh, Constantinos ChamzasTue, 10 Ma🤖 cs.LG

MMTU: A Massive Multi-Task Table Understanding and Reasoning Benchmark

This paper introduces MMTU, a large-scale benchmark comprising over 28,000 questions across 25 real-world expert-level table tasks, designed to comprehensively evaluate and reveal the significant limitations of current frontier models in understanding, reasoning, and manipulating structured tabular data.

Junjie Xing, Yeye He, Mengyu Zhou, Haoyu Dong, Shi Han, Lingjiao Chen, Dongmei Zhang, Surajit Chaudhuri, H. V. JagadishTue, 10 Ma🤖 cs.LG

Leveraging chaotic transients in the training of artificial neural networks

This paper demonstrates that utilizing unconventionally large learning rates to induce transient chaotic dynamics during neural network training creates an optimal balance between exploration and exploitation, thereby accelerating convergence to high accuracy across various architectures and tasks.

Pedro Jiménez-González, Miguel C. Soriano, Lucas LacasaTue, 10 Ma🤖 cs.LG

EROICA: Online Performance Troubleshooting for Large-scale Model Training

This paper presents EROICA, the first online troubleshooting system deployed on production-scale GPU clusters (~100,000 GPUs) that effectively diagnoses complex hardware and software performance issues in large-scale model training through fine-grained profiling and differential observability with minimal impact.

Yu Guan, Zhiyu Yin, Haoyu Chen, Sheng Cheng, Chaojie Yang, Kun Qian, Tianyin Xu, Pengcheng Zhang, Yang Zhang, Hanyu Zhao, Yong Li, Wei Lin, Dennis Cai, Ennan ZhaiTue, 10 Ma🤖 cs.LG

BemaGANv2: Discriminator Combination Strategies for GAN-based Vocoders in Long-Term Audio Generation

BemaGANv2 is an advanced GAN-based vocoder that enhances long-term audio generation for Text-to-Music and Text-to-Audio applications by integrating Anti-aliased Multi-Periodicity composition modules in the generator and systematically evaluating novel discriminator combination strategies, including the Multi-Envelope Discriminator, to achieve high-fidelity and temporally coherent results.

Taesoo Park, Mungwi Jeong, Mingyu Park, Narae Kim, Junyoung Kim, Mujung Kim, Jisang Yoo, Hoyun Lee, Sanghoon Kim, Soonchul KwonTue, 10 Ma🤖 cs.LG

Co-LoRA: Collaborative Model Personalization on Heterogeneous Multi-Modal Clients

This paper introduces Co-LoRA, a collaborative personalization framework that addresses both data and model heterogeneity through a task-relevance-aware aggregation strategy and a dimension-invariant module, validated by a new multi-modal benchmark and superior performance over state-of-the-art methods.

Minhyuk Seo, Taeheon Kim, Hankook Lee, Jonghyun Choi, Tinne TuytelaarsTue, 10 Ma🤖 cs.LG

Efficient Algorithms for Logistic Contextual Slate Bandits with Bandit Feedback

This paper introduces two efficient algorithms, Slate-GLM-OFU and Slate-GLM-TS, for the Logistic Contextual Slate Bandit problem that achieve $\tilde{O}(\sqrt{T})$ regret and $N^{O(1)}$ per-round computational complexity by combining local planning with global learning, demonstrating superior performance in both synthetic benchmarks and practical language model applications.

Tanmay Goyal, Gaurav SinhaTue, 10 Ma🤖 cs.LG

Sharpness-Aware Machine Unlearning

This paper characterizes how Sharpness-Aware Minimization (SAM) alters generalization during machine unlearning by abandoning its denoising properties when fitting forget signals, leading to the proposal of "Sharp MinMax"—a novel method that splits the model to simultaneously learn retain signals via SAM and unlearn forget signals via sharpness maximization, thereby achieving superior unlearning performance, reduced feature entanglement, and enhanced privacy.

Haoran Tang, Rajiv KhannaTue, 10 Ma🤖 cs.LG

From Semantic To Instance: A Semi-Self-Supervised Learning Approach

This paper proposes a semi-self-supervised learning approach featuring a novel GLMask representation and a semantic-to-instance pipeline that achieves state-of-the-art instance segmentation performance with minimal manual annotation, demonstrating superior results on both dense agricultural wheat head images and the general-purpose COCO dataset.

Keyhan Najafian, Farhad Maleki, Lingling Jin, Ian StavnessTue, 10 Ma🤖 cs.LG

Adaptive Batch-Wise Sample Scheduling for Direct Preference Optimization

This paper introduces SamS, an efficient algorithm that adaptively schedules training samples in Direct Preference Optimization based on the model's evolving batch-wise states, significantly improving LLM alignment performance without modifying the core DPO algorithm or incurring substantial computational overhead.

Zixuan Huang, Yikun Ban, Lean Fu, Xiaojie Li, Zhongxiang Dai, Jianxin Li, Deqing WangTue, 10 Ma🤖 cs.LG

DemoDiffusion: One-Shot Human Imitation using pre-trained Diffusion Policy

DemoDiffusion is a one-shot imitation learning method that enables robots to perform diverse manipulation tasks by leveraging kinematic retargeting to derive a rough trajectory from a single human demonstration and refining it with a pre-trained diffusion policy to ensure alignment with plausible robot actions, achieving significantly higher success rates than baseline approaches without requiring task-specific training or paired data.

Sungjae Park, Homanga Bharadhwaj, Shubham TulsianiTue, 10 Ma🤖 cs.LG

Adopting a human developmental visual diet yields robust, shape-based AI vision

By implementing a novel "developmental visual diet" inspired by human visual maturation, this study demonstrates that guiding AI learning processes rather than simply scaling data yields models with superior shape-based recognition, robustness to distortions, and alignment with human vision.

Zejin Lu, Sushrut Thorat, Radoslaw M Cichy, Tim C KietzmannTue, 10 Ma🤖 cs.LG

Noisy PDE Training Requires Bigger PINNs

This paper establishes that Physics-Informed Neural Networks (PINNs) require a network size scaling with the number of noisy samples to achieve empirical risk below the noise variance, demonstrating that simply increasing data quantity cannot compensate for insufficient model capacity in noisy PDE training.

Sebastien Andre-Sloan, Anirbit Mukherjee, Matthew ColbrookTue, 10 Ma🤖 cs.LG

← Previous Next →