Stock Market Prediction Using Node Transformer Architecture Integrated with BERT Sentiment Analysis

This paper proposes an integrated framework combining a node transformer architecture with BERT-based sentiment analysis to model stock market graphs and social media sentiment, demonstrating superior forecasting accuracy (0.80% MAPE) and directional precision compared to traditional ARIMA and LSTM models across 20 S&P 500 stocks from 1982 to 2025.

Mohammad Al Ridhawi, Mahtab Haj Ali, Hussein Al Osman2026-03-09🤖 cs.AI

Addressing the Ecological Fallacy in Larger LMs with Human Context

This paper demonstrates that addressing the ecological fallacy by modeling an author's language context through a specific task called HuLM, particularly during fine-tuning (HuFT) or continued pre-training, significantly improves the performance of an 8B Llama model across multiple downstream tasks compared to standard training methods.

Nikita Soni, Dhruv Vijay Kunjadiya, Pratham Piyush Shah, Dikshya Mohanty, H. Andrew Schwartz, Niranjan Balasubramanian2026-03-09🤖 cs.AI

Who We Are, Where We Are: Mental Health at the Intersection of Person, Situation, and Large Language Models

This paper proposes an interpretable modeling approach that integrates person-level psychological traits with situational context features derived from social media data to predict dynamic mental well-being, demonstrating that theory-driven methods offer competitive performance and greater human-understandable insights compared to standard language model embeddings.

Nikita Soni, August Håkan Nilsson, Syeda Mahwish, Vasudha Varadarajan, H. Andrew Schwartz, Ryan L. Boyd2026-03-09🤖 cs.AI

Preventing Learning Stagnation in PPO by Scaling to 1 Million Parallel Environments

This paper identifies that learning stagnation in PPO arises from poor sample-based loss estimates due to excessive step sizes relative to gradient noise, proposing that scaling to over one million parallel environments effectively mitigates this issue and enables monotonic performance improvements up to one trillion transitions.

Michael Beukman, Khimya Khetarpal, Zeyu Zheng, Will Dabney, Jakob Foerster, Michael Dennis, Clare Lyle2026-03-09🤖 cs.LG

Agnostic learning in (almost) optimal time via Gaussian surface area

This paper improves the known bounds for agnostic learning of concept classes with bounded Gaussian surface area by demonstrating that a polynomial degree of O~(Γ2/ε2)\tilde{O}(\Gamma^2 / \varepsilon^2) suffices for ε\varepsilon-approximation, thereby yielding near-optimal complexity for learning polynomial threshold functions in the statistical query model.

Lucas Pesenti, Lucas Slot, Manuel Wiedmer2026-03-09🤖 cs.LG

Improved high-dimensional estimation with Langevin dynamics and stochastic weight averaging

This paper demonstrates that Langevin dynamics combined with stochastic weight averaging can achieve optimal sample complexity of ndk/2n \gtrsim d^{k^\star/2} for recovering a hidden direction in high-dimensional settings like tensor PCA and single-index models, effectively emulating landscape smoothing without explicit regularization.

Stanley Wei, Alex Damian, Jason D. Lee2026-03-09🤖 cs.LG

DQE: A Semantic-Aware Evaluation Metric for Time Series Anomaly Detection

This paper proposes DQE, a novel semantic-aware evaluation metric for time series anomaly detection that addresses existing limitations in bias, consistency, and false alarm penalization by introducing a semantic-based partitioning strategy and aggregating scores across the full threshold spectrum to provide more stable, discriminative, and interpretable assessments.

Yuewei Li, Dalin Zhang, Huan Li, Xinyi Gong, Hongjun Chu, Zhaohui Song2026-03-09🤖 cs.LG