This section explores the intersection where physics meets data analysis, a rapidly evolving frontier where complex datasets reveal hidden patterns in the universe. From tracking particle collisions to modeling cosmic structures, these studies rely on advanced statistical methods to turn raw numbers into fundamental insights about how reality works.

Gist.Science monitors every new preprint in this category as it appears on arXiv, ensuring you never miss a breakthrough. We process each entry to provide both plain-language overviews for general understanding and detailed technical summaries for experts, bridging the gap between dense research and clear comprehension.

Below are the latest papers in physics and data analysis, organized for easy reading and discovery.

Selectivity- and Activity-Aware Catalyst Descriptors for CO2_2 Hydrogenation on Alloy Nanocatalysts using Machine-Learned Force Fields

This study introduces a facet-resolved adsorption energy distribution framework utilizing machine-learned force fields to analyze 1.4 million adsorption sites across diverse alloy surfaces, thereby identifying specific compositions and orientations that optimize both activity and methanol selectivity for CO2_2 hydrogenation.

Prajwal Pisal, Ondřej Krejčí, Patrick Rinke2026-05-11🔬 cond-mat.mtrl-sci

An information-matching approach to optimal experimental design and active learning

This contribution presents a scalable, convex optimization-based information alignment approach that leverages the Fisher information matrix to select minimal, high-quality training data for accurately predicting quantities of interest, thereby addressing data scarcity and parameter non-identifiability in diverse scientific modeling and active learning applications.

Yonatan Kurniawan, Tracianne B. Neilsen, Benjamin L. Francis, Alex M. Stankovic, Mingjian Wen, Ilia Nikiforov, Ellad B. Tadmor, Vasily V. Bulatov, Vincenzo Lordi, Mark K. Transtrum2026-05-08🔬 physics.app-ph

Bayesian leave-one-out cross-validation for astrophysical model comparison using gravitational-wave background data

This study employs Bayesian leave-one-out cross-validation on pulsar-timing-array data to compare four models of supermassive-black-hole-binary evolution, finding that while current evidence does not decisively favor any single model over others, the data support ultralight-dark-matter-induced low-frequency suppression without yet distinguishing it from generic environmental hardening scenarios.

Shreyas Tiruvaskar, Chris Gordon2026-05-08🔭 astro-ph

Partial Effective Information Decomposition for Synergistic Causality

This paper introduces Partial Effective Information Decomposition (PEID), a novel interventionist framework that uniquely decomposes multivariate causal influences into unique and synergistic components under maximum-entropy interventions, thereby enabling the characterization of synergistic causation, downward causation, and interpretable causal structures in complex systems.

Mingzhe Yang, Shuo Wang, Jiang Zhang2026-05-06📊 stat

Toward a Scientific Discovery Engine for Weather and Climate Data: A Visual Analytics Workbench for Embedding-Based Exploration

This paper presents an open-source visual analytics workbench that enables scientists to interpret, validate, and explore embedding-based representations of large-scale weather and climate data by linking latent-space search results back to their physical origins and metadata, thereby facilitating a discovery workflow for identifying and retrieving analog events like tropical cyclones.

Nihanth W. Cherukuru, Matt Rehme, Kirsten J. Mayer, David John Gagne, John Schreck, John Clyne, Charlie Becker2026-05-05🔬 physics

Testing General Relativity Through Gravitational Wave Classification: A Convolutional Neural Network Framework

This paper introduces a machine learning framework that utilizes convolutional neural networks trained on response function observables to significantly enhance the classification of gravitational wave signals for testing general relativity, achieving a 33-fold improvement in sensitivity over standard waveform inputs and successfully detecting deviations in massive gravity theories.

Lavinia Heisenberg, Shayan Hemmatyar, Hector Villarrubia-Rojo2026-05-05⚛️ gr-qc

Smart Ensemble Learning Framework for Predicting Groundwater Heavy Metal Pollution

This study proposes a robust predictive framework for groundwater heavy metal pollution in the Densu Basin that integrates Gaussian copula transformations with nested cross-validated ensemble machine learning to overcome the limitations of conventional methods and accurately model the skewed Heavy Metal Pollution Index.

T. Ansah-Narh, G. Y. Afrifa, J. B. Tandoh, K. Asare, M. Addi, K. E. Yorke, D. M. A. Akpoley, K. Aidoo, S. K. Fosuhene2026-05-04🤖 cs.LG