stat papers | Gist.Science

Variable selection in linear mixed model meta-regression with suspected interaction effects -- How can tree-based methods help?

This paper evaluates the effectiveness of tree-based methods, particularly stability-selected random effects trees, as robust complementary tools for detecting interaction effects in linear mixed model meta-regression, demonstrating their superiority over traditional linear methods when interactions are nonlinear and their growing competitiveness as the number of studies increases.

Jan-Bernd Igelmann, Paula Lorenz, Markus PaulyMon, 09 Ma📊 stat

Large Wave Direction Data Modeling Using Wrapped Spatial Gaussian Markov Random Fields

This paper proposes a wrapped Gaussian Markov random field (WGMRF) model to address the computational limitations of existing wrapped Gaussian process methods for large-scale, high-resolution spatial directional data, demonstrating its superior scalability and predictive performance through simulations and an application to wave direction data from the 2004 Indian Ocean Tsunami.

Arnab HazraMon, 09 Ma📊 stat

Optimizing Complex Health Intervention Packages through the Learn-As-you-GO (LAGO) Design

This paper introduces the Learn-As-you-GO (LAGO) design, an adaptive methodology that iteratively optimizes complex, multi-component health interventions during a trial to ensure effectiveness and minimize costs, demonstrating its potential to prevent trial failures through examples from the BetterBirth study and ongoing HIV and non-communicable disease research.

Donna Spiegelman (Center on Methods for Implementation,Prevention Science,,Department of Biostatistics, Yale University), Dong Roman Xu (Southern Medical University Institute for Global Health), Ante Bing (Department of Mathematics,Statistics, Boston University), Guangyu Tong (Section of Cardiovascular Medicine, Department of Internal Medicine, Yale University), Mona Abdo (Center on Methods for Implementation,Prevention Science,,Department of Biostatistics, Yale University), Jingyu Cui (Center on Methods for Implementation,Prevention Science,,Department of Biostatistics, Yale University), Charles Goss (Center for Biostatistics,Data Science, Washington University School of Medicine), John Baptist Kiggundu (Infectious Diseases Research Collaboration), Chris T. Longenecker (Division of Cardiology,Department of Global Health, University of Washington), LaRon Nelson (Yale School of Nursing, Yale University), Drew Cameron (Department of Health Policy,Management, Yale University), Fred Semitala (Infectious Diseases Research Collaboration,,Department of Medicine, Makerere University,,Makerere University Joint AIDS Program), Xin Zhou (Center on Methods for Implementation,Prevention Science,,Department of Biostatistics, Yale University), Judith J. Lok (Department of Mathematics,Statistics, Boston University)Mon, 09 Ma📊 stat

Clustering-Based Outcome Models for Clinical Studies: A Scoping Review

This scoping review systematically examines and categorizes clustering-based outcome models for clinical studies into informed and agnostic approaches, highlighting their utility in handling high-dimensional data and heterogeneous populations for applications such as risk stratification, rare disease research, and subgroup-specific treatment effect estimation.

Johannes Vilsmeier, Fabian Eibensteiner, Franz König, Francois Mercier, Robin Ristl, Nigel Stallard, Marc Vandemeulebroecke, Sarah Zohar, Martin PoschMon, 09 Ma📊 stat

Simultaneously accounting for winner's curse and sample structure in Mendelian randomization: bivariate rerandomized inverse variance weighted estimator

This paper proposes the bivariate rerandomized inverse variance weighted (BRIVW) estimator, a novel method that simultaneously corrects for winner's curse and sample structure in two-sample Mendelian randomization by modeling the joint distribution of genetic associations to provide more accurate and consistent causal effect estimates.

Xin Liu, Ping Yin, Peng WangMon, 09 Ma📊 stat

On parameter estimation for the truncated skew-normal distribution

This paper proposes a stable and accurate grid-based method of moments (GRID-MOM) for estimating parameters of the truncated skew-normal distribution by decoupling the shape parameter from location and scale estimates, thereby overcoming numerical instability in existing approaches.

Kwangok Seo, Seul Lee, Johan LimMon, 09 Ma📊 stat

Modeling Animal Communication Using Multivariate Hawkes Processes with Additive Excitation and Multiplicative Inhibition

This paper proposes a novel multivariate Hawkes process framework combining additive excitation and multiplicative inhibition to effectively model animal acoustic communication, which is validated through simulations and applied to reveal distinct interaction patterns in meerkat and baleen whale datasets.

Bokgyeong Kang, Erin M. Schliep, Alan E. Gelfand, Ariana Strandburg-Peshkin, Robert S. SchickMon, 09 Ma📊 stat

Two Localization Strategies for Sequential MCMC Data Assimilation with Applications to Nonlinear Non-Gaussian Geophysical Models

This paper introduces and evaluates two localization strategies for a sequential Markov Chain Monte Carlo data assimilation framework, demonstrating their ability to efficiently handle high-dimensional, nonlinear, and non-Gaussian geophysical models while avoiding weight degeneracy and outperforming traditional ensemble Kalman filters in scenarios with heavy-tailed observation noise.

Hamza Ruzayqat, Hristo G. Chipilski, Omar KnioMon, 09 Ma📊 stat

Robust Estimation of Location in Matrix Manifolds Using the Projected Frobenius Median

This paper proposes a computationally efficient and robust method for estimating the location of data on various matrix manifolds by computing the Frobenius median in an ambient Euclidean space and projecting it onto the manifold, while establishing its theoretical properties and demonstrating its effectiveness through simulations and real-world earthquake data.

Houren Hong, Kassel Liam Hingee, Janice L. Scealy, Andrew T. A. WoodMon, 09 Ma📊 stat

Preoperative Decline and Postoperative Recovery of Wearable-Derived Physical Activity Over a Four-Year Perioperative Period in Total Knee and Hip Arthroplasty: Evidence from the All of Us Research Program

This longitudinal study of 238 All of Us participants utilizing four years of Fitbit data reveals that total knee and hip arthroplasty patients experience progressive preoperative activity declines followed by a staged postoperative recovery pattern, with higher immediate preoperative activity levels significantly predicting a greater likelihood of returning to habitual physical activity.

Yuezhou Zhang, Amos Folarin, Callum Stewart, Hyunju Kim, Rongrong Zhong, Shaoxiong Sun, Richard JB DobsonMon, 09 Ma📊 stat

Change Point Detection for Cell Populations Measured via Flow Cytometry

This paper proposes a latent space Gaussian mixture-of-experts model with a group-fused LASSO penalty to detect abrupt environmental change points in single-cell flow cytometry data of marine phytoplankton, successfully identifying a transition zone between two marine provinces.

Yik Lun Kei, Qi Wang, Paul Parker, Francois Ribalet, Sangwon HyunMon, 09 Ma📊 stat

Two-stage Adaptive Design Cluster Randomised Trials

This paper proposes a two-stage adaptive design framework for cluster randomised trials that combines test approaches with Pareto optimality to enable early stopping, sample size re-estimation, and trial re-design, thereby addressing high uncertainty in correlation parameters and balancing multi-dimensional cost-efficiency objectives.

Samuel I. Watson, James MartinMon, 09 Ma📊 stat

Prediction-Oriented Transfer Learning for Survival Analysis

This paper proposes a novel transfer learning framework for survival analysis that enhances prediction accuracy in data-scarce target studies by transferring predictive knowledge from source studies using flexible semiparametric transformation models and an EM algorithm, without requiring access to individual-level source data or assuming shared model parameters.

Yu Gu, Donglin Zeng, D. Y. LinFri, 13 Ma📊 stat

Multivariate Functional Principal Component Analysis for Mixed-Type mHealth Data: An Application to Mood Disorders

This paper proposes a semiparametric multivariate functional principal component analysis method for mixed-type mHealth data, which effectively identifies interpretable time-of-day patterns across diverse health domains to stratify mood disorder subtypes.

Debangan Dey, Rahul Ghosal, Kathleen Merikangas, Vadim ZipunnikovFri, 13 Ma📊 stat

Finite-Sample Decision Instability in Threshold-Based Process Capability Approval

This study reveals that process capability decisions based on fixed thresholds (e.g., $C_{pk} \geq 1.33$ ) using moderate sample sizes inherently carry significant instability and boundary risk, as the probability of acceptance converges to 0.5 when the true capability equals the threshold, a finding supported by asymptotic theory, simulations, and empirical data from 880 manufacturing dimensions.

Fei Jiang, Lei YangFri, 13 Ma📊 stat

Worst-case low-rank approximations

This paper introduces a unified framework called wcPCA for developing worst-case optimal low-rank approximations across heterogeneous domains, extending the approach to new estimators and matrix completion while proving theoretical guarantees and demonstrating improved robustness in real-world applications with minimal loss to average performance.

Anya Fries, Markus Reichstein, David Blei, Jonas PetersFri, 13 Ma📊 stat

Outrigger local polynomial regression

This paper introduces the "outrigger" local polynomial estimator, a distributionally adaptive method that achieves minimax optimality across various conditional error distributions without requiring structural assumptions like independence or symmetry, while guaranteeing performance at least as good as standard estimators under Gaussian errors.

Elliot H. Young, Rajen D. Shah, Richard J. SamworthFri, 13 Ma📊 stat

RIE-Greedy: Regularization-Induced Exploration for Contextual Bandits

This paper proposes RIE-Greedy, a contextual bandit algorithm that leverages the inherent stochasticity of cross-validation-based regularization in black-box models to induce effective exploration, theoretically matching Thompson Sampling in simple cases and empirically outperforming existing methods in large-scale applications.

Tong Li, Thiago de Queiroz Casanova, Eric M. Schwartz, Victor Kostyuk, Dehan Kong, Joseph J. WilliamsFri, 13 Ma📊 stat

A Statistically Reliable Optimization Framework for Bandit Experiments in Scientific Discovery

This paper presents a unified optimization framework that resolves the statistical validity and power challenges of applying multi-armed bandits in scientific discovery by introducing corrected hypothesis testing methods and a novel objective function to balance cumulative reward with statistical efficiency.

Tong Li, Travis Mandel, Goldie Phillips, Anna Rafferty, Eric M. Schwartz, Dehan Kong, Joseph J. WilliamsFri, 13 Ma📊 stat

Continuous-time modeling and bootstrap for Schnieper's reserving

This paper proposes a continuous-time stochastic framework for Schnieper's reserving model that utilizes a Poisson measure for claim arrivals and Brownian motion for cost fluctuations, enabling a robust bootstrap method to estimate the full predictive distribution of reserves while naturally ensuring non-negativity and accounting for asymmetry.

Nicolas BaradelFri, 13 Ma📊 stat

← Previous Next →