Autocorrelation effects in a stochastic-process model for decision making via time series
This study employs a stochastic-process model to demonstrate that the optimal autocorrelation of time-series signals for solving multi-armed bandit problems depends on the reward environment, with negative autocorrelation being advantageous in reward-rich settings and positive autocorrelation in reward-poor ones, while performance remains independent of autocorrelation when the sum of winning probabilities equals one.