Learning Risk Preferences in Markov Decision Processes: an Application to the Fourth Down Decision in the National Football League

This paper employs an inverse optimization framework on NFL play-by-play data to demonstrate that coaches' historically conservative fourth-down decisions are consistent with optimizing low quantiles of future value, revealing that their risk preferences have become more tolerant over time and vary based on field position.

Nathan Sandholtz, Lucas Wu, Martin Puterman + 1 more2026-03-06🔢 math