How Well Do Multimodal Models Reason on ECG Signals?
This paper introduces a reproducible, scalable framework for evaluating multimodal models on ECG signals by decomposing reasoning into "Perception" (verified via code generation) and "Deduction" (verified via retrieval against clinical criteria) to address the limitations of existing manual or superficial evaluation methods.