Patch Validation in Automated Vulnerability Repair
This paper introduces PVBench, a benchmark demonstrating that over 40% of patches generated by automated vulnerability repair systems are falsely validated as correct because they fail to pass critical "PoC+" tests that encode developer intentions, root cause locations, and specific coding conventions.