Revisiting Referring Expression Comprehension Evaluation in the Era of Large Multimodal Models
Published in BEAM (CVPR Workshop), 2025
This study identifies limitations in existing REC benchmarks and introduces Ref-L4, a comprehensive benchmark with diverse objects, longer expressions, and a larger vocabulary, to better evaluate modern REC models.
Recommended citation: Chen, J., Wei, F., Zhao, J., Song, S., Wu, B., Peng, Z., Chan, S.G., & Zhang, H. (2024). Revisiting Referring Expression Comprehension Evaluation in the Era of Large Multimodal Models. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 513-524.
