
We introduce a point-to-region (P2R) loss to address challenges in semi-supervised point-based crowd counting. Traditional point-to-point (P2P) supervision struggles with pseudo-label confidence propagation to background pixels, leading to over-activation of feature maps and misinterpretation of non-pedestrian regions as foreground. We identify this issue through experiments and propose a point-specific activation map (PSAM) to visualize activation patterns, revealing excessive activation around foreground pixels during semi-supervised training.
To resolve this, our P2R scheme replaces pixel-level matching with region-level supervision. Instead of matching individual points, P2R segments local regions around pseudo-labels, allowing confidence scores to propagate to all pixels within these regions. This approach eliminates the need for the computationally expensive Hungarian algorithm (required in P2P), significantly reducing computational cost. P2R also enables reliable background pixel supervision by sharing pseudo-label confidence across matched regions.
We demonstrate P2R’s superiority in semi-supervised counting and unsupervised domain adaptation. Using only 5% labeled data, P2R exceeds performance of methods using 10% labeled data. It achieves state-of-the-art results on benchmarks like ShanghaiTech and UCF-QNRF, with a 68× speedup over P2P. Ablation studies confirm P2R resolves PSAM-observed over-activation issues, enabling stable training with unlabeled data.
Selected Publications
Point-to-Region Loss for Semi-Supervised Point-Based Crowd Counting.
,
In: IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), June 2025 (highlight).
,
In: IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), June 2025 (highlight).
