About
Welcome to the Video, Image, and Sound Analysis Lab (VISAL) at the City University of Hong Kong! The lab is directed by Prof. Antoni Chan in the Department of Computer Science.
Our main research activities include:
- Computer Vision, Surveillance
- Machine Learning, Pattern Recognition
- Computer Audition, Music Information Retrieval
- Eye Gaze Analysis
For more information about our current research, please visit the projects and publication pages.
Opportunities for graduate students and research assistants – if you are interested in joining the lab, please check this information.
Latest News [more]
- [Mar 30, 2022]
Our project “Automatic Wide-area Crowd Surveillance Using Multiple Cameras” received the Silver Medal at Inventions Geneva Evaluation Days (IGED) 2022!
- [Aug 16, 2021]
Congratulations Qingzhong for defending his thesis!
- [Aug 12, 2021]
Congratulations to Jia for defending his thesis!
- [Jul 1, 2021]
Dr. Chan was promoted to Professor!
Recent Publications [more]
- Calibration-free Multi-view Crowd Counting.
,
In: European Conference on Computer Vision (ECCV), Tel Aviv, to appear Oct 2022. - Understanding the role of eye movement consistency in face recognition and autism through integrating deep neural networks and hidden Markov models.
,
npj Science of Learning, to appear 2022. - Asymptotic Optimality for Active Learning Processes.
,
In: Uncertainty in Artificial Intelligence (UAI), to appear Aug 2022. - Bits-Ensemble: Towards Light-Weight Robust Deep Ensemble by Bits-Sharing.
,
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) (accepted to CASES 2022), to appear 2022. - PRIMAL-GMM: PaRametrIc MAnifold Learning of Gaussian Mixture Models.
,
IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), 44(6):3197-3211, June 2022 (online 2021). [code] - Crowd Counting in the Frequency Domain.
,
In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022. - Understanding children’s attention to dental caries through eye-tracking.
,
Caries Research, 56(2):129-137, June 2022. - Wide-Area Crowd Counting: Multi-View Fusion Networks for Counting in Large Scenes.
,
International Journal of Computer Vision (IJCV), 130(8):1938-1960, May 2022. - Eye movement analysis of children's attention for midline diastema.
,
Scientific Reports, 12:7462, May 2022. - Understanding children's attention to traumatic dental injuries using eye-tracking.
,
Dental Traumatology, to appear 2022.
Recent Project Pages [more]
We propose a novel Crowd Counting framework built upon an external Momentum Template, termed C2MoT, which enables the encoding of domain specific information via an external template representation.
- "Dynamic Momentum Adaptation for Zero-Shot Cross-Domain Crowd Counting." In: ACM Multimedia (MM), Oct 2021.,
We improve the distinctiveness of image captions using a Group-based Distinctive Captioning Model (GdisCap), which compares each image with other images in one similar group and highlights the uniqueness of each image.
- "Group-based Distinctive Image Captioning with Memory Attention." In: ACM Multimedia (MM), Oct 2021 (oral). [supplemental],
We propose a novel tree structure variational Bayesian method to learn the individual model and group model simultaneously by treating the group models as the parents of individual models, so that the individual model is learned from observations and regularized by its parents, and conversely, the parent model will be optimized to best represent its children.
- "Hierarchical Learning of Hidden Markov Models with Clustering Regularization." In: 37th Conference on Uncertainty in Artificial Intelligence (UAI), Jul 2021.,
To reduce the human experts’ workload and improve the observation
accuracy, in this paper, we develop a practical system to detect Chinese White Dolphins in the wild automatically.
- "Chinese White Dolphin Detection in the Wild." In: ACM Multimedia Asia (MMAsia), Gold Coast, Australia, Dec 2021.,
We analyze eye movement data on stimuli with different feature layouts. Through co-clustering HMMs, we discover common strategies on each stimuli and cluster subjects with similar strategies.
- "Eye Movement analysis with Hidden Markov Models (EMHMM) with co-clustering." Behavior Research Methods, 53:2473-2486, April 2021.,
Recent Datasets and Code [more]
Dolphin-14k: Chinese White Dolphin detection dataset
A dataset consisting of Chinese White Dolphin (CWD) and distractors for detection tasks.
- Files: Google Drive, Readme
- Project page
- If you use this dataset please cite:
Chinese White Dolphin Detection in the Wild.
,
In: ACM Multimedia Asia (MMAsia), Gold Coast, Australia, Dec 2021.
Crowd counting: Zero-shot cross-domain counting
Generalized loss function for crowd counting.
- Files: github
- Project page
- If you use this toolbox please cite:
Dynamic Momentum Adaptation for Zero-Shot Cross-Domain Crowd Counting.
,
In: ACM Multimedia (MM), Oct 2021.
CVCS: Cross-View Cross-Scene Multi-View Crowd Counting Dataset
Synthetic dataset for cross-view cross-scene multi-view counting. The dataset contains 31 scenes, each with about ~100 camera views. For each scene, we capture 100 multi-view images of crowds.
- Files: Google Drive
- Project page
- If you use this dataset please cite:
Cross-View Cross-Scene Multi-View Crowd Counting.
,
In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR):557-567, Jun 2021.
Crowd counting: Generalized loss function
Generalized loss function for crowd counting.
- Files: github
- Project page
- If you use this toolbox please cite:
A Generalized Loss Function for Crowd Counting and Localization.
,
In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2021.
Fine-Grained Crowd Counting Dataset
Dataset for fine-grained crowd counting, which differentiates a crowd into categories based on the low-level behavior attributes of the individuals (e.g. standing/sitting or violent behavior) and then counts the number of people in each category.
- Files: dataset (1.2GB), code
- Project Page
- If you use this dataset please cite: