About

Welcome to the Video, Image, and Sound Analysis Lab (VISAL) at the City University of Hong Kong! The lab is directed by Prof. Antoni Chan in the Department of Computer Science.

Our main research activities include:

  • Computer Vision, Surveillance
  • Machine Learning, Pattern Recognition
  • Computer Audition, Music Information Retrieval
  • Eye Gaze Analysis

For more information about our current research, please visit the projects and publication pages.

Opportunities for graduate students and research assistants – if you are interested in joining the lab, please check this information.

Latest News [more]

  • [Jan 19, 2023]

    Congratulations to Xueying for defending her thesis!

  • [Dec 9, 2022]

    Congratulations to Ziquan for defending his thesis!

  • [Nov 30, 2022]

    Call for Papers: Special Issue on “Applications of artificial intelligence, computer vision, physics and econometrics modelling methods in pedestrian traffic modelling and crowd safety” in Transportation Research Part C: Emerging Technologies. Deadline April 30th, 2023.

  • [Mar 30, 2022]

    Our project “Automatic Wide-area Crowd Surveillance Using Multiple Cameras” received the Silver Medal at Inventions Geneva Evaluation Days (IGED) 2022!

Recent Publications [more]

  • ODAM: Gradient-based Instance-Specific Visual Explanations for Object Detection.
    Chenyang Zhao and Antoni B. Chan,
    In: Intl. Conf. on Learning Representations (ICLR), Rwanda, to appear May 2023.
  • Bayes-MIL: A New Probabilistic Perspective on Attention-based Multiple Instance Learning for Whole Slide Images.
    Yufei Cui, Ziquan Liu, Xiangyu Liu, Xue Liu, Cong Wang, Tei-Wei Kuo, Jason Xue Chun, and Antoni B. Chan,
    In: Intl. Conf. on Learning Representations (ICLR), Rwanda, to appear May 2023.
  • Variational Nested Dropout.
    Yufei Cui, Yu Mao, Ziquan Liu, Qiao Li, Antoni B. Chan, Xue Liu, Tei-Wei Kuo, and Xue Chun,
    IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), to appear 2023.
  • Optimal planning of municipal-scale distributed rooftop photovoltaic systems with maximized solar energy generation under constraints in high-density cities.
    Haoshan Ren, Zhenjun Ma, Antoni B. Chan, and Yongjun Sun,
    Energy, 263(Part A):125686, Jan 2023.
  • Improved Fine-Tuning by Better Leveraging Pre-Training Data.
    Ziquan Liu, Yi Xu, Yuanhong Xu, Qi Qian, Hao Li, Xiangyang Ji, Antoni B. Chan, and Rong Jin,
    In: Neural Information Processing Systems (NeurIPS), To appear 2022.
  • An Empirical Study on Distribution Shift Robustness From the Perspective of Pre-Training and Data Augmentation.
    Ziquan Liu, Yi Xu, Yuanhong Xu, Qi Qian, Hao Li, Rong Jin, Xiangyang Ji, and Antoni B. Chan,
    In: NeurIPS 2022 Workshop on Distribution Shifts: Connecting Methods and Applications (DistShift), to appear 2022.
  • Precise Augmentation and Counting of Helicobacter Pylori in Histology Image.
    Yufei Cui, Yixin Chen, Zhifeng Shuai, Fang Peng, Yanbo Lv, Luoning Zheng, Xue Liu, Antoni B. Chan, Tei-Wei Kuo, and Chun Jason Xue,
    In: NeurIPS 2022 Workshop on Medical Imaging meets NeurIPS (MedNeurIPS), to appear 2022.
  • Boosting Adversarial Robustness From The Perspective of Effective Margin Regularization.
    Ziquan Liu and Antoni B. Chan,
    In: British Machine Vision Conference, to appear 2022.
  • Scale-Prior Deformable Convolution for Exemplar-Guided Class-Agnostic Counting.
    Wei Lin, Kunlin Yang, Xinzhu Ma, Junyu Gao, Lingbo Liu, Shinan Liu, Jun Hou, Shuai Yi, and Antoni B. Chan,
    In: British Machine Vision Conference, to appear 2022.
  • Bits-Ensemble: Towards Light-Weight Robust Deep Ensemble by Bits-Sharing.
    Yufei Cui, Shangyu Wu, Qiao Li, Antoni B. Chan, Tei-Wei Kuo, and Jason Xue Chun,
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 41(11):4397-4408, Nov 2022 (CASES 2022).

Recent Project Pages [more]

Calibration-free Multi-view Crowd Counting

We propose a calibration-free multi-view crowd counting (CF-MVCC) method, which obtains the scene-level count as a weighted summation over the predicted density maps from the camera-views, without needing camera calibration parameters.

Single-Frame-Based Deep View Synchronization for Unsynchronized Multicamera Surveillance

We propose a synchronization model that operates in conjunction with existing DNN-based multi-view models to allow them to work on unsynchronized data.

Modeling Eye Movements by Integrating Deep Neural Networks and Hidden Markov Models

We model eye movements on faces through integrating deep neural networks and hidden Markov Models (DNN+HMM).

Crowd Counting in the Frequency Domain

We derive loss functions in the frequency domain for training density map regression for crowd counting.

Dynamic Momentum Adaptation for Zero-Shot Cross-Domain Crowd Counting

We propose a novel Crowd Counting framework built upon an external Momentum Template, termed C2MoT, which enables the encoding of domain specific information via an external template representation.

Recent Datasets and Code [more]

Modeling Eye Movements with Deep Neural Networks and Hidden Markov Models (DNN+HMM)

This is the toolbox for modeling eye movements and feature learning with deep neural networks and hidden Markov models (DNN+HMM).

Dolphin-14k: Chinese White Dolphin detection dataset

A dataset consisting of  Chinese White Dolphin (CWD) and distractors for detection tasks.

Crowd counting: Zero-shot cross-domain counting

Generalized loss function for crowd counting.

CVCS: Cross-View Cross-Scene Multi-View Crowd Counting Dataset

Synthetic dataset for cross-view cross-scene multi-view counting. The dataset contains 31 scenes, each with about ~100 camera views. For each scene, we capture 100 multi-view images of crowds.

Crowd counting: Generalized loss function

Generalized loss function for crowd counting.