In this project, we develop a hit detection algorithm to detect hits in generic monocular videos, agnostic to camera angle and player skill level. We propose to use badminton domain features (court, pose and shuttlecock coordinates) as input to a GRU based model. These coordinates are extracted from each frame, and a sequence of coordinates from 14 consecutive frames is fed into the GRU model to predict three classes: 'no hit', 'hit by near player', or 'hit by far player'. During training, we consider a sequence of 14 frames to be a hit, if a hit is found in the last six frames.
- We find that our proposed method, even though it is only trained with professional broadcast singles videos, is able to generalize to some extent to amateur videos taken from different camera angles, doubles games, and players of different skill levels. This validates the robustness of badminton domain features (coordinate form) over generic RGB features.
- We provide annotated datasets of professional singles, amateur singles and amateur doubles matches here.
- We provide manual annotation tools to faciliate annotation of custom datasets under the datasets/ directory.
- We provide a semi-automatic annotation pipeline in the annotation_pipeline/ directory.
Below, we show a few demos of our proposed hit detection algorithm on videos of different camera angles, players of different skill levels, singles and doubles videos.
We prepare three sets of annotated matches: Professional singles, amateur singles, amateur doubles. They are available in this google drive link and should be placed under the datasets/ directory.
We provide ground-truth annotations of shuttlecock coordinates, hit detections, and player bounding boxes. We also provide manual annotation tools under datasets/.
We compare our proposed methods with two baselines, a ResNet image classifier and a rule-based baseline based on comparing the second derivative of the shuttlecock x and y coordinates with an empirically determined threshold. We use mAP as the evaluation metric (see report for more information).
Check out these videos for a demonstration of various qualities:
- Mixing up of near and far player hits when camera angle is too wide
- Robustness to occluded shuttlecock/ poses
- shuttlecock occluded, rule-based fails but domain works
- pose occluded, resnet fails but domain works
- Able to tell there is no hit, even when player performs hit action halfway but stops when he realises shuttlecock is going out of court pro/test_match1/1_05_02.mp4 domain
- Rally videos and their ground-truth annotations are provided under datasets/ in pro.zip, am_singles.zip, am_doubles.zip, or can be downloaded here
- Manual annotation tools are provided under datasets/ in the scripts label_tool_bbox.py, label_toolV2.py
- Semi-automatic annotation pipeline for pose and shuttlecock coordinates are in annotation_pipeline/
- Notebooks for organising datasets into input features and observing dataset statistics can be found in annotation_pipeline/organise_input_features.ipynb and annotation_pipeline/dataset_stats.ipynb
- Notebooks for training the proposed domain based algorithm and ResNet are found in domain_rnn.ipynb and ResNet_baseline.ipynb respectively. They take in input features from the directory input_features/, which can be downloaded here
- Notebooks for processing classification probabilities from the proposed algorithms can be found in hit_detection/process_pred_probs.ipynb. Sample classification probabilities can be found here
- The notebook for rule-based baseline method can be found in hit_detection/rule_baseline.ipynb
- Pretrained weights for the proposed domain method can be found in mm_weights/, or downloaded here
- Pretrained weights for ResNet can be found in resnet_data, or downloaded here
A small GPU is required for running the semi-automatic annotation pipeline, as well as for training the proposed GRU network and ResNet. The computational load is fairly light, see details in the training notebooks.
The following references were immensely useful for this project.
- MonoTrack: Shuttle trajectory reconstruction from monocular badminton video on using badminton domain features for hit event detection
- TrackNetV2: Efficient Shuttlecock Tracking Network on tracking shuttlecock with deep learning, as well as providing the TrackNetv2 dataset which formed the basis of our Professional dataset.
The full details are documented in the pdf report.
The full set of code can be found here
- Domain Adaptation to improve generalisation.
- Multimodal feature learning, possibly combine audio and rgb features with domain coordinates.
- Larger and more varied training dataset.
- Extension to other aspects of badminton video analysis, including stroke classification, strategy analysis etc.