Coder Social home page Coder Social logo

hoclor / cosaduv-contextual-saliency-for-detecting-anomalies-in-uav-video Goto Github PK

View Code? Open in Web Editor NEW
4.0 2.0 0.0 20.49 MB

Repository for my Master's Thesis Project: "Contextual Saliency for Detecting Anomalies within Unmanned Aerial Vehicle (UAV) Video"

License: MIT License

Jupyter Notebook 14.34% Python 85.44% Shell 0.22%
computer-science computer-vision saliency-detection uav-images machine-learning cnn

cosaduv-contextual-saliency-for-detecting-anomalies-in-uav-video's Introduction

Temporal and Non-Temporal Contextual Saliency Analysis for Generalized Wide-Area Search within Unmanned Aerial Vehicle (UAV) Video

Created and tested using Python 3.5.3, PyTorch 0.4.0, and OpenCV 3.4.3

Repository for my Master's Project on the topic of "Contextual Saliency for Detecting Anomalies within Unmanned Aerial Vehicle (UAV) Video". The thesis report is available here.

This work was also published in The Visual Computer under the title "Temporal and Non-Temporal Contextual Saliency Analysis for Generalized Wide-Area Search within Unmanned Aerial Vehicle (UAV) Video".

Architectures

Original DSCLRCN Architecture

The DSCLRCN architecture (original authors (and original image source), partial re-implementation in PyTorch used for this project) was used as a baseline. Our proposed CoSADUV (aka TeCS) architecture is shown in the figure above, with changes from the DSCLRCN architecture shown with a grey background.

The architecture was modified by replacing the "Conv+Softmax" with a Convolutional-LSTM layer with kernel size 3x3 and a Sigmoid activation function. Additionally, several loss functions other than NSSLoss were investigated (see our paper for more information). Architectures with the convolutional LSTM (CoSADUV) and without it (CoSADUV_NoTemporal, using a normal conv layer instead) are available.

Abstract

"Unmanned Aerial Vehicles (UAV) can be used to great effect for the purposes of surveillance or search and rescue operations. UAV enable search and rescue teams to cover large areas more efficiently and in less time. However, using UAV for this purpose involves the creation of large amounts of data (typically video) which must be analysed before any potential findings can be uncovered and actions taken. This is a slow and expensive process which can result in significant delays to the response time after a target is seen by the UAV. To solve this problem we propose a deep model using a visual saliency approach to automatically analyse and detect anomalies in UAV video. Our Contextual Saliency for Anomaly Detection in UAV Video (CoSADUV) model is based on the state-of-the-art in visual saliency detection using deep convolutional neural networks and considers local and scene context, with novel additions in utilizing temporal information through a convolutional LSTM layer and modifications to the base model. Our model achieves promising results with the addition of the temporal implementation producing significantly improved results compared to the state-of-the-art in saliency detection. However, due to limitations in the dataset used the model fails to generalize well to other data, failing to beat the state-of-the-art in anomaly detection in UAV footage. The approach taken shows promise with the modifications made yielding significant performance improvements and is worthy of future investigation. The lack of a publicly available dataset for anomaly detection in UAV video poses a significant roadblock to any deep learning approach to this task, however despite this our paper shows that leveraging temporal information for this task, which the state-of-the-art does not currently do, can lead to improved performance."

Instructions to run the model

  1. First follow all instructions in the READMEs inside model/ and model/Dataset/ to download and prepare the pretrained models used and to format the dataset correctly. (This may require the use of the scripts in data_preprocessing/)
  2. Open and have a look through either main_ncc.py or notebook_main.ipynb (or both). The IPython notebook provides a more interactive interface, while main_ncc.py can be run with minimal input required.
  3. Set the hyperparameters and settings near the top of the file. Most important is to set the correct dataset directory name (if different) and mean image filename.
  4. Run through the file to train the model. If using main_ncc.py, the model will automatically be run through testing once training is completed.

Example Results

Example

Click the video above to play it. The video shows the performane of the model on a sequence from the UAV123 dataset (This was also used to train the model). It presents four streams simultaneously:

  • top-left is the input sequence
  • top-right is the ground-truth data (as provided in the UAV123 dataset)
  • bottom-left is the prediction of the best non-temporal model, and
  • bottom-right is the prediction of the best temporal model.

Reference

If you make use of this work in any way, please reference the following:

@article{Gokstorp_2021,
 author = {Simon G. E. Gökstorp and Toby P. Breckon},
 title = {Temporal and non-temporal contextual saliency analysis for generalized wide-area search within unmanned aerial vehicle ({UAV}) video},
 journal = {The Visual Computer},
 volume = {38},
 number = {6},
 pages = {2033--2040},
 year = 2021,
 month = {sep},
 publisher = {Springer Science and Business Media {LLC}},
 doi = {10.1007/s00371-021-02264-6},
 url = {https://doi.org/10.1007/s00371-021-02264-6},
}

If you have any questions/issues/ideas, feel free to open an issue here or contact me!

Acknowledgements

This project was completed under the supervision of Professor Toby Breckon (Durham University).

cosaduv-contextual-saliency-for-detecting-anomalies-in-uav-video's People

Contributors

hoclor avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.