Coder Social home page Coder Social logo

m4cit / vggr Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 34.72 MB

VGGR (Video Game Genre Recognition) is a Deep-Learning Image Classification project, answering questions nobody is asking.

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%
computer-vision convolutional-neural-networks deep-learning gaming image-classification image-recognition machine-learning python pytorch

vggr's Introduction

VGGR (Video Game Genre Recognition)

Have you ever seen gameplay footage and wondered what kind of video game it is from? No? Well, do not not wonder anymore.

VGGR is a Deep-Learning Image Classification project, answering questions nobody is asking.



(new repo due to a glitch? in the old one)

Requirements

  1. Install Python 3.10 or newer.

  2. Clone the repository with

    git clone https://github.com/m4cit/VGGR.git
    

    or download the latest source code.

  3. Download the latest train, test, and validation img zip-files in releases.

  4. Unzip the train, test, and validation img files inside their respective folders located in ./data/.

  5. Install PyTorch.

    5.1 Either with CUDA

    • Windows:
      pip3 install torch==2.2.2 torchvision==0.17.2 --index-url https://download.pytorch.org/whl/cu121
      
    • Linux:
      pip3 install torch==2.2.2 torchvision==0.17.2
      

    5.2 Or without CUDA

    • Windows:
      pip3 install torch==2.2.2 torchvision==0.17.2
      
    • Linux:
      pip3 install torch==2.2.2 torchvision==0.17.2 --index-url https://download.pytorch.org/whl/cpu
      
  6. Navigate to the VGGR main directory.

    cd VGGR
    
  7. Install dependencies.

    pip install -r requirements.txt
    

Note: The provided train dataset does not contain augmentations.

Genres

Available

  • Football / Soccer
  • First Person Shooter (FPS)
  • 2D Platformer
  • Racing

In the Works

  • Real-time Strategy (RTS)

Games

Train Set

  • FIFA 06
  • Call of Duty Black Ops
  • Call of Duty Modern Warfare 3
  • DuckTales Remastered
  • Project CARS

Test Set

  • PES 2012
  • FIFA 10
  • Counter Strike 1.6
  • Counter Strike 2
  • Ori and the Blind Forest
  • Dirt 3

Validation Set

  • Left 4 Dead 2
  • Oddworld Abe's Oddysee
  • FlatOut 2

Usage

Commands

--demo | Demo predictions with the test set

--augment | Data Augmentation

--train | Train mode

--predict | Predict / inference mode

--input (-i) | File input for predict mode (html link or local image path)

--model (-m) | Model selection

  • cnn_v1 (default)
  • cnn_v2
  • cnn_v3

--device (-d) | Device selection

  • cpu (default)
  • cuda
  • ipu
  • xpu
  • mkldnn
  • opengl
  • opencl
  • ideep
  • hip
  • ve
  • fpga
  • ort
  • xla
  • lazy
  • vulkan
  • mps
  • meta
  • hpu
  • mtia

Examples

Demo with Test Set

python VGGR.py --demo

or

python VGGR.py --demo -m cnn_v1 -d cpu

or

python VGGR.py --demo --model cnn_v1 --device cpu

Predict with Custom Input

python VGGR.py --predict -i path/to/img.png

or

python VGGR.py --predict -i https://website/img.png

or

python VGGR.py --predict -i path/to/img.png -m cnn_v1 -d cpu

Training

python VGGR.py --train -m cnn_v1 -d cpu

Delete the existing model to train from scratch.

Results

The --demo mode creates html files with the predictions and corresponding images inside the results folder.

Performance

There are three Convolutional Neural Network (CNN) models available:

  1. cnn_v1 | F-score of 75 %
  2. cnn_v2 | F-score of 58.33 %
  3. cnn_v3 | F-score of 64.58 %

cnn_v1 --demo result examples

Data

Most of the images are from my own gameplay footage. The PES 2012 and FIFA 10 images are from videos by No Commentary Gameplays, and the FIFA 95 images are from a video by 10min Gameplay (YouTube).

The train dataset also contained augmentations (not in the provided zip-file).

Augmentation

To augment the train data with jittering, inversion, and 5 part cropping, copy-paste the metadata of the images into the augment.csv file located in ./data/train/metadata/.

Then run

python VGGR.py --augment

The metadata of the resulting images are subsequently added to the metadata.csv file.

Preprocessing

All images are originally 2560x1440p, and get resized to 1280x720p before training, validation, and inference. 4:3 images are stretched to 16:9 to avoid black bars.

Libraries

vggr's People

Contributors

m4cit avatar

Stargazers

Constantine Zavezeon avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.