Coder Social home page Coder Social logo

kaggle-planet-understanding-the-amazon-from-space's Introduction

Repository contains code to solve Kaggle problem Planet: Understanding the Amazon from Space. This solution won 3rd place in competition.

Requirements:

Python >= 3.4, Keras 1.2.1, Theano 0.9.2, Tensorflow, XGBoost 0.6

How to run:

You need to execute set of scripts one by one:

  • python a11_find_neighbours.py
  • python a30_create_keras_models.py
  • python a30_create_keras_models_land.py
  • python a30_create_keras_models_weather.py
  • python a30_create_keras_models_single_class.py
  • python a31_create_cnn_features_basic.py
  • python a31_create_cnn_features_land.py
  • python a31_create_cnn_features_weather.py
  • python a32_create_cnn_features_single_class.py
  • python a32_find_neighbours_features.py
  • python a42_gbm_blender.py
  • python a42_keras_blender.py
  • python a50_ensemble_from_cache_v1.py

Notes:

  1. Recreating all CNN models from scratch on single GPU will require a lot of time (around a month). It can be parallelized using separate GPU on different CNN models. Final models weights size ~50 GB. Msg me if you need these weights.
  2. Creating neighbours features requires around a day to complete.
  3. Due to high parallelization, CNN models trained on GPU can slightly differ even in case it was trained on the same code.
  4. A little bit details about solution available on Kaggle forum

Directory structure:

  • -- input - input data as it was given on Kaggle
  • -- Kaggle-Planet-Understanding-the-Amazon-from-Space - all the Python code (this repo)
  • -- models - all generated models from neural nets will be in this folder.
  • -- weights - files with weights for pretrained models. Link: Download
  • -- modified_data - some intermediate files for neighbour analysis
  • -- features - all raw features generated by neural nets will be stored in this folder. We already have them calculated. Link: Download
  • -- cache - this folder will contain arrays with predictions from XGBoost and Keras blenders
  • -- subm - final predictions (in format of submit file for Kaggle)

Dataflow

Dataflow

kaggle-planet-understanding-the-amazon-from-space's People

Contributors

idmippm avatar zfturbo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

kaggle-planet-understanding-the-amazon-from-space's Issues

Samples per epoch

Thank you so much.
Can i ask why did you choose around 10% of train data for each epoch?

Optimising for f2beta_loss

Why didn't you use f2beta_loss for the loss function for the CNNs? You used it for the Keras 2nd level model so I wonder why not the others?

Many thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.