Coder Social home page Coder Social logo

minerva-ml / open-solution-avito-demand-prediction Goto Github PK

View Code? Open in Web Editor NEW
17.0 7.0 8.0 1.18 MB

Open solution to the Avito Demand Prediction Challenge

Home Page: https://www.kaggle.com/c/avito-demand-prediction

License: MIT License

Python 42.68% Jupyter Notebook 57.32%
data-science machine-learning deep-learning kaggle python data-science-learning nlp neptune competition python3

open-solution-avito-demand-prediction's People

Contributors

dependabot[bot] avatar gitter-badger avatar jakubczakon avatar kamil-kaczmarek avatar kant avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

open-solution-avito-demand-prediction's Issues

Timestamp Features

Explore/add timestamp features:

  • day
  • day of the week
  • month (two values only)

Then explore/add group-by features with times.

UPDATE (from @Leoniak713):

  • features are extracted
  • group-by accepts these features

A lot of missing values

How to handle missing values?
Currently replace with 0 +add new binary nan/not nan is used

train model on 'category name' and 'parent category name'

Train model on images, with targets (multi-output model):

  • category name
  • parent category name

Features are two vectors:

  • probability distribution over category name (softmax)
  • probability distribution over parent category name (softmax)

Repetitive code

Is there any logic by repeting the same code twice? Why you put is_missing under train_mode flag? You can declare it out of train_mode and then in case you need return it...

    if train_mode:
        is_missing = Step(name='is_missing',
                          transformer=fe.IsMissing(**config.is_missing),
                          input_data=['input'],
                          adapter={'X': ([('input', 'X')])},
                          cache_dirpath=config.env.cache_dirpath, **kwargs)

        is_missing_valid = Step(name='is_missing_valid',
                                transformer=is_missing,
                                input_data=['input'],
                                adapter={'X': ([('input', 'X_valid')])},
                                cache_dirpath=config.env.cache_dirpath, **kwargs)

        return is_missing, is_missing_valid

    else:
        is_missing = Step(name='is_missing',
                          transformer=fe.IsMissing(**config.is_missing),
                          input_data=['input'],
                          adapter={'X': ([('input', 'X')])},
                          cache_dirpath=config.env.cache_dirpath, **kwargs)

        return is_missing

Explore solution_1 results

Explora best/worst results and most important features from solution_1 in the notebooks/devbook.ipynb

make submit, run master

Initialize your participation in the challenge:

  • make submit,
  • run code from branch master,
  • make submit calculated from this code,
  • review discussion, add some issues.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.