Coder Social home page Coder Social logo

proj-ml's Introduction

Machine learning - Kaggle competition

alt text

The goal of the competition is to obtain the lowest RMSE from machine learning predictions of a data jobs.

Kaggle

Competition bases

  • Predict the salary for data Jobs with machine learning
  • Data:
    • Salaries_data.csv (Working data)
    • Testeo.csv (Predicting data)
    • Muestra.csv (Example for Kaggle)
  • Data will be uploaded to Kaggle and RMSE will be calculated as a score.
  • Lowest score wins the competition

ETL

  • Salary and salary_currancy: both were eliminated from the table (salary_in_usd, the column-to-predict, proceed from both).

  • Company location: Grouped by 'US' or 'Not US'. Too many values to get a good correlation value and to work with.

  • Employee residence: Grouped by 'US' or 'Not US'. Too many values to get a good correlation value and to work with.

  • Employment type: Grouped by 'FT' or 'Not FT'. Too many values to get a good correlation value and to work with.

  • Job tittle: restricted to 'Data Scientist’, 'Data Analyst’, 'Machine Learning Engineer’, 'Research Scientist’ and ‘Other’

  • Categorical to numeric. With 'get_dummies', I transformed categorical columns to numeric colums to work with them.

Correlation values were improved with these changes.

Predictions

RMSE values were obtained with its own training:

  • Random forest: 57288.29736536037
  • ExtraTreeRegressor: 54998.40405193934
  • LinearRegression: 55869.57667978832
  • Lasso: 55866.50186298033
  • Ridge: 55786.59417475289
  • ElasticNet: 60719.37461777011
  • SVR: 74151.47731646513
  • LGBMRegressor: 56461.887229577005
  • XGBR: 60513.10436434398
  • CTR: 59449.57853642741
  • GBR: 56630.53859375318

Final result

🥉 Third ranking position from my class!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.