Coder Social home page Coder Social logo

lab-comparing-regression-models's Introduction

logo_ironhack_blue 7

Lab | Comparing regression models

For this lab, we will be using the same dataset we used in the previous labs. Load the cleaned categorical and numerical dataframes that you saved at the end of Monday's labs.

Data Analysis Process

Remember the process:

  • Case Study
  • Get data
  • Cleaning/Wrangling/EDA
  • Processing Data
  • Modeling
  • Validation
  • Reporting

Instructions

Concatenate Numerical and Categorical dataframes into one dataframe called data.

  1. In this final lab, we will model our data. Import sklearn train_test_split and separate the data.

  2. Separate X_train and X_test into numerical and categorical (X_train_cat , X_train_num , X_test_cat , X_test_num)

  3. Use X_train_num to fit scalers. Transform BOTH X_train_num and X_test_num.

  4. Encode the categorical variables X_train_cat and X_test_cat (See the hint below for encoding categorical data!!!)

  5. Since the model will only accept numerical data, check and make sure that every column is numerical, if some are not, change it using encoding.

***********************************************

Hint for Categorical Variables

You should deal with the categorical variables as shown below (for ordinal encoding, dummy code has been provided as well):

Encoder Type Column
One hot state
Ordinal coverage
Ordinal employmentstatus
Ordinal location code
One hot marital status
One hot policy type
One hot policy
One hot renew offercustomer_df
One hot sales channel
One hot vehicle class
Ordinal vehicle size
Ordinal education
One hot response
One hot gender

Dummy code

data["coverage"] = data["coverage"].map({"Basic" : 0, "Extended" : 1, "Premium" : 2})

given that column "coverage" in the dataframe "data" has three categories:

"basic", "extended", and "premium" values are to be represented in the same order.

******************************************************

  1. Try a simple linear regression with all the data to see whether we are getting good results.

  2. Great! Now define a function that takes a list of models and train (and tests) them so we can try a lot of them without repeating code.

  3. Use the function to check LinearRegressor and KNeighborsRegressor.

  4. You can check also the MLPRegressor for this task!

  5. Check and discuss the results.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.