Coder Social home page Coder Social logo

lixx21 / transportation_emission_prediction Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 2.96 MB

Predict emission from transportation using machine learning

Jupyter Notebook 97.49% Python 2.51%
emission machine-learning regression-models supervised-learning

transportation_emission_prediction's Introduction

Overview

      In this era, emission is the most important thing that we must concern. With high emission there is a lot impact that we can feel, there are air pollution, climate change, etc. if emissions from human activities increase, they build up in the atmosphere and warm the climate, leading to many other changes around the world—in the atmosphere, on land, and in the oceans.as emissions from human activities increase, they build up in the atmosphere and warm the climate, leading to many other changes around the world—in the atmosphere, on land, and in the oceans Climate Change Indicator. and this is can be a big problem for us as a human.

      One of factors that can produced a lot of emissions is transportation, therefore this notebook wants to predict emission from transportation to help reduce high emission vehicle.

Exploratory Data Analysis

  1. Use data.info to see information of each columns and we know that there are 73585 rows and 12 columns
  2. use data.isnull().sum() to check null or missing values in dataset
  3. Because we only need several columns like Engine Size, Cylinders, Fuel Type, Fuel Consumption City, Fuel Consumption Highway (Hwy) and CO2 Emissions(g/km), then we remove the rest using data = data.drop([colums], axis=1)

image

  1. To make user can see the fuel type meaning we change the alphabet representation using actual fuel type

      from: image

      to: image

  1. Visualize total of each fuel type

image

      Then we know that regular gasoline is the highest fuel type that most vehicles use and natural gas is the least fuel type that vehicle use

  1. Then we plot correlation for each numerical data using scatter

image

Data Preprocessing

  1. Because there is fuel type column that contain non numerical value, therefore we need to encode that into numerical value using pd.get_dummies()

image

  1. Define x values and y value for x values contain all independent variables and y values contain label or dependent variable
  2. split x and y into x_train, x_test, y_train, y_test using train_test_split and in this case I use split size 80% for train_size and 20% for test_size

Modeling

      For modeling I use 4 model, there are:

  1. Linear Regression with estimator LinearRegression(fit_intercept=False, n_jobs=30)
  2. Ridge Regression with estimator Ridge(alpha=2.0, solver='svd')
  3. Random Forest Regression with estimator RandomForestRegressor(max_depth=50, max_features=None, min_samples_split=8)
  4. Neural Network with layers like this:

image

Result

      For the result I got accuracy and MAE for each model like this:

image

      Then we can see that best MAE and accuracy goes to Random Forest Regression. Not only that, I alos saved my models into pickle and js for tensorflow or deep learning model

transportation_emission_prediction's People

Contributors

lixx21 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.