Coder Social home page Coder Social logo

nyctaxi's Introduction

nycTaxi

nycTaxi_rf.ipynb contains my work for the New York City Taxi Fare Prediction Kaggle playground competition. I employ a Random Forest Regressor to predict the fare of a New York City yellow taxi using data that would be available at the start of the ride. Random Forests are a supervised ensemble learning method that bag together several decision trees to correct for their tendency to overfit.

Data: You can find that data for this project from the cometition's Data tab. When running this code, I had train.csv and test.csv in a subdirectory within my working directory called data. If you use a different file structure or change the file names, make sure to update the PATH variable in the notebook and modify the read_csv calls accordingly.

While the goal at the end is to predict our target variable (fare_amount), this notebook pays particular attention to a key stength of the Random Forest โ€“ INTERPRETATION. I go through the following general steps:

  1. Loading and inspecting the data, cleaning up non-sensical data as needed.
  2. Some basic feature engineering using information about the pickup and dropoff locations.
  3. Building a quick and dirty Random Forest baseline.
  4. Single trees, bagging, and randomness.
  5. Datetime feature engineering.

To come:

  1. More on hyperparameter tuning.
  2. More feature engineering (maybe) - weather, traffic speeds by time, important locations, boroughs, public holidays
  3. Interpretation โ€“ feature importance, confidence intervals
  4. Modular code with separate modules for different functionality that I define in the notebook.

If you have any questions, suggestiong for how this notebook can be improved, or ideas for cool things I can do with Random Forests, feel free to reach out!

Credits: This work is a result of working through the fast.ai Machine Learning 1 course. I have occasionally borrowed code snippets or ideas from the lessons on Random Forests, the code for which can be found at this Github repo.

nyctaxi's People

Contributors

prratek avatar

Stargazers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.