Coder Social home page Coder Social logo

oop-regressor-seattle-ds-102819's Introduction

Build Your Own Regressor

Your task today is to create a new class, MeanRegressor, which implements a similar interface to a sklearn regressor like sklearn.linear_model.LinearRegression. However, unlike a more sophisticated model implementation, your model does not actually take any of the feature variables into account, it just always predicts the mean of the target variable of the training data.

Requirements

  1. Build a class MeanRegressor that can be initialized
  2. Write a method MeanRegressor#fit(X, y):
    • X is a two-dimensional matrix (nested NumPy array, nested Python list, or Pandas dataframe) of data rows and features. Your model will be ignoring it.
    • y is a list (NumPy array or Python list) representing the target variable
    • The model should determine the mean of y and store it, to be used in the predict method
    • This method does not return anything
  3. Write a method MeanRegressor#predict(X):
    • X is a two-dimensional matrix. Your model will be ignoring its features, and only using the count of rows.
    • This method returns the mean of the training data for each row of X, i.e. a list containing the same number repeated as many times as necessary.
  4. Write a method MeanRegressor#score(X, y):
    • X is a two-dimensional matrix and y is a list of target variables
    • This method will compute the R2 for how well the features of X are able to predict the target of y. As a reminder, R2 is calculated as 1 - residual sum of squares/total sum of squares, where residual sum of squares is the sum of all ((y_true - y_pred)2) and total sum of squares is the sum of all ((y_true - y_pred.mean())2). So, if you are scoring with the same y that was used for the fit, you should expect the score to be exactly zero.

These requirements have test coverage. To run the tests, run pytest in bash from the root of this repository. There should be an initial print-out saying that you failed all 5 tests. Re-run pytest as you implement the requirements, and eventually they should all pass.

Hints

You will need at least one attribute (AKA member variable) to make your MeanRegressor work. One option would be storing some information about model fit that you manually create, and another option would be instantiating a DummyRegressor and passing user inputs into it.

Stretch Goals

If you have the previous requirements working and there is time remaining, consider these additional goals:

  1. Check whether X and y are valid inputs and raise appropriate exceptions if they are not. For example, if the user tries to run fit with X having 5 rows and y having 10 target variables, produce a readable/understandable error that explains why this is invalid input.
  2. See if you can reuse code from your Mod 2 project, and feed the King County housing data into your MeanRegressor. How does this "dummy" model compare to your previous final model, in terms of R^2 and residuals? Explore this with data visualizations.

oop-regressor-seattle-ds-102819's People

Contributors

hoffm386 avatar wvsharber avatar aabrahamson3 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.