Build Your Own Regressor

Your task today is to create a new class, MeanRegressor, which implements a similar interface to a sklearn regressor like sklearn.linear_model.LinearRegression. However, unlike a more sophisticated model implementation, your model does not actually take any of the feature variables into account, it just always predicts the mean of the target variable of the training data.

Requirements

Build a class MeanRegressor that can be initialized
Write a method MeanRegressor#fit(X, y):
- X is a two-dimensional matrix (nested NumPy array, nested Python list, or Pandas dataframe) of data rows and features. Your model will be ignoring it.
- y is a list (NumPy array or Python list) representing the target variable
- The model should determine the mean of y and store it, to be used in the predict method
- This method does not return anything
Write a method MeanRegressor#predict(X):
- X is a two-dimensional matrix. Your model will be ignoring its features, and only using the count of rows.
- This method returns the mean of the training data for each row of X, i.e. a list containing the same number repeated as many times as necessary.
Write a method MeanRegressor#score(X, y):
- X is a two-dimensional matrix and y is a list of target variables
- This method will compute the R² for how well the features of X are able to predict the target of y. As a reminder, R² is calculated as 1 - residual sum of squares/total sum of squares, where residual sum of squares is the sum of all ((y_true - y_pred)²) and total sum of squares is the sum of all ((y_true - y_pred.mean())²). So, if you are scoring with the same y that was used for the fit, you should expect the score to be exactly zero.

These requirements have test coverage. To run the tests, run pytest in bash from the root of this repository. There should be an initial print-out saying that you failed all 5 tests. Re-run pytest as you implement the requirements, and eventually they should all pass.

Hints

You will need at least one attribute (AKA member variable) to make your MeanRegressor work. One option would be storing some information about model fit that you manually create, and another option would be instantiating a DummyRegressor and passing user inputs into it.

Stretch Goals

If you have the previous requirements working and there is time remaining, consider these additional goals:

Check whether X and y are valid inputs and raise appropriate exceptions if they are not. For example, if the user tries to run fit with X having 5 rows and y having 10 target variables, produce a readable/understandable error that explains why this is invalid input.
See if you can reuse code from your Mod 2 project, and feed the King County housing data into your MeanRegressor. How does this "dummy" model compare to your previous final model, in terms of R^2 and residuals? Explore this with data visualizations.

wvsharber / oop-regressor-seattle-ds-102819 Goto Github PK

oop-regressor-seattle-ds-102819's Introduction

Build Your Own Regressor

Requirements

Hints

Stretch Goals

oop-regressor-seattle-ds-102819's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent