Coder Social home page Coder Social logo

date-a-scientist's People

Watchers

 avatar

date-a-scientist's Issues

Project Review

Rubric Score

Criteria 1: Valid Python Code

  • Score Level: 4/4
  • Comment(s): Code included runs without any errors.

Criteria 2: Exploration of Data

  • Score Level: 3/4
  • Comment(s): Data is explored somewhat, and the experimental question(s) chosen are logical and based on the data exploration. Features chosen to answer the question make sense, but some of the most obvious features are left out/not addressed.

You only explored a few of the features, and you did not explore any of the free-form text data. One of the great values of Data Science is the possibility to uncover insights that our intuition alone can't detect, so I encourage you to fully explore all the features.

Word clouds are a great way to visualize text. Also, think about how you can turn free-form text into features (vectorization, converting to categorical through clustering, etc).

Another worthy area of exploration is the interaction between features. See if any features are correlated with one another. How would this affect the models you use?

Criteria 3: Machine Learning Techniques used correctly

  • Score Level: 3/4
  • Comment(s): 75-90% of algorithms are used correctly and the correct conclusions are drawn from the results.

I would suggest formulating another question just for regression. As a side note, you should do some Google research on why Logistic Regression is a classification technique. It will help clarify the distinction between the two. Also, for KNN, k does not need to be three (don't mix up KNN with K-Means).

You also formulated body_type as an ordinal feature. Generally this is bad practice, and I would encourage you to keep it as a categorical feature and one-hot encode it.

Criteria 4: Report: Are conclusions clear and supported by data?

  • Score Level: 4/4
  • Comment(s): Question(s) are stated clearly. The results of 2 regression algorithms and 2 classification algorithms are shown. Conclusions are clearly stated and based on evidence.

Although I am giving you full credit based on the rubric criteria, I do not fully agree with the conclusion you reached. You chose accuracy as your evaluation metric. Have you considered trying other evaluation metrics (i.e. precision, recall, f-1 score, MSE)? Also think about, which feature had the most significant effect on the outcome?

Criteria 5: Code formatting

  • Score Level: 4/4
  • Comment(s): Code is formatted clearly and readable.

Simply stated, your python script is beautiful. All printouts are neatly organized and separated, and it made it really easy to read your code. My only complaint is the naming of the variable feature_label. Feature and label are two distinct concepts, and the variable name you chose could be confusing.

To take your project to the next level, consider putting your code and conclusions in a Jupyter notebook and uploading it onto Github. This is a great way to showcase your work to employers.

Overall Score: 18/20

Overall, good job on this project. Although I think you should continue developing this project by trying more models and testing different features, you demonstrated a clear understanding of the overall process of applying ML.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.