Coder Social home page Coder Social logo

marcpaulo15 / titanic_streamlit Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 293 KB

TOP13% solution for the Titanic-Kaggle competition using a Gradient Boosting Classifier. Moreover, implementation of a Streamlit App to play with the models.

License: MIT License

Python 4.88% Jupyter Notebook 95.12%
data-science data-science-introduction data-visualization gradient-boosting gradient-boosting-classifier hyperparameter-optimization kaggle-titanic machine-learning machine-learning-introduction model-optimization

titanic_streamlit's Introduction

Titanic Streamlit App

In this project, we present a solution to the Titanic Competition that achieves a 0.78 score on the test dataset (TOP 13% on 07/09/2023). Moreover, we present a simple but beautiful Streamlit App that allows the users to pretend they are aboard the Titanic and find out their chances of surviving the sinking. https://www.kaggle.com/code/marcpaulo/titanic-playground-for-new-kagglers-0-78

demo_screenshot

This project may be exciting for those who are starting their Data Science journey, exploring the Kaggle environment, or want to learn how to use Streamlit in their projects. There are also some TO-DO's and other ideas for you to implement in order to further your knowledge and achieve a better score in the competition.

The topics that we cover here are:

1. Exploratory Data Analysis (EDA): --> Exhaustive feature analysis, outlier detection, missing values, visualize the distribution and structure of the data, initial guess on the importance of each feature, etc. #pandas #seaborn #numpy

2. Data Preprocessing: --> Create a new feature from the existing ones (feature extraction), deal with missing data, design and implement clean, efficient, and reusable Pipelines and Data Transformers using the #sklearn library.

3. Let's train some Models: --> Test basic classification algorithms (LogisticRegression, SVC, RandomForest, GradientBoosting, KNN), and run some GridSearch with Cross-Validation for hyperparameter optimization. Plot the GridSearch results to compare the performance of all the different settings and choose the best configuration.

Project Structure

/data/train.csv: data used to train the models. It contains information about 891 passengers.
/images/RMS_titanic.jpg: images used to decorate the Streamlit app.
/main_app/streamlit_app.py: launches the app using the powerful Streamlit library.
/models/trained_grad_boost.pkl: trained model resulting from the training notebooks and used in the App.
/notebooks/training_playground.ipynb: long notebook implementing the Data Science part in detail.
/notebooks/training_grad_boost.ipynb: shot version of the playground notebook in which we only train and save the best model.

titanic_streamlit's People

Contributors

marcpaulo15 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.