Coder Social home page Coder Social logo

mlproject's Introduction

End to End Machine Learning Project

Agenda:

  1. Set up github
    • new environment
    • setup.py
    • requirements.txt
  2. src folder and build the package

Create repo

Create empty repo in github

Setup venv

Create folder and open in vs code. In that folder, create virtual environment.

conda create -p venv python==3.8 -y

conda activate venv

First commit in repo

From terminal in local:

Init git repo

git init

Create and add readme file

git add README.md
git commit -m "First commit"

Connect local repo with github. when you create an empty repo in github, it shows you this code to connect. Make sure before push that you have git.config updated with email.

git branch -M main
git remote add origin https://github.com/josrodand/mlproject.git
git push -u origin main

Create .gitignore file

You can do this from github. select create new file an write .gitignore. Select python as language and file will be filled automatically.

Make a commit from github and add .gitignore.

After that, you have to make a pull in local

git pull

This process can be automated. later on.

setup.py

This allows to create our machine learning model as a package. You will make updates and install this package in our projects. After that you will upload your package in pypl.

create src

Create src folder and __init__.py file

create requirements.txt

Put all needed modules. You can put -e . at the end to help setup.py to install al requirements.

After that, install requirements

pip install -r requirements.txt

Having -e . in requirements file will do that instalation connects with setup.py and will create a package metadata folder: mlproject.egg-info

create components

We create components folder with files:

  • init file
  • data_ingestion.py
  • data_transformation.py
  • model_trainer.py

Create pipeline

We cretate pipeline folder with files:

  • init file
  • train_pipeline.py
  • predict_pipeline.py

More files

  • Create logger.py, utils.py and exception.py in src folder

Exception code

We have created a custom exception handler that takes errors and shows file, line and type of error

logging code

We create logging code that allows the code to make log files in a directory

We can test logging code with python logger.py. It will create logs directory, with a folder with the date and file.

Start project

first download data from github and put un folder notebook/data

Create notebooks: eda and model training.

Create process modules

  • Metric code, model selection, etc in utils
  • Training code in model_trainer

We will see

Data ingestion implementation

Creamos clase DataIngestionconfig con los paths. Usamos dataclass para crear clases solo con atributos. Para meter metodos mejor hacerlo normal

Aqui lo que hemos hecho es generar una clase que incluya toda la ingesta. De momento es basicamente leer desde un csv y dividir en train y test.

Esto genera un directorio artifacts con los ficheros en csv de raw, train y test. con la clase generamos automaticamente los directorios a partir del path

En gitignore conviene añadir la carpeta artifacts para que no la suba a github. este tio la ha puesto pero se ha subido hay que mirarlo

mlproject's People

Contributors

josrodand avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.