Coder Social home page Coder Social logo

pandas-project's Introduction

IronHack Logo

Project: Data Cleaning and Manipulation with Pandas

Overview

The goal of this project is to combine everything you have learned about data wrangling, cleaning, and manipulation with Pandas so you can see how it all works together. For this project, you will start with one of these messy data sets:

You will need to import a data set, use your data wrangling skills to clean it up, prepare it to be analyzed, and then export it as a clean CSV data file.

You will be working individually for this project, but we'll be guiding you along the process and helping you as you go. Show us what you've got!


Technical Requirements

The technical requirements for this project are as follows:

  • The dataset that we provide you is a significantly messy data set. Apply the different cleaning and manipulation techniques you have learned.
  • Import the data using Pandas.
  • Examine the data for potential issues.
  • Use at least 8 of the cleaning and manipulation methods you have learned on the data.
  • Produce a Jupyter Notebook that shows the steps you took and the code you used to clean and transform your data set.
  • Export a clean CSV version of your data using Pandas.

Necessary Deliverables

The following deliverables should be pushed to your Github repo for this chapter.

  • A cleaned CSV data file containing the results of your data wrangling work.
  • A Jupyter Notebook (data-wrangling.ipynb) containing all Python code and commands used in the importing, cleaning, manipulation, and exporting of your data set.
  • A README.md file containing a detailed explanation of the process followed in the importing, cleaning, manipulation, and exporting of your data as well as your results, obstacles encountered, and lessons learned.

Suggested Ways to Get Started

  • Examine the data and try to understand what the fields mean before diving into data cleaning and manipulation methods.
  • Break the project down into different steps - use the topics covered in the lessons to form a check list, add anything else you can think of that may be wrong with your data set, and then work through the check list.
  • Use the tools in your tool kit - your knowledge of Python, data structures, Pandas, and data wrangling.
  • Work through the lessons in class & ask questions when you need to! Think about adding relevant code to your project each night, instead of, you know... procrastinating.
  • Commit early, commit often, don’t be afraid of doing something incorrectly because you can always roll back to a previous version.
  • Consult documentation and resources provided to better understand the tools you are using and how to accomplish what you want.

Useful Resources

pandas-project's People

Contributors

daniloxxv avatar evankiske avatar ta-data-mex avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.