This repository provides the data in support of the course Advanced Pandas for Data Analytics, provided by Cloud Academy.
I suggest to follow the following steps in order you to be able to replicate the course steps in your local host. Open your favourite terminal emulator, and then:
git clone https://github.com/cloudacademy/advanced-pandas-for-data-analytics.git
mkvirtualenv - p python3 <NAME_ENV>
pip install -r requirements.txt
jupyter notebook
You are now ready to get your hands dirty: enjoy!
We have used the Bike Sharing Dataset in this course, which all are available in the data folder. You can find more information on the data here.
Among many fields, it is worth to map the following variables:
- workingday: if day is neither weekend nor holiday is 1, otherwise is 0.
- weathersit:
- Clear, Few clouds, Partly cloudy, Partly cloudy mapped in value 1;
- Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist mapped in value 2;
- Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds mapped in 3;
- Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog mapped in 4.
- temp: Normalized temperature in Celsius. The values are divided to 41 (max).
- atemp: Normalized feeling temperature in Celsius. The values are divided to 50 (max).
- hum: Normalized humidity. The values are divided to 100 (max).
- windspeed: Normalized wind speed. The values are divided to 67 (max).
- casual: count of casual users.
- registered: count of registered users.
- cnt: count of total rental bikes including both casual and registered.