A project focusing on data extraction, exploration and visualization of the COVID-19 effect on a worldwide scale, as well as particularly in the US. The code is available at the Jupyter notebook (.ipynb file).
The goals has been to make use of "real-life" datasets, apply thorough data cleaning and exploration, and eventually, identify emerging trends of the coronavirus pandemic.
Workflow followed
The process I followed has generally been:
- Data import
- Data cleaning, reshaping, quality checks
- Exploratory Data Analysis (EDA)
- Final interactive visualization on Tableau Software
For the data import & mining I have used wget/ pandas libraries, while for EDA I have also involved matplotlib/ seaborn/ folium. The COVID-19 data is extracted by the John Hopkins University github page, while the country population data are extracted from the United Nations and World Bank latest available datasets.