COVID-19 Trend and Analysis
The project is majorly a comparative study of algorithms on the COVID-19 global real-time data. The core concept utilizes all the principles of prediction in data science and makes use of Supervised machine learning algorithms for time-series forecasting/prediction. The algorithms proposed for this project are as follows:
- Decision Tree Induction (Keras/Python)
- CNN Regressor (Keras/Python)
- LSTM (Keras/Python)
Global real-time COVID-19 data link: https://data.humdata.org/dataset/novel-coronavirus-2019-ncov-cases
The countries selected for cross-analysis and comparison = China & Canada
The States selected for cross-analysis and comparsion - Hebei from China, Ontario from Canada.
Summary: This project attempts to conduct analysis on the WHO dataset to produce predictive analysis applying different machine learning regression approaches such as decision trees, LSTM, and CNN regressor. The primary data has 91 entries, which consists of data of various countries with respect to dates along with confirmed cases, confirmed deaths, and recovered cases. The dataset has been divided into 70:30 in which 70 percent is used for training and validation, and 30 percent is used for testing. The coronavirus disease outbreak started in 2019, arising in Wuhan, China. The key objective is to exercise different artificial intelligence approaches, we ought to predict the confirmed cases, confirmed deaths, and recovered cases, and further, various visualization techniques have been used to deduce the meaningful inferences from the model's prediction and perform specific analytics on the results concluded. The prediction models such as LSTM and CNN are evaluated on the basis of several loss functions such as R2 score and Mean Squared Error.