Emanuele De Sanctis
Data Analysis, Barcelona, 31-07-2019
The objective of the project is to conduct a general analysis of the huge number of wildfires in the U.S. and to determine if factors such as the time of the year and the cause of the fire influence the area burned by a wildfire.
- Are wildfires more dangerous in the summer? That is, is the mean area burned by a wildfire wider if said wildfire happens during the summertime?
- Is the area burned by a wildfire related to its cause?
The dataset I have used was downloaded from Kaggle and it contains 1.88 million observations in a 24-year span from 1992 to 2015 acquired from the reporting systems of federal, state, and local fire organizations.
Citation: Short, Karen C. 2017. Spatial wildfire occurrence data for the United States, 1992-2015 [FPA_FOD_20170508]. 4th Edition. Fort Collins, CO: Forest Service Research Data Archive. https://doi.org/10.2737/RDS-2013-0009.4
I have found a very detailed dataset and I cleaned it to remove the information that were not required for my analysis. I then conducted a general exploratory analysis using basic plotting to better understand the evolution of the phenomenon through the years and to get a better idea about what kind of information I was working with. After the process was completed I set out to find an answer to the questions I cited earlier using statistical analysis. Finally I decided to create a choropleth map to show which areas of the US have had more fires per square kilometer. The choropleth map can be found here.
I created a trello board for general planning and of course created a repository for backup reasons and version control, but given the fact that the data was all concentrated in a single dataset there wasn't much need for organization.