This project is an attempt at an open sourced competition to best analyze and predict water well functionality in Tanzania. This could also include suggestions at infrastructure changes, geographical decisions, and population focuses. Ideally we would like to identify (with a reasonable degree of certainty) areas or well types that are especially functional, or unusually non-functional.
The problem is, of course, access to safe and reliable drinking water. Quite literally a matter of life and death, the correct analyzation of which would be invaluable. Population growth, coupled with less and less wells being built while more and more break could spell disaster for Tanzania if not properly addressed and responded to.
The data used for this project was generously provided by the Tanzanian Ministry of Water, as well as Taarifa, an open sourced infrastructure assisting to bring water as well as awareness to the nation of Tanzania. The data covers just under 60,000 water wells across the country, their status (functional versus non-functional) as well as other data such as source types, age, and those responsible for it's installation. This is a ternary problem, and so there are multiple data sets to cross examine.
Our model scored in at just over 81% accuracy, allowing us to identify well with a very reasonable degree of certainty. We also tested for feature importance which found that geographical location played a very signifcant role in well functionality. The further north and west a well is, the more likely it is to be functional. This is most likely due to the Northwestern border of Tanzania's proximity to Lake Victoria, Africa's largest body of water.
As far as future work goes, there is still much to be done. We could cross-examine different well failures, what factors they share, and possibly use that to prevent and predict such events. Additionally, the average lifespan of a well or pump before going dry, which can help planning and creating infrastructure for future projects, and inevitable breaks or malfunctions. Lastly, what the ideal ratio is person:well, as this would save materials from unnecessary overlap, which would be all the more arduous considering how much of the country is currently facing a drought.
Email: [email protected] Github: luca-caruccio Linkedin: Luca Caruccio
-Introduction and Overview
-Business Problems
-Data Methods
-Results
-Conclusions
-Future Work and Next Steps