Coder Social home page Coder Social logo

berrli / environmental-insights Goto Github PK

View Code? Open in Web Editor NEW
3.0 2.0 0.0 27.14 MB

Code Repository for Environmental Insights, a python package for the accessing and analytics of ambient air pollution concentration data.

Home Page: https://arxiv.org/abs/2403.03664

License: GNU General Public License v3.0

Python 3.19% Jupyter Notebook 96.81%
airpollution ambient data-science machinelearning

environmental-insights's Introduction

Environmental Insights

Environmental Insights is a Python package for downloading and visualising air pollution concentration data in the UK and globally. Alongside the downloaded data, a set of functions have also been provided to manipulate the air pollution concentrations and explore air pollution futures. The Python package is a companion to the paper entitled "Environmental Insights: Democratizing Access to Ambient Air Pollution Data and Predictive Analytics with an Open-Source Python Package", with the following abstract:
Ambient air pollution is a pervasive issue with wide-ranging effects on human health, ecosystem vitality, and economic structures. Utilizing data on ambient air pollution concentrations, researchers can perform comprehensive analyses to uncover the multifaceted impacts of air pollution across society. To this end, we introduce Environmental Insights, an open-source Python package designed to democratize access to air pollution concentration data. This tool enables users to easily retrieve historical air pollution data and employ a Machine Learning model for forecasting potential future conditions. Moreover, Environmental Insights includes a suite of tools aimed at facilitating the dissemination of analytical findings and enhancing user engagement through dynamic visualizations. This comprehensive approach ensures that the package caters to the diverse needs of individuals looking to explore and understand air pollution trends and their implications.

This is a contituation of the work started in our recently published paper in Environment and Planning B: Urban Analytics and City Science Estimating annual ambient air pollution using structural properties of road networks

Downloading the required data

If you have accessed this work from Github, please read the associated paper that describes the use of this package and its purpose, available here:

Due to Github file size limitations, all of the models and data to use this package have been hosted on Google Drive. The link to the Google Drive folder is: https://drive.google.com/drive/folders/18ZLO8XqtFp3c4WrUJVfSH0fmAXFmL8il?usp=sharing

The Google Drive folder relates to a range of research studies outputs, with the package being designed to be used in conjucntion with the data and models. The easiest manner to integrate the data into the package is to put the corresponding google drive contents into the relevant packaghe folder, the mapping between the two is as follows:

  • environmental_insights_data/air_pollution/uk_complete_set : Data for A Framework for Scalable Ambient Air Pollution Concentration Estimation
  • environmental_insights_data/air_pollution/global_complete_set : Data for A Data-Driven Supervised Machine Learning Approach to Estimating Global Ambient Air Pollution Concentrations With Associated Prediction Intervals
  • environmental_insights_data/feature_vector/uk_typical_day : Data for Environmental Insights
    • This directory contains the data for the feature vectors used to predict the air pollution concentrations across England using the typical day framework proposed in this packages companion paper.
  • environmental_insights_data/feature_vector/supporting_data : Supporting Data for Environmental Insights
    • This directory contains supporting datasets not generated by the research but important to running the tutorial and some other functions of the package.
  • environmental_insights_models/uk : Models/UK
    • This directory contains the data driven supervised machine learning models for England.
  • environmental_insights_models/global : Models/Global
    • This directory contains the data driven supervised machine learning models globally.

Recommended Use

The code has been validated with Python version 3.9.12. You can install a conda environment with this version with the command "conda create -n "environmental_insight_enviro" python=3.9.12". This will install a conda environment with version 3.9.12 for you to use.

  • Step 1: Ensure that jupyter lab is installed. Install instructions are avaliable here.
  • Step 2: Run the code in "package_installation.ipynb" to ensure all of the required packages are avaliable.
  • Step 2: Run through the "tutorial.ipynb" file, which will explain the basic concepts of the package.
  • Step 4: Look through the file in "Documentation" that describe the complete functionality of the package.

The tutorial contains a code snippet that will donwload the required packages for the software via the requirements.txt file. The python packages that are required are:

  • lightgbm (3.3.3)
  • geopandas (0.14.1)
  • pandas (2.1.3)
  • scipy (1.11.4)
  • matplotlib (3.8.2)
  • overpy (0.6)
  • shapely (2.0.2)
  • pyarrow (14.0.1)
  • pyogrio (0.7.2)

While the code may work with other versions of these packages, these packages are the ones testing has been conducted on.

These packages can be installed via code provided in the "package_installation.ipynb" file.

The recommended method of using this package is with a jupyter notebook, which the tutorial for this package is written in. The tutorial is avaliable in the file "tutorial.ipynb". The use of Conda and Jupyter Labs to this end is also recommended.

Aspects of the package

There are three critical components to the package:

  • Data: The data aspects of the work provide access to air pollution concentration data, both in the UK (at a 1kmx1km hourly resolution) and globally (0.25-degree hourly resolution). Further feature vector data is included for both the UK and global models to see the environmental conditions resulting in the model's development and making the predictions it did.
  • Models: The model's aspect of the work provides access to the trained LightGBM models. With the feature vector data, you can predict air pollution concentrations. Further, the feature vectors can be changed to explore hypothetical situations such as "What would happen to the air pollution if the average wind speed doubled on a Friday in June in London?"
  • Functions: A set of supporting functions has been created to simplify the package's use. This includes accessing the data and models, alongside visualisation and making predictions.

All visualisations made through the function within the program are stored within the directory "environmental_insights_visulisations".

Testing

The testing for the package can be found in the "tests" directory making use of the built-in python unittests. To make the use of the tests easier, they are included within the test_workbook.ipynb jupyter notebook.

Documentation

The documentation for the project is included within the directory "Documentation". The documentation provides the overview of the different functions included in the package.

Author

Liam Berrisford

environmental-insights's People

Contributors

berrli avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.