Coder Social home page Coder Social logo

hotosm / osm-fieldwork Goto Github PK

View Code? Open in Web Editor NEW
14.0 12.0 79.0 6.78 MB

Processing field data from ODK to OpenStreetMap format, and other field data collection utils.

License: GNU Affero General Public License v3.0

Python 95.17% Makefile 2.00% Dockerfile 2.50% Shell 0.22% kvlang 0.10%
opendatakit openstreetmap

osm-fieldwork's Introduction

OSM Fieldwork

HOT

Processing field data from ODK to OpenStreetMap format, and other field utils.

Build CI Build Publish Docs Publish Test Package version Downloads License


๐Ÿ“– Documentation: https://hotosm.github.io/osm-fieldwork/

๐Ÿ–ฅ๏ธ Source Code: https://github.com/hotosm/osm-fieldwork


History ๐Ÿ“–

The History of the OSM Fieldwork Project begins with Rob Savoye, Senior Technical Lead at Humanitarian OpenStreetMap Team.

In 2010, Rob's rural volunteer fire department(he is also a long time free software developer for the GNU project, fire-fighter, climber, disaster tech support.) faced the challenge of outdated giant paper mapbooks with incomplete information. Despite Google having limited address and remote road coverage, the lack of cell service made it impossible to rely on it for verification. Determined to find a solution, Rob turned to OpenStreetMap (OSM). His first step involved importing building footprints and addresses into OSM, greatly aiding the fire department in locating places quickly and easily. The response time was significantly reduced, nearly halved. Given that most of the roads were dirt jeep trails, Rob undertook ground-truthing the highway and trail data in OSM. Over the course of several years, he diligently added precise information about all the highways in his area, enabling the fire department to determine the appropriate response vehicles for each scenario. Once Rob had successfully improved the fire district maps, he expanded his efforts to map the remote regions of Colorado and a few neighboring states, proving invaluable during large wildland fires. Ground-truthing became an integral part of his work, conducted using mobile devices in the field. To streamline the data collection process, Rob heavily relied on ODK and eventually created additional software to facilitate data processing, which had previously been time-consuming and tedious. Now, transferring data seamlessly from his phone to OSM requires minimal effort. To this day, Rob continues his weekly field mapping every few months while continuously enhancing the software used in the project.

About OSM Fieldwork

Osm-Fieldwork is a project for processing data collection using ODK into OpenStreetMap format. It includes several utility programs that automate part of the data flow like creating satellite imagery basemaps and data extracts from OpenStreetMap so they can be used with ODK Collect. Many of these steps are currently a manual process. All of the programs in osm-fieldwork are designed to function as the backend of a webpage, but to also work standalone and offline. The standalone functionality are simple command line programs run in a terminal. They were originally created for producing emergency response maps in the Western United States, which is explained in this talk from SOTM-US 2022 titled OSM For Firefighting. Much of the tech and usage is explained in these tech briefs. Currently these are now part of the backend for the Field Mapping Tasking Manager project at HOT.

Installation

To install osm-fieldwork, you can use pip. Here are two options:

  • Directly from the main branch:
pip install git+https://github.com/hotosm/osm-fieldwork.git
  • Latest on PyPi:
pip install osm-fieldwork

Configure

Osm-Fieldwork can be configured using a simple config ($HOME/.osm-fieldwork)file in your home directory, or using environment variables.

Config file

The config file is used to store the credentials to access an ODK Central server. You must have an account on the Central server of course for this to work. That file looks like this:

url=https://foo.org
user=[email protected]
passwd=arfood

Environment Variables

  • LOG_LEVEL

If present, will change the log level. Defaults to DEBUG.

  • ODK_CENTRAL_URL

The URL for an ODKCentral server to connect to.

  • ODK_CENTRAL_USER

The user for ODKCentral.

  • ODK_CENTRAL_PASSWD

The password for ODKCentral.

  • ODK_CENTRAL_SECURE

If set to False, will allow insecure connections to the ODKCentral API. Else defaults to True.

Using the Container Image

  • osm-fieldwork scripts can be used via the pre-built container images.
  • These images come with all dependencies bundled, so are simple to run.

Run a specific command:

docker run --rm -v $PWD:/data ghcr.io/hotosm/osm-fieldwork:latest json2osm <flags>

Run interactively (to use multiple commands):

docker run --rm -it -v $PWD:/data ghcr.io/hotosm/osm-fieldwork:latest

Note: the output directory should always be /data/... to persist data.

Utility Programs

These programs are more fully documented in this file. This is just a short overview.

CSVDump.py

This program converts the data collected from ODK Collect into the proper OpenStreetMap tagging schema. The conversion is controlled by a YAML file, so easy to modify for other projects. The output are two files, one is suitable for OSM,and is in OSM XML format. The other No converted data should ever be uploaded to OSM without validating the conversion in JOSM. To do efficient conversion from ODK to OSM, it's best to use the XLSForm library as templates, as everything is designed to work together.

basemapper.py

This program creates basemaps of satellite imagery, and produces files in mbtiles format for ODK Collect and sqlitedb files for Osmand. Imagery basemaps are very useful when the map data is lacking.or in ODK Collect, selecting the current location instead of where you are standing. The basemaps Osmand are very useful of navigation where the map data is lacking. Imagery can be downloaded from ERSI, Bing, USGS Topo maps, or Open Aerial Map

make_data_extract.py

This program makes data extracts from OpenStreetMap data. Multiple input sources are supported, a local postgresql database, or the HOT maintained Underpass database.

json2osm

odk2csv.py, odk2geojson.py, odk2osm.py

These programs ER used when working offline for extended periods. This converts the ODK XML format on your mobile device into the same CSV format used for submissions downloaded from ODK Central, or the JSON format also from Central.

odk_client.py

This program is a simple command line client to an ODK Central server. This allows you to list projects, appusers, tasks, and submissions. You can also delete projects, tasks, and appusers, but this should only be used by developers as it does direct database access, and you could lose all your data.

filter_data.py

This program is used to support humanitariam data models. It extracts the tags and values from the data models document developed by HOT, and compares those to the taginfo database to help fine tune what data goes into OSM or the private output data. This is to not flood OSM with obscure tags that aren't supported by the community. It also filters data extracts so they work with ODK Collect.

osm2favorites.py

This is a silly program, but it takes a GeoJson file, usually an OSM data extract and generates a GPX file with styling for OsmAnd. This is useful when ground-truthing map data, as it can be used for navigating to those areas.

Best Practices and troubleshooting

To ensure the quality of your converted data, here are some best practices to follow:

  • Always validate your conversion in JOSM before uploading to OpenStreetMap.

  • Use the XLSForm library as templates to ensure that your ODK Collect data is compatible with the conversion process.

  • If you're having trouble with the conversion process, try using the utility programs included with Osm-Fieldwork to troubleshoot common issues.

For more info visit the troubleshooting page.

By following these best practices and using the utility programs included with Osm-Fieldwork, you can effectively process data collection from ODK into OpenStreetMap format. However, please note that while Osm-Fieldwork has been tested and used in various projects, it is still in active development and may have limitations or issues that need to be resolved.

XLSForm library

In the XForms directory is a collection of XLSForms that support the new HOT data models for humanitarian data collection. These cover many categories like healthcare, waterpoints, waste distribution, etc... All of these XLSForms are designed to have an efficient mapper data flow, edit existing OSM data, and support the data models.

The data models specify the preferred tag values for each data item, with a goal of both tag completeness and tag correctness. Each data item is broken down into a basic and extended survey questions when appropriate.

What is an XLSForm?

An XLSForm is a spreadsheet-based form design tool that allows you to create complex forms for data collection using a simple and intuitive user interface. With XLSForms, you can easily design and test forms on your computer, then deploy them to mobile devices for data collection using ODK Collect or other data collection tools. XLSForms use a simple and structured format, making it easy for you to share and collaborate on form designs with your team or other organizations.

Using the XLSForm Library with Osm-Fieldwork

The XLSForms in the XForms directory of the XLSForm Library have been designed to support the HOT data models and have an efficient mapper data flow. These forms also allow for editing of existing OSM data and support the data models, specifying the preferred tag values for each data item with the goal of both tag completeness and tag correctness.

Here are some examples of how to use the XLSForm Library with Osm-Fieldwork

  • Download an XLSForm from the XForms directory:
wget https://github.com/hotosm/xlsform/raw/master/XForms/buildings.xls
  • Convert the XForm to OSM XML using CSVDump:

  • Use the resulting OSM XML file with JOSM or other OSM editors to validate and edit the data before uploading it to OpenStreetMap.

Summary

The XLSForm Library is a valuable resource for organizations involved in humanitarian data collection, as it provides a collection of pre-designed forms that are optimized for efficient mapper data flow and tag completeness/correctness. By using the XLSForm Library with Osm-Fieldwork, you can streamline your data collection process and ensure the quality of your data.

Osm-Fieldwork is a powerful tool for processing data collection from ODK into OpenStreetMap format. By following the best practices outlined in this documentation and using the utility programs included with Osm-Fieldwork, you can streamline your data collection process and ensure the quality of your converted data. If you have any questions or issues with osm-fieldwork, please consult the project's documentation or seek support from the project's community.

osm-fieldwork's People

Contributors

azharcodeit avatar ivangayton avatar joemaren avatar krtonga avatar kshitijrajsharma avatar linda-njau avatar mohammadareeb95 avatar ndacyayisenga-droid avatar neelimagoogly avatar nrjadkry avatar owolabioromidayo avatar petya-kangalova avatar pre-commit-ci[bot] avatar robsavoye avatar roseford avatar rsavoye avatar spwoodcock avatar sujanadh avatar varun2948 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

osm-fieldwork's Issues

Create program to merge ODK data with OSM

When using an external dataset with ODK Collect that contains existing OSM data, the existing tags need to be conflated with the tags from ODK Collect. This would be after the ODK data is converted to OSM format. Often the existing OSM data may have a few tags already, but ODK is adding more values, so they need to be merged.

Create ODK Central client

It would be useful to have a CLI program to access the ODK Central server emote REST API for basic operations. It is inefficient to frequently upload XForms with attachments when developing the XForm. A simple CLI to upload the XForm and the attachments, as well as down the submissions, would be much more efficient, and allow other scripts or cron jobs to access the server.

Create tags in data extract when using Overpass

Currently when using Overpass for the data extract, the tags don't make it to the data file. They are in the result data from the query, so just need to be processed into the OGR Feature.

XLSXForms need to support select_from_file

When using an external data file, in our case one that contains existing OSM data, the XLSXForms need to extract the tag/values from the existing data and use those to populate the default values when using ODK or Kobo Collect. Otherwise the existing data values needs to be reentered manually.

Create XLSForm for buildings

There are a lot of building data in the data models. This varies from healthcare facilities, shops, etc... There is currently no XLSForm for buildings, but I have several older ones I can base this on. Rather than create separate XLSForms for each type of building, this one starts with the function of the building, and using conditionals, changes the survey questions. So much building data is shared between the types, having multiple XLSForms would have a lot of duplication.

How to deal with tags on points that sit atop buildings?

Tags on points or polygons?

As Kristen and Ivan prepared to test the FMTM for building tagging in Boulder, we realized that there's an inconsistency in the way that tags are applied to buildings.

In many cases, the building polygon (the 'way') is tagged with building=<something, often 'yes'> a name, number of levels, etc. This makes it straightforward to add tags to the building.

However, in other cases, the building polygon has few or no tags (not much more than building=yes) but there's a point (standalone OSM node) that is tagged with most of the useful information about the building: the name, the opening hours, etc.

A particularity of the situation along Pearl Street in Boulder is that restaurants appear to be tagged with amenity=restaurant, while shops appear to be tagged with the key shop (as in shop=fabric or shop=gift). So simply pulling all of the amenities from OSM along with the buildings wouldn't catch shops, and we're not sure what other important types would also be missed.

What is the implication for building tagging?

The old OMK workflow generally assumed that it would often be appropriate for building polygons to be directly tagged. This doesn't work well when a building contains multiple amenities (a mall, or a multi-story building with a different business on each floor; that being said, a pile of 2D points doesn't do much better for the multi-story situation either). The current situation requires that we either choose between polygons and points, in which case we'll certainly fail to see/load information from one of the two layers, or bring in both, which seems likely to result in a messy, cluttered task for which it's more difficult to decide if it's properly completed.

There are a few possible approaches to this:

  1. Ignore it and tag the building polygons without looking at the tags on the points
  2. Import amenities, shops, and whatever other keys refer to points that contain information about the buildings we're visiting, and select those points rather than the building polygon centroids
  3. Integrate the tags from the points into the polygons using JOSM or other before starting work on a project
  4. Other?

It seems likely that different approaches will make sense for different contexts; wherever a particular style of mapping seems to predominate and the local community seems to be happy with it we probably shouldn't mess with it.

However, we should probably have a think about the various possible strategies, and maybe have a conversation with the data model and data quality people!

Support data from select_from_file

ODK supports loading data from an external file, that can be selected and edited based on the XForm.
in our case this these are existing POIs or buildings in OSM. Odkconvert needs to be able to preserve the tags from the existing OSM data as the data is converted to OSM. Since the tags in this case are already in OSM syntax, they don't need to be converted.

Convert the healthcare XLSForm

While using the OSM tag in the name column of the survey and choices sheets becomes the OSM tag/value, not everything can be handled this way. Usually there is some tweaking of the YAML config file. This task is to make the conversion to OSM is good.

Improve XForm for Waterways

Modify the XLSForm used for the Ruwa-Niger project used to collect data on waterways. The changes involve 2 things. The primary one is improving the efficiency of collecting map data. The secondary is to modify the XLSForm name column in the survey and choices sheets to be a closer match to approved OSM tags/values to make it easier to convert them from ODK to OSM.

Improve XForm for Educational Facilities

Modify the XLSForm used for the Ruwa-Niger project used to collect data on schools. These changes involve 2 things. The primary one is improving the efficiency of collecting map data. The secondary is to modify the XLSForm name column in the survey and choices sheets to be a closer match to approved OSM tags/values to make it easier to convert them from ODK to OSM.

Add support to Toilets XLSForm for OSM data

This is to support using ODK Collect to edit OSM data, in this case, toilets. This requires support for making the data extract, and using that data as defaults in ODK Collect.

Improve XForm for Market Places

Modify the XLSForm used for the Ruwa-Niger project used to collect data on markets. These changes involve 2 things. The primary one is improving the efficiency of collecting map data. The secondary is to modify the XLSForm name column in the survey and choices sheets to be a closer match to approved OSM tags/values to make it easier to convert them from ODK to OSM.

Refactor to support multiple output values

On occasion a single value in the ODK XML data can generate multiple OSM output tags. For example using power=solar would become "generator::source=solar,power=generator". The Yaml syntax in the config file needs to supports this using a comma as the delimiter.

Update toilets XLSForm to edit OSM data

Now that this XLSForm supports the data model, it needs to be extended to support editing existing OSM data. This will let mappers edit this feature to be tag complete.

Improve XForm for Religious Facilities

Modify the XLSForm used for the Ruwa-Niger project used to collect data on religious facilities These changes involve 2 things. The primary one is improving the efficiency of collecting map data. The secondary is to modify the XLSForm name column in the survey and choices sheets to be a closer match to approved OSM tags/values to make it easier to convert them from ODK to OSM.

Update Healthcare XLSForm to work with OSM

Now that it's possible to load OSM data into ODK Collect and use that to set the default values for all questions. While the data extract file will need to be generated for different mapping projects, that's also been automated.

Investigate using pyodk

In mid Sept, the organization that maintains the ODK Central server release an open source python module to use the REST API of the server. Currently odkconvert has it's own python module, but it would be preferred to use something with more community support. The new code should be investigated for Endpoint completeness.

Create new XLSForm for Landuse

There is a Landuse category in the data model, but there isn't an existing XLSForm for this, so one will have to be created.

Handling empty data extracts

When FMTM creates the data extracts, for some features, like public toilets, there not be any data in most of the data files. For an XLSForm though, it's expecting to access several fields like title and id. If these don't exist, ODK Collect will crash. There are two workaround for now. One is to extend the empty feature collection to have 1 entry with the minimal keywords in it. There would be no display of existing data in Collect, but it won't crash. The other workaround is to edit the the XLSForm to not use a data extract from OSM, and upload it as a custom XLS to FMTM.

Improve XForm for Water Points

Modify the XLSForm used for the Ruwa-Niger project used to collect data on water points. The changes involve 2 things. The primary one is improving the efficiency of collecting map data. The secondary is to modify the XLSForm name column in the survey and choices sheets to be a closer match to approved OSM tags/values to make it easier to convert them from ODK to OSM.

Create CLI for CSVDump.py and odk_client.py

We are using these modules in FMTM, but running standalone would be easier with a CLI.

  • odk_client is a CLI for odkcentral. It is more of a debugging tool, but its direct access into the DB lets you clean up mistakes that you can't do through the UI, like deleting projects.
  • CSVDump.py is used to extract data for going to remote / offline mapping areas. Most OSM software requires an internet connection, so this is useful for those that regularly map offline.

I propose we add a CLI for these two modules using Click (https://pypi.org/project/click/), to install into a users bin after pip installing, allowing them to run something like:
odkconvert client function --flag
odkconvert dump --flag

Create XLSForm for Place

Currently there is no XLSForm for the Place category in the data models, so one will have to be created.

Improve XForm for public toilets

Modify the XLSForm used for the Ruwa-Niger project used to collect data on public toilets. These
changes involve 2 things. The primary one is improving the efficiency of collecting map data. The secondary is to modify the XLSForm name column in the survey and choices sheets to be a closer match to approved OSM tags/values to make it easier to convert them from ODK to OSM.

Update XLSForms to support the data models

The draft data models document specifies the types of humanitarian data, as well as the data points to collect. Not all of the models are map oriented. For the models that do have map data, there may be required data points not in the XLSForm. All the XLSForms in this library will need to be enhanced to support these models where possible.

Create XLSForm library

As HOT and other humanitarian mappers often collect the same data, like water points, it would be more efficient to have a library of template XLSForms that are designed for easy of use and mapper efficiency. These would also implement the HOT data models as close as possible.

Add XLSForm for highways

There is a road network XLSForm, but collects few details about the highways. Using an enhanced version of an XLSForm already in use at HOT, and modifying it to support the new data models.

Create GeoJson data extract of existing buildings or amenities

To support the select_one_from_file functionality of ODK, a GeoJson data file of existing buildings is required. To create that data file requires either extracting it from a postgres database, or using Overpass Turbo. Both of those tasks has a learning curve, and this needed to be automated for FMTM too. This script uses either Overpass or postgres to extract all existing buildings in the user supplied boundary so not it's really easy.

Improve XForm for Waste Disposal

Modify the XLSForm used for the Ruwa-Niger project used to collect data on waste disposal. These changes involve 2 things. The primary one is improving the efficiency of collecting map data. The secondary is to modify the XLSForm name column in the survey and choices sheets to be a closer match to approved OSM tags/values to make it easier to convert them from ODK to OSM.

Improve XForm for Road Networks

Modify the XLSForm used for the Ruwa-Niger project used to collect data on road networks. These changes involve 2 things. The primary one is improving the efficiency of collecting map data. The secondary is to modify the XLSForm name column in the survey and choices sheets to be a closer match to approved OSM tags/values to make it easier to convert them from ODK to OSM.

Create XLSForm for Healthcare

Currently there is no XLSForm for Healthcare, so one will have to be created. There are a few older XLSForms for health facilities I'll use as a start.

Update logging throughout codebase

The logger needs to be instantiated using:
log = logging.getLogger(__name__)
in all modules, to allow this to be hooked by upstream loggers.

The logging still needs to work when odkconvert is used standalone and when it's incorporated into fmtm.

The code above will make it work with upstream fmtm, but to make it work standalone we need to create a basicLogger (https://github.com/hotosm/fmtm/blob/14a917952596cab53a7cb8bfc426551dd4a0068c/src/backend/main.py#L44) instance in the modules that will be called.

CSVDump.py and odk_client.py are called standalone and need the logger defined first. I will need to check if there are any other modules. Input from @rsavoye would be great if possible!

Improve XForm for Cemeteries

Modify the XLSForm used for the Ruwa-Niger project used to collect data on cemeteries. These changes involve 2 things. The primary one is improving the efficiency of collecting map data. The secondary is to modify the XLSForm name column in the survey and choices sheets to be a closer match to approved OSM tags/values to make it easier to convert them from ODK to OSM.

Improve odk_client.py command line

odk_client.py does a lot of admin tasks for the ODK Central server. It's current command line arguments aren't very intuitive, they mostly exist for testing the code during development. The command line arguments should probably be grouped differently, better choices, etc... As odk_client is the only remote admin program for Central that exists, if should be improved so others can work on it.

Update XLSForms to support the data models

The draft data models document specifies the types of humanitarian data, as well as the data points to collect. Not all of the models are map oriented. For the models that do have map data, there may be required data points not in the XLSForm. All the XLSForms in this library will need to be enhanced to support these models where possible.

Create simple mobile UI to replicate CLI

A UI would be nice for ODK -> OSM file format conversion.

  • Kivy is a small, simple web framework for Python.
  • We could use a Kivy UI to create a mobile interface for functions in osm-fieldwork.
  • We could link the important functions to buttons with a few select / input fields.
  • This would make data collection in the field easier.
  • The app would compile to Android (and possibly iOS too).

Additional info:

  • Many users only have a phone or tablet, no laptop.
  • Phones often have a longer battery life too, which is useful for the field.
  • XML data could be converted directly from the phone storage, so no internet would be needed (limited connectivity in the field).
  • Optional extra: also upload data to OSM.

Create XLSForm for Drainage

The data models have a category of drainage, which is different from water point, so a new XLSForm, will have to be created.

CI: add Github workflows for publishing to PyPi

  • Related to PR #46.
  • First we need to decide on how to manage version control.
  • I personally recommend commitizen, it would allow for bumping versions in multiple files, plus auto-changelogs based on 'conventional commit' messages.

We could then include a section in pyproject.toml:

[tool.commitizen]
name = "cz_conventional_commits"
version = "0.1.0"
version_files = [
    "pyproject.toml:version",
    "odkconvert/__version__.py",
    "Makefile:VERSION",
]
  • Once version control is handled, we could automatically push versions to pypi based on git tags.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.