Coder Social home page Coder Social logo

trefoil's Introduction

Trefoil (formerly Clover)

Because today might be your lucky day.

**(note: this was renamed from clover on 4/20/2018 due to name conflicts on pypi)

Geospatial operations with NetCDF files and numpy arrays.

Build Status Coverage Status

Why?

We needed a library to consolidate a series of utility scripts and general geospatial operations on NetCDF and numpy arrays. We found we were creating a lot of purpose built scripts for other projects involving lots of processing of NetCDF climate and model outputs. Where possible, we have been pulling out general patterns and placing them here. When we looked for existing work, we didn't find anything that quite met our needs, with a clean API and no strong assertions about data model or compliance with CF-conventions (we aspire to conventions, but not all data meet them).

Specifically, we want to provide:

  • simple and fast API for rendering numpy arrays to images
  • simple API to provide utility functions that make working with NetCDF data easier
  • simple command line interface to make common operations easy and portable
  • analysis operations to simplify using geometries alongside raster data
  • analysis operations to summarize across various dimensions of spatial and temporal-spatial datasets (anything more than 3 dimensions makes our heads hurt!)

We are trying to avoid reimplementing anything well-handled elsewhere. Where possible, we contribute functionality to other libraries (e.g., rasterio) where we think that the functionality is general enough not to depend on living within trefoil.

Where is it being used?

This is a core dependency for ncdjango, our Django-based NetCDF map server.

We are using this on a variety of internal projects within the Conservation Biology Institute.

Installation

pipenv is used for managing dependencies in this project.

pipenv install trefoil

No longer directly maintained / supported:

On Windows, install the ones that require compiling from Python Windows Packages. Then install the remainder using pip

Command line interface

This is currently undergoing heavy development.
See CLI docs for more information.

Work in progress

This is still under active development, as we have time and need. All APIs are subject to change until we hit version 1.0.

Specifically, we need to work on:

  • standardizing API patterns
  • documentation
  • test coverage and correctness
  • roadmap

Contributors:

With inspiration from Tim Sheehan and Ken Ferschweiler.

See Also:

  • rasterio: Geospatial I/O and operations on rasters, done right.
  • OCGIS: Geoprocessing on CF compatible climate datasets.
  • scikit-image: Python image processing
  • python-rasterstats: Summary statistics of rasters using geometries

trefoil's People

Contributors

brendan-ward avatar dependabot[bot] avatar kennino avatar nikmolnar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

trefoil's Issues

Add CLI for creating mask from shapefile

Allow user to create a mask netcdf file from a template netCDF file and a shapefile. Primary use would be for masking out pixels from other NetCDF files in other CLI commands, e.g., render_netcdf.

Same basic usage as rasterio's rio mask command.

Experiment with WebP for image output

Smaller than PNG, even optimized. Not sure how it compares to smaller paletted images produced here.

Only supported in Google products and a couple others, not IE or FF. Bummer!

Nonetheless, see how it compares for our usage here. A siginificant reduction may argue for making it a parameter to renderers.

Create CLI to generate style file for a netcdf file(s)

Given similar commands about colorspace, renderer type, value range, etc from render_netcdf, use these to generate a style JSON file.

Initially this could be limited to a single variable; it would be nice if this is handled gracefully on subsequent runs with other variables to build a consolidated file (first run creates file for variable 1, second run appends variable 2).

An additional interface could be to build default (black to white) stretched renderers for all listed variables, and leave it up to the user to populate the file with alternative colors.

Investigate palette transparency in latest version of Pillow

Pillow might have solved the issue with setting transparency value into the palette, so that we can keep a paletted image instead of converting to RGBA for purposes of setting the transparency layer. This should bring image size down a good bit, and may obviate some of the need for later tools like pngquant / tinypng downstream.

Cannot import 'get_data_window' from rasterio > 1.0a(x)

Using rasterio==1.0a5, the get_data_window method is no longer importable directly from rasterio. The equivalent function is now imported using the windows module, like the below (from here)

from rasterio import windows

def test_data_window_unmasked(data):
    window = windows.get_data_window(data)
    assert window == ((0, data.shape[0]), (0, data.shape[1]))

I'd have to dig into where that change was made, and it's not game-breaking right now, but it may be something to consider when using window operations.

Extract stack of netCDF files to CSV

Use case:

  • One netCDF file contains the IDs of reporting units (digitized from polygons).
  • Extract from a large stack of netCDF files that match exactly in terms of spatial footprint

Output should be CSV, with first column as reporting unit ID, and one column per netCDF file.

Allow control of precision.

Will need to build in chunking for the data extraction to avoid running out of memory, but tune so that there isn't a lot of I/O churn.

Add window option to to_netcdf CLI

Some rasters have undesirable initial extents (e.g., well beyond world bounds), which makes reprojection and use in maps difficult.

Provide an option to crop to coordinates in the source crs, world bounds, and possibly autocrop otion as well.

Shapefile input for 'zones' command not valid

Iโ€™m trying to create zones using a shapefile like this:

trefoil zones --attribute Clusters_c --like MERRA2_data\MERRA-2.tavg1_2d_lnd_Nx.20170101.SUB.nc4 gis\sweden.shp .

The input shapefile (gis/sweden.shp) is a valid shapefile, but trefoil throws me an error:

Error: Invalid value for "INPUT": gis/sweden.shp is not a valid input file

I think this is because INPUT argument is created using @file_in_arg from rasterio.rio.options. I think that assumes that the input file should be some format supported by GDAL (see rasterio.rio.options.file_in_handler and rasterio.shutil.exists). Am I understanding something wrong here?

Update travis config

Drop python 3.4 and add 3.5 and 3.6.

We can also theoretically drop specific download of rasterio wheel and GDAL dependencies and follow the model set here instead.

Guess year as date type

Sometimes we get a time dimension (e.g., called 'time') which is lacking units or any other attributes that could indicate what type of time it is.

If it is an integer type, and on the range 1800 to 2200 or so, guess that it is years.

Guessing may be OK for use in Data Basin where user can override the guess on import; it may have undesirable side effects in other places within clover though.

Post to Pypi

Preferably after stable 1.0* of rasterio lands.

`describe` fails on datasets with `char`-type variables

Trying to use describe (CLI or API) on a dataset with one or more char-type variables causes the following error:

  File "/Users/nikmolnar/projects/trefoil/trefoil/netcdf/describe.py", line 76, in describe
    'min': data.min().item(),
  File "/Users/nikmolnar/virtualenv/trefoil-_zYcxnHA/lib/python3.6/site-packages/numpy/core/_methods.py", line 29, in _amin
    return umr_minimum(a, axis, None, out, keepdims)
TypeError: cannot perform reduce with flexible type

I'm not advocating that we support rendering, or performing other analysis on char variables, but describe should ignore such variables and return information for other variables, instead of failing outright.

Test mapnik rendering backend

As an alternative to Pillow.

Looks like we can feed in image data directly from python

In particular, we could leverage mapnik's png output options to also use lossy compression for smaller rendered images instead of relying on something like pngquant.

Downsides may be an increased dependency surface. Will need to benchmark performance, flexibility, and size of output files to see if it is worth it.

Add LAB, HCL, HSL color spaces for interpolation

Can use colormath in short term for converting between color spaces, but ideally should re-implement these as vector methods instead.

LAB interpolation is linear and easy. HCL and HSL rely on similar tricks as for HSV - see D3 implementation of these interpolation methods.

Filename order is not consistent between platforms (Unix & Windows)

I am using a subprocess to call the clover to_netcdf on a filename pattern such as It*-Ts*-sc.tif to return a NetCDF file with a time dimensions. The problem is that the order in which the files are stacked is different between Windows and Unix systems, resulting in the order of the timesteps to be unpredicatable. I am not using the datetime-pattern parameter, but these files are not datetimes at all.

For example, if my stack of GeoTIFFs are the following:

It0001-Ts0000-sc.tif
It0001-Ts0001-sc.tif
It0001-Ts0002-sc.tif
It0001-Ts0003-sc.tif

calling glob.glob on Windows will return them in ascending order:

pattern = 'It*-Ts*-sc.tif'
files = glob.glob(pattern)
print(files)
['It0001-Ts0000-sc.tif', 'It0001-Ts0001-sc.tif', 'It0001-Ts0002-sc.tif', 'It0001-Ts0003-sc.tif', ...]

On Unix, the pattern seems unpredictable:

print(files)
['It0001-Ts0003-sc.tif', 'It0001-Ts0000-sc.tif', 'It0001-Ts0002-sc.tif', 'It0001-Ts0001-sc.tif', ...]

Would it be possible to add a sort parameter to the to_netcdf command? In my implementation I have modified clover.cli.convert.py @ line 87 to call filesnames.sort(), which puts them filenames back in the same order as they are on Windows.

Add logging of operations to NetCDF attributes

Use attributes to store log of operations performed on a NetCDF file, e.g., when creating from ArcASCII.

Consider adding a new option to CLI for log messages from user, similar to git commit messages.

Derive username from operating system info and include that in logging.

Use history attribute from CF conventions.

Add map preview tool for EEMS

Use the EEMS parser to show the basic layers from the EEMS file in a sidebar on the left (table of contents style), wherein each node onclick shows the associated layer / variable.

Typically all variables are in the same netCDF file.

Default renderer is blue to red, similar to reverse of palettable.diverging.RdYlBu_11

Can skip input variables / datasets for now, limit to those produced by EEMS.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.