Coder Social home page Coder Social logo

Comments (3)

nmandery avatar nmandery commented on June 11, 2024 1

At this point this H3->raster conversion has not been implemented.

The naive-approach you mention is already possible using gdal_rasterize, but - as you mention - may not be as efficient as large number of cells may be created when mixed h3 resolutions are used.

Given a lower h3 resolution R and a higher resolution R+N I see no way to generate the R+N outline of a R-cell without generating all the child cells. A possible optimization would be to avoid doing this resolution homogenization eagerly - which may create quite large datasets - and instead to it lazily for each cell just when its passed to the rasterization. This could be implemented using the change_cell_resolution from with in rust. Given that, it would still require a part of the rasterization code to be ported to rust as I would like to avoid having the gdal dependency in the rust implementation.

Another way to tackle this while staying in python would be applying h3_set_to_multi_polygon to the R+N-children of each cell. This would result in a MultiPolygon geometry containing a single Polygon of the R+N-outline for each cell and could be done using geopandas and written to a vector dataset afterwards. This dataset could then be used with gdal_rasterize or rasterio.features.rasterize to generate the raster. This approach would also have the benefit of not requiring to have all child-cells in memory at once.

Let me know if this helps. I currently do no see me coming around to implement any rasterization within this library, but I am happy to accept a PR.

from h3ronpy.

alpha-beta-soup avatar alpha-beta-soup commented on June 11, 2024

You're right, of course there's no way not to compute all of the child cells, the best case optimisation is just not to put all of them (complete de-compaction) into memory at once. I'm currently unable to contribute this to this project, since I've never used Rust, but I appreciate the discussion anyway.

from h3ronpy.

nmandery avatar nmandery commented on June 11, 2024

I played around with rasterizing dataframes a few days ago - so maybe this is of some help for you. Its not the fastest for large numbers of cells, but it works for the few tests I have done. Rasterization is done using rasterio. The part of building the child cells is still missing, but could be easily added in the for loop in rasterize_dataframe without requiring building all un-compacted child cells at once:

import pandas as pd
import typing
import h3.api.numpy_int as h3
import numpy as np
from h3ronpy.util import dataframe_to_geodataframe
from rasterio.transform import from_bounds
from rasterio.features import rasterize
import rasterio

def sample_df() -> pd.DataFrame:
    distances = []
    h3indexes = []
    for i_distance, i_h3indexes in enumerate(h3.hex_range_distances(h3.geo_to_h3(12.3, 43.1, 12) , 30)):
        distances.extend(len(i_h3indexes) * [i_distance,])
        h3indexes.extend(i_h3indexes)
    return pd.DataFrame({
                "h3index": h3indexes,
                "distance": distances
                })
    
def rasterize_dataframe(df: pd.DataFrame, size: typing.Tuple[int], h3index_column_name="h3index", value_column_name="value", nodata=0):
    gdf = dataframe_to_geodataframe(df[[h3index_column_name, value_column_name]], column_name=h3index_column_name)
    transform = from_bounds(*gdf.total_bounds, *size)
    arr = np.full(size, nodata, dtype=df[value_column_name].dtype)
    for row in gdf.iterrows():
        rasterize([row[1]["geometry"],], out_shape=size, out=arr, default_value=row[1][value_column_name], fill=nodata, transform=transform)
    return arr, transform


def run():
    df = sample_df()
    df["distance"] = df["distance"].astype(np.uint8)
    size=(1000, 1000)
    nodata=255
    arr, transform = rasterize_dataframe(df, size, value_column_name="distance", nodata=nodata)
    assert df['distance'].dtype == arr.dtype
    with rasterio.open("test.tif", 'w', driver="GTiff", width=size[0], height=size[1], count=1,
                       dtype=arr.dtype, nodata=nodata, compress='lzw', transform=transform) as ds:
        ds.write(arr, 1)

if __name__=='__main__':
    run()

It could even be parallelized by splitting the dataframe into chunks (within rasterize_dataframe) , which could then be rasterized in sub-processes into an individual numpy array. In the end these numpy arrays could then be merged into a single one.

If this would solve the problem, maybe we can add it to the python-parts of this library.

from h3ronpy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.