Coder Social home page Coder Social logo

Example H3 data sets about h3 HOT 3 OPEN

ajfriend avatar ajfriend commented on June 20, 2024
Example H3 data sets

from h3.

Comments (3)

isaacbrodsky avatar isaacbrodsky commented on June 20, 2024

It would be nice to have a collection of data sets using H3 that folks can use for examples or are just generally useful.

This seems like it could be helpful as a reference dataset.

Some ideas:

Another one that comes to mind are the various US census geometries (essentially, anything in the TIGER dataset).

https://geodatasets.readthedocs.io/en/latest/introduction.html is a Python package that does something similar, but for general geographic datasets.

Aside from what examples we want, I think we'd also need to decide:

* what data format we'd use, or if we'd use multiple

I think it would make sense to have multiple formats, some users might want a simple text based format like CSV or JSON, while others may prefer efficient binary formats like Parquet (as uint64).

* how we store the examples---in the repo, or point to external hosting

Considering the format duplication, the fact that the text files can be very large, and the relatively independent maintenance concerns, I recommend outside of the repo. I believe we already do that in master for country geometries used in testing.

from h3.

ajfriend avatar ajfriend commented on June 20, 2024

I think it would make sense to have multiple formats, some users might want a simple text based format like CSV or JSON, while others may prefer efficient binary formats like Parquet (as uint64).

Agreed.

Considering the format duplication, the fact that the text files can be very large, and the relatively independent maintenance concerns, I recommend outside of the repo. I believe we already do that in master for country geometries used in testing.

Yes, I definitely agree we should host these through a separate repo (maybe something like h3datasets?). It was more that I was wondering if in that repo we host the raw data, or if it should point to some other storage location. The geodatasets package uses the latter strategy. If we were using the former strategy, I was curious if we thought we might run into github file and repo size limits (the repo we point to here comes in at 17GB). Maybe we can start with the in-repo approach and pivot to external hosting if necessary. If we do end up needing external storage, any ideas on what services we might use?

from h3.

isaacbrodsky avatar isaacbrodsky commented on June 20, 2024

Yes, I definitely agree we should host these through a separate repo (maybe something like h3datasets?). It was more that I was wondering if in that repo we host the raw data, or if it should point to some other storage location. The geodatasets package uses the latter strategy. If we were using the former strategy, I was curious if we thought we might run into github file and repo size limits (the repo we point to here comes in at 17GB). Maybe we can start with the in-repo approach and pivot to external hosting if necessary. If we do end up needing external storage, any ideas on what services we might use?

Ah, I see. The two options I'd suggest are S3 and Cloudflare R2. R2 is cheaper and more modern (which incidentally can cause issues if you happen to use HTTP-only software, as it enforces SSL). In the mean time in the repo seems like an OK place to start.

from h3.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.