Comments (3)
It would be nice to have a collection of data sets using H3 that folks can use for examples or are just generally useful.
This seems like it could be helpful as a reference dataset.
Some ideas:
Another one that comes to mind are the various US census geometries (essentially, anything in the TIGER dataset).
https://geodatasets.readthedocs.io/en/latest/introduction.html is a Python package that does something similar, but for general geographic datasets.
Aside from what examples we want, I think we'd also need to decide:
* what data format we'd use, or if we'd use multiple
I think it would make sense to have multiple formats, some users might want a simple text based format like CSV or JSON, while others may prefer efficient binary formats like Parquet (as uint64).
* how we store the examples---in the repo, or point to external hosting
Considering the format duplication, the fact that the text files can be very large, and the relatively independent maintenance concerns, I recommend outside of the repo. I believe we already do that in master
for country geometries used in testing.
from h3.
I think it would make sense to have multiple formats, some users might want a simple text based format like CSV or JSON, while others may prefer efficient binary formats like Parquet (as uint64).
Agreed.
Considering the format duplication, the fact that the text files can be very large, and the relatively independent maintenance concerns, I recommend outside of the repo. I believe we already do that in
master
for country geometries used in testing.
Yes, I definitely agree we should host these through a separate repo (maybe something like h3datasets
?). It was more that I was wondering if in that repo we host the raw data, or if it should point to some other storage location. The geodatasets
package uses the latter strategy. If we were using the former strategy, I was curious if we thought we might run into github file and repo size limits (the repo we point to here comes in at 17GB). Maybe we can start with the in-repo approach and pivot to external hosting if necessary. If we do end up needing external storage, any ideas on what services we might use?
from h3.
Yes, I definitely agree we should host these through a separate repo (maybe something like
h3datasets
?). It was more that I was wondering if in that repo we host the raw data, or if it should point to some other storage location. Thegeodatasets
package uses the latter strategy. If we were using the former strategy, I was curious if we thought we might run into github file and repo size limits (the repo we point to here comes in at 17GB). Maybe we can start with the in-repo approach and pivot to external hosting if necessary. If we do end up needing external storage, any ideas on what services we might use?
Ah, I see. The two options I'd suggest are S3 and Cloudflare R2. R2 is cheaper and more modern (which incidentally can cause issues if you happen to use HTTP-only software, as it enforces SSL). In the mean time in the repo seems like an OK place to start.
from h3.
Related Issues (20)
- Broken Link to website docs in contributing.md
- Broken link to website in contributing docs
- Uber CLA Contact HOT 1
- Has cell_to_vertex been implemented? HOT 2
- Replace empty function parameters with `void` HOT 1
- cell_to_child_pos() version 4 of the Python API client HOT 3
- polygonToCells: validity of polygons HOT 3
- Missing library stubs MYPY HOT 2
- polygonToCells not returning all H3Cells for the bounding box containing both USA and Russia HOT 1
- Confirmation of grid algorithm HOT 3
- cellToChildren error HOT 2
- Add function for returning the H3 indices of each endpoint of a directed edge HOT 5
- Expose cellToChildrenSize in bindings HOT 1
- Getting unexpected results when converting coordinates in either direction HOT 4
- Meta: blog post has broken images HOT 5
- API | distance between h3s challenging to work around HOT 5
- Completely cover a polygon with H3 cells using H3 extension for PostGIS
- Hi
- H3 Bug - Easily Reproducible HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from h3.