Comments (15)
Another option is nbsphinx: https://nbsphinx.readthedocs.io/en/0.8.7/ That's what's used in dask-examples: https://github.com/dask/dask-examples, which are rendered at https://examples.dask.org/.
from torchgeo.
from torchgeo.
Here is another example of how to do this: https://github.com/PyTorchLightning/lightning-tutorials
from torchgeo.
Started looking into this. You can directly render a notebook using nbsphinx
. pandoc
is required if you have any markdown in your notebook. This is what Lightning does for their tutorials.
However, PyTorch does something completely different. They instead store the file as a .py
file and encode the rst in comments. I'm guessing this makes it easier to test? Not sure how this gets automatically converted to a notebook when you open in Google Colab/MS Learn.
from torchgeo.
On the thread of running and testing notebooks to make sure they remain up-to-date, nbmake
seems like a good way to integrate things with pytest
: https://semaphoreci.com/blog/test-jupyter-notebooks-with-pytest-and-nbmake
However, that requires a specific conda environment to be active, which I don't want. Also, we'll need all dependencies installed and have data available. Some of these training loops could be very time-intensive to run.
from torchgeo.
Okay, here's what I've decided. We'll use nbsphinx
to render the tutorial notebooks and nbmake
to test them. Tests will be split into:
- unit tests (fast, run on every push/pull_request to any branch)
- integration/functional tests (slow, run on every push/pull_request to a release branch)
This will allow us to iterate quickly on PRs without inundating CI but still make sure that the entire stack including data download and model training works as expected before each release. We'll move testing of setup.py
and train.py
to the integration/functional tests, which will greatly speed those up as well.
from torchgeo.
Another possibility instead of downloading the data ourselves is to use existing datasets in the cloud. I don't think Google Colab has access to any satellites imagery, and the Planetary Computer is not yet available to the general public. Are there any other cloud services that could work?
from torchgeo.
Google Earth Engine, there are a few datasets available for ML as of now.
BigEarthNet
LandCoverNet
I would be keen on assisting with this at some point
This also gave me an idea to contribute more datasets to the community catalog
from torchgeo.
Does GEE support running jupyter notebooks? I've only ever used JavaScript in their code editor. It's hard to make any assumptions about data availability since the notebook needs to run on Colab, PC, and CI.
from torchgeo.
Yes, it does via the GEE Python API.
Some drawbacks
- The interactive leafmap/folium map will not stay alive. So you would have to opt for static images (that will need to be downloaded)
- Perhaps more serious, is the user would need to download the data to their drive (or GD or GCS) which has been made easier with geedim for image data (currently the workflow I use). However, I do not know of an instance where this can be avoided. I wonder if streaming the data as batches from GEE would be fast enough, or how much of a delay that will introduce.
from torchgeo.
Okay, so this would be no different than our current approach of downloading data from Planetary Computer. Just another source of data.
from torchgeo.
My apologies. I think it is going to be the case on all platforms for the foreseeable future (until GEE directly supports NNs-likely not any time soon).
Side note: in the geedim package the author used an approach based on rasterio to write image patches in chunks. Perhaps useful for inference
from torchgeo.
Even if GEE directly supported NNs, they wouldn't support TorchGeo, so they aren't really relevant to us other than a possible data source. It would be much more fruitful to be able to directly support data in Colab or Planetary Computer. There's some work in progress on the PC side, but I'm not sure what's available in Colab.
from torchgeo.
from torchgeo.
Yep, GEE used to have a lot more data, although I think PC might have already caught up in that front. GEE is still far more user friendly and easier to scale, so it's winning for non-CS people. But GEE is also very limited because it doesn't support NNs. In that sense, GEE is ~10 years behind TorchGeo 😄
(the entire geospatial community is ~10 years behind the computer vision community, computer vision folks haven't used anything other than CNNs for over a decade)
We're hoping to provide something as easy as possible for geospatial researchers hoping to explore deep learning methods. Of course, TorchGeo isn't restricted to Colab or PC, you can use it on your laptop, supercomputer, or in the cloud (AWS, Azure, GCP, etc.). As long as you can get your hands on some data, and you can afford compute time, you can use TorchGeo.
from torchgeo.
Related Issues (20)
- Errors & improvements in Metrics descriptions HOT 2
- Add a WMS Dataset HOT 2
- Switch from SMP to TorchSeg HOT 1
- Add plot method to IntersectionDataset HOT 1
- v0.5.2 missing PRs HOT 2
- Use ruff
- Add Inference Example HOT 1
- Switch coverage providers? HOT 1
- Auto download fails for FireRisk HOT 11
- Anomaly with RandomGrayScale tests HOT 2
- Add YAML formatter HOT 16
- Change documentation theme
- CDL: cannot redownload additional years HOT 20
- Overrideable resample property for IntersectionDataset
- UnionDataset of two IntersectionDataset fails HOT 2
- RandomBatchGeoSampler produces nan or nodata values HOT 6
- Check if bbox of intersection is valid HOT 4
- Git clone and pip install results in 'Successfully installed UNKNOWN-0.0.0' HOT 10
- class_weights cannot be passed via config file as a tensor is expected HOT 5
- README.md benchmark dataset code HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from torchgeo.