Comments (11)
That's correct. Each dataset is slightly different (i.e. vector/raster labels, imagery bands) so there is no "one size fits all" notebook. We don't have an example notebook for the SpaceNet dataset so your best bet would be to explore how the SpaceNet datasets are structured in our API and repurpose an existing notebook.
from mlhub-tutorials.
Hi Ashwin,
You shouldn't be receiving that error, I'm looking into it and will get back to you shortly.
Best,
Kevin
from mlhub-tutorials.
@ashnair1 This should now be resolved. The links to the source imagery were pointing to an invalid item ID and the 'label' asset really should have had the 'labels' key instead. Both of these issues have been fixed.
from mlhub-tutorials.
The image issue seems to be resolved. But I can't seem to access labels.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-6-1830123587e0> in <module>
----> 1 get_items(f'https://api.radiant.earth/mlhub/v1/collections/{collectionId}/items?limit={limit}', max_items_downloaded=1)
<ipython-input-4-7e48d49c1897> in get_items(uri, classes, cloud_and_shadow, seasonal_snow, max_items_downloaded, items_downloaded)
88
89 # Download the label and source imagery for the item
---> 90 download_source_and_labels(feature)
91
92 # Stop downloaded items if we reached the maximum we specify
<ipython-input-4-7e48d49c1897> in download_source_and_labels(item)
38 import pdb
39 pdb.set_trace()
---> 40 labels = item.get('assets').get('labels')
41 links = item.get('links')
42
TypeError: 'NoneType' object is not subscriptable
It seems item doesn't have a labels field,
ipdb> item.get('assets')
{'MS': {'href': 'https://api.radiant.earth/mlhub/v1/download/cd9b80a8ba7a6e2c05f8420a11caa583dac46ff7ca02030e23a4ccafc4527b4f', 'title': 'MS-geotiff', 'type': 'image/tiff; application=geotiff'}, 'PAN': {'href': 'https://api.radiant.earth/mlhub/v1/download/6991b88e54cf4f1d95779d9ca07d0cd5696a64cd2f01c06c6ebeccfecc6f71a9', 'title': 'PAN-geotiff', 'type': 'image/tiff; application=geotiff'}, 'PS-MS': {'href': 'https://api.radiant.earth/mlhub/v1/download/0e2d142874a26ee67e36ec048c6ef75ada84fe86f0c23dc8cb99bf631611a9dd', 'title': 'PS-MS-geotiff', 'type': 'image/tiff; application=geotiff'}, 'PS-RGB': {'href': 'https://api.radiant.earth/mlhub/v1/download/75563ba9260f85f3496c7391719c34c0ab2e371e05b047f46f00a5bc8179fc1d', 'title': 'PS-RGB-geotiff', 'type': 'image/tiff; application=geotiff'}}
from mlhub-tutorials.
It looks like you're using the BigEarthNet notebook. In all of the datasets except SpaceNet we've separated source imagery items and label items into different collections. Source imagery items will not have a labels asset. Since the BigEarthNet notebook expects the source imagery and labels to be in separate collections it's erroring out when it reaches a source imagery item instead of a label item. You can add an additional check if there's a labels asset and if not skip the item.
from mlhub-tutorials.
Adding the check for label, I was able to download the MS, PAN, PS-MS and PS-RGB versions of img64. But then a record appeared that had no assets field which is as follows:
{'description': 'SpaceNet 2 Khartoum Chipped Training Dataset', 'extent': {'spatial': {'bbox': [[32.4858384, 15.5138111999, 32.5665684, 15.7402062]]}, 'temporal': {'interval': [['2015-04-13T00:00:00Z', None]]}}, 'id': 'sn2_AOI_5_Khartoum', 'license': 'CC-BY-SA-4.0', 'links': [{'href': 'https://api.radiant.earth/mlhub/v1/collections/sn2_AOI_5_Khartoum', 'rel': 'self'}, {'href': 'https://api.radiant.earth/mlhub/v1/', 'rel': 'parent'}, {'href': 'https://api.radiant.earth/mlhub/v1/', 'rel': 'root'}, {'href': 'https://api.radiant.earth/mlhub/v1/collections/sn2_AOI_5_Khartoum/items', 'rel': 'items'}], 'properties': {'license': 'CC-BY-SA-4.0', 'providers': [{'name': 'SpaceNet LLC', 'roles': ['processor', 'host', 'licensor', 'producer'], 'url': 'https://api.radiant.earth/mlhub/v1/download/017ab8ab69ffa44271d32452fe85eab079ac53ef3370b2eed74e2e87769eae57'}]}, 'providers': [{'name': 'SpaceNet LLC', 'roles': ['processor', 'host', 'licensor', 'producer'], 'url': 'https://api.radiant.earth/mlhub/v1/download/078e2ee114866281d8d728c610d8cce8b3780edb1f7e010c49a5e20776c636ee'}], 'stac_extensions': ['label'], 'version': 1}
Could you elaborate on what this record is for?
from mlhub-tutorials.
That would be a STAC Collection record which doesn't have assets. I'm not sure how your script navigated to that page but links with the rel type "parent" or "collection" in an item will link to that item's collection
from mlhub-tutorials.
Right. Just to clarify, I'm trying to re-purpose the download code from the BigEarthNet notebook to download the Spacenet datasets. However it seems to me that it might not be as simple as just replacing collectionID
in the notebook (from bigearthnet_v1_labels
to sn2_AOI_3_Paris
as I originally thought.
As a side note, do you have any examples of using the api to download the SpaceNet datasets? That would really be helpful since it differs from the other datasets.
from mlhub-tutorials.
Just a follow up question. How can I check the structure of the Spacenet dataset in the API? I can't seem to find the labels.
Edit: I've observed a couple of things and wanted to know if it was intentional.
(Pdb) rc = requests.get('https://stac-api.radiant.earth/collections/sn2_AOI_3_Paris/items?limit=1000', headers=headers)
(Pdb) rc1 = requests.get('https://api.radiant.earth/mlhub/v1/collections/sn2_AOI_3_Paris/items?limit=1000', headers=headers)
(Pdb) rc1.json().keys()
*** json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
(Pdb) rc.json().keys()
dict_keys(['type', 'stac_extensions', 'context', 'numberMatched', 'numberReturned', 'features', 'links'])
- The link
https://api.radiant.earth/mlhub/v1/collections
(which is provided in the notebook) doesn't seem to support large limits buthttps://stac-api.radiant.earth/collections
does. - Setting limit to low values such as 10, won't return labels as the initial records are images and instances of label only appear later on. Setting limit to 10, gives 10 images instead of giving 10 image-label pairs. This seems like a problem if you don't want to download the entire dataset and only want a subset. I suppose you could filter out source images, download labels and then download the imagery since the labels have link to the imagery but that would still involve iterating over the entire dataset.
from mlhub-tutorials.
I'm wondering why the Spacenet datasets alone are structured like this. The workflow specified in your tutorial makes a lot of sense but it only works if the source imagery and labels are separate. Grouping the imagery and labels into one collection without image-label pairing makes it harder to get a subset and makes the downloading of the entire dataset tedious. Of course, I might be missing something obvious wherein we could just query the labels from the dataset and download the images via the link field. If that's the case, please do let me know.
from mlhub-tutorials.
Hi Ashwin,
I pushed some fixes this morning to the API which should fix the issue with large limits on the API. Accessing the stac-api domain is not currently supported and it's only used for internal testing. The SpaceNet team created their first catalog for the SN2 challenge which included both labels and imagery in the same collection. When we created the catalogs for the rest of the challenges we kept the same format to keep things consistent as they would be using the catalogs we generated as well. For the SpaceNet dataset the best path really is to iterate through the dataset and determine which ones are labels and which are imagery.
Best,
Kevin
from mlhub-tutorials.
Related Issues (16)
- Ran last chunk of radiant-mlhub-api-know-how.ipynb but get error 403: Forbidden HOT 2
- Downloaded labels are missing correct spatial properties HOT 1
- is there any video tutorial to explain the code please HOT 2
- NoCredentialsError: Unable to locate credentials HOT 3
- Process Pool does not work in jupyter notebook for windows users HOT 1
- LandCoverNet dataset is not getting downloaded HOT 5
- Error while downloading the LandCoverNet dataset HOT 2
- Metadata error with BigEarthNet HOT 5
- limit parameter value to download all data HOT 3
- Error while downloading the LandCoverNet dataset HOT 4
- Failed to download Assets HOT 1
- LandCoverNet - Downloading Europe HOT 2
- Missing Labels for South Africa Crop Type Competition HOT 2
- LandCoverNet NA - ValidationError HOT 3
- MlModel not working as expected
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mlhub-tutorials.