Comments (16)
Here is the source code from NASA for querying and exporting HLS data.
from multi-temporal-crop-classification-training-data.
geojson.io for creating geojson
from multi-temporal-crop-classification-training-data.
The burn scar scripts (https://github.com/NASA-IMPACT/hlsfm_burn_scar_pipeline) don't do things terribly efficiently. Here is how my scripts match up.
The bs script "0. subset_burn_shapefile.py" is specific to burn scars, so I don't copy it.
The bs script "1. save_HLS_query_dataframe.py" stores all potential download urls for the aoi. This can be basically copied to my notebook "1_CDL_save_HLS_query.ipynb"
The bs script "2. create_HLS_masks_bulk.py" is doing MANY things (inefficiently, by downloading full HLS tiles for each geojson), so I split it up. My script "2a_CDL_create_HLS_masks_bulk.ipynb" takes care of file processing. (loading geojson for aoi/chip, identifying closest tile, downloading all images for that tile as hdf, extracting hdf metadata, and converting to tif).
There are at least 2 issues with script (2a). First, the function to return cloud cover and spatial coverage metadata does not work. It returns an empty dictionary.
nasa_hls.get_metadata_from_hdf(hdf_dir+local_name, fields=['cloud_cover', 'spatial_coverage'])
Second, the function to convert from hdf to tif does not work. I also have a workaround for this, but not ideal.
nasa_hls.convert_hdf2tiffs(Path(hdf_dir+local_name), Path(tiff_dir))
I will also create a script (2b) that takes care of the cropping/masking of .tif files. (not done yet).
from multi-temporal-crop-classification-training-data.
Thanks @mcecil .
Couple of things:
- What's the difference between
geojson_file
andgeojson_rpj_file
in2a_CDL_preprocess_HLS_to_TIF.ipynb
? I want to run the code using a sample aoi but not sure which of these I should replace. - For simplicity of not having to change all paths in the first cell of
2a_CDL_preprocess_HLS_to_TIF.ipynb
, define aroot_path
variable that we can set at the top, and just use that for all the file paths in the following. - Did you run one sample HDF through the
nasa_hls.convert_hdf2tiffs
to see if you get any error or missing crs in the output GeoTIFF?
from multi-temporal-crop-classification-training-data.
I would just use the base geojson which I can share (in lat long). I tried reprojecting it (in R) but not sure if it worked correctly. I'll push the geojson to the repo.
Yes, I can update the root_path.
I did run a sample HDF through nasa_hls.convert_hdf2tiffs
and it does some weird things. It attempts to create a folder for each image, but does not populate it. Here is the error.
from multi-temporal-crop-classification-training-data.
I've resolved the issues with the NASA_HLS functions.
- For
get_metadata_from_hdf
andconvert_hdf2tiffs
, I had to replace single quotes ' with double quotes " for the command sent to shell. I also had to do slight editing of output for the metadata. For both functions, I had to create my own version of the function in the notebook.
The cropping/masking is more complicated. The workflow seems rather complicated, creating Boolean masks using the entire tile raster as a reference raster (so a large file). And there seems to be an error (still) with the georeferencing after cropping. I'm not sure if this would affect the DL model, as the error may exist for both the mask and band layers.
In any case, I have got this to work using rasterio mask and cropping, that seems to work but has some edge effects (some pixels on the border are not in mask).
from multi-temporal-crop-classification-training-data.
For HLS image tracking
- Create panda table with columns for: HLS tile, date, image name, month, cloud cover (%) , spatial coverage (%)
- save as csv.
from multi-temporal-crop-classification-training-data.
Final steps to follow:
- Load all chip bounding boxes from the GeoJSON file
- Find all the HLS tiles that need to be downloaded
- Download all scenes for all tiles (in HDF format)
- Filter HDF files based on scene-level cloud cover, and only keep the ones with 0% cloud cover
- Sort the dates of remaining HDF scenes (remaining means 0% cloud), and select three dates including the first, the middle and the last scene.
- Convert the three selected scenes to GeoTIFF (in this case COG)
- Reproject the GeoTIFFs to CDL CRS.
Let's cover the chipping of HLS and CDL in #5 .
from multi-temporal-crop-classification-training-data.
I've created 'workflow' notebook in the rewrite branch that will go through all steps.
Weird issue occurring. I've selected three images for conversion to COG. One of them does not convert. It creates an empty folder in the tif directory, but not files. the other two hdf convert fine.
The '007' image does not convert while the '032' and '052' do convert from hdf to cog. I tested on my old code and I was able to get the '007' image to convert.
from multi-temporal-crop-classification-training-data.
@mcecil please share the url of the HDF file for the 007
image so I can try on my end and see if I can debug.
from multi-temporal-crop-classification-training-data.
Here is the bad file: https://hls.gsfc.nasa.gov/data/v1.4/S30/2020/15/S/T/T/HLS.S30.T15STT.2020007.v1.4.hdf
Subbing in days 32 and 52 should give files that work.
from multi-temporal-crop-classification-training-data.
not sure if it matters, but the reprojected HLS tifs have a weird 0 data rectangle above the hls values.
the images do align though. I checked pixel alignment and also road overlap (so the HLS image is in the right place)
from multi-temporal-crop-classification-training-data.
to-do list:
- test with real candidate chips
- test on AWS, cloud storage
- get list of all HLS tiles
- selection of 3 candidate scenes should be on per tile basis
- check cloud cover threshold for different tiles
- confirm tracking method for HDF files, TIF conversion, chipping
- confirm months used (March - Sept?)
- Does QA band need to be included?
- how to deal with partial failures (i.e. HDF image not converting)
- band ordering for chips.
- put chipping into a function
For the cloud coverage issue, I tested one tile T15STT. There were 0 images in Mar-Sept with 0% cloud cover. There were 7 images with <= 5%
from multi-temporal-crop-classification-training-data.
not sure if it matters, but the reprojected HLS tifs have a weird 0 data rectangle above the hls values.
the images do align though. I checked pixel alignment and also road overlap (so the HLS image is in the right place)
@mcecil we forgot to talk about this in our call. This is the result of interpolation. I wouldn't worry about it.
from multi-temporal-crop-classification-training-data.
https://nasa-openscapes.github.io/2021-Cloud-Hackathon/tutorials/02_Data_Discovery_CMR-STAC_API.html
from multi-temporal-crop-classification-training-data.
@mcecil and @kordi1372 , I just noticed we didn't close the issues on this report from the first version of the code that Mike developed.
It's best if we close these, since they are implemented already with v1.4 of the data, add a tag to GitHub to keep record of the current working version of the code (Let me know if you need help with this) and start a new set of issues for Fatemeh to update the code for using v2.0 of the data.
from multi-temporal-crop-classification-training-data.
Related Issues (5)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from multi-temporal-crop-classification-training-data.