Coder Social home page Coder Social logo

Comments (16)

HamedAlemo avatar HamedAlemo commented on July 20, 2024

Here is the source code from NASA for querying and exporting HLS data.

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

geojson.io for creating geojson

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

The burn scar scripts (https://github.com/NASA-IMPACT/hlsfm_burn_scar_pipeline) don't do things terribly efficiently. Here is how my scripts match up.

image

The bs script "0. subset_burn_shapefile.py" is specific to burn scars, so I don't copy it.

The bs script "1. save_HLS_query_dataframe.py" stores all potential download urls for the aoi. This can be basically copied to my notebook "1_CDL_save_HLS_query.ipynb"

The bs script "2. create_HLS_masks_bulk.py" is doing MANY things (inefficiently, by downloading full HLS tiles for each geojson), so I split it up. My script "2a_CDL_create_HLS_masks_bulk.ipynb" takes care of file processing. (loading geojson for aoi/chip, identifying closest tile, downloading all images for that tile as hdf, extracting hdf metadata, and converting to tif).

There are at least 2 issues with script (2a). First, the function to return cloud cover and spatial coverage metadata does not work. It returns an empty dictionary.

nasa_hls.get_metadata_from_hdf(hdf_dir+local_name, fields=['cloud_cover', 'spatial_coverage'])

Second, the function to convert from hdf to tif does not work. I also have a workaround for this, but not ideal.

nasa_hls.convert_hdf2tiffs(Path(hdf_dir+local_name), Path(tiff_dir))

I will also create a script (2b) that takes care of the cropping/masking of .tif files. (not done yet).

from multi-temporal-crop-classification-training-data.

HamedAlemo avatar HamedAlemo commented on July 20, 2024

Thanks @mcecil .
Couple of things:

  • What's the difference between geojson_file and geojson_rpj_file in 2a_CDL_preprocess_HLS_to_TIF.ipynb? I want to run the code using a sample aoi but not sure which of these I should replace.
  • For simplicity of not having to change all paths in the first cell of 2a_CDL_preprocess_HLS_to_TIF.ipynb, define a root_path variable that we can set at the top, and just use that for all the file paths in the following.
  • Did you run one sample HDF through the nasa_hls.convert_hdf2tiffs to see if you get any error or missing crs in the output GeoTIFF?

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

I would just use the base geojson which I can share (in lat long). I tried reprojecting it (in R) but not sure if it worked correctly. I'll push the geojson to the repo.

Yes, I can update the root_path.

I did run a sample HDF through nasa_hls.convert_hdf2tiffs and it does some weird things. It attempts to create a folder for each image, but does not populate it. Here is the error.

image

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

I've resolved the issues with the NASA_HLS functions.

  • For get_metadata_from_hdf and convert_hdf2tiffs , I had to replace single quotes ' with double quotes " for the command sent to shell. I also had to do slight editing of output for the metadata. For both functions, I had to create my own version of the function in the notebook.

The cropping/masking is more complicated. The workflow seems rather complicated, creating Boolean masks using the entire tile raster as a reference raster (so a large file). And there seems to be an error (still) with the georeferencing after cropping. I'm not sure if this would affect the DL model, as the error may exist for both the mask and band layers.

In any case, I have got this to work using rasterio mask and cropping, that seems to work but has some edge effects (some pixels on the border are not in mask).

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

For HLS image tracking

  • Create panda table with columns for: HLS tile, date, image name, month, cloud cover (%) , spatial coverage (%)
  • save as csv.

from multi-temporal-crop-classification-training-data.

HamedAlemo avatar HamedAlemo commented on July 20, 2024

Final steps to follow:

  • Load all chip bounding boxes from the GeoJSON file
  • Find all the HLS tiles that need to be downloaded
  • Download all scenes for all tiles (in HDF format)
  • Filter HDF files based on scene-level cloud cover, and only keep the ones with 0% cloud cover
  • Sort the dates of remaining HDF scenes (remaining means 0% cloud), and select three dates including the first, the middle and the last scene.
  • Convert the three selected scenes to GeoTIFF (in this case COG)
  • Reproject the GeoTIFFs to CDL CRS.

Let's cover the chipping of HLS and CDL in #5 .

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

I've created 'workflow' notebook in the rewrite branch that will go through all steps.

Weird issue occurring. I've selected three images for conversion to COG. One of them does not convert. It creates an empty folder in the tif directory, but not files. the other two hdf convert fine.

The '007' image does not convert while the '032' and '052' do convert from hdf to cog. I tested on my old code and I was able to get the '007' image to convert.

image

from multi-temporal-crop-classification-training-data.

HamedAlemo avatar HamedAlemo commented on July 20, 2024

@mcecil please share the url of the HDF file for the 007 image so I can try on my end and see if I can debug.

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

Here is the bad file: https://hls.gsfc.nasa.gov/data/v1.4/S30/2020/15/S/T/T/HLS.S30.T15STT.2020007.v1.4.hdf

Subbing in days 32 and 52 should give files that work.

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

not sure if it matters, but the reprojected HLS tifs have a weird 0 data rectangle above the hls values.

the images do align though. I checked pixel alignment and also road overlap (so the HLS image is in the right place)

image

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

to-do list:

  • test with real candidate chips
  • test on AWS, cloud storage
  • get list of all HLS tiles
  • selection of 3 candidate scenes should be on per tile basis
  • check cloud cover threshold for different tiles
  • confirm tracking method for HDF files, TIF conversion, chipping
  • confirm months used (March - Sept?)
  • Does QA band need to be included?
  • how to deal with partial failures (i.e. HDF image not converting)
  • band ordering for chips.
  • put chipping into a function

For the cloud coverage issue, I tested one tile T15STT. There were 0 images in Mar-Sept with 0% cloud cover. There were 7 images with <= 5%

from multi-temporal-crop-classification-training-data.

HamedAlemo avatar HamedAlemo commented on July 20, 2024

not sure if it matters, but the reprojected HLS tifs have a weird 0 data rectangle above the hls values.

the images do align though. I checked pixel alignment and also road overlap (so the HLS image is in the right place)

image

@mcecil we forgot to talk about this in our call. This is the result of interpolation. I wouldn't worry about it.

from multi-temporal-crop-classification-training-data.

kordi1372 avatar kordi1372 commented on July 20, 2024

https://nasa-openscapes.github.io/2021-Cloud-Hackathon/tutorials/02_Data_Discovery_CMR-STAC_API.html

from multi-temporal-crop-classification-training-data.

HamedAlemo avatar HamedAlemo commented on July 20, 2024

@mcecil and @kordi1372 , I just noticed we didn't close the issues on this report from the first version of the code that Mike developed.
It's best if we close these, since they are implemented already with v1.4 of the data, add a tag to GitHub to keep record of the current working version of the code (Let me know if you need help with this) and start a new set of issues for Fatemeh to update the code for using v2.0 of the data.

from multi-temporal-crop-classification-training-data.

Related Issues (5)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.