Coder Social home page Coder Social logo

Comments (28)

mcecil avatar mcecil commented on July 20, 2024

I've downloaded the 2020 layer from here. https://www.nass.usda.gov/Research_and_Science/Cropland/Release/index.php

from multi-temporal-crop-classification-training-data.

HamedAlemo avatar HamedAlemo commented on July 20, 2024

@mcecil I've the data in #1. I will crop this to our AOI and send you the final GeoTIFF with the bounding boxes for each chip.
One thing I need though is a sample HLS file. We need to project these two datasets to the same CRS. Let's discuss in our meeting.

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

We've decided that we will use the CDL crs for all geospatial layers.

  • Hamed will create chip boundaries (geojson) based on the CDL layer.
  • Mike will include a raster transform to project HLS data to the CDL crs. This will occur during the step when converting from HDF to single-layer TIF.

from multi-temporal-crop-classification-training-data.

HamedAlemo avatar HamedAlemo commented on July 20, 2024

Following the update in #4 we will use this ticket for chip generation:

  • Loop over chip aois
  • Load all three scenes of HLS and clip them to the target chip aoi
  • Export one file per time with all bands merged together.
  • Clip the CDL for the corresponding chip aoi and export it as a tif as well.

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

other questions:

  • do we want to scale reflectance values?
  • do we want other bands? (like NIR?) I think band 5 is red-edge

from multi-temporal-crop-classification-training-data.

HamedAlemo avatar HamedAlemo commented on July 20, 2024

@mcecil

  • no scaling of reflectance values. Let's keep them as integers (float values will take much more disk space to store).
  • Good catch on the bands. It was my bad to suggest band B05. Let's go with B08 for now which is NIR. We can only feed 4 bands at this time.

from multi-temporal-crop-classification-training-data.

HamedAlemo avatar HamedAlemo commented on July 20, 2024

@mcecil two updates:

  • Following our discussion today, I reviewed the percentage of pixels with various unacceptable QA flags. Given the noise in the QA band, I suggest we only discard a scene (and consequently the chip) if any of the unacceptable flags is present in one scene (per time) more than 5% of pixels. So don't look at this cumulatively over time or across all flags. Only individual flags and at each time.
  • Let's use the following QA values as accepted ones (I have also pasted the code that I used to derive this).
    [0, 4, 32, 36, 64, 68, 96, 100, 128, 132, 160, 164, 192, 196, 224, 228]
import nasa_hls
qa_table = nasa_hls.get_qa_look_up_table()
qa_table = qa_table[~qa_table["cloud"]]
qa_table = qa_table[~qa_table["cirrus"]]
qa_table = qa_table[~qa_table["snow"]]
qa_table = qa_table[~qa_table["cloud_shadow"]]
qa_table.index

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

output file names:
chips/"chip_"merged.tif
chips/"chip
".mask.tif
chips_qa/"chip_"_qa.tif

check for band values for -1000 for bad values. If any pixels are bad, then discard chip.

any negative values per band get converted to 0.

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

Mike to create binary output chips in separate folder, "chips_binary". This will include both HLS and binary crop/non-crop.

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

CDL crop classes [1,2,3,4,5,6,10,11,12,13,14,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,66,67,68,69,70,71,72,74,75,76,77,92,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,236,237,238,240,241,242,243,244,245,246,247,248,249,250,254]

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

@HamedAlemo the na counts for each chip. it looks like half of the chips have at least some NA values.

image

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

In the one example I looked at, the NA value was in a very dark area (cloud shadow?) so cropping the band value to 0 (current output) might be reasonable.

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

Another example, with QA flag.

We would remove 14 of 30 chips based on the 5% QA threshold.

Of the remaining 16 chps, 5 have some NA values.

image

from multi-temporal-crop-classification-training-data.

HamedAlemo avatar HamedAlemo commented on July 20, 2024

@mcecil the 16 out 30 for QA fags is reasonable. I had almost the same number when I picked 5%. For now let's go with the most restrictive option to generate a v1 of the dataset and drop all chip that have NaN values. You should clip the any pixel that is negative (not -1000 though) to 0, but no data should be kept as now data until we better understand why this is so common.

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

Well, this means any chips that have no data, we will just drop, correct?

So there is no question of keeping pixels with no data, we discard them entirely.

from multi-temporal-crop-classification-training-data.

HamedAlemo avatar HamedAlemo commented on July 20, 2024

Yes.

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

Great, just checked 3 tiles, and we would now keep 84 of 102, so a bit better.

Also, do we need to keep all the HDF's? This is taking a lot of space and will soon consume the C drive. If I just keep the 3 HDF's we use it should be a lot better.

from multi-temporal-crop-classification-training-data.

HamedAlemo avatar HamedAlemo commented on July 20, 2024

Oh no, delete all the extra HDFs.

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

Ok great, thought we might be saving them for later.

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

We also need to update the hls_hdf_to_cog.py script to include 'QA' bands. I've been changing this manually. This script is in the Docker file.

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

Summarizing outstanding issues:

  • Update hls_hdf_to_cog.py script to include QA bands.
  • Check on chips that are assigned tile '01SBU' .
  • Consider what to do with HLS tiles that do not have 3 images with 100% spatial coverage and < 5% cloud cover. (skipping for now)
  • The 'workflow' notebook does not run well when you have to stop and start again, and when you only run a subset of tiles. I need to fix this.
  • Need to calculate per-band mean, sd across all chips.

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

Mike tasks

  • clean up repo DONE
  • implement reducing spatial coverage threshold 100 to 90 to 80 etc. DONE
  • incude per tile tracking for the 5 tasks (hdf download, tif conversion, tif reprojection, chipping, filtering) DONE
  • continue to exclude chips that have any NA values DONE
  • means and standard deviations should be pooled for all dates, and calculated per band (so 4 total) . DONE
  • fix chip tracking DONE
  • record dates of images for chips DONE

update readme with

  • instructions to run workflow notebook DONE
  • checking HLS tiles for weird things like "01SBU" DONE
  • add section called "Assumptions" (separate from "Instructions") that includes logic for chip generation DONE
  • anything else unclear

@HamedAlemo tasks:

  • Update hls_hdf_to_cog.py script to include QA bands.
  • Confirm means and standard deviations are per band (4 vs 12) in Thursday meeting

from multi-temporal-crop-classification-training-data.

HamedAlemo avatar HamedAlemo commented on July 20, 2024

@mcecil I submitted a PR for the QA band naming fix in hls_hdf_to_cog.py (here). I will let you know when it's merged, and then you should be able to rerun your container and it automatically pulls the updated code.

from multi-temporal-crop-classification-training-data.

mcecil avatar mcecil commented on July 20, 2024

Ok I've updated the scripts as mentioned, and will push this to Github today.

Things I have not done:

  • changed code to download TIFs instead of HDF files. Because we use HDF metadata for cloud coverage and spatial coverage, I'm not sure if we actually want to download TIFs directly.

from multi-temporal-crop-classification-training-data.

HamedAlemo avatar HamedAlemo commented on July 20, 2024

Sounds good, thanks @mcecil . We can address the the new TIFs instead of HDFs this week.

from multi-temporal-crop-classification-training-data.

kordi1372 avatar kordi1372 commented on July 20, 2024

It seems likely that we will have to use a different band number depending on what HLS product we are using.
The product HLSL30.002 has to use band 5 because there is no band 8 and the infrared band (NIR narrow) is equivalent to band 5, while the product HLSS30.002 has to use band 8A.
There is some ambiguity regarding Band 8 and Band 8A selection. Sentinel 2 Band 8A offers a spectral range that is fully compatible with Landsat 8 Band 5. So, I think Band 8A should be used.

from multi-temporal-crop-classification-training-data.

kordi1372 avatar kordi1372 commented on July 20, 2024

https://lpdaac.usgs.gov/resources/e-learning/getting-started-cloud-native-hls-data-python/

from multi-temporal-crop-classification-training-data.

HamedAlemo avatar HamedAlemo commented on July 20, 2024

Thanks @kordi1372 . We will use the HLSS30.002 product (which has bands based on Sentinel-2 sensor). So whatever is the NIR band in HLSS30.002 we need to select that which seems to be B8A as indicated in the link you shared.

from multi-temporal-crop-classification-training-data.

Related Issues (5)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.