Coder Social home page Coder Social logo

go-eratosthenes / dhdt Goto Github PK

View Code? Open in Web Editor NEW
7.0 3.0 1.0 424.19 MB

Extracting topography from mountain glaciers, through the use of shadow casted by surrounding mountains.

Home Page: https://dhdt.readthedocs.io

License: Apache License 2.0

Python 99.04% TeX 0.96%
landsat8 rapideye sentinel-2 photohypsometry python

dhdt's Introduction

GitHub Badge License Badge Python Build Documentation Status OpenSSF Best Practices

dhdt

Extracting topography from mountain glaciers, through the use of shadow casted by surrounding mountains. Imagery from optical satellite systems are used, over all mountain ranges on Earth.

Installation

Download and access the package folder using git:

git clone https://github.com/GO-Eratosthenes/dhdt.git
cd dhdt

The dependencies are most easily installed with conda from the conda-forge channel (see Miniforge installers for a minimal Conda installation). Create and activate a virtual environment with all the required dependencies:

conda env create -n dhdt -f environment.yml
conda activate dhdt

Install dhdt using pip (add the -e option to install in development mode):

pip install .

Contributing

If you want to contribute to the development of this package, have a look at the contribution guidelines.

License

Copyright (c) 2023, Netherlands eScience Center, Utrecht University

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Credits

This package was created with Cookiecutter and the NLeSC/python-template.

dhdt's People

Contributors

dicaearchus avatar fnattino avatar rogerkuou avatar

Stargazers

 avatar Stef Lhermitte avatar Francesco Ioli avatar  avatar Islam Mansour avatar Tim Wu avatar Shaun Song avatar

Watchers

James Cloos avatar Prashant Pandit avatar  avatar

Forkers

scientificmonk

dhdt's Issues

Datasets to fetch [OVERVIEW]

Auxiliary data:

  • RGI (v 7): #59
  • DEM (Copernicus): #60
  • Tiling scheme for Sentinel 2 (MGRS): #61
  • Coast shapefile: #62

Optical data:

  • Sentinel 2 data (L1C 10m + all XML files + detector masks, L2A cloud mask) with cloud cover < 70%

Find matching Sentinel-2 L1C and L2A items

Hi @dicaearchus, here is a snippet that find matching scenes from the L1C and L2A catalogs:

I have tested on the full data catalogs on Spider, and it seems indeed that we don't have L2A assets for all scenes before 2019, but there are also some scenes in 2019 (and one in 2020) for which we only have L1C data.

I have also verified that there are no L1C scenes matching multiple L2A scenes using the employed criteria.

import fnmatch

import pystac


CATALOG_L1C_PATH = "/project/eratosthenes/Data/sentinel-2/sentinel2-l1c-small/catalog.json"
CATALOG_L2A_PATH = "/project/eratosthenes/Data/sentinel-2/sentinel2-l2a-small/catalog.json"
ITEM_ID_L1C = "S2A_MSIL1C_20230219T200451_R128_T09VWJ_20230219T214414"


def get_items(catalog_l1c, catalog_l2a, item_id_l1c):

    item_l1c = catalog_l1c.get_item(item_id_l1c, recursive=True)
    item_l2a = None

    platform = item_l1c.properties["platform"][-2:].upper()
    datetime = item_l1c.datetime.strftime("%Y%m%dT%H%M%S")
    tile_id = item_l1c.properties["s2:mgrs_tile"]
    item_l2a_id_match = f"S{platform}_MSIL2A_{datetime}_*_T{tile_id}_*"
    for item in catalog_l2a.get_all_items():
        if fnmatch.fnmatch(item.id, item_l2a_id_match):
            item_l2a = item
            break
    return item_l1c, item_l2a


def main():
    catalog_l1c = pystac.Catalog.from_file(CATALOG_L1C_PATH)
    catalog_l2a = pystac.Catalog.from_file(CATALOG_L2A_PATH)

    item_l1c, item_l2a = get_items(catalog_l1c, catalog_l2a, ITEM_ID_L1C)


if __name__ == "__main__":
    main()

Credentials for PyPI

Need to add PyPI credentials to GH repo secrets (after creating an account on PyPI)

inconsistent folder specification

specification of the "data" folder is different,

in the Bologna-examples it is DATA_DIR = "/project/eratosthenes/Data/"
while in handler_randolph it is RGI_DIR_DEFAULT = os.path.join('.', 'data', 'RGI')

which one to use?

different orbits are used for elevation change calculation

Suntraces are used without consideration from which orbit the data is coming from.
Since the re-orthorectification is not always working as a charm, errors easily propagate.
Hence a constraint on orbit number or observation angle is very helpful.

pdf of manuscript

It can be removed from the repository, the GitHub action will generate a new PDF every time the paper files are updated. One can find the PDF under the Actions tab, in the "Artifacts" (currently not there, next modifications will trigger its generation)

Registration question

Hello,

I'm a rookie of Registration, I try to use your own dataset for testing, and found that the results of frequency subpixel are not very good (unstable), while the results of spatial sub-pixel are better (get_top_moment or get_top_2d_gaussian), most of which are stated in the paper It is the phase correlation plus curve fitting/line fitting that works best, but there is no relevant open source for the implementation of curve fitting/line fitting. I would like to ask if you recommend a more accurate and stable method for registration.

Thx

ftps

The ftps library is not mentioned in the requirements. It also looks not to be maintained any more, and if I remember correctly, it gave quite few issues (propblems in the underlying pycurl?). Should we replace it/drop it?

Deprecation warning in `read_sentinel2`

/data/volume_2/fnattino/RedGlacier/preprocessing/notebooks/dhdt/input/read_sentinel2.py:987: DeprecationWarning: parsing timezone aware datetimes is deprecated; this will raise an error in the future
  rec_time = np.datetime64(att.text, 'ns')

MGRS-related functions

The following functions requires the MGRS tiling GeoJSON. The default location where they should look for the file should be the same default location where this is downloaded (./data/MGRS ?):

They could be moved to dhdt/auxiliary/handler_mgrs.py, together with the function that downloads and convert the KML tiling file.

Masking data in `match_pair`

match_pair from dhdt.processing.coupling_tools deals with arrays and masks, which can be passed as independent objects or as masked arrays. The function assumes True values in the mask stands for valid points, but this is the opposite of the convention of masked arrays. This should be fixed or/and the chosen convention should be explained in the docs.

Memory leak OSRSpatialReferenceShadow ?

Warning when running dhdt/examples/Brintnell-Bologna-icefield/03-preprocess-imagery.py:

swig/python detected a memory leak of type 'OSRSpatialReferenceShadow *', no destructor found.

Release

After tests are in place and installation verified, we should release version 0.1

MGRS tiling module

Create a module to:

  • fetch the Sentinel-2 tiling scheme from Copernicus (see link to KML file provided here).
  • clean geometries (extract bounds from geometry arrays).
  • save it as easier-to-handle format (geojson).

Module not yet present, to be added to dhdt/auxiliary/handler_mgrs.py

Part of #52

Functionalities currently in RedGlacier repo

Potential functionalities that could be extracted from the RedGlacier scripts and notebooks and added to this repository:

  • Image retrieval:
    • Search and save metadata from the matching Sentinel-2 tiles in STAC format, either by talking directly to a STAC endpoint (Earth Search service on AWS) or by combining metadata from the Copernicus Open Access Hub and data from Google Cloud Storage. See here.
    • Download actual satellite images. See here.
  • Preprocessing:
    • Retile the DEM according to the Sentinel-2 tiling scheme. Link
    • Rasterise the RGI and extract bounding box related to a specific glacier ID. Link
    • Rasterize and retile coastline shape according to Sentinel-2 scheme. Link
    • Compute the cloud cover mask using s2cloudless. Link.
    • Generate an output STAC catalog where to store preprocessing results. Link.
    • Run preprocessing: compute shadow-enhanced image, coregister with artificial image, extract shadows, extract caster-casted lines. Illustrative notebook and script.

Handling of paths

Can we use paths to data files that are independent from the SAFE format?

Green vs Blue band in entropy shadow enhancement method

Hi @dicaearchus, a question about the "entropy" shadow enhancement method. I see that it works with only two visible bands plus NIR. From the docstring of the entropy_shade_removal function I seem to get that the function should take the highest-frequency band as first input argument, which for the 10m bands should be the Blue band ("B02")? But instead the Green band ("B03") seems to be passed over here:

https://github.com/GO-Eratosthenes/start-code/blob/1ebf1b59e0d78d4bd4c00ce1a2126755f25b2a0b/eratosthenes/preprocessing/shadow_transforms.py#L95

Is this intended and working better for some reasons?

Deprecation warnings from docstrings with math

Docstrings that include backslashes (e.g. docstrings containing formulas) give deprecation warnings, see e.g.:

/home/eratosthenes-fnattino/dhdt/dhdt/processing/matching_tools_differential.py:101: DeprecationWarning: invalid escape sequence \p

The docstring convention seems to suggest that raw strings should be used (r""" ... """):

Use r"""raw triple double quotes""" if you use any backslashes in your docstrings.

Define root of `data` folder

In handler_rgi and handler_sentinel2, the default data dir is defined in the root package dir. This is a good idea when working in dev mode, but if dhdt is installed with pip, files will end up in not-easy-to-access system directories. So it is probably better to define "./data" as the default location for data.

error in MGRS union

the function 'get_generic_s2_raster' gives an error when this is run

File ".../dhdt/dhdt/generic/handler_sentinel2.py", line 268, in get_generic_s2_raster
geom_merged = shapely.unary_union(geom_utm, grid_size=0.001)
AttributeError: module 'shapely' has no attribute 'unary_union'

Any idea what could have gone wrong @rogerkuou ?

Repositories

Structuring and getting an overview

Dear all,

In the process of getting a grip on the structure of the code I am searching for tools to get an overview.
I found some examples, like Mermaid, as shown below.
Would it be an idea to have something like this for the code already made?
Because I am very much lost at this point.

The example below is an image matching function, which I have implemented....:

'''
graph TD
A[fa:fa-camera-retro I1] -->|mat_to_gray| B(fa:fa-camera-retro I1)
B -->|Fourier transform| C[S1]
C -->J

D[fa:fa-subscript beta] -->|raised_cosine| E[fa:fa-filter W1]
E --> J

G[fa:fa-camera-retro I2] -->|mat_to_gray| H(fa:fa-camera-retro I1)
H -->|Fourier transform| I(S2)
I -->|cross spectrum| J{Q}

F[fa:fa-subscript beta] -->|raised cosine| K[fa:fa-filter W2]
K --> J
J -->|normalize to unit vectors| L(Qn)

C -->|absolute| M(S1a)
M -->|thresh_maksing| N(fa:fa-filter Ws)
N -->|two_point_step_size| O{fa:fa-subscript m, snr}
L --> O

'''

Coastal module

Fetch, rasterise and retile coastal data (GSHHS_f_L1.geojson).

Current module: dhdt/auxiliary/handler_coastal.py

Part of #52

Mosaic DEM tiles

Hi @dicaearchus - I managed to get the retiling of the DEM to work with the following code snippet:

import glob
import affine
import matplotlib.pyplot as plt
import pyproj
import rioxarray

from rasterio.enums import Resampling
from rioxarray.merge import merge_arrays


def mosaic_tiles(dem_tiles_filenames, shape, crs, transform):
    merged = None
    for f in dem_tiles_filenames:
        dem_tile = rioxarray.open_rasterio(f)

        # reproject to image CRS
        dem_tile_xy = dem_tile.rio.reproject(crs,
                                             transform=transform,
                                             resampling=Resampling.lanczos,
                                             shape=shape)
        if merged is None:
            merged = dem_tile_xy
        else:
            merged = merge_arrays([merged, dem_tile_xy])
    return merged


def main():
    # input
    transform = affine.Affine.from_gdal(
        399960.0, 10.0, 0.0, 6700020.0, 0.0, -10.0
    )
    crs = pyproj.CRS.from_epsg(32605)
    shape = (10980, 10980)
    filenames = glob.glob("data/Copernicus_DSM_10*")

    # compute mosaic of tiles
    tile = mosaic_tiles(
        dem_tiles_filenames=filenames,
        shape=shape,
        crs=crs,
        transform=transform,
    )

    # make plot
    tile.squeeze().plot.imshow(vmin=0, vmax=3000)
    plt.show()
    return


if __name__ == "__main__":
    main()

The issue was in the missing shape argument in the call to c: if the output shape is not specified the shape of the input array is employed for the output array as well. In our case the lat/lon input tiles have rather coarse resolution, so missing the shape argument lead to only few values around the top left corner and empty for all but one tile..

When the output of dem_tile.rio.reproject is fixed with the correct transform/shape, then pad_box and clip_box are not necessary anymore, since the reprojected raster has already the target coords/shape.

Output of the code above:
Screenshot 2022-03-24 at 22 58 31

Problematic geometries for tiles next to the antimeridian line

The Copernicus KML file that defines the MGRS tiles has some problematic geometries for the tiles crossing the antimeridian line. Once converted to the UTM CRS, the tile boundaries are not "square" -- see e.g. the following snippet:

import geopandas as gpd

from fiona.drvsupport import supported_drivers
supported_drivers["KML"] = "rw"

gdf = gpd.read_file("https://sentinel.esa.int/documents/247904/1955685/S2A_OPER_GIP_TILPAR_MPC__20151209T095117_V20150622T000000_21000101T000000_B00.kml", driver="KML")

tile = gdf[gdf.Name == "01CCV"]

# get EPGS from description field
print(tile.Description.item())

tile.to_crs("EPSG:32701").unary_union

Output:

'TILE PROPERTIES<br><table border=0 cellpadding=0 cellspacing=0  width=250 style="FONT-SIZE: 11px; FONT-FAMILY: Verdana, Arial, Helvetica, sans-serif;"><tr><td bgcolor="#E3E1CA" align="right"><font COLOR="#000000"><b>TILE_ID</b></font></td><td bgcolor="#E4E6CA"> <font COLOR="#008000">01CCV</font></td></tr><tr><td bgcolor="#E3E1CA" align="right"><font COLOR="#000000"><b>EPSG</b></font></td><td bgcolor="#E4E6CA"> <font COLOR="#008000">32701</font></td></tr><tr><td bgcolor="#E3E1CA" align="right"><font COLOR="#000000"><b>MGRS_REF</b></font></td><td bgcolor="#E4E6CA"> <font COLOR="#008000">-72.01265627 177.18928449 -72.077536527 -179.91249232 -72.972797991 179.93922476 -72.9043013 176.89507973</font></td></tr><tr><td bgcolor="#E3E1CA" align="right"><font COLOR="#000000"><b>UTM_WKT</b></font></td><td bgcolor="#E4E6CA"> <font COLOR="#008000">MULTIPOLYGON(((300000 2000020,300000 1890220,409800 1890220,409800 2000020,300000 2000020)))</font></td></tr><tr><td bgcolor="#E3E1CA" align="right"><font COLOR="#000000"><b>LL_WKT</b></font></td><td bgcolor="#E4E6CA"> <font COLOR="#008000">MULTIPOLYGON(((177.189340361676 -72.0124778858137,176.864623786205 -72.9914734627827,-179.775147078577 -73.064632943735,-179.627444421211 -72.081397340079,177.189340361676 -72.0124778858137)))</font></td></tr></table>'

Unknown

RGI module

Fetch, rasterise and retile RGI (version 6 or 7), verifying it is working on v7.

Current module: dhdt/auxiliary/handler_rgi.py

Clean up use of OGR by using Geopandas instead.

Part of #52

Sentinel 2 tiling grid

Functions in dhdt.generic.handler_sentinel2 assumes a GeoJSON tiling scheme file is present.

Copernicus provides the gridding scheme via this KML file: https://sentinels.copernicus.eu/documents/247904/1955685/S2A_OPER_GIP_TILPAR_MPC__20151209T095117_V20150622T000000_21000101T000000_B00.kml

Should one have a function to get this file (and convert it to GeoJSON if better suited)?

In general it would be nice to have functions to retrieve all the needed auxiliary files (this one, the RGI index, etc.) and an (optional) overarching function to retrieve all auxiliary files needed by the package.

Deprecated `script.ndimage.interpolation`

dhdt/postprocessing/solar_tools.py:234: DeprecationWarning: Please use `rotate` from the `scipy.ndimage` namespace, the `scipy.ndimage.interpolation` namespace is deprecated.
  Ms = ndimage.interpolation.rotate(Mrs, -az, axes=(1, 0),

`match_pair` fails

I could trace this back to i1,j1,i2,j2 being floats instead of integers, which seems to derive from map2pix (in pad_images_and_filter_coord_list) returning floats for i, j. Could I make it such that the image-based coordinates (array indices) returned by map2pix are integers @dicaearchus ?

Full error traceback:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [19], in <cell line: 4>()
      2 print('start matching')
      3 sample_X, sample_Y = pix2map(geoTransform, sample_I, sample_J)
----> 4 match_X, match_Y, match_score = match_pair(Shd, Si,
      5                                            stable.data, stable.data,
      6                                            geoTransform, geoTransform,
      7                                            sample_X, sample_Y,
      8                                            temp_radius=WINDOW_SIZE,
      9                                            search_radius=WINDOW_SIZE,
     10                                            correlator='ampl_comp',
     11                                            subpix='moment',
     12                                            metric='peak_entr')  # 'robu_corr'
     14 dY, dX = sample_Y - match_Y, sample_X - match_X
     15 IN = ~np.isnan(dX)

File /data/volume_2/fnattino/RedGlacier/preprocessing/notebooks/dhdt/processing/coupling_tools.py:172, in match_pair(I1, I2, L1, L2, geoTransform1, geoTransform2, X_grd, Y_grd, temp_radius, search_radius, correlator, subpix, processing, boi, precision_estimate, metric, **kwargs)
    169 for counter in tqdm(range(len(i1))): # loop through all posts
    170     # create templates
    171     if correlator in frequency_based:
--> 172         I1_sub = create_template_off_center(I1,i1[counter],j1[counter],
    173                                             2*temp_radius)
    174         I2_sub = create_template_off_center(I2,i2[counter],j2[counter],
    175                                             2*search_radius)
    176         L1_sub = create_template_off_center(L1,i1[counter],j1[counter],
    177                                             2*temp_radius)

File /data/volume_2/fnattino/RedGlacier/preprocessing/notebooks/dhdt/processing/coupling_tools.py:533, in create_template_off_center(I, i, j, width, filling)
    531     I_sub = I[i-radius[0]:i+radius[1], j-radius[0]:j+radius[1], :]
    532 else:
--> 533     I_sub = I[i-radius[0]:i+radius[1], j-radius[0]:j+radius[1]]
    534 return I_sub

TypeError: slice indices must be integers or None or have an __index__ method

DEM module

Fetch, retile and merge Copernicus DEM.

Current module: dhdt/auxiliary/handler_copernicusdem.py

Try to clean up use of ftps (stale library, no longer updated), use something else? (see #48 )

Part of #52

Issue with installation of `morphsnake`

In order to install the morphsnake dependency one needs to first have numpy and scipy installed, which means installation of the start-code repository requires an environment with gdal, numpy and scipy

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.