Coder Social home page Coder Social logo

smnorris / fwapg Goto Github PK

View Code? Open in Web Editor NEW
9.0 5.0 5.0 7 MB

PostgreSQL tools for working with British Columbia's Freshwater Atlas

Home Page: https://smnorris.github.io/fwapg

License: MIT License

Shell 6.28% PLpgSQL 82.01% Makefile 10.94% Dockerfile 0.76%
fwa british-columbia postgis freshwater-atlas linear-referencing watersheds rivers streams lakes

fwapg's People

Contributors

frantarkenton avatar smnorris avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

fwapg's Issues

FWA_WatershedAtMeasure QA

  • don't include polys with invalid codes
  • double check against fwakit.watersheds
  • test cut watersheds for validity

tidy upstream area scripts

Adding the upstream area attribute to streams is useful but can be taken from fwa_watersheds_upstream_area, no need to calculate upstream area twice

FWA_WatershedAtMeasure performance

Unsurprisingly, bigger areas take longer to build:

Sooke River @ mouth:

time psql -c "SELECT wscode_ltree FROM fwa_refinedwatershed(354153927, 0)"
real	0m1.366s
user	0m0.009s
sys	0m0.010s

Cowichan River @ 7.2km:

time psql -c "SELECT wscode_ltree FROM fwa_refinedwatershed(354155148, 7200)"
real	0m4.057s
user	0m0.007s
sys	0m0.006s

Running the Fraser River at measure of 1000m takes too long to wait for.

Fix for now is to adjust the query to use fwa_assessment_watersheds_poly as pre-aggregated areas. As a test, this takes ~47s for Fraser River at measure 1000m. This is acceptable for current use case, but is there a faster way to get the outer edge of the geometry collection? Also, using the assessment watersheds is only quick if only using that layer, there doesn't seem to be a quick way to select upstream assessment watersheds and then tack on the first order watersheds that are not upstream of that first selection.

A cache of some kind is probably the way to go but using a built in postgis function would be nice.

FWA_WatershedAtMeasure fails on Spillimacheen River

SELECT * FROM FWA_WatershedAtMeasure(356570193, 4999.9999);
ERROR:  more than one row returned by a subquery used as an expression
CONTEXT:  PL/pgSQL function fwa_watershedatmeasure(integer,double precision) line 26 at RETURN QUERY

FWA_Upstream - queries fail to use index when wrapped in the function

Calling something like this can be extremely slow, the function still doesn't seem to use the indexes reliably.

(4.5m for a query joining ~6000 points to streams, restricted to a single watershed group, but only 2.4s when using the raw upstream SQL)

-- using function, 4.5m
SELECT a.col1, sum(st_length(geom)) as length_upstream
FROM a
LEFT OUTER JOIN streams b
ON FWA_Upstream(
    a.blue_line_key,
    a.downstream_route_measure,
    a.wscode_ltree,
    a.localcode_ltree,
    b.blue_line_key,
    b.downstream_route_measure,
    b.wscode_ltree,
    b.localcode_ltree
   )

-- spell it out, 2.4s
SELECT a.col1, sum(st_length(geom)) as length_upstream
FROM a
LEFT OUTER JOIN streams b
ON 
    -- b is a child of a, always
  b.wscode_ltree <@ a.wscode_ltree
  AND
      -- conditional upstream join logic, based on whether watershed codes are equivalent
    CASE
      -- first, consider simple case - streams where wscode and localcode are equivalent
      -- this is all segments with equivalent bluelinekey and a larger measure
      -- (plus fudge factor)
       WHEN
          a.wscode_ltree = a.localcode_ltree AND
          (
              (b.blue_line_key <> a.blue_line_key OR
               a.downstream_route_measure < b.downstream_route_measure + .01)
          )
       THEN TRUE
       -- next, the more complicated case - where wscode and localcode are not equal
       WHEN
          a.wscode_ltree != a.localcode_ltree AND
          (
           -- higher up the blue line (plus fudge factor)
              (b.blue_line_key = a.blue_line_key AND
               a.downstream_route_measure < b.downstream_route_measure + .01 )
              OR
           -- tributaries: b wscode > a localcode and b wscode is not a child of a localcode
              (b.wscode_ltree > a.localcode_ltree AND
               NOT b.wscode_ltree <@ a.localcode_ltree)
              OR
           -- capture side channels: b is the same watershed code, with larger localcode
              (b.wscode_ltree = a.wscode_ltree
               AND b.localcode_ltree > a.localcode_ltree)
          )
        THEN TRUE
    END 

Tidy, modify, or remove function FWA_LengthInstream()

Although I don't think I'm currently using the function, the code in FWA_LengthInstream() is outdated. It is useful though, and could be used for fish habitat modelling. It should be updated to use the FWA_Upstream/ FWA_Downstream funcs directly.

Also, the func would probably be much more useful if it was generalized to be similar to those functions - simply return true if a feature is 'instream' between two points. Length can be derived from the returned features.

To find features between two locations (without creating a graph/making routing queries), find downstream path of both input locations and return only features that are not common to the two result sets:

Something like this, but with some kind of logic that ensures no results are returned when querying features that are not connected (I think this would connect features across the ocean):

WITH dnstr_a AS
(
    SELECT
      linear_feature_id,
      blue_line_key,
      downstream_route_measure,
      wscode_ltree,
      localcode_ltree,
      geom
    FROM whse_basemapping.fwa_stream_networks_sp s
    WHERE FWA_Downstream(360878058, 10524, '400.431358.815756'::ltree, '400.431358.815756.466791'::ltree,
    s.blue_line_key, s.downstream_route_measure, s.wscode_ltree, s.localcode_ltree)
),

dnstr_b AS
(
    SELECT
      linear_feature_id,
      blue_line_key,
      downstream_route_measure,
      wscode_ltree,
      localcode_ltree,
      geom
    FROM whse_basemapping.fwa_stream_networks_sp s
    WHERE FWA_Downstream(360881038, 5115, '400.431358.918528'::ltree, '400.431358.918528.104334'::ltree,
    s.blue_line_key, s.downstream_route_measure, s.wscode_ltree, s.localcode_ltree)
)

SELECT
  COALESCE(a.linear_feature_id, b.linear_feature_id) as linear_feature_id,
  COALESCE(a.geom, b.geom) as geom
FROM dnstr_a a
FULL OUTER JOIN dnstr_b b
on a.linear_feature_id = b.linear_feature_id
WHERE (a.linear_feature_id IS NULL AND b.linear_feature_id IS NOT NULL)
OR (a.linear_feature_id IS NOT NULL AND b.linear_feature_id IS NULL)

CUT method at measure very close to mouth of a river emptying into lake

SELECT * FROM fwa_watershedrefine(356568595, 21.6)
Screen Shot 2019-07-12 at 10 57 16 AM

Too many watersheds adjacent to the double line river are selected for the cut. And cutting is not necessary in this case, the point is at the start of the river waterbody where the fundamental watersheds terminate at the same location (although the waterbody does not start at measure 0 - we may be able to assume the fundamental watersheds terminate at the same location where fwa_watershed_code = local_watershed_code but this needs checking.)

basic handling of points in YT/NWT/AB/AK using hydrosheds

As per smnorris/bcbasins#12.

A function is needed that takes a point or coordinates and returns and aggregated shape including the hydroshed it is within plus those upstream. Just take the query from FWA_WatershedExBC():

-- For streams in other areas we use hydrosheds, which has far less detail
-- Find the hydroshed at the point of interest and work upstream from there
    WITH RECURSIVE walkup (hybas_id, geom) AS
        (
            SELECT hybas_id, wsd.geom
            FROM hydrosheds.hybas_lev12_v1c wsd
            INNER JOIN (select * FROM FWA_LocateAlong(blkey, meas)) as pt
            ON ST_Intersects(wsd.geom, pt.geom)

            UNION ALL

            SELECT b.hybas_id, b.geom
            FROM hydrosheds.hybas_lev12_v1c b,
            walkup w
            WHERE b.next_down = w.hybas_id
        )
    SELECT
      'hybas_na_lev12_v1c' AS source,
      ST_Union(w.geom) as geom
    FROM walkup w;

finding trans-boundary streams and border - stream intersections

When finding points where streams likely cross the border with fwa_upstreambordercrossings(blkey, meas), the fuzzy result of the -75m shift and the approximate location of the border created by create_fwa_approx_borders.sql can lead to border points being created in the wrong drainage, resulting in incorrect returns from subsequent calls to fwa_upstreambordercrossings.
For example on Ewart Creek, because the height of land is right on the border, the point is created in the adjacent drainage (Wall Creek):
Screen Shot 2019-07-15 at 12 55 34 PM

QGIS slow to list layers

QGIS is extremely slow to list spatial tables of general 'Geometry' type.
Work around this by specifying type, eg geometry(MultiLineString,3005)

Python dependencies

Note python dependency for the extras in README, provide a requirements.txt and/or environment.yml and remove pgdata dependency from the extras scripts.

upgrade to postgres 13 / postgis 3.1 / geos 3.9

Upgrading to pg13/postgis3.1/geos3.9 gives significant performance benefits. The fixed precision operations should also remove the need for the lostgis ST_Safe_ functions.

Also, because the ST_Safe_ functions do not work with postgres parallelization, some of the FWA functions that currently use them may not work under postgres 13 anyway - removing these functions is a priority.

download fwa.zip

This isn't exactly an issue with your scripts but I'm having trouble downloading the fwa.zip file required before running your scripts.

When I run

wget https://geobc.s3-us-west-2.amazonaws.com/FWA.zip

I get

--2020-05-05 20:01:35--  https://geobc.s3-us-west-2.amazonaws.com/FWA.zip
Resolving geobc.s3-us-west-2.amazonaws.com (geobc.s3-us-west-2.amazonaws.com)... 52.218.251.41
Connecting to geobc.s3-us-west-2.amazonaws.com (geobc.s3-us-west-2.amazonaws.com)|52.218.251.41|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2020-05-05 20:01:36 ERROR 403: Forbidden.

Can you confirm that you are still able to download from that link?
Cheers!

add new linear referencing functions for pg_featureserv

  1. Return a point at specified meters upstream
  2. Return multiple points at specified distance along stream with labels (0 starting at mouth and moving up)

I think a linear function would be useful too (provided a blue_line_key and two measures, return a single feature representing the stream between them) - but that is out of scope for now.

FWA_Upstream - improve comparison of lines and points

For locations on the same blue_line_key, the function compares point locations only (at the provided measures).
This gives a weird result if considering whether point b at measure 10 is upstream of line at measure 5, with length 10. The point is at the midpoint of the line but the function currently returns true for this instance, length/upstream measure is not considered.

v0.0.1

Other tools depend on fwapg and the FWA data is being continuously updated by GeoBC. To help with stability, fwapg releases should be created, with each version linked to a specific data extract.

FWA_WatershedAtMeasure still too slow

Peace @ Ft St John takes >5min. Ok for local processing but not so good for requests via pg_featureserv:

 LOG:  duration: 313465.312 ms  statement: select * from postgisftw.fwa_watershedatmeasure(359572348, 1590599);

FWA_WatershedAtMeasure - returns no results if no watersheds upstream

For example:

# SELECT * FROM  FWA_WatershedAtMeasure(356551270, 681);
 wscode_ltree | localcode_ltree | area_ha | refine_method | geom
--------------+-----------------+---------+---------------+------
(0 rows)

Terminus of this first order watershed poly is at downstream_route_measure = 606.3m

I can't recall if this is expected behaviour based on other watersheds but it doesn't seem right in this case.

encoding of gnis_name in FWA_NAMED_WATERSHEDS

Encoding of NAMED_WATERSHEDS_POLY column gnis_name in FWA.gpkg file dumped from the warehouse is off.

Appears to be an issue for that layer only, the gnis_name Barrière River comes through fine in the streams table.

$ ogrinfo FWA.gpkg -sql "SELECT gnis_name FROM FWA_NAMED_WATERSHEDS_POLY WHERE GNIS_NAME LIKE 'Barri%'"
INFO: Open of `FWA.gpkg'
      using driver `GPKG' successful.
Layer name: SELECT
Geometry: None
Feature Count: 7
Layer SRS WKT:
(unknown)
GNIS_NAME: String (80.0)
OGRFeature(SELECT):0
  GNIS_NAME (String) = Barri� River

extras - calculate upstream area for watersheds

Current script populates a column in the streams table. Because upstream of each watershed is needed as well, it is probably more efficient to calculate upstream area for each watershed and work with that.

clean outputs from hydrosheds

Some cleaning of the aggregated output might be required, there are some slivers in the test shape. Fixing #45 might be enough though.
Screen Shot 2021-04-08 at 4 30 19 PM

tests

A few basic tests for the various fwa_ functions would be valuable.

functions missing

Not all required functions are currently being added. Review and tidy.

FWA_WatershedAtMeasure output type

Currently the function RETURNS TABLE. This makes sense for the typical case where just requesting a single watershed, especially via the current api.

When using the db directly, it would be nice to use the function to generate watersheds within a more general/typical context.

eg - I have a bunch of IDs and want a set of results with watersheds for each:

SELECT
 stream_crossing_id,
 downstream_route_measure,
 blue_line_key,
 FWA_WatershedAtMeasure(blue_line_key, downstream_route_measure)
FROM bcfishpass.crossings
WHERE stream_crossing_id in (50152,62515,50181,50185,50261,62423,62516,197533,197534,197542,197555);

This gives a result like this, the output from the function is a nested table / values with watershed codes, refine method and geom:

62515	112	356547712	(300.625474.584724.233916.539602,300.625474.584724.233916.539602,444.72,DEM,<geom>)

Returning the watershed codes and area isn't really necessary, but returning the refine method as well as the output geom definitely is - perhaps return an array holding these two values rather than a table would be easier to work with?

FWA_Upstream/Downstream - tolerances and returning features at equivalent measures

Returning all streams upstream of a point location where there is a break in the stream network is a common query.
Currently, the stream feature at the measure provided will not be returned by FWA_Upstream() because measures of the stream and point are equivalent (+/- the provided tolerance). There should to be an option to return features with equivalent measures rather than tacking them on with another subquery.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.