smnorris / fwapg Goto Github PK
View Code? Open in Web Editor NEWPostgreSQL tools for working with British Columbia's Freshwater Atlas
Home Page: https://smnorris.github.io/fwapg
License: MIT License
PostgreSQL tools for working with British Columbia's Freshwater Atlas
Home Page: https://smnorris.github.io/fwapg
License: MIT License
fwakit.watersheds
poissonconsulting/fwatlasbc#23
Not sure how/if pg_featureserv can handle warnings but otherwise this should be simple.
Provincial layers previously too large for the distribution service are now directly downloadable via the DataBC catalogue.
To support loading via GUI supplied by postgis-baselayers, the script could potentially be converted to a make file, see https://github.com/kokoalberti/postgis-baselayers/blob/master/app/datasets/naturalearth/Makefile
Adding the upstream area attribute to streams is useful but can be taken from fwa_watersheds_upstream_area
, no need to calculate upstream area twice
Unsurprisingly, bigger areas take longer to build:
Sooke River @ mouth:
time psql -c "SELECT wscode_ltree FROM fwa_refinedwatershed(354153927, 0)"
real 0m1.366s
user 0m0.009s
sys 0m0.010s
Cowichan River @ 7.2km:
time psql -c "SELECT wscode_ltree FROM fwa_refinedwatershed(354155148, 7200)"
real 0m4.057s
user 0m0.007s
sys 0m0.006s
Running the Fraser River at measure of 1000m takes too long to wait for.
Fix for now is to adjust the query to use fwa_assessment_watersheds_poly
as pre-aggregated areas. As a test, this takes ~47s for Fraser River at measure 1000m. This is acceptable for current use case, but is there a faster way to get the outer edge of the geometry collection? Also, using the assessment watersheds is only quick if only using that layer, there doesn't seem to be a quick way to select upstream assessment watersheds and then tack on the first order watersheds that are not upstream of that first selection.
A cache of some kind is probably the way to go but using a built in postgis function would be nice.
SELECT * FROM FWA_WatershedAtMeasure(356570193, 4999.9999);
ERROR: more than one row returned by a subquery used as an expression
CONTEXT: PL/pgSQL function fwa_watershedatmeasure(integer,double precision) line 26 at RETURN QUERY
We should probably be adding M values to the geoms on load (rather than calculating them on the fly whenever doing linear referencing):
http://postgis.net/docs/ST_AddMeasure.html
http://postgis.net/docs/ST_LocateAlong.html
http://postgis.net/docs/ST_LocateBetween.html
Calling something like this can be extremely slow, the function still doesn't seem to use the indexes reliably.
(4.5m for a query joining ~6000 points to streams, restricted to a single watershed group, but only 2.4s when using the raw upstream SQL)
-- using function, 4.5m
SELECT a.col1, sum(st_length(geom)) as length_upstream
FROM a
LEFT OUTER JOIN streams b
ON FWA_Upstream(
a.blue_line_key,
a.downstream_route_measure,
a.wscode_ltree,
a.localcode_ltree,
b.blue_line_key,
b.downstream_route_measure,
b.wscode_ltree,
b.localcode_ltree
)
-- spell it out, 2.4s
SELECT a.col1, sum(st_length(geom)) as length_upstream
FROM a
LEFT OUTER JOIN streams b
ON
-- b is a child of a, always
b.wscode_ltree <@ a.wscode_ltree
AND
-- conditional upstream join logic, based on whether watershed codes are equivalent
CASE
-- first, consider simple case - streams where wscode and localcode are equivalent
-- this is all segments with equivalent bluelinekey and a larger measure
-- (plus fudge factor)
WHEN
a.wscode_ltree = a.localcode_ltree AND
(
(b.blue_line_key <> a.blue_line_key OR
a.downstream_route_measure < b.downstream_route_measure + .01)
)
THEN TRUE
-- next, the more complicated case - where wscode and localcode are not equal
WHEN
a.wscode_ltree != a.localcode_ltree AND
(
-- higher up the blue line (plus fudge factor)
(b.blue_line_key = a.blue_line_key AND
a.downstream_route_measure < b.downstream_route_measure + .01 )
OR
-- tributaries: b wscode > a localcode and b wscode is not a child of a localcode
(b.wscode_ltree > a.localcode_ltree AND
NOT b.wscode_ltree <@ a.localcode_ltree)
OR
-- capture side channels: b is the same watershed code, with larger localcode
(b.wscode_ltree = a.wscode_ltree
AND b.localcode_ltree > a.localcode_ltree)
)
THEN TRUE
END
Although I don't think I'm currently using the function, the code in FWA_LengthInstream()
is outdated. It is useful though, and could be used for fish habitat modelling. It should be updated to use the FWA_Upstream
/ FWA_Downstream
funcs directly.
Also, the func would probably be much more useful if it was generalized to be similar to those functions - simply return true if a feature is 'instream' between two points. Length can be derived from the returned features.
To find features between two locations (without creating a graph/making routing queries), find downstream path of both input locations and return only features that are not common to the two result sets:
Something like this, but with some kind of logic that ensures no results are returned when querying features that are not connected (I think this would connect features across the ocean):
WITH dnstr_a AS
(
SELECT
linear_feature_id,
blue_line_key,
downstream_route_measure,
wscode_ltree,
localcode_ltree,
geom
FROM whse_basemapping.fwa_stream_networks_sp s
WHERE FWA_Downstream(360878058, 10524, '400.431358.815756'::ltree, '400.431358.815756.466791'::ltree,
s.blue_line_key, s.downstream_route_measure, s.wscode_ltree, s.localcode_ltree)
),
dnstr_b AS
(
SELECT
linear_feature_id,
blue_line_key,
downstream_route_measure,
wscode_ltree,
localcode_ltree,
geom
FROM whse_basemapping.fwa_stream_networks_sp s
WHERE FWA_Downstream(360881038, 5115, '400.431358.918528'::ltree, '400.431358.918528.104334'::ltree,
s.blue_line_key, s.downstream_route_measure, s.wscode_ltree, s.localcode_ltree)
)
SELECT
COALESCE(a.linear_feature_id, b.linear_feature_id) as linear_feature_id,
COALESCE(a.geom, b.geom) as geom
FROM dnstr_a a
FULL OUTER JOIN dnstr_b b
on a.linear_feature_id = b.linear_feature_id
WHERE (a.linear_feature_id IS NULL AND b.linear_feature_id IS NOT NULL)
OR (a.linear_feature_id IS NOT NULL AND b.linear_feature_id IS NULL)
Point to whse_basemapping schema, not temp (used for test)
Add to whse_basemapping.fwa_stream_networks_sp
, for quicker fish habitat modelling queries:
upstream_lake_ha
upstream_reservoir_ha
upstream_wetland_ha
SELECT * FROM fwa_watershedrefine(356568595, 21.6)
Too many watersheds adjacent to the double line river are selected for the cut. And cutting is not necessary in this case, the point is at the start of the river waterbody where the fundamental watersheds terminate at the same location (although the waterbody does not start at measure 0 - we may be able to assume the fundamental watersheds terminate at the same location where fwa_watershed_code
= local_watershed_code
but this needs checking.)
The query works with FTP data posted 2016-04-25 but not with the latest:
psql:sql/create_fwa_assessment_watersheds_streams.sql:27: ERROR: duplicate key value violates unique constraint "fwa_assessment_watersheds_streams_lut_pkey"
DETAIL: Key (linear_feature_id)=(868144294) already exists.
Smoky Creek (blue_line_key=356534008
) is nowhere near the border, but result returned includes all of the Kootenay drainage in the US:
https://www.hillcrestgeo.ca/fwapg/functions/fwa_watershedatmeasure/items.html?blue_line_key=356534008&downstream_route_measure=1016&limit=50
As per smnorris/bcbasins#12.
A function is needed that takes a point or coordinates and returns and aggregated shape including the hydroshed it is within plus those upstream. Just take the query from FWA_WatershedExBC()
:
-- For streams in other areas we use hydrosheds, which has far less detail
-- Find the hydroshed at the point of interest and work upstream from there
WITH RECURSIVE walkup (hybas_id, geom) AS
(
SELECT hybas_id, wsd.geom
FROM hydrosheds.hybas_lev12_v1c wsd
INNER JOIN (select * FROM FWA_LocateAlong(blkey, meas)) as pt
ON ST_Intersects(wsd.geom, pt.geom)
UNION ALL
SELECT b.hybas_id, b.geom
FROM hydrosheds.hybas_lev12_v1c b,
walkup w
WHERE b.next_down = w.hybas_id
)
SELECT
'hybas_na_lev12_v1c' AS source,
ST_Union(w.geom) as geom
FROM walkup w;
The function needs to return an indicator if a point is outside BC. Currently this only happens if there is also stream in BC within the tolerance.
When finding points where streams likely cross the border with fwa_upstreambordercrossings(blkey, meas)
, the fuzzy result of the -75m shift and the approximate location of the border created by create_fwa_approx_borders.sql
can lead to border points being created in the wrong drainage, resulting in incorrect returns from subsequent calls to fwa_upstreambordercrossings
.
For example on Ewart Creek, because the height of land is right on the border, the point is created in the adjacent drainage (Wall Creek):
QGIS is extremely slow to list spatial tables of general 'Geometry' type.
Work around this by specifying type, eg geometry(MultiLineString,3005)
Clip hydrosheds data to areas outside of BC, FWA boundaries need to take priority within BC.
add watershed_group_id
Rather than copying pasting the uptream/downstream logic into all the various queries.
See poissonconsulting/fwapgr#12
request https://hillcrestgeo.ca/fwa/v1/watershed/359567332?downstream_route_measure=10&srid=4326
returns
{
"statusCode": 500,
"code": "42702",
"error": "Internal Server Error",
"message": "column reference \"geom\" is ambiguous"
}
Note python dependency for the extras
in README, provide a requirements.txt
and/or environment.yml
and remove pgdata
dependency from the extras scripts.
Upgrading to pg13/postgis3.1/geos3.9 gives significant performance benefits. The fixed precision operations should also remove the need for the lostgis
ST_Safe_
functions.
Also, because the ST_Safe_
functions do not work with postgres parallelization, some of the FWA functions that currently use them may not work under postgres 13 anyway - removing these functions is a priority.
This isn't exactly an issue with your scripts but I'm having trouble downloading the fwa.zip file required before running your scripts.
When I run
wget https://geobc.s3-us-west-2.amazonaws.com/FWA.zip
I get
--2020-05-05 20:01:35-- https://geobc.s3-us-west-2.amazonaws.com/FWA.zip
Resolving geobc.s3-us-west-2.amazonaws.com (geobc.s3-us-west-2.amazonaws.com)... 52.218.251.41
Connecting to geobc.s3-us-west-2.amazonaws.com (geobc.s3-us-west-2.amazonaws.com)|52.218.251.41|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2020-05-05 20:01:36 ERROR 403: Forbidden.
Can you confirm that you are still able to download from that link?
Cheers!
I think a linear function would be useful too (provided a blue_line_key and two measures, return a single feature representing the stream between them) - but that is out of scope for now.
Cut exbc data by bc boundary for the LAKE method as we do for others here: https://github.com/smnorris/fwapg/blob/main/sql/functions/FWA_WatershedAtMeasure.sql#L329
blue_line_key = 356570372
works on other blue_line_keys, e.g.
https://www.hillcrestgeo.ca/fwapg/functions/fwa_locatealonginterval/items.html?blue_line_key=356568123
For locations on the same blue_line_key, the function compares point locations only (at the provided measures).
This gives a weird result if considering whether point b at measure 10 is upstream of line at measure 5, with length 10. The point is at the midpoint of the line but the function currently returns true for this instance, length/upstream measure is not considered.
Other tools depend on fwapg and the FWA data is being continuously updated by GeoBC. To help with stability, fwapg releases should be created, with each version linked to a specific data extract.
Peace @ Ft St John takes >5min. Ok for local processing but not so good for requests via pg_featureserv:
LOG: duration: 313465.312 ms statement: select * from postgisftw.fwa_watershedatmeasure(359572348, 1590599);
For example:
# SELECT * FROM FWA_WatershedAtMeasure(356551270, 681);
wscode_ltree | localcode_ltree | area_ha | refine_method | geom
--------------+-----------------+---------+---------------+------
(0 rows)
Terminus of this first order watershed poly is at downstream_route_measure = 606.3m
I can't recall if this is expected behaviour based on other watersheds but it doesn't seem right in this case.
This is a data error that has been reported to GeoBC.
I'm only noting the issue here as a reminder to run an extract when fixes are available.
Rather than calculate as needed, just include it in the table.
Encoding of NAMED_WATERSHEDS_POLY
column gnis_name
in FWA.gpkg
file dumped from the warehouse is off.
Appears to be an issue for that layer only, the gnis_name
Barrière River
comes through fine in the streams table.
$ ogrinfo FWA.gpkg -sql "SELECT gnis_name FROM FWA_NAMED_WATERSHEDS_POLY WHERE GNIS_NAME LIKE 'Barri%'"
INFO: Open of `FWA.gpkg'
using driver `GPKG' successful.
Layer name: SELECT
Geometry: None
Feature Count: 7
Layer SRS WKT:
(unknown)
GNIS_NAME: String (80.0)
OGRFeature(SELECT):0
GNIS_NAME (String) = Barri� River
Current script populates a column in the streams table. Because upstream of each watershed is needed as well, it is probably more efficient to calculate upstream area for each watershed and work with that.
Some cleaning of the aggregated output might be required, there are some slivers in the test shape. Fixing #45 might be enough though.
A few basic tests for the various fwa_
functions would be valuable.
Expect more changes to this func so add a simple test to make sure it is returning something consistent in the common use cases.
poissonconsulting/fwatlasbc#16
# select fwa_watershedatmeasure(356528119, 0);
ERROR: more than one row returned by a subquery used as an expression
CONTEXT: PL/pgSQL function fwa_watershedatmeasure(integer,double precision) line 26 at RETURN QUERY
George has applied the fixes and they are available in the warehouse.
On next extract, remove fixes from https://github.com/smnorris/fwapg/blob/main/04_assmnt_wsds_lut.sh#L20
Halfway River @ 90536m
# SELECT * FROM fwa_watershedrefined(359570578, 90536);
ERROR: 2nd arg must be smaller then 3rd arg
CONTEXT: PL/pgSQL function fwa_watershedrefined(integer,double precision) line 21 at RETURN QUERY
This name is still a bit misleading - the function doesn't return the stream geoms it returns the nearest point geom on the nearest stream(s). FWA_IndexPoint
is probably better, like the EPA service (https://www.epa.gov/waterdata/point-indexing-service)
Not all required functions are currently being added. Review and tidy.
Currently the function RETURNS TABLE
. This makes sense for the typical case where just requesting a single watershed, especially via the current api.
When using the db directly, it would be nice to use the function to generate watersheds within a more general/typical context.
eg - I have a bunch of IDs and want a set of results with watersheds for each:
SELECT
stream_crossing_id,
downstream_route_measure,
blue_line_key,
FWA_WatershedAtMeasure(blue_line_key, downstream_route_measure)
FROM bcfishpass.crossings
WHERE stream_crossing_id in (50152,62515,50181,50185,50261,62423,62516,197533,197534,197542,197555);
This gives a result like this, the output from the function is a nested table / values with watershed codes, refine method and geom:
62515 112 356547712 (300.625474.584724.233916.539602,300.625474.584724.233916.539602,444.72,DEM,<geom>)
Returning the watershed codes and area isn't really necessary, but returning the refine method as well as the output geom definitely is - perhaps return an array holding these two values rather than a table would be easier to work with?
https://www.postgresql.org/about/news/1943/
ltree
watershed codes rather than creating with ogr2ogr and altering tableReturning all streams upstream of a point location where there is a break in the stream network is a common query.
Currently, the stream feature at the measure provided will not be returned by FWA_Upstream()
because measures of the stream and point are equivalent (+/- the provided tolerance). There should to be an option to return features with equivalent measures rather than tacking them on with another subquery.
blue_line_key = 360887232
downstream_route_measure = 8412
Result includes entire Atlin/Tagish drainage, should just be Tutshi Lake. This is probably a trans-boundary data quirk?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.