Coder Social home page Coder Social logo

kinverarity1 / python-sa-gwdata Goto Github PK

View Code? Open in Web Editor NEW
3.0 2.0 0.0 15.42 MB

Python package for the Groundwater Data section of the DEW WaterConnect website

Home Page: https://python-sa-gwdata.readthedocs.io/en/latest/index.html

License: MIT License

Python 8.73% Jupyter Notebook 91.27%
data-access groundwater python south-australia

python-sa-gwdata's Introduction

python-sa-gwdata

License

Python code to get groundwater data for South Australia

This code provides the Python package sa_gwdata to make it easier to download and access groundwater data from the South Australian Department for Environment and Water's Groundwater Data website. It also provides some help for getting related data from the Department for Energy and Mining's South Australian Resources Information Gateway (SARIG) website.

This is an unofficial side-project done in my spare time.

How to use

Check out the complete package documentation, and some tutorial Jupyter Notebooks in the notebooks folder.

Define the wells you are interested in manually:

>>> import sa_gwdata
>>> wells = sa_gwdata.find_wells("5928-203 and also ULE 96")
>>> wells
["LKW042", "ULE096"]

(It has recognised automatically that 5928-203 is also known as LKW042).

Or search for wells by geographic area:

>>> wells = sa_gwdata.find_wells_in_lat_lon([-34.65, -34.62], [135.47, 135.51])

Then you can download data as pandas DataFrames:

>>> wls = sa_gwdata.water_levels(wells)
>>> tds = sa_gwdata.salinities(wells)
>>> dlogs = sa_gwdata.drillers_logs(wells)

There is also full access to the underlying set of web services which provide a variety of data in JSON format.

Start a session with Groundwater Data:

>>> session = sa_gwdata.WaterConnectSession()

On initialisation it downloads some summary information.

>>> session.networks
{'ANGBRM': 'Angas Bremer PWA',
 'AW_NP': 'Alinytjara Wilurara Non-Prescribed Area',
 'BAROOTA': 'Baroota PWRA',
 'BAROSSA': 'Barossa PWRA',
 'BAROSS_IRR': 'Barossa irrigation wells salinity monitoring',
 'BERI_REN': 'Berri and Renmark Irrigation Areas',
 'BOT_GDNS': 'Botanic Gardens wetlands',
 'CENT_ADEL': 'Central Adelaide PWA',
 'CHOWILLA': 'Chowilla Floodplain',
 ...
}

With this information we can make some direct REST calls:

>>> r = session.get("GetObswellNetworkData", params={"Network": "CENT_ADEL"})
>>> r.df.head(5)
	aq_mon	chem	class	dhno	drill_date	lat	latest_open_date	latest_open_depth	latest_sal_date	latest_swl_date	...	pwa	replaceunitnum	sal	salstatus	stat_desc	swl	swlstatus	tds	water	yield
0	Tomw(T2)	Y	WW	27382	1968-02-07	-34.764662	1992-02-20	225.00	2013-09-02	2018-09-18	...	Central Adelaide	NaN	Y	C	OPR	3.47	C	3620.0	Y	2.00
1	Qhcks	N	WW	27437	1963-01-01	-34.800905	1963-01-01	6.40	1984-02-01	1986-03-05	...	Central Adelaide	NaN	Y	H	NaN	5.86	H	1121.0	Y	NaN
2	Tomw(T1)	Y	WW	27443	1972-04-20	-34.811124	2014-04-01	0.00	1991-10-09	2003-07-04	...	Central Adelaide	NaN	Y	H	BKF	NaN	H	2030.0	Y	5.00
3	Tomw(T1)	Y	WW	27504	1978-02-28	-34.779893	1978-02-28	144.50	2016-04-06	2011-09-18	...	Central Adelaide	NaN	Y	H	OPR	11.21	H	2738.0	Y	0.00
4	Tomw(T1)	Y	WW	27569	1975-01-01	-34.891250	1975-07-09	131.10	1986-11-13	1988-09-21	...	Central Adelaide	NaN	Y	H	BKF	9.90	H	42070.0	Y	12.50

Install

You will need Python 3.8 or a more recent version.

$ pip install -U python-sa-gwdata

This installs the latest release of the Python package sa_gwdata.

To install the latest code from GitHub, make sure you the dependencies pandas and requests installed, then use:

$ pip install https://github.com/kinverarity1/python-sa-gwdata/archive/master.zip

License

MIT

python-sa-gwdata's People

Contributors

codacy-badger avatar kinverarity1 avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

python-sa-gwdata's Issues

Interfacing with Water Connect problem

Describe the bug
Hi @kinverarity1 , this is a really awesome package and it has previously worked very well. It seems to the Water Connect site has changed something in the last month or two, see below:

To Reproduce
Steps to reproduce the behavior:

If trying to run the example from the how-to use, i.e.:

>>> import sa_gwdata
>>> wells = sa_gwdata.find_wells("5928-203 and also ULE 96")

Expected behavior

>>> wells
["LKW042", "ULE096"]

Screenshots
If applicable, add screenshots to help explain your problem.
image

Desktop (please complete the following information):

  • OS: MacOS Monetery 12.6
  • Python version: 3.7.12
  • python-sa-gwdata version: 0.9.0

Create spatial index for efficient rectangle querying to avoid 10000 limit

Obviously the simplest strategy would be dividing up 1 degree squares by four.

But a much faster way would be to generate a most efficient way i.e. Some kind of equal density method limited to rectangles.

This would be very useful, because you could then use the drill date in the State wide cache layer to decide for an efficient strategy of updating the different bulk data download functions. Or the observation well network fields in the unit number search json

Add bulk download CSV water level requests

They don't have all the data but they're a damn sight quicker than doing individual requests.

e.g. https://www.waterconnect.sa.gov.au/_layouts/15/dfw.sharepoint.wdd/WDDDMS.ashx/GetWaterLevelDownload?bulkOutput=CSV

In the POST content the key exportdata appears to have string-encoded JSON:

{"Box":+[-35.606559,139.234534,-35.158702,140.460883],
+"DHNOs":+[83454,83495,83498,83499,83502,83504,83516,83527,83577,
83644,83646,83647,83654,83663,83664,83665,83666,83703,84741,84751,
84791,84792,84793,84795,84802,84803,84868,84878,84879,148693,167427,
171071,173662,182486,182487,190264,192545,196037,196355,200067,
201427,201465,140892,212313,232824,234615,238843,240307,102041,
205988],+"Type":+"CSV",+"Anomalous":+true,+"Pumping":+true}

August update:

The full list of bulk data service names:

  • GetSummaryDownload
  • GetWaterLevelDownload
  • GetSalinityDownload
  • GetWaterChemistryDownloadAllData
  • GetConstructionDownload
  • GetConstructionDetailsDownload (returns ZIP)
  • GetDrillersLogDownload
  • GetElevationDownload
  • GetLithologicalLogDownload
  • GetHydroStratLogDownload
  • GetStratLogDownload

Bug: "can't set attribute"

AttributeError                            Traceback (most recent call last)
<ipython-input-5-ade1c9c139c9> in <module>
      1 wls = (
----> 2     db.water_levels(db.find_wells(well_id))
      3     .pipe(wrap_technote.filter_wl_observations)
      4 )
      5 wls.info()

c:\devapps\projects\dew_gwdata\dew_gwdata\_sageodata.py in find_wells(self, input_text, **kwargs)
    239         logger.debug("unit_nos -> {}".format(r_unit_nos))
    240         all_dh_nos = list(set(dh_nos + r_obs_nos + r_unit_nos))
--> 241         return self._create_well_instances(all_dh_nos)
    242 
    243 

c:\devapps\projects\dew_gwdata\dew_gwdata\_sageodata.py in _create_well_instances(self, dh_nos)
    198         df["name"] = df["dh_name"]
    199         df["unit_no.hyphen"] = df["unit_hyphen"]
--> 200         return Wells([Well(**vals.to_dict()) for _, vals in df.iterrows()])
    201 
    202     def find_wells(self, input_text, **kwargs):

c:\devapps\projects\dew_gwdata\dew_gwdata\_sageodata.py in <listcomp>(.0)
    198         df["name"] = df["dh_name"]
    199         df["unit_no.hyphen"] = df["unit_hyphen"]
--> 200         return Wells([Well(**vals.to_dict()) for _, vals in df.iterrows()])
    201 
    202     def find_wells(self, input_text, **kwargs):

c:\devapps\projects\python-sa-gwdata\sa_gwdata\identifiers.py in __init__(self, *args, **kwargs)
    270         self.obs_no = ObsNo()
    271         self.name = ""
--> 272         self.set(*args, **kwargs)
    273 
    274     def set(self, dh_no, unit_no="", obs_no="", **kwargs):

c:\devapps\projects\python-sa-gwdata\sa_gwdata\identifiers.py in set(self, dh_no, unit_no, obs_no, **kwargs)
    278         self.set_obs_no(obs_no)
    279         for key, value in kwargs.items():
--> 280             self.set_well_attribute(key, value)
    281 
    282     def set_well_attribute(self, key, value):

c:\devapps\projects\python-sa-gwdata\sa_gwdata\identifiers.py in set_well_attribute(self, key, value)
    283         key = key.lower()
    284         self._attributes.append(key)
--> 285         setattr(self, key, value)
    286 
    287     def set_obs_no(self, *args):

AttributeError: can't set attribute

pandas bug - columns cannot be sets

Describe the bug
Error when using

To Reproduce

Python 3.11.0 | packaged by conda-forge | (main, Oct 25 2022, 06:12:32) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sa_gwdata
>>> wells = sa_gwdata.find_wells("5928-203 and also ULE 96")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\devapps\syski\python-sa-gwdata\sa_gwdata\waterconnect_funcs.py", line 34, in find_wells
    session = get_global_session()
              ^^^^^^^^^^^^^^^^^^^^
  File "C:\devapps\syski\python-sa-gwdata\sa_gwdata\waterconnect.py", line 85, in get_global_session
    __waterconnect_session = WaterConnectSession()
                             ^^^^^^^^^^^^^^^^^^^^^
  File "C:\devapps\syski\python-sa-gwdata\sa_gwdata\waterconnect.py", line 128, in __init__
    self.well_cache = pd.DataFrame(columns=set(self.well_id_cols.values()))
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\devapps\syski\mambaforge\envs\working\Lib\site-packages\pandas\core\frame.py", line 640, in __init__
    raise ValueError("columns cannot be a set")
ValueError: columns cannot be a set

Expected behavior
Shouldn't be an exception :-)

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Python version:
  • python-sa-gwdata version: [e.g. >>> import sa_gwdata; sa_gwdata.__version__]

Additional context
Add any other context about the problem here.

Improve documentation

Describe the bug
The JSON web service methods are too prominent in the documentation. The focus should instead be on (1) what are the bulk CSV/pandas DataFrame methods and (2) what data they return.

To Reproduce
Read the docs and be confused :-)

Expected behavior
Read the docs and be enlightened :-)

Screenshots
N/A

Desktop (please complete the following information):

  • OS: N/A
  • Python version: N/A
  • python-sa-gwdata version: 0.10

Additional context
N/A

Add config/.rc file

Is your feature request related to a problem? Please describe.
There will be shortly be a need for package-wide configuration: e.g. location of a local caching database to store previous queries.

Describe the solution you'd like
A configuration file setup. We need something simple and clear, ideally perhaps toml format.

Store data in a local database cache

Is your feature request related to a problem? Please describe.
Every time you make a request for some data that you've already done, it re-downloads it.

Describe the solution you'd like
A locally stored database cache. Ideally this would be allowed to have a maximum size, and therefore we would benefit from an ORM so that we easily do bulk removes, updates, and inserts. The data model should be based only on what's available via the bulk CSV downloads. The JSON service queries are an extra goodie which could be mapped across later. So SQLAlchemy would be an additional dependency. SQLite would be the obvious choice for the cache itself.

The default behaviour of whether to rely solely on the cached results, or update from Groundwater Data, should be configurable, but an easy rule of thumb would be that for some queries like water_levels and salinities, an update should be run if the query is on a new day. Others like well_summary, and so on, should probably be more like a month. In any case: add a keyword argument to the query methods: update_cache=False/True/"auto".

Identify wells without having to query per row of a table

Is your feature request related to a problem? Please describe.

If I have well identifiers scattered across multiple columns, how can I find those wells in a consistent way, that makes it easy to query for new data? e.g.

dh_no obs_no unit_no
  ule91
123456
    S-6628-123
  PTA 114 662820331

Describe the solution you'd like

A way to add a new column to the above table by querying only a minimal number of times (i.e. 2-3 times, not once per row). The new column should contain at a minimum a reliable well identifier.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.