aodn / data-services Goto Github PK

Scripts which are used to process incoming data in the data ingestion pipeline

License: GNU General Public License v3.0

MATLAB 1.54% Shell 0.54% R 1.77% Python 84.68% PLpgSQL 0.17% Rich Text Format 10.98% Dockerfile 0.02% HTML 0.23% Jupyter Notebook 0.07%

data-services's People

Contributors

Stargazers

Watchers

Forkers

tomdurrant beccowley mcuttler b-stepin

data-services's Issues

po_s3_del does not handle * or ? in filename

Using * or ? in filename for po_s3_del is deceptively working and a source of error.

ggalibert@10-aws-syd:~/$ po_s3_del IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T1?3000Z_CBG_FV00_1-hour-avg.nc
Deleting 'IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T1?3000Z_CBG_FV00_1-hour-avg.nc' with index deletion
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T103000Z_CBG_FV00_1-hour-avg.nc'
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T113000Z_CBG_FV00_1-hour-avg.nc'
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T123000Z_CBG_FV00_1-hour-avg.nc'
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T133000Z_CBG_FV00_1-hour-avg.nc'
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T143000Z_CBG_FV00_1-hour-avg.nc'
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T153000Z_CBG_FV00_1-hour-avg.nc'
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T163000Z_CBG_FV00_1-hour-avg.nc'
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T173000Z_CBG_FV00_1-hour-avg.nc'
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T183000Z_CBG_FV00_1-hour-avg.nc'

In the above example the files have been deleted from S3 but not from the database...

Either this should be fixed or better documented somewhere that this functionality cannot be used.

Environmental variables - discussion

A few weeks ago, @danfruehauf has set up environmental variables on 10-nsp so POs can use them for their scripts. The main reason for this is to make this repo scalable. In the last few years, we've changed infrastructure quite a few times. It is pretty clear that being scalable is a necessity.

However, I feel like we haven't taken into account the different user cases, and these environment variables seems to me to be a burden as well as being quite hard to use.

I'd like to have everyone's opinion on this matter, and to know better how POs are thinking of using them.

Current environmental variables

Name	Default	Purpose
$OPENDAP_DIR	/mnt/opendap	OpenDAP
$PUBLIC_DIR	/mnt/imos-t4/IMOS/public	Public
$ARCHIVE_DIR	/mnt/imos-t4/IMOS/archive	Archive
$INCOMING_DIR	/mnt/imos-t4/IMOS/staging	Incoming
$OPENDAP_IMOS_DIR	$OPENDAP_DIR/1/IMOS/opendap	IMOS OpenDAP
$PUBLIC_IMOS_DIR	$PUBLIC_DIR	IMOS public
$ARCHIVE_IMOS_DIR	$ARCHIVE_DIR	IMOS archive
$WIP_DIR	/mnt/ebs/wip	Work In Progress tmp dir
$DATA_SERVICES_DIR	/mnt/ebs/data-services	Where this git repo is deployed
$LOG_DIR	/mnt/ebs/log/data-services	Designated log dir

Most of our scripts currently use a file, generally called config.txt . Prior to the 'environmental variables' era, a config file was written such as :

config.txt

script.path             = /mnt/ebs/data-services/AATAMS/AATAMS_sattag_nrt
dataWIP.path            = /mnt/imos-t4/project_officers/wip/AATAMS/AATAMS_sattag_nrt
australianTags.filepath = /mnt/imos-t4/IMOS/archive/eMII/TALEND_harvester/AATAMS/metadata/aatams_sattag_metadata.csv
dataInput.path          = /mnt/imos-t4/IMOS/archive/eMII/TALEND_harvester/AATAMS/aatams_sattag_nrt/unzipped

This config file can then be used easily by any programming language.

in Matlab :

dataWIP_path = readConfig('dataWIP.path', 'config.txt','=');

in Python :

from configobj import ConfigObj 
config       = ConfigObj('config.txt')
dataWIP_path = config.get('dataWIP.path')

in Bash;

configfile=config.txt
ii=0
while read line; do
if [[ "$line" =~ ^[^#]*= ]]; then
        #http://stackoverflow.com/questions/369758/how-to-trim-whitespace-from-bash-variable
        name[ii]=`echo $line | cut -d'=' -f 1 | sed -e 's/^ *//' -e 's/ *$//'`
        value[ii]=`echo $line | cut -d'=' -f 2- | sed -e 's/^ *//' -e 's/ *$//'`
        ((ii++))
fi
done <$configfile

# this part of the code finds the script.path value in the config.txt
for (( jj = 0 ; jj < ${#value[@]} ; jj++ ));
do
    if [[ "${name[jj]}" =~ "dataWIP.path" ]] ; then
         dataWIP_path=${value[jj]} ;
    fi
done

It often happens that one main script calls actually many programming language routines. This happens because we sometimes find on internet routines written in Perl, Python or other which do the job we want, and don't or can't reinvent the wheel all the time.

For example the previous version of the AATAMS_DM script used to
1 - run a shell script, which
2 - who called Matlab, which
3 - would call Python to convert some Microsoft Access Database files.

This is to point out that some codes can be 'complex' in terms of technology being used.

How is the config file supposed to be written ?

Nothing has been cleared since the beginning. The env var were created, ready to be used. But how ? No one sat down to talk about it.

Possibility 1

It was first suggested to hardcode a Matlab function into the config file to read the environmental variable. In Matlab, this function is called getenv. For example, if one wants to use the $OPENDAP_IMOS_DIR variable, the config file would be written such as :

config.txt

opendapFAIMMS.path  = strcat(getenv('OPENDAP_IMOS_DIR'),'/FAIMMS')

Let's try to read this in matlab

testVar = readConfig('opendapFAIMMS.path', 'config.txt','=');
class testVar
>> class testVar

ans =

char

>> testVar

testVar =

strcat(getenv('OPENDAP_IMOS_DIR'),'/FAIMMS')

This means that loaded in Matlab, it will be loaded as a string, and not as a command to run. The only way to change this would be to change all our Matlab scripts to convert the string into a command. Every single *.m file which would use this variable would have to be changed. Again more work and a source of mistake.

In Matlab, the command eval executes a Matlab expression in text string. So all our *.m files would need to use

testVar = readConfig('opendapFAIMMS.path', 'config.txt','=');
testVar = eval(testVar)

Anyway, writing a config file this way shows that it could potentially only be read by Matlab, and nor Python, nor bash nor ... which is clearly not possible for many scripts.

Possibility 2

We saw above that hardcoding a config file either :

config.txt for Matlab

opendapFAIMMS.path  = strcat(getenv('OPENDAP_IMOS_DIR'),'/FAIMMS')

config.txt for Bash

opendapFAIMMS.path  = $OPENDAP_IMOS_DIR/FAIMMS

is not the way to go if more than one programming language is going to use config.txt.

In this case, it seems like the only possible way is not to add the environmental variable part into the config file, but into the programming language itself. In other words, hardcoding half of the configuration into the different matlab, bash and python script. It seems a bit counter productive. See example below :

config.txt

opendapFAIMMS.path  = FAIMMS

and on Matlab

opendapFaimmsPath = strcat(getenv(OPENDAP_IMOS_DIR) , readConfig('opendapFAIMMS.path', 'config.txt','=');

Not to mention that those Matlab, Python etc scripts will need to check that they do exist, and create default env variables if not. Which means more coding/testing ... for something which is already works.

Possibility 2 - Cons 1

Although it seems like not much, modifying every single routine to handle this can take a lot of time, and is never done without mistakes.

Possibility 2 - Cons 2

Because a part of the configuration would be hard coded into the scripts, if for any reason (and they exists), a file used by a script has to be relocated from $PUBLIC_IMOS_DIR to $ARCHIVE_IMOS_DIR

or even in a near future, once the copy of data will changed from $OPENDAP_IMOS_DIR to $INCOMING_DIR once PO's won't have any writing access, then ALL *.m *.sh *.py will again have to be changed. It doesn't seem serious at all, no compiled program works like this, they read a config file, that is all.

Possibility 2 - Cons 3

This also make things way more difficult to code and test on our machines. This is being forced to develop in a rigid environment.

Possibility 2 - Cons 4

To give some of our codes to third parties becomes impossible.

Possibility 3

Unknown - anyone welcome to edit

Which solution ?

I'm seeking for help and advice here. But IMO we're not going towards the right path, although we need to find a way to make this repository scalable.

Solution 1

A config.txt.bckp could be created with all env variables written in :

config.txt.bckp

opendapFAIMMS.path  = $OPENDAP_IMOS_DIR/FAIMMS

This config.txt.bckp could be read and translated by a common bash script ninjaTranslate.sh for the data-services repo. This bash script should :

find all config.txt.bckp in the data-services repo
translate all env variables found in config.txt.bckp to their correct value
create equivalent config.txt files with the correct values

This would be equivalent as applying a patch to the full repo every time the master branch gets checked out.

Solution 2

If we want to stick with Possibility 1 mentioned above, as many config files as programming language being used for one software will need to be created.

-config.py.txt
-config.bash.txt
-config.mat.txt

This seems really dirty, and again a source of mistakes

Solution 3

running out of ideas

Conclusion

I'm sure the Dev's team would have heaps of ideas and feedbacks. @dnahodil , since you're the Man this week, could we sat down with a few people sometimes ?

SOOP - TRV duplicates

AIMS now provides QAQC SOOP TRV data in realtime, as oppposed to delayed in the past without notifying us. As a result, we have duplicates. Need to handle in the code the deletion of previously downloaded file.

Talend processing hangs

10-aws-syd

root@10-aws-syd:~$ ps aux  | grep talend | grep ANMN
root     31039  0.0  0.0  50428  1848 pts/1    T    10:55   0:00 sudo -u talend /mnt/ebs/talend/bin/talend-trigger -c /mnt/ebs/talend/etc/trigger.conf --delete -f IMOS/ANMN/NRS/REAL_TIME/NRSYON/humidity/percentage_relative_humidity_channel_4100/2011/QAQC/IMOS_ANMN_M_20110101T000000Z_NRSYON_FV01_END-20110124T123928Z_C-20150325T105818Z.nc,IMOS/ANMN/NRS/REAL_TIME/NRSYON/humidity/percentage_relative_humidity_channel_4100/2011/QAQC/IMOS_ANMN_M_20110101T000000Z_NRSYON_FV01_END-20110124T123928Z_C-20150325T105818Z.nc
talend   31040  0.0  0.0  52432  9332 pts/1    Tl   10:55   0:00 ruby /mnt/ebs/talend/bin/talend-trigger -c /mnt/ebs/talend/etc/trigger.conf --delete -f IMOS/ANMN/NRS/REAL_TIME/NRSYON/humidity/percentage_relative_humidity_channel_4100/2011/QAQC/IMOS_ANMN_M_20110101T000000Z_NRSYON_FV01_END-20110124T123928Z_C-20150325T105818Z.nc,IMOS/ANMN/NRS/REAL_TIME/NRSYON/humidity/percentage_relative_humidity_channel_4100/2011/QAQC/IMOS_ANMN_M_20110101T000000Z_NRSYON_FV01_END-20110124T123928Z_C-20150325T105818Z.nc
talend   31042  0.0  0.0   4396   612 pts/1    T    10:55   0:00 sh -c /mnt/ebs/talend/jobs/anmn_nrs_dar_yon-anmn_nrs_dar_yon/java/ANMN_NRS_DAR_YON_harvester/ANMN_NRS_DAR_YON_harvester_run.sh --context_param paramFile="/mnt/ebs/talend/jobs/anmn_nrs_dar_yon-anmn_nrs_dar_yon/etc/anmn_nrs_dar_yon-anmn_nrs_dar_yon.conf" --context_param base=/tmp/d20160506-31040-1ozfy8x --context_param fileList=/tmp/harvester_file_list20160506-31040-16c82ee --context_param logDir=/tmp/d20160506-31040-1tsjkno
talend   31044  0.0  0.0   4396   608 pts/1    T    10:55   0:00 /bin/sh /mnt/ebs/talend/jobs/anmn_nrs_dar_yon-anmn_nrs_dar_yon/java/ANMN_NRS_DAR_YON_harvester/ANMN_NRS_DAR_YON_harvester_run.sh --context_param paramFile=/mnt/ebs/talend/jobs/anmn_nrs_dar_yon-anmn_nrs_dar_yon/etc/anmn_nrs_dar_yon-anmn_nrs_dar_yon.conf --context_param base=/tmp/d20160506-31040-1ozfy8x --context_param fileList=/tmp/harvester_file_list20160506-31040-16c82ee --context_param logDir=/tmp/d20160506-31040-1tsjkno
talend   31047  0.0  0.6 2648556 96964 pts/1   Tl   10:55   0:01 java -Xms256M -Xmx1024M -cp classpath.jar: anmn_nrs_dar_yon_ts.anmn_nrs_dar_yon_harvester_0_1.ANMN_NRS_DAR_YON_harvester --context_param paramFile=/mnt/ebs/talend/jobs/anmn_nrs_dar_yon-anmn_nrs_dar_yon/etc/anmn_nrs_dar_yon-anmn_nrs_dar_yon.conf --context_param base=/tmp/d20160506-31040-1ozfy8x --context_param fileList=/tmp/harvester_file_list20160506-31040-16c82ee --context_param logDir=/tmp/d20160506-31040-1tsjkno

Attempted to kill the connection client on db-prod without any result.

Had to kill manually on 10-aws-syd

IMOS Checker: wave height variables are not vertical coordinates

The IMOS checker is failing on files containing wave height parameters, which it is erroneously identifying as vertical coordinates. E.g.

        float WWSH(TIME) ;
                WWSH:standard_name = "sea_surface_wind_wave_significant_height" ;
                WWSH:long_name = "sea_surface_wind_wave_significant_height" ;
                WWSH:units = "m" ;
                WWSH:reference_datum = "sea surface" ;
                WWSH:positive = "up" ;

results in

    WWSH                               :3:    20/21 :  
        standard_name                  :3:     1/ 2 :  
            vertical                   :3:     0/ 1 : Variable WWSH appears to
                                                      be a vertical coordinate,
                                                      should have attribute
                                                      standard_name = 'height'

messy repo

I don't want to police people, but the repo starts to be pretty messy. When we first had a chat about it, we mentionned this (see readme)

Folder stucture
The suggested naming convention we agreed on with the developpers, regarding the different PO's scripts was : [FACILITY_NAME]/[SUB-FACILITY_NAME][script_name][programing_language]

example : FAIMMS/faimms_data_rss_channels_process_matlab

Most of the folders don't respect this, don't have an easy to read folder hierarchy, no readme file to know what those scripts are supposed to do...

I think we (PO's) should all try to make this repo more consistent and cleaner.

Thoughts anyone ?

incrond has no thread limit

incrond has no thread limit. It can still be replaced by another tool, however it seems pretty stable.

We'll need to decide whether we are going to limit the number of threads it can spawn (problem if many files are uploaded at once - thread bomb). Another option is to use another tool.

I don't think it's urgent, but I put it here so we do not forget about it.

ANMN burst average input flag question

Does this come from Monique's original code? Can you explain the rational behind? I'm not sure to understand... I would have removed anything that is not flagged 0, 1 or 2.

IMOS-1.3 checker needs to accept new data centre details

data_centre: eMII => AODN
data_centre_email: [email protected] => [email protected]

imos 1.4 checker - acknowledgment attribute issue

Seems like there are differences between the code defining the acknowledgement attribute
https://github.com/aodn/data-services/blob/master/lib/cc_plugin_imos/cc_plugin_imos/imos.py#L1117

and the 1.4 version of the document
http://content.aodn.org.au/Documents/IMOS/Conventions/IMOS_NetCDF_Conventions.pdf p55

what is the correct version ?

@ggalibert @mhidas

Could not find uploader for file

A recent upload from one of the regular ANMN uploaders had a couple of erroneous files, but the error message was sent to me because:

Mar 21 11:28:56 10-aws-syd ANMN_NRS: Could not find uploader for file '/mnt/ebs/tmp/tmp.Y3UBpkI1oe/IMOS_ANMN-NRS_CDEKSTUZ_20090527T020337Z_NSNSI_FV00_Profile-SBE-19plus_C-20151214T043847Z.nc'

Incoming files not detected?

There are 112 files in /mnt/ebs/incoming/ANMN/QLD, uploaded between 16:30 and 18:30 last night, which have not triggered the incoming handler.

SOOP_XBT_DM does not clean up tmp files

The files are NetCDF files in the form of tmp.XXXXXX. I could tell they are SOOP_XBT_DM from inspecting the temporary files with ncdump.

ANMN burst averge product

Please see my comments in 9c8cbe1#diff-8a2e1dc73b5ce5edaf67a9bb2483bb38R13

Python coding style and consistency

Now that more of us are starting to write more Python code, it might be helpful to converge a bit in terms of coding style. In particular, naming conventions for variables, functions, classes and files. It just makes the code a bit more readable, so it's worth putting a bit of effort (not heaps) into it. As this detailed style guide for Python code suggests "A style guide is about consistency. Consistency with this style guide is important. Consistency within a project is more important. Consistency within one module or function is most important. "

My own code needs some cleaning up, but first we should agree on what to aim for. I propose we adopt the following conventions as an ideal:

Use 4 spaces (not tabs) per indentation level, and >=4 spaces for long lines (examples)
Module names (i.e. names of .py files) should be lower_case, with underscores if necessary
Class names should use CapWords (capitalise first letter of every word, no underscores). If any word is an abbreviation, capitalise the whole word, e.g. use ANMNFileClassifier instead of AnmnFileClassifier
Function names should be lower_case, with underscores between words (alternatively, if we decide we have too many in mixedCase already, we can stick with that, though IMHO they're less readable. Also we use the lower_case convention in our bash scripts)
Variable names should be lower_case, same as functions
Use leading underscores for non-public methods and variables
Constants within a module should be defined at the top and use UPPER_CASE
Docstrings - describe each module, class and function by adding a "triple-double-quoted" string directly after the definition, e.g.

    def do_something(arg):
        """Do something with arg and return the result.
        If no argument given, do nothing.
        """"

These are just suggestions, open for discussion.

File disappears after going to $INCOMING_DIR

Some files completely disappear from the system after going to $INCOMING_DIR

How to reproduce

on aws10, connect as PO user, and do a source ./env at the base of data-services dir (would be good to add this automatically via chef btw)

# copy a SOOP ASF MT file to its incoming dir
cp $WIP_DIR/SOOP/SOOP_XBT_ASF_SST/data_unsorted/ASF_SST/ship/2016/IMOS_SOOP-ASF_MT_20160404T000000Z_VLMJ_FV01_C-20160405T040053Z.nc $INCOMING_DIR/SOOP/ASF/MT   

input_logf SOOP_ASF_MT
Apr  6 14:44:36 10-aws-syd SOOP_ASF_MT: '/mnt/ebs/tmp/tmp.oltFjHYb6U/IMOS_SOOP-ASF_MT_20160404T000000Z_VLMJ_FV01_C-20160405T040053Z.nc' is not a valid file, aborting.

### similar behaviour with 
cp $WIP_DIR/SOOP/SOOP_XBT_ASF_SST/data_unsorted/ASF_SST/ship/2016/IMOS_SOOP-SST_T_20160405T000300Z_VRDU8_FV01_C-20160406T001307Z.nc $INCOMING_DIR/SOOP/SST

input_logf SOOP_SST
Apr  6 15:16:27 10-aws-syd SOOP_SST: '/mnt/ebs/tmp/tmp.hr9huOeRry/IMOS_SOOP-SST_T_20160405T000300Z_VRDU8_FV01_C-20160406T001307Z.nc' is not a valid file, aborting.

Ok why not, the file is 'not valid'. Well the original one is. But it looks like there is an issue here https://github.com/aodn/data-services/blob/master/lib/common/util.sh#L165 with the tmp file. The definition of 'valid' is a bash check to see if the file exists or not. So we're really at a low level check.

The issue is that we should still be able to locate the 'bad' file. But it is nor in $ERROR_DIR nor in $INCOMING_DIR. Anyway, this is EXTREMELY concerning. Of course, Nagios won't warn us on anything because there is no files in the ERROR_DIR. So we have potentially lost many files.

Both these folders are empty, as well as the thredds folder where the file should go to.

ls -la $INCOMING_DIR/SOOP/ASF/MT 
ls -la $ERROR_DIR/SOOP_ASF_MT

I wish I was wrong, I've tried and checked many times.

@smancini @jonescc @pblain

ANMN T-gridded product file names don't meet naming convention

The process creates file names like this:
IMOS_ANMN-NSW_Temperature_20100702T003500Z_CH070_FV02_CH070-1007-regridded_END-20100907T000500Z_C-20141211T025746Z.nc

This doesn't actually meet the IMOS file naming convention because the third field (after the sub-facility code) should consist of data codes indicating the type of parameters in the file. In this case it should be 'TZ' instead of 'Temperature'.

suggestions to get the data-services scripts on nsp

We need to find a way to chef manage this repo to have everyone's code in :

download the master branch to the location defined by Dan (?) on nsp5
maybe create a folder (called crontab_files) in the main directory of this repo with the crontab files we want. Those files can be moved automatically
...

[SOOP] Call signs duplicated

Refactoring can be done around the area of ship call signs. They are duplicated among a few SOOP scripts. For example:

We can probably have one list of those hashes somewhere and source it once.

ANMN burst average new global attribute product_processing_description

Is this new global attribute product_processing_description really useful?

I can understand it could be good to trace where this FV02 file comes from but this reference is half addressing the question since this particular file in master can have a myriad of different versions. Either you keep track of a very identified version of the code or you don't.

Pipeline fails to send email to uploader

From the process.log:

Jun 16 15:23:55 10-aws-syd ANMN_NSW: Sending NetCDF Checker report to 'xxx@yyy'
Jun 16 15:23:55 10-aws-syd ANMN_NSW: cat: /mnt/ebs/log/data-services/ANMN_NSW/NRSPHB_2014.zip.20160616-152355.log: No such file or directory
Jun 16 15:23:55 10-aws-syd ANMN_NSW: Could not process file '/mnt/ebs/tmp/tmp.DqF094L3d6/NRSPHB_2014.zip': IMOS_ANMN-NRS_CDEKOSTUZ_20140120T224730Z_NRSPHB_FV00_Profile-SBE19plus_C-20160606T060514Z.nc has incorrect name or was uploaded to 
the wrong place
Jun 16 15:23:55 10-aws-syd ANMN_NSW: Moving '/mnt/ebs/tmp/tmp.DqF094L3d6/NRSPHB_2014.zip' -> '/mnt/ebs/error/ANMN_NSW/NRSPHB_2014.zip.20160616-152355'
Jun 16 15:23:56 10-aws-syd ANMN_NSW: smtp-server: 535 Authentication failed: Bad username / password#015
Jun 16 15:23:56 10-aws-syd ANMN_NSW: . . . message not sent.

Link Checker unable to send emails

Update link checker to use AWS SES credentials

Where is S3 mounted on 10-aws-syd ???

The ANMN pipelines (and presumably others) assume the public files are accessible at the path pointed to by the $DATA_DIR env variable. On 10-aws-syd, this is /mnt/imos-data. However, that location just has an empty hierarchy for SOOP TMV (no files):

mhidas@10-aws-syd:~$ ls -aR /mnt/imos-data/
/mnt/imos-data/:
.  ..  IMOS

/mnt/imos-data/IMOS:
.  ..  public

/mnt/imos-data/IMOS/public:
.  ..  SOOP

/mnt/imos-data/IMOS/public/SOOP:
.  ..  TMV

(etc...)

Where are all the S3 files? The incoming handlers need to be able to see them so that they can determine if any previous versions of a file need to be dealt with before indexing the new file.

Need regular clean-up of `/mnt/ebs/tmp` on 10-aws-syd

This directory is used for temp storage during processing (see https://github.com/aodn/chef-private/pull/1776), and it looks like files are not always cleaned up. There's currently 29Gb worth of stuff in there. For the moment this is not a problem as there's still 156Gb free on /mnt/ebs, but as this filesystem is also used for incoming, error, logging and various other things, some bad things could happen if it fills up.

So, it would be good to periodically clean out the oldest files in /mnt/ebs/tmp. Could easily add a cron job here, but maybe this should be set up in chef?

@jonescc @julian1 @danfruehauf any thoughts?

s3 mime in po box - error

When dealing with a file added in the pipeline via the PO BOX, s3_put_no_index_keep_file is eventually called.

If the file is not a NetCDF, there use to be some sort of issue to download a file via s3 explorer.

This commit 184f266 was supposed to fix this. However, this brings an issue in the po box because echo $S3CMD > s3cmd-mocked doesn't know about the mimes

generic_handler: failing regex filter should be reported to uploader

For a file handled by the generic_incoming_handler, if the name doesn't match the regex filter, it currently just moves it to the error dir, with a message like Did not pass regex filter '^IMOS_ANMN-NRS.*realtime.*.nc$' in the log. Usually it will be the uploader who can fix this, so this feedback needs to go to them.
It could also be reworded to something more helpful, like `The file was either uploaded to the wrong incoming directory, or has an incorrect file name.'

PO BOX - NCCHECKER - issues with tests

On a brand new po box, after following the instructions on https://github.com/aodn/data-services/blob/master/lib/cc_plugin_imos/README.md

if I run

cd data-services/lib/cc_plugin_imos
python setup.py test -s cc_plugin_imos.tests

I get errors with ncgen

writing manifest file 'cc_plugin_imos.egg-info/SOURCES.txt'
running build_ext
ncgen: Protocol error
        (genbin.c:163)
ncgen: Protocol error
        (genbin.c:163)

@lwgordonimos @mhidas any ideas ?

Need a unified source of standard netCDF attributes

Many of our processes that create netCDF files require templates to set global (and variable) attributes. Currently this is being done in a variety of different ways, from multiple sources, often setting the same basic attributes (project, acknowledgements, etc..):

the Matlab Toolbox has text files like this for global attributes, and imosParameters.txt for variable attributes
IMOSnetCDF.py uses text files like this for both global and variable attributes. The format used is consistent with CDL, though the files are not complete CDL files. (It also reads a copy of the imosParameters.txt file from the Toolbox, which is out of date now...)
Loz has a generate_netcdf_att.py script which reads global and variable attributes from a config file
the ACORN current_generator code simply sets attributes in a acorn_constants.py file.

There may be others too...

There are two issues here:

We have redundant code doing the same thing in different ways.
We have redundant versions of the same standard global attributes in several locations.

It would be helpful to come up with a solution that removes, or at least minimises, both of these issues.

IMOS checks on "LATITUDE", "LONGITUDE" and "TIME" too specific

These checks only look for the variables with names LATITUDE, LONGITUDE and TIME (same issue for vertical variables already fixed - aodn/compliance-checker#61). These are not required to have these specific names (though in most IMOS data they do), and should be identified according to their name or attributes. e.g. LATITUDE is any variable that has

variable name in _possibley list (cf/util.py), OR
standard_name="latitude", OR
axis="Y", OR
units in _possibleyunits list (cf/util.py)

Incoming file with space in filename not handled

This should (hopefully) be a rare occurrence, but just so everyone's aware, if a file is uploaded with any space characters in the file name, it will not trigger the pipeline.

e.g.

$ ftp incoming.aodn.org.au
...
ftp> put "Filename with space.docx"
... (wait) ...
ftp> ls
-rw-rw-r--    1 110      100         95669 May 27 00:42 Filename with space.docx

Pipeline deleted incoming (corrupted) zip file

From input_log ANMN_NRS:

Mar  1 16:25:44 10-aws-syd ANMN_NRS: Unzipping '/mnt/ebs/incoming/ANMN/NRS/NRSKAI_2015_11.zip' to /mnt/ebs/wip/ANMN_NRS
Mar  1 16:25:44 10-aws-syd ANMN_NRS: End-of-central-directory signature not found.  Either this file is not
Mar  1 16:25:44 10-aws-syd ANMN_NRS: a zipfile, or it constitutes one disk of a multi-part archive.  In the
Mar  1 16:25:44 10-aws-syd ANMN_NRS: latter case the central directory and zipfile comment will be found on
Mar  1 16:25:44 10-aws-syd ANMN_NRS: the last disk(s) of this archive.
Mar  1 16:25:44 10-aws-syd ANMN_NRS: unzip:  cannot find zipfile directory in one of /mnt/ebs/incoming/ANMN/NRS/NRSKAI_2015_11.zip or
Mar  1 16:25:44 10-aws-syd ANMN_NRS: Processing 0 extracted files...
Mar  1 16:25:44 10-aws-syd ANMN_NRS: /mnt/ebs/incoming/ANMN/NRS/NRSKAI_2015_11.zip.zip, and cannot find /mnt/ebs/incoming/ANMN/NRS/NRSKAI_2015_11.zip.ZIP, period.

This was handled by ANMN/common/incoming_handler.sh, which in turn uses the unzip_file function (https://github.com/aodn/data-services/blob/master/lib/common/util.sh#L417). It seems this function doesn't return a non-zero exit value when the unzip fails.

pipeline processing (ANMN_AM) trying to delete file on S3 using 'rm'

When processing new real-time data from ANMN-AM and trying to delete the previous version, we get this in the log:

Sep 13 12:25:17 10-nsp-mel ANMN_AM: ERROR: Invalid command: u'rm'
Sep 13 12:25:17 10-nsp-mel ANMN_AM: Could not set delete 's3://imos-data/IMOS/ANMN/AM/NRSMAI/CO2/real-time/IMOS_ANMN-AM_KST_20150423T000000Z_NRSMAI_FV00_NRSMAI-CO2-1504-realtime-raw_END-20150912T020000Z_C-20150912T022514Z.nc'

Strange, because we're explicitly saying...

s3_rm IMOS/$path_hierarchy/`basename $prev_file`
rm -f $prev_file

where $prev_file is the result of a search on opendap. I have also confirmed that the previous versions are correctly being deleted on the opendap filesystem, but not on S3.

A couple of issues with python modules used by AIMS realtime scripts

While looking at #578 I noticed that the function pass_netcdf_checker seems to be imported from aims.realtime_util (in anmn_nrs_aims.py, faimms.py, and soop_trv.py) but that module in turn imports it from util.

Three minor issues with this:

Why isn't lib/aims/realtime_util.py under lib/python/ ?
It's generally a good idea to avoid from module import *
Why don't those scripts just import the pass_netcdf_checker directly from lib/python/util.py ?

Needs refactoring

The primary S3 interaction code is duplicated/copy-pasted in multiple places.

meteo@debian:~/imos/data-services$ md5sum $( grep -Rl    'export \-f s3_put_no_index_keep'   )
grep: SOOP/SOOP_TRV/env: No such file or directory
grep: FAIMMS/REALTIME/env: No such file or directory
grep: AATAMS/AATAMS_sattag_dm/env: No such file or directory
grep: ANMN/NRS_AIMS/REALTIME/env: No such file or directory
grep: AUV/auv_viewer_campaign_processing/env: No such file or directory
8129b50f81a4c291e44194348d7d5394  ACORN/ACORN_data_aggregation/lib/common/s3.sh
8129b50f81a4c291e44194348d7d5394  lib/common/s3.sh
8129b50f81a4c291e44194348d7d5394  SOOP/SOOP_TRV/lib/common/s3.sh
8129b50f81a4c291e44194348d7d5394  SOOP/SOOP_aggregation_dataset/lib/common/s3.sh
8129b50f81a4c291e44194348d7d5394  FAIMMS/REALTIME/lib/common/s3.sh
8129b50f81a4c291e44194348d7d5394  AATAMS/AATAMS_sattag_dm/lib/common/s3.sh
8129b50f81a4c291e44194348d7d5394  SRS/SRS_OC_BODBAW/lib/common/s3.sh
8129b50f81a4c291e44194348d7d5394  ANMN/NRS_AIMS/REALTIME/lib/common/s3.sh
8129b50f81a4c291e44194348d7d5394  AUV/auv_viewer_campaign_processing/lib/common/s3.sh

generic_incoming_handler not sending emails!

I only just realised that the generic handler only uses the file_error function to log when a file has failed compliance checks, instead of file_error_and_report_to_uploader to also send an email to the uploader!

This means a backup recipient email will also need to be passed as an argument to the generic handler (or set as an env variable?)

ANMN burst average product might be out of sync with original FV01 in terms of metadata

As I was telling Loz, If we rebuild a NetCDF from scratch looking at a lookup table for variable attributes and for a specific NetCDF structure (dimensions, extra scalar variables, global attributes, variable attributes), the day we update the content, the structure of the FV01 file or the nature of the CF/IMOS checks, the resulted FV02 will not be consistent with the FV01 files or won't pass the new checks. The lookup table will need updating and the burst average code will need to be modified to adapt to a new file structure or new check.

We don't need to worry about historical files that don't pass checks. They should be re-uploaded or fixed once and for all. This burst average process should be as agnostic as possible from what makes a file pass or not the checker. All it should do is process files that have just passed the checker in order to produce files that will automatically pass it again.

A simpler and more generic approach consists in taking the FV01 file as it is for starting building a new FV02.

For example, if we have:

        TIME = 250840 ;
variables:
        double TIME(TIME) ;
                TIME:standard_name = "time" ;
                TIME:long_name = "time" ;
                TIME:units = "days since 1950-01-01 00:00:00 UTC" ;
                TIME:calendar = "gregorian" ;
                TIME:axis = "T" ;
                TIME:valid_min = 0. ;
                TIME:valid_max = 90000. ;
        double LATITUDE ;
                LATITUDE:standard_name = "latitude" ;
                LATITUDE:long_name = "latitude" ;
                LATITUDE:units = "degrees_north" ;
                LATITUDE:axis = "Y" ;
                LATITUDE:reference_datum = "geographical coordinates, WGS84 projection" ;
                LATITUDE:valid_min = -90. ;
                LATITUDE:valid_max = 90. ;
        double LONGITUDE ;
                LONGITUDE:standard_name = "longitude" ;
                LONGITUDE:long_name = "longitude" ;
                LONGITUDE:units = "degrees_east" ;
                LONGITUDE:axis = "X" ;
                LONGITUDE:reference_datum = "geographical coordinates, WGS84 projection" ;
                LONGITUDE:valid_min = -180. ;
                LONGITUDE:valid_max = 180. ;
        float NOMINAL_DEPTH ;
                NOMINAL_DEPTH:standard_name = "depth" ;
                NOMINAL_DEPTH:long_name = "nominal depth" ;
                NOMINAL_DEPTH:units = "metres" ;
                NOMINAL_DEPTH:axis = "Z" ;
                NOMINAL_DEPTH:positive = "down" ;
                NOMINAL_DEPTH:reference_datum = "sea surface" ;
                NOMINAL_DEPTH:valid_min = -5.f ;
                NOMINAL_DEPTH:valid_max = 12000.f ;
        float TEMP(TIME) ;
                TEMP:coordinates = "TIME LATITUDE LONGITUDE NOMINAL_DEPTH DEPTH" ;
                TEMP:standard_name = "sea_water_temperature" ;
                TEMP:long_name = "sea_water_temperature" ;
                TEMP:units = "Celsius" ;
                TEMP:valid_min = -2.5f ;
                TEMP:valid_max = 40.f ;
                TEMP:_FillValue = 999999.f ;
                TEMP:ancillary_variables = "TEMP_quality_control" ;
        byte TEMP_quality_control(TIME) ;
                TEMP_quality_control:long_name = "quality flag for sea_water_temperature" ;
                TEMP_quality_control:standard_name = "sea_water_temperature status_flag" ;
                TEMP_quality_control:valid_min = 0b ;
                TEMP_quality_control:valid_max = 9b ;
                TEMP_quality_control:_FillValue = 99b ;
TEMP_quality_control:quality_control_set = 1. ;
                TEMP_quality_control:quality_control_conventions = "IMOS standard set using the IODE flags" ;
                TEMP_quality_control:flag_values = 0b, 1b, 2b, 3b, 4b, 5b, 6b, 7b, 8b, 9b ;
                TEMP_quality_control:flag_meanings = "No_QC_performed Good_data Probably_good_data Bad_data_that_are_potentially_correctable Bad_data Value_changed Not_used Not_used Not_used Missing_value" ;
                TEMP_quality_control:quality_control_global_conventions = "Argo reference table 2a (see http://www.cmar.csiro.au/argo/dmqc/user_doc/QC_flags.html), applied on data in position only (between global attributes time_deployment_start and time_deployment_end)" ;
                TEMP_quality_control:quality_control_global = "A" ;
// global attributes:
                :toolbox_input_file = "\\\\Pearl\\imos\\NRS\\Yongala\\MOORINGS\\Field\\20140516_GM27Trip5912\\Data\\Wetlabs\\WQM0064_002.DAT" ;
                :toolbox_version = "2.4 - PCWIN64" ;
                :file_version = "Level 1 - Quality Controlled Data" ;
                :file_version_quality_control = "Quality controlled data have passed quality assurance procedures such as automated or visual inspection and removal of obvious errors. The data are using standard SI metric units with calibration and other routine pre-processing applied, all time and location values are in absolute coordinates to agree to standards and datum, metadata exists for the data or for the higher level dataset that the data belongs to. This is the standard IMOS data level and is what should be made available to eMII and to the IMOS community." ;
                :project = "Integrated Marine Observing System (IMOS)" ;
                :Conventions = "CF-1.6,IMOS-1.3" ;
                :standard_name_vocabulary = "CF-1.6" ;
                :title = "Yongala National Reference Station - Bottom Frame. Deployed November 2013" ;
                :institution = "ANMN-NRS" ;
                :date_created = "2015-03-02T06:20:17Z" ;
                :abstract = "The Queensland and Northern Australia mooring sub-facility is based at the Australian Institute for Marine Science in Townsville.  The sub-facility is responsible for moorings in two geographic regions: Queensland Great Barrier Reef, where four pairs of regional moorings and one National Reference Station are maintained; and Northern Australia, where a National Reference Station and transect of the Timor Sea comprising four regional moorings, are maintained." ;
                :comment = "Not recording on recovery - stopped on 23/12/2013. 80% BLIS remaining Geospatial vertical min/max information has been computed using the Gibbs-SeaWater toolbox (TEOS-10) v3.02 from latitude and relative pressure measurements (calibration offset usually performed to balance current atmospheric pressure and acute sensor precision at a deployed depth)." ;
                :source = "Bottom Frame - Tripod" ;
                :instrument = "WETLABS WQM" ;
                :keywords = "WQM, TIME, LATITUDE, LONGITUDE, NOMINAL_DEPTH, TEMP, PRES_REL, PSAL, DOX1_2, CHLU, TURB, DOX2, DEPTH" ;
                :references = "http://www.imos.org.au" ;
                :netcdf_version = "4.1.3" ;
                :quality_control_set = 1. ;
                :site_code = "NRSYON" ;
                :platform_code = "NRSYON" ;
                :deployment_code = "NRSYON-1311" ;
                :featureType = "timeSeries" ;
                :naming_authority = "IMOS" ;
                :instrument_serial_number = "064" ;
                :instrument_sample_interval = 1. ;
                :instrument_burst_interval = 900. ;
                :instrument_burst_duration = 59. ;
                :institution_address = "Australian Institute of Marine Science, 1526 Cape Cleveland Road, Cape Cleveland, Queensland, 4810" ;
                :institution_postal_address = "AIMS, PMB 3, Townsville MC, Townsville 4810, Queensland, Australia" ;
                :history = "2015-03-02T06:20:40Z - depthPP: Depth computed using the Gibbs-SeaWater toolbox (TEOS-10) v3.02 from latitude and relative pressure measurements (calibration offset usually performed to balance current atmospheric pressure and acute sensor precision at a deployed depth)." ;
                :quality_control_log = "imosImpossibleDateQC(dateMin=01/01/2007, dateMax=02/03/2015) did not fail on any TIME sample.\nimosImpossibleLocationSetQC(distanceKmPlusMinusThreshold=2.5) did not fail on any LATITUDE sample.\nimosImpossibleLocationSetQC(distanceKmPlusMinusThreshold=2.5) did not fail on any LONGITUDE sample.\nimosInOutWaterQC(in=10/11/13 04:12:00, out=02/06/14 00:20:00) flagged 105 TEMP samples with flag Bad_data.\nimosInOutWaterQC(in=10/11/13 04:12:00, out=02/06/14 00:20:00) flagged 105 PRES_REL samples with flag Bad_data.\nimosInOutWaterQC(in=10/11/13 04:12:00, out=02/06/14 00:20:00) flagged 105 PSAL samples with flag Bad_data.\nimosInOutWaterQC(in=10/11/13 04:12:00, out=02/06/14 00:20:00) flagged 105 DOX1_2 samples with flag Bad_data.\nimosInOutWaterQC(in=10/11/13 04:12:00, out=02/06/14 00:20:00) flagged 105 CHLU samples with flag Bad_data.\nimosInOutWaterQC(in=10/11/13 04:12:00, out=02/06/14 00:20:00) flagged 105 TURB samples with flag Bad_data.\nimosInOutWaterQC(in=10/11/13 04:12:00, out=02/06/14 00:20:00) flagged 105 DOX2 samples with flag Bad_data.\nimosInOutWaterQC(in=10/11/13 04:12:00, out=02/06/14 00:20:00) flagged 105 DEPTH samples with flag Bad_data.\nimosGlobalRangeQC(min=-2.5, max=40) did not fail on any TEMP sample.\nimosGlobalRangeQC(min=-15, max=12000) did not fail on any PRES_REL sample.\nimosGlobalRangeQC(min=2, max=41) did not fail on any PSAL sample.\nimosGlobalRangeQC(min=0, max=900000) did not fail on any DOX1_2 sample.\nimosGlobalRangeQC(min=0, max=100) did not fail on any CHLU sample.\nimosGlobalRangeQC(min=0, max=880000) did not fail on any DOX2 sample.\nimosGlobalRangeQC(min=-5, max=12000) did not fail on any DEPTH sample.\nimosImpossibleDepthQC(zNominalMargin=15, maxAngle=70 => min=12.1742, max=33.3248) did not fail on any PRES_REL sample.\nimosImpossibleDepthQC(zNominalMargin=15, maxAngle=70 => min=12.1, max=33.12) did not fail on any DEPTH sample.\nimosSalinityFromPTQC() did not fail on any PSAL sample." ;
                :geospatial_lat_min = -19.3042333333 ;
                :geospatial_lat_max = -19.3042333333 ;
                :geospatial_lon_min = 147.6205666667 ;
                :geospatial_lon_max = 147.6205666667 ;
                :instrument_nominal_height = 0.5 ;
                :instrument_nominal_depth = 27.1 ;
                :site_depth_at_deployment = 27.6 ;
                :geospatial_vertical_min = 27.99f ;
                :geospatial_vertical_max = 27.99f ;
                :geospatial_vertical_positive = "down" ;
                :local_time_zone = 10. ;
                :time_deployment_start = "2013-11-10T04:12:00Z" ;
                :time_deployment_start_origin = "TimeFirstInPos" ;
                :time_deployment_end = "2014-06-02T00:20:00Z" ;
                :time_deployment_end_origin = "TimeLastInPos" ;
                :time_coverage_start = "2013-11-09T22:59:44Z" ;
                :time_coverage_end = "2013-12-23T16:45:33Z" ;
                :data_centre = "eMarine Information Infrastructure (eMII)" ;
                :data_centre_email = "[email protected]" ;
                :author_email = "[email protected]" ;
                :author = "Rigby, Paul" ;
                :principal_investigator = "Steinberg, Craig" ;
                :principal_investigator_email = "[email protected]" ;
                :institution_references = "http://www.imos.org.au/emii.html" ;
                :citation = "The citation in a list of references is: \"IMOS [year-of-data-download], [Title], [data-access-URL], accessed [date-of-access].\"." ;
                :acknowledgement = "Any users of IMOS data are required to clearly acknowledge the source of the material in the format: \"Data was sourced from the Integrated Marine Observing System (IMOS) - IMOS is supported by the Australian Government through the National Collaborative Research Infrastructure Strategy and the Super Science Initiative.\" If relevant, also credit other organisations involved in collection of this particular datastream (as listed in \'credit\' in the metadata record)." ;
                :distribution_statement = "Data may be re-used, provided that related metadata explaining the data has been reviewed by the user, and the data is appropriately acknowledged. Data, products and services from IMOS are provided \"as is\" without any warranty as to fitness for a particular purpose." ;
                :project_acknowledgement = "The collection of this data was funded by IMOS and delivered through the Queensland and Northern Australia Mooring sub-facility of the Australian National Mooring Network operated by the Australian Institute of Marine Science. IMOS is supported by the Australian Government through the National Collaborative Research Infrastructure Strategy, the Super Science Initiative and the Department of Employment, Economic Development and Innovation of the Queensland State Government. The support of the Tropical Marine Network (University of Sydney, Australian Museum, University of Queensland and James Cook University) on the GBR is also acknowledged." ;

We want to keep everything as it is, except:
-we get rid of any '_quality_control' variable.
-we duplicate any variable PARAM that is a function of TIME so that we have: PARAM, PARAM_min, PARAM_max, PARAM_sdv and PARAM_num_obs. These variables will automatically inherit the data and attributes from PARAM. Apply mean or median, min, max, standard_dev and count to the data of each of them respectively (flags are taken into account to only do this on "good" data) and update the relevant attributes accordingly, add cell_methods, etc...
-TIME is treated slightly differently since it doesn't have a TIME_min, TIME_max, etc...
-for global attributes, most of them will still be relevant, some will need to be deleted, some updated.

IMOS check_geospatial_lat_min_max exception

Running the IMOS checker plugin (https://github.com/aodn/data-services/tree/imos_checker_plugin) on this file reports the following exceptions:

imos.check_geospatial_lat_min_max: only length-1 arrays can be converted to Python scalars
imos.check_geospatial_lon_min_max: only length-1 arrays can be converted to Python scalars

This is bad.

burst-averaging code writes creation date in local timezone

The last field in the IMOS filename, and the date_created attribute are both supposed to contain the UTC date and time the file was created. burst_average.py writes these with the required 'Z' on the end, but the time used is actually local time on wherever the process is running (see lines 218 and 253).

See e.g. https://github.com/aodn/data-services/blob/master/lib/python/IMOSnetCDF.py#L102 for a fix, or why not just use that code, which already generates correct IMOS file names and attributes?

IOOS checker and plugins new installation makes AIMS scripts fail

FAIMMS, ANMN NRS DARWIN YONGALA and SOOP TRV scripts runs as CRONTAB everyday. They also use the NETCDF-CHECKER before pushing a new file to the $INCOMING_DIR to prevent bad files to be downloaded and pushed. This used to work with no issues until last tuesday night on both NSP14 and AWS10.

The issue seems to be related to a bad installation of the IOOS CHECKER/plugin system.

On NSP14, run ipython, and paste the following lines

import os
import sys
import tempfile
netcdf_file_path='/mnt/imos-test-data/IMOS/FAIMMS/Heron_Island/Relay_Pole_1/[email protected]_channel_26/2009/NO_QAQC/IMOS_FAIMMS_T_20090101T000000Z_HIRP1_FV00.nc'
netcdf_checker_path = os.path.dirname(os.path.realpath(os.environ.get('NETCDF_CHECKER')))
sys.path.insert(0, netcdf_checker_path)
import cchecker

tmp_json_checker_output = tempfile.mkstemp()
test='imos'
return_value, errors = cchecker.ComplianceChecker.run_checker(netcdf_file_path, [test] , 'None', 'normal', tmp_json_checker_output[1], 'json')

This will fail with the following message :

No valid checkers found for tests 'imos'
---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
/home/lbesnard/data-services/FAIMMS/REALTIME/<ipython-input-17-eeb824a74047> in <module>()
----> 1 return_value, errors = cchecker.ComplianceChecker.run_checker(netcdf_file_path, [test] , 'None', 'normal', tmp_json_checker_output[1], 'json')

/var/lib/netcdf-checker/compliance_checker/runner.py in run_checker(cls, ds_loc, checker_names, verbose, criteria, output_filename, output_format)
     48 
     49         elif output_format == 'json':
---> 50             groups = cls.json_output(cs, score_groups, output_filename, ds_loc, limit)
     51 
     52         else:

/var/lib/netcdf-checker/compliance_checker/runner.py in json_output(cls, cs, score_groups, output_filename, ds_loc, limit)
    121                     cs.json_output(checker, groups, f, ds_loc, limit)
    122 
--> 123         return groups
    124 
    125     @classmethod

UnboundLocalError: local variable 'groups' referenced before assignment

A bit hard to debug, since the PO box doesn't reflect what is on NSP14 and AWS10. This again shows how important it is to have the PO box similar to what is on prod.

However the issue can be tracked down to this file and line
/var/lib/netcdf-checker/compliance_checker/runner.py" Line 34

(Pdb) core_groups = cs.run(ds, 'imos')
No valid checkers found for tests 'imos'
(Pdb) core_groups = cs.run(ds, 'cf')
No valid checkers found for tests 'cf'

It looks like the tests are not known. The checker plugins was probably not installed in the correct way.

I think this should be raised as a VIT, since it means those 3 datasets won't run until this is fixed.

@smancini @pblain

LJCO not running

@danfruehauf
the cron file for my new script is empty :

see on nsp10

lbesnard@10-nsp-mel:~$ cat /etc/cron.d/_po_SRS_OC_LJCO_AERONET 
# This file is managed by Chef, do not modify it by hand
#

[email protected]

while

lbesnard@10-nsp-mel:~$ cat /mnt/ebs/data-services/cron.d/SRS_OC_LJCO_AERONET 
[email protected]

0 22 * * * lbesnard python SRS/SRS_OC_LJCO_AERONET/downloadAeronetData.py
lbesnard@10-nsp-mel:~$

Did i do anything wrong ?

Make checker output logs and files in error directory unique

Currently if a file fails the netCDF checker, the checker output is saved in the log directory under the netcdf filename with '.log' appended. If the uploader fixes some of the file and re-uploads it with exactly the same filename, the checker output will get appended to the previous log, which could be confusing if it is then sent back to the uploader. So we need to start a new log for the second upload.

The failed netcdf file itself gets copied to the error directory with a timestamp appended to its name. This makes it a bit more unique, but technically it could still be overwritten if the same file is uploaded twice in quick succession (perhaps unlikely).

So both the log file and the copy in the error directory should be named in a way that makes it unique, i.e. referring to a specific incoming handler process.

Unittest depends on response from geoserver-systest

I just ran the data-services unittests in the PO box. This normally passes, but this time I got:

Executing: 'lib/test/python/test_wfs_query.py'
#############################
E
======================================================================
ERROR: test_wfs_request_matching_file_pattern (__main__.TestGenerateNetCDFAtt)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "lib/test/python/test_wfs_query.py", line 20, in test_wfs_request_matching_file_pattern
    res =  wfs_request_matching_file_pattern(self.layer, self.pattern, geoserver_url=self.geoserver_url, url_column=self.url_column, s3_bucket_url=True)[0]
  File "/vagrant/src/data-services/lib/python/util.py", line 87, in wfs_request_matching_file_pattern
    response  = wfs11.getfeature(typename=imos_layer_name, filter=filterxml)
  File "/usr/local/lib/python2.7/dist-packages/owslib/feature/wfs110.py", line 225, in getfeature
    u = openURL(base_url, data, method, timeout=self.timeout)
  File "/usr/local/lib/python2.7/dist-packages/owslib/util.py", line 186, in openURL
    **rkwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 50, in request
    response = session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 468, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 576, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 435, in send
    raise ReadTimeout(e, request=request)
ReadTimeout: HTTPConnectionPool(host='geoserver-systest.aodn.org.au', port=80): Read timed out. (read timeout=30)

----------------------------------------------------------------------
Ran 1 test in 31.865s

FAILED (errors=1)

I don't think it's a good idea to have this dependence on a server that has nothing to do with anything in the data-services repo. Isn't the idea of a unittest that it tests a single, isolated unit of code?

check_netcdf_cf & check_netcdf_imos - missing arg options

Both check_netcdf_cf & check_netcdf_imos bash functions miss the possibility to add some of the netcdf-checker options. such as -criteria [{lenient,normal,strict}], -c [{lenient,normal,strict}]

This would help to pass some files where low priority issues fail the check

@danfruehauf , do you think you could do this ?

File published by pipeline as CF compliant but is not!

This file was stamped by the pipeline as having passed CF checks:

compliance_checker_version: 1.1.1 (79b23fe87f7e932974b93f7bdc1de29d70d18e74)
compliance_checker_last_updated: 2016-01-21 01:31:49 UTC
history: 2016-05-30T01:33:17Z - timeOffsetPP: TIME values and time_coverage_start/end global attributes have been applied the following offset : +0 hours.
2016-05-30 03:28:39 UTC: passed compliance checks: cf (IOOS compliance checker version 1.1.1)

But when I run a CF check on it using the same version of the checker, it fails!

$ cchecker.py -t=cf http://thredds.aodn.org.au/thredds/dodsC/IMOS/ANMN/NRS/NRSMAI/Biogeochem_profiles/IMOS_ANMN-NRS_CDEKOSTUZ_20150224T235659Z_NRSMAI_FV01_Profile-SBE-19plus_C-20160530T013313Z.nc
Running Compliance Checker on the dataset from: http://thredds.aodn.org.au/thredds/dodsC/IMOS/ANMN/NRS/NRSMAI/Biogeochem_profiles/IMOS_ANMN-NRS_CDEKOSTUZ_20150224T235659Z_NRSMAI_FV01_Profile-SBE-19plus_C-20160530T013313Z.nc


--------------------------------------------------------------------------------
                    The dataset scored 429 out of 430 points                    
                              during the cf check                               
--------------------------------------------------------------------------------
...
--------------------------------------------------------------------------------
                  Reasoning for the failed tests given below:                   


Name                             Priority:     Score:Reasoning
--------------------------------------------------------------------------------
var                                    :2:    97/98 :  
    DIRECTION                          :2:     1/ 2 :  
        check_coordinates              :2:     0/ 1 : The variable DIRECTION
                                                      does not have associated
                                                      coordinates

😕 ❓❓❓

ANMN burst average list_var_to_average improvement

For the function list_var_to_average, In order to list variables that needs to be averaged (function is relevantly named list_var_to_average), I would have listed the specific parameters to averaged rather than the one we don't want to average. Indeed, in case there is a new parameter that is introduced, we don't risk to introduce it in FV02 without knowing. The day we want to add a parameter, at least it is a conscious choice.

utils/check-netcdf.sh

/vagrant/src/data-services/lib/common/check-netcdf.sh: line 12: 28112 Killed                  $NETCDF_CHECKER $file "$@" &>$tmp_checker_output

 /vagrant/src/data-services/lib/common/check-netcdf.sh: line 12:  2114 Segmentation fault      $NETCDF_CHECKER $file "$@" &>$tmp_checker_output

/vagrant/src/data-services/lib/common/check-netcdf.sh: line 22: $log_file: ambiguous redirect

only happens when too many files come at once in incoming_dir

@danfruehauf any idea ?

pipeline processing: moves published file into error directory when archive fails!

I was just experimenting with a new pipeline I'm setting up (ABOS_SOTS) and noticed a case where a previously published file needed to be archived, and this process failed (because the archive directory didn't exist on the VM -- but that's not really relevant). The outcome was that the previously published file ended up in the error directory, along with the newly uploaded version.

This should not happen! The pipeline should never un-publish a file unless it replaces it with a newer version, and if it does so, the previous version should be archived.

Here's the full log:

Aug 26 18:19:01 vagrant-ubuntu-precise-64 ABOS_SOTS: mkdir: cannot create directory `/mnt/imos-t4': Permission denied
Aug 26 18:19:01 vagrant-ubuntu-precise-64 ABOS_SOTS: Could not process file '/mnt/opendap/1/IMOS/opendap/ABOS/SOTS/PULSE/IMOS_ABOS-SOTS_W_20150331T130000Z_Pulse_FV01_Pulse-11-MRU-Surface-wave-height_END-20150813T230000Z_fixed.nc': Could not create directory '/mnt/imos-t4/IMOS/archive/ABOS/SOTS/PULSE'
Aug 26 18:19:01 vagrant-ubuntu-precise-64 ABOS_SOTS: Moving '/mnt/opendap/1/IMOS/opendap/ABOS/SOTS/PULSE/IMOS_ABOS-SOTS_W_20150331T130000Z_Pulse_FV01_Pulse-11-MRU-Surface-wave-height_END-20150813T230000Z_fixed.nc' -> '/vagrant/src/error/ABOS_SOTS/IMOS_ABOS-SOTS_W_20150331T130000Z_Pulse_FV01_Pulse-11-MRU-Surface-wave-height_END-20150813T230000Z_fixed.nc.20150826-181900'
Aug 26 18:19:01 vagrant-ubuntu-precise-64 ABOS_SOTS: Could not process file '/var/incoming/ABOS/SOTS/Pulse/IMOS_ABOS-SOTS_W_20150331T130000Z_Pulse_FV01_Pulse-11-MRU-Surface-wave-height_END-20150813T230000Z_new.nc': Not moved out of incoming directory
Aug 26 18:19:01 vagrant-ubuntu-precise-64 ABOS_SOTS: Moving '/var/incoming/ABOS/SOTS/Pulse/IMOS_ABOS-SOTS_W_20150331T130000Z_Pulse_FV01_Pulse-11-MRU-Surface-wave-height_END-20150813T230000Z_new.nc' -> '/vagrant/src/error/ABOS_SOTS/IMOS_ABOS-SOTS_W_20150331T130000Z_Pulse_FV01_Pulse-11-MRU-Surface-wave-height_END-20150813T230000Z_new.nc.20150826-181900'

sharing common routines

Since we now use a common repository, it would be wise to start creating a shared folder to put some routines/package commonly used, some sort of eMII toolbox, not to reinvent the wheel all the time. The benefits are more peer review, more knowledge being shared, better coding ...

The most common routine we use in our MATLAB scripts is for example readConfig.m first created for the toolbox. It would make more sense to only have one version of it in case there is a need for an update.

I also just created a set of routines for an AATAMS script to query a SQLite file, and get the data in a matlab format depending of the column type (https://github.com/aodn/data-services/tree/master/AATAMS/AATAMS_sattag_dm/subroutines/sqlite).
This definitely could be some sort of use by someone at some stage. If this kind of 'package' is not being shared, it's more than likely to become unnoticed.

Suggestion

data-services  
              \
               -cron.d
               -AATAMS
               -***  
               -lib/common
                   \
                    -MATLAB
                        \
                         -SQLite
                    -PYTHON
                    -R

Thoughts anyone ?

IMOS check on Conventions attribute too strict

The IMOS checkers currently expect the value of the Conventions global attribute to include the exact string "CF-1.6,IMOS-1.3" or "CF-1.6,IMOS-1.4". However, according to the NetCDF conventions, the value of this attribute can be either space or comma-separated, so we should allow for that.

ANMN burst average clean_burst_vars question

What is the purpose of this function? Would be good to comment better on this.

aodn / data-services Goto Github PK

data-services's People

Contributors

Stargazers

Watchers

Forkers

data-services's Issues

Environmental variables - discussion

Current environmental variables

in Matlab :

in Python :

in Bash;

How is the config file supposed to be written ?

Possibility 1

Possibility 2

Possibility 2 - Cons 1

Possibility 2 - Cons 2

Possibility 2 - Cons 3

Possibility 2 - Cons 4

Possibility 3

Which solution ?

Solution 1

Solution 2

Solution 3

Conclusion

How to reproduce

Suggestion

Thoughts anyone ?

Recommend Projects

Recommend Topics

Recommend Org