stormsurgelive / asgs Goto Github PK

The Automated Solution Generation System (ASGS) provides software infrastructure for automating coastal ocean modelling for real time decision support, and provides a variety of standalone command line tools for pre- and post-processing.

Home Page: https://asgs.sh

License: GNU General Public License v3.0

Perl 16.32% Shell 13.28% Fortran 56.55% C 0.58% Java 0.43% Python 1.12% Gnuplot 0.50% HTML 0.75% Makefile 0.36% CMake 0.03% MATLAB 3.26% Awk 0.03% Roff 0.10% F* 6.66% Starlark 0.01% Smarty 0.01%

adcirc forecast automation tropical cyclones

asgs's Introduction

For a Quick Start, see:

https://github.com/StormSurgeLive/asgs/wiki/ASGS-Cheat-Sheet

Release Engineering:

Stable Release Tag: 2024.stable.2 (also always stable branch)

Latest Stable Release - git checkout stable; git pull origin stable

Development/Bleeding Edge - git checkout master; git pull origin master

asgs's People

Stargazers

Watchers

asgs's Issues

make location of asgs log file configurable in asgs config file

By default, the ASGS log file is written to the same directory where the ASGS is started. This is opaque somewhat non-intuitive. If the log file is stored in the Operator's home directory, lengthy and/or numerous asgs log files can fill up the Operator's home directory. We should have a new ASGS variable called SYSLOGDIR. We should default to setting SYSLOGDIR=$SCRATCHDIR/asgs/log (making the directory if necessary) and give the Operator the ability to set SYSLOGDIR in each asgs config file.

mixed case advisories eventually

A heads up that the NWS local products will no longer be all caps by May 11. The bulletins do not specifically mention the NHC, but that is expected to happen eventually, regardless.

http://www.nws.noaa.gov/os/notification/scn16-12mixed_case.htm

The parsing of National Hurricane Center advisories in the ASGS assumes all caps which is valid for the 2016 hurricane season, but case sensitivity should be removed.

Translate cpra post processing Matlab to Python

The Matlab scripting to plot CPRA interest locations should be rewritten to Python to provide better cross-platform support.

adcirc test case support

The ASGS is oriented toward real time processing, but would work well as a testbed for ADCIRC modifications. In order to run test cases like the shinnecock inlet with various parameters/options, support is required for simply picking up a set of files and executing them with different fort.15 parameters, without looking for external forcing etc. Also, infrastructure is required for post processing to compare the output with expected output.

handle intermediate advisories

The NHC sometimes issues intermediate advisories where they update the storm location or other information between their normal advisory times. The ASGS may not notice these intermediate advisories in real time, or have time to run them before the next advisory is released. These intermediate advisories are also often accompanied by partial lines in the associated BEST track files. The ASGS should be modified to handle these situations without choking on the files.

new pressure wind relationship

The pressure-wind relationships used in storm_track_gen.pl would benefit from the introduction of a different pressure-wind relationship derived after the very active 2005 hurricane season. This equation has more specific variants depending on whether the storm is strengthening or weakening, which basin it is in (Atlantic, Gulf of Mexico, or Caribbean), and whether it is north or south of 25N latitude. The equation(s) and their description can be found in the following reference:

Brown, P. D., J. L. Franklin, and C. Landsea, 2006: A fresh look at tropical cyclone pressure-wind relationships using recent reconnaissance-based best track data (1998–2005). Preprints, 25th Conf. on Hurricanes and Tropical Meteorology, Monterey, CA, Amer. Meteor. Soc., 3B.5. [Available online at http://ams.confex.com/ams/pdfpapers/107190.pdf].

/output Compile Error

I compiled the output directory on Lodestar 5 at UT-Austin and received this error:

../util/nodalattr/nodalattr.f90(328): error #5082: Syntax error, found '<' when expecting one of: <LABEL> <END-OF-STATEMENT> ; TYPE INTEGER REAL COMPLEX BYTE CHARACTER CLASS DOUBLE ...
<<<<<<< HEAD
^
../util/nodalattr/nodalattr.f90(346): error #5082: Syntax error, found '==' when expecting one of: <LABEL> <END-OF-STATEMENT> ; TYPE INTEGER REAL COMPLEX BYTE CHARACTER CLASS DOUBLE ...
=======
^

This is out of my expertise to resolve, any help would be greatly appreciated.

enable meteorologically forced runs without tidal spinup or other hotstart file

ASGS should be able to run a forecast from coldstart without any tidal or other initialization to generate a hotstart file.

2014stable version doc updates needed

Documentation does not include the updated configuration information for the 2014stable version of the ASGS, including definition of the storm ensemble, the PERCENT parameter, and the negative percent required for veerLeft.

numerical instability detection

ASGS can't tell when ADCIRC or ADCIRC+SWAN is having numerical instabilities.

see if a job is hanging in the hpc queue

It is sometimes the case that a parallel job will experience a fatal error and crash, but will not immediately exit an hpc queue. This causes the job to burn up core hours needlessly, as well as delaying or stopping production. The ASGS needs to be able to monitor a job to determine whether this is happening, perhaps by checking the .out file from the job periodically if it is available and/or by checking the last update times of the output files to see if they haven't been updated within a configurable window of time. The window of time could be based on the output frequency and expected performance (timesteps/sec).

produce estimated time of completion for each job

Once benchmarking features have been developed, it should be used by the ASGS for each run to produce an estimated time of completion.

test/debug production asgs on hatteras with rabbitmq merged in

I've merged the upstream/2014stable branch into the 2014stable-rmq branch on the renci-unc fork, resolved conflicts, committed changes and pushed the result back to the renci-unc fork on github. Now I need to test/debug on hatteras to confirm that everything still works ...

test/debug asgs with rabbitmq merged in on queenbee

automatically produce moving hindcasts at configurable intervals

When comparing model hindcast and nowcast data, it is helpful to be able to produce moving window nowcasts (e.g., last 24 hours, last 2 M2 tidal cycles, last 7 days, last 10 days, last 30 days, entire storm history, etc) that look back across model results from previous advisories or cycles. The moving window hindcasts can then be compared to measured data for continuous validation. Due to disk space limitations, it probably only makes sense to do this for station data and min/max data, not for fulldomain data.

test/debug rabbitmq enabled asgs on hatteras

The monitoring changes to asgs using rabbitmq are on the renci-unc fork of the asgs in the 2014stable-rmq branch. Once these changes have been merged into the production version of the asgs and conflicts are resolved, the merged code must be tested and verified on hatteras.

active status monitoring needed

The ASGS sends out emails to notify the operator each time it successfully completes a milestone: detecting new data, completing a forecast, etc. However, it does not always send an email when it experiences failures. The result of this is that Operators receive a large volume of email about successes, rather then targeted emails that provide notice of failure. It is easy to not notice that an ASGS instance has stopped responding, due to this error by omission. An automated external monitoring mechanism is required to positively notify the Operator when results are not produced on schedule.

test/debug asgs with rabbitmq merged in on lonestar5

add comments to all variable initializations in asgs_main.sh to indicate what they mean

There is a list of variable declarations in asgs_main.sh that establishes default values but does not provide any indication of what each variable is meant to do or what the valid values are for each variable. Comments should be added to provide greater transparency and to make it easier to onboard new developers and operators.

run.properties metadata

run.properties file should arguably contain all parameter values from fort.15 file

opendap path contains storm name which may be ambiguous

The opendap path on the THREDDS server where results are stored contains the storm name as a path element. This creates ambiguity for two reasons: (1) the NHC sometimes uses the storm number spelled out as the storm name before a system can become a named storm (e.g., using TWO as the name of the 2nd storm of a year until the storm is officially named, subsequently using the actual name); and (2) reuse of storm names from year-to-year. This path element should be changed to the storm number and year instead (e.g., 122005 instead of Katrina).

implement a data tank subsystem

The asgs repository holds all of the template files and scripts needed for the asgs to run, but it is not appropriate to store large files (like mesh and nodal attributes files). The repository is also not the most optimal place to store binary files that are not built during the installation process. For these purposes, the asgs should have a "data tank" subsystem that can be used to store these large files. The data in the tank should be accessible via cp as well as scp and http for remote data storage. This system can also be used to store grib files and similar files locally so that they only have to be downloaded once, rather than being re-downloaded over and over by different ensemble members. The tank can be filled by the asgs or by separate dedicated processes.

serial processing mode

The ASGS is hardcoded to assume that all runs are executed in parallel (always running adcprep etc). Make the ASGS capable of running single processor compute jobs (i.e., adcirc instead of padcirc).

logging related to a specific run

All the log messages associated with a particular run should be written into the run directory as well as written to the main asgs log file.

All log messages use tee or similar so that the ensemble member directory and the main asgs log file get duplicates so that the ensemble member directory has only messages related to that ensemble member.

cut down asgs-operators and create asgs-announce

One way to improve the use of our asgs-operators mailing list is to dramatically reduce the list membership. The list audience currently includes anyone with an interest in real time ADCIRC model guidance that has asked me to be on it. We could reduce this audience so that it consists exclusively of ASGS (and CERA) Operators. Basically just the recipients of this email.

For those being removed from the asgs-operators list, we will create a new asgs-announce mailing list to provide status updates and other good news.

asgs needs a readme.md file

We need intro documentation to provide orientation.

performance analysis and turnaround time

need to have a subsystem that tracks performance analysis (timesteps per second) for different meshes under different conditions (wave coupling on or off, etc); should also keep track of time required for different phases of execution (waiting in queue, running adcprep, executing, post processing, etc); should also produce charts and graphs

automate the updating of station lists

For each nowcast and forecast cycle, the ASGS instance should retrieve the latest CERA station list and compare it with its own CERA station list. If the CERA station list has changed, then the ASGS should use the new CERA station list. @carolakaiser

create json formatted run.properties file

The run.properties file will be central to the 2019stable but there are some issues with the conventional format of keyword/value pairs. In particular, we need to enable multiple values for a given keyword. We would also like to use an existing standard format that can be modified programmatically from a variety of languages with existing support for the format.

swan comment line limits

The control_file_gen.pl script does not enforce the character length limits on the SWAN comment lines at the top of the fort.26 file. Also, the fort.26 template developer must be aware of these limits when developing hard coded comment lines in the fort.26 template. Comment lines that are too long will cause SWAN to produce an error message and exit.

The definitions of these comment lines and corresponding character length limits are as follows (from the SWAN documentation at http://swanmodel.sourceforge.net/download/zip/swanuse.pdf):

(line 1) ’name’ 'nr' ('name' is the name of the project, at most 16 characters long. Default: blanks) ’nr’ is the run identification (to be provided as a character string; e.g. the run number) to distinguish this run among other runs for the same project; it is at most 4 characters long. It is the only required information in this command.
(line 2) ’title1’ is a string of at most 72 characters provided by the user to appear in the output of the program for the user’s convenience. Default: blanks.
(line 3) ’title2’ same as ’title1’
(line 4). ’title3’ same as ’title1’.

real time river gage boundary conditions

The code and metadata developed for supporting the setting/resetting of river boundaries with real time gage data should be merged into the asgs master branch.

see if a job has disappeared from the hpc queue

It sometimes happens that a job will mysteriously disappear from an hpc queue, which prevents the ASGS from seeing either a .finish or .error file that announces that the job is either completed successfully or crashed with a fatal error (because these files are written with commands at the end of the queue script, which won't be reached if someone cancels the job). This is actually a useful feature sometimes because it allows an Operator to tweak the files in a job by cancelling it, making changes, and resubmitting, without the ASGS noticing. The drawback to that approach is that the ASGS needs to keep track of wall clock time, and misses the wall clock time lost when the job is resubmitted. This capability can be retained by allowing the Operator to create a .pause file that will tell the ASGS to suspend active monitoring of the job until it sees the .pause file has been deleted.

create intro page for asgs wiki

Need to create an asgs wiki to provide documentation on various topics and subjects.

need a simple high level meteorology-only configuration option

Running a forecast ensemble member in meteorology-only mode is very helpful for producing winds without land roughness or canopy coefficient for visualization and analysis (among other things).

The ASGS can run ensemble members in met-only mode if the Operator sets all the right ADCIRC parameters in the config file for that ensemble member. In order to simplify the configuration file and make it less error-prone, there should be a single parameter to set in the config file that sets the right parameters to make an ensemble member run in meteorology-only mode.

re-read config file periodically while waiting for next advisory

We may adjust the ASGS config one or more times between advisories and need the ASGS to be aware of those changes between advisories, as well as when attempting to download meteorological data.

troubleshoot file i/o errors on hatteras

There seems to be an issue on hatteras with starting up parallel processes that all read the hotstart file. I've added retries that simply resubmit the parallel job, and this works eventually (after a dozen or more retries) on a lower number of processors (e.g., 160 cores). But at higher core counts (over 600), adcirc+swan won't start without an i/o error message, even after resubmitting the job over 100 times.

reduce queue script templates so there is only one for each queue system

In the 2014stable version and previous versions, there was a separate set of queue script templates for each HPC platform, in order to take into account the differences (and idiosycracies) between platforms. However, a more thoroughly templatized queue script along with improved metadata and configuration flexibility should take care of these differences, allowing us to use just one queue script template for SLURM and another for PBS.

perform nowcasts when BEST track files are updated rather than waiting for forecast/advisory

The 2014stable version of the ASGS performs a nowcast/forecast cycle when a new forecast/advisory is issued. However, the updated BEST track file that is needed to perform the nowcast is actually released mid-cycle between forecast/advisory times. Significant improvements in turnaround time can be achieved by running the nowcast as soon as the BEST track file is available so that the ASGS is ready to run the forecast immediately upon issuance of the next forecast/advisory.

aswip options

ASGS may not pass the correct values to aswip for NWS19 and NWS20; aswip in v51 now has -m 4 as the default value and may require the -z option.

initial water level version difference

The ASGS always looks for a fort.88 file if river flux has been turned on. However, this is only appropriate for ADCIRC versions v50 and v49. ADCIRC v51 implements this as a nodal attribute. ASGS should detect the version of ADCIRC and activate the appropriate behavior.

automated updating of stations lists

The list of stations and/or their coordinates may require dynamic updating; multiple stations lists need to be combined to create the full stations list (e.g., the CERA station list, available via URL would need to be added to stations lists maintained as a static file or available via other dynamic means). Configuration options should include getting the entire stations list from a URL or executing a program that retrieves/constructs the stations list in a self-contained way. Whatever list acquisition/construction method is used, the ASGS must have a failover mode that uses the last-known-good stations list in case the list construction script fails for some reason or the remote station database is offline etc.

asgs blithely soldiers on when crucial files are not found and when key scripts fail

When the ASGS first starts up, it checks for the existence of many different files including the various adcirc executables, the fort.15 template file, etc. However, after the ASGS is running, it often assumes that all its files are present. In fact, I just had a case where the fort.26 template file was missing, causing the control_file_gen.pl script to fail. If the ASGS had checked the exit status of control_file_gen.pl or for the existence of the fort.26 template file, it could have avoided going ahead with building the rest of the job, submitting it to the queue, and then having it crash in the queue and hang there.

NHC radii are out of date

The NHC radii in storm_track_gen.pl are out of date and should be updated for the 2020 hurricane season.

high memory usage by NAM2OWI.pl

As reported by @BrianOBlanton : The NAM2OWI perl code reads a bunch of text data into memory for later output, and this can cause the processing job to die w/o diagnostics when running on a vm with 2Gb of memory. This processing job can be moved to a compute node, but eventually we should re-write this perl code to be more efficient...

merge rabbitmq-enabled asgs with production version

This task is just the merge part of the monitoring, the test, debug, production deployment, non-RENCI production deployment, and documentation are separate tasks.

tides only runs

The ASGS should support the option of producing periodic tides only runs of configurable duration and at configurable intervals so that storm surge results can be separated from storm tide results.

automated historical storm production

In order to run the ASGS in test mode, the Operator must construct hindcast and forecast files manually; this is a tedious and error prone process. An automated method for downloading and constructing pairs of hindcasts and forecasts is required whereby the Operator only needs to give the number and year of the historical storm and the advisory to start on, and the ASGS handles the job of constructing the met files and issuing the advisories to itself.

doc update SCRIPTDIR

Documentation does not reflect that all exe be within SCRIPTDIR.
Operator documentation does not adequately cover Rmax variations.
Documentation should reflect the use of screen -S to name screen sessions; useful in high production situations

monthly readiness exercises

ASGS Operators will hold exercises every month during hurricane season to demonstrate readiness.

maxele.63 [adcirc to netcdf] segmentation error

Hi,
I tried with multiple maxele.63 files (different sources) but everytime it is yielding this error, can you look into the matter please?
thanks
arslaan@Linux:~/Documents/KalpanaTesting_DONE/script$ adcirc2netcdf.x --meshfile 'fort.14' --attfile 'generic_atts.txt' --datafile 'maxele.63'
INFO: adcirc2netcdf was compiled with the following netcdf library: 4.4.0 of Mar 29 2016 11:41:40 $
INFO: Processing --meshfile fort.14.
INFO: Processing --attfile generic_atts.txt.
INFO: Processing --datafile maxele.63.
INFO: adcirc2netcdf.f90: Checking number of nodes in data file.
INFO: adcirc2netcdf.f90: Searching for file maxele.63 ...
INFO: adcirc2netcdf.f90: The file maxele.63 was found.
INFO: adcirc2netcdf.f90: The file maxele.63 was opened successfully.
INFO: Creating NetCDF file 'maxele.63.nc'.
INFO: adcirc2netcdf.f90: Opening netcdf metadata/attributes file.
INFO: adcirc2netcdf.f90: Searching for file generic_atts.txt ...
INFO: adcirc2netcdf.f90: The file generic_atts.txt was found.
INFO: adcirc2netcdf.f90: The file generic_atts.txt was opened successfully.
INFO: Finished reading metadata/attributes file.
INFO: adcirc2netcdf.f90: Searching for file fort.14 ...
INFO: adcirc2netcdf.f90: The file fort.14 was found.
INFO: adcirc2netcdf.f90: The file fort.14 was opened successfully.
INFO: Mesh file comment line: FLAT88_fluxbndbathycorrected.grd
INFO: Reading mesh file dimensions.
INFO: Allocating memory for elevation specified boundaries.
INFO: Allocating memory for flux specified boundaries.
WARNING: Number of flux boundary nodes was set to 583 but 582 were found.
INFO: Finished reading mesh file dimensions.
INFO: Reading mesh file coordinates, connectivity, and boundary data.
INFO: adcirc2netcdf.f90: Searching for file fort.14 ...
INFO: adcirc2netcdf.f90: The file fort.14 was found.
INFO: adcirc2netcdf.f90: The file fort.14 was opened successfully.
INFO: Finished reading mesh file coordinates, connectivity, and boundary data.
INFO: Writing mesh definitions to netcdf.
INFO: Finished writing mesh definitions to netcdf.
INFO: Adding data attributes to netCDF file.
fn%numVarNetCDF= 0
nc_dimid = 1 0
initFileMetaData : enter
num xdmf= 1
xdmf init finished
netcdf vartype init finished
netcdf init(1) finished
initFileMetaData : return
INFO: Finished adding data attributes to netCDF file.
INFO: Mesh has been written to the netCDF file.
INFO: adcirc2netcdf.f90: Searching for file maxele.63 ...
INFO: adcirc2netcdf.f90: The file maxele.63 was found.
INFO: adcirc2netcdf.f90: The file maxele.63 was opened successfully.

allocating f%rdata

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0 0x7FA72A45BE08
#1 0x7FA72A45AF90
#2 0x7FA729DA24AF
#3 0x7FA72A77F060
#4 0x7FA72A7B37B1
#5 0x414C11 in __asgsio_MOD_writeonedataset
#6 0x403ED8 in MAIN__ at adcirc2netcdf.f90:?
Segmentation fault (core dumped)

stormsurgelive / asgs Goto Github PK

asgs's Introduction

asgs's People

Stargazers

Watchers

Forkers

asgs's Issues

Recommend Projects

Recommend Topics

Recommend Org