gem / oq-engine Goto Github PK

View Code? Open in Web Editor NEW

364.0 33.0 272.0 973.23 MB

OpenQuake's Engine for Seismic Hazard and Risk Analysis

Home Page: https://github.com/gem/oq-engine/#openquake-engine

License: GNU Affero General Public License v3.0

Python 97.99% Shell 0.75% HTML 0.36% JavaScript 0.69% CSS 0.05% C++ 0.06% Jinja 0.06% BitBake 0.03%

earthquakes seismic hazard risk risk-analysis risk-assessment hazard-assessment openquake python cluster

oq-engine's Introduction

OpenQuake Engine

The OpenQuake Engine is an open source software that provides calculation and assessment of seismic hazard, risk and decision-making tools via the data, methods and standards that are being developed by the GEM (Global Earthquake Model) Foundation and its collaborators. DOI: 10.13117/openquake.engine

Current Long Term Support (LTS) release - for users wanting stability

Current LTS version is the OpenQuake Engine 3.16 'Angela':

The code name for version 3.16 is Angela, in memory of the Italian science journalist Piero Angela. What's new

Latest release - for users needing the latest features

Latest stable version is the OpenQuake Engine 3.19.* What's new

Documentation

Since version 3.19 the OpenQuake Engine documentation has been consolidated into a single site: https://docs.openquake.org/oq-engine/master/manual/

Mirrors

A mirror of this repository, hosted in Pavia (Italy), is available at https://mirror.openquake.org/git/GEM/oq-engine.git.

The main download server (downloads.openquake.org) is hosted in Nürnberg (Germany).

License

The OpenQuake Engine is released under the GNU Affero Public License 3.

Contacts

Support forum: https://groups.google.com/forum/#!forum/openquake-users
Twitter: @gem_devs

Thanks

The OpenQuake Engine is developed by the Global Earthquake Model Foundation (GEM) with the support of

Public Partners

Private Partners

Governors

Advisors

Associate Partners

Project Partners

Products Distribution Partners

If you would like to help support development of OpenQuake, please contact us at [email protected]. For more info visit the GEM website at https://www.globalquakemodel.org/partners

oq-engine's People

Contributors

Stargazers

Watchers

Forkers

christophermacgown larsbutler kpanic vsilva cbeauval quantus-dt danciul ged4gem favalex hsberlin julgp1970 angri johndouglas funvisis wmorales leoalvar arbeit matley pslh rob-seismology bwyss nemipala gvallarelli nastasi-oq 4x xpb kenxshao ryanberrio chunghan chenliu0831 luisera mohsenkohrangi vup1120 marmarques serkansevilgen scm20008 monellid qingkaikong julgp francescovisini cschwarz1234 myutwo christiehale dynaryu preinh rcgee mehmadi amirj700 nooperpudd mtahara cigdemyilmaz88 tahirazeem abdolrezat tosseto maivanchu cbworden treviallen farangoj laurenzluigi g-weatherill vhmar experimentaccount0 brishtiteveja ehsanhaghighat oneconcern nackerley jeffatennis baagaard-usgs zodiacwind esiwgnahz sjacob90 stephane-on claoaristi africamachineintelligence talpallikar askarlat alexid78 mehmousavi61 systems-earth wincpt shasthojoy enterstudios huazhz krisvanneste pheresi cvanhoutte-zz soengmou catarinaqmcosta vikasjena li8182 ndperezg claraduverger mayerven glorykim99 nimadolatabadi andreafrancia acortesz rodolfopuglia nirceo atifrasheed82

oq-engine's Issues

The max. JVM memory size should be configurable via an environment variable

example:

$ export OQ_JVM_MAX_MEM=3072

and that value is to honoured by openquake.java.jvm()

Fix calculation workflow in eventBasedMixin

This spec cover the following user story:
https://www.pivotaltracker.com/story/show/11078327

What needs to be done?

Correct calculation workflow implementation for event based hazard/risk calculation.

In the execute() method of the EventBasedMixin class (opensha.py line 575), two main loops are defined:

one over NUMBER_OF_SEISMICITY_HISTORIES
one over NUMBER_OF_LOGIC_TREE_SAMPLES

This is the current implementation (lines from 592 to 602):
histories = int(self.params['NUMBER_OF_SEISMICITY_HISTORIES'])
realizations = int(self.params['NUMBER_OF_LOGIC_TREE_SAMPLES'])
for i in range(0, histories):
pending_tasks = []
for j in range(0, realizations):
self.store_source_model(source_model_generator.getrandbits(32))
self.store_gmpe_map(gmpe_generator.getrandbits(32))

The correct implementation is the following:
histories = int(self.params['NUMBER_OF_SEISMICITY_HISTORIES'])
realizations = int(self.params['NUMBER_OF_LOGIC_TREE_SAMPLES'])
for i in range(0, realizations):
pending_tasks = []
self.store_source_model(source_model_generator.getrandbits(32))
self.store_gmpe_map(gmpe_generator.getrandbits(32))
for j in range(0, histories):

that is start loop over NUMBER_OF_LOGIC_TREE_SAMPLES, for each iteration sample and store a source model, and sample and store a gmpe model map. Given the stored source model and gmpes, start a new loop over NUMBER_OF_SEISMICITY_HISTORIES.

All curves for a site shall be combined in a single hazard diagram

Presently a hazard job writes the following curve types to separate
files i.e. all

- hazard curves (resulting from logic tree passes)
- mean curves
- curves for a given quantile

go to one file. A job with 3 sites, 2 realisations and 2 quantiles would
currently result in the following files:

- 1 file with 6 hazard curves (resulting from logic tree passes)
- 1 file with 3 mean curves
- 2 files (with 3 quantile curves each)

The desired feature is to have a diagram that combines all of these
curve types for a given site i.e. we would end up with 3 files (one per
site) that contains all of the respective curves.

Please see also this example of what the diagrams should look like once this feature has been completed.

Other aspects of these feature:

- allow user to specify a colour per
    - curve type
    - quantile
  in the job configuration file
- provide a legend for the combined diagram

Uncertainty map calculation

Allow definition of hazard IML values from vulnerability file

This is a detailed spec for:
https://www.pivotaltracker.com/story/show/9211159

See also:
OpenQuake book, Chapter 10: Deterministic event based calculator

Requirements

The job config defines [HAZARD] INTENSITY_MEASURE_LEVELS.
If risk calculations will be employed in the job:
- Verify that the hazard IMLs are in the proper range (see Restrictions below)
  - If the hazard IMLs are in the proper range, no change is necessary
  - If the hazard IMLs are not in the proper range, a new linear scale of hazard IMLs shall be re-calculated (at run time) given the formulas below
    - A message shall be logged indicating that the values were recalculated before running the job
    - The number of IMLs generated shall be equal to the number defined in the hazard job config

Important:

A vulnerability configuration can contain multiple discreteVulnerabilitySets; thus, it can contain multiple sets of IML values.
Lowerbound and Upperbound values (see below) must be calculated for each discreteVulnerabilitySet.
Hazard IMLs must be calculated using the lowest Lowerbound value and the highest Upperbound value; this ensures that the new hazard IML range is valid for the entire vulnerability model.

Given a set of vulnerability IMLs:
n = the number of hazard IMLs to be calculated
Lowerbound LB = IML_1 - ((IML_2 - IML_1) / 2)
Upperbound UB = IML_n + ((IML_n - IML_n-1) / 2)
Delta d = (UB - LB) / 2

Thus:
new_hazard_imls = [ LB + (d * x) for x in range(n) ]
Note: This will only create a linear scale of values. In the future we will likely need the functionality to calculate a logarithmic scale as well.

Restrictions

All IML values must be > 0.0
Hazard IML values are subject to the following limitations: IML_1 <= Lowerbound, IML_n >= Upperbound

Proposed solution:

Modify the validate() decorator in openquake/job/init.py
- If risk calculations are employed in the job, implement a check to verify the hazard IML range (and re-calculate if necessary)

plot hazard curves

needs description

Python2.7 related test failure (test_filter_attribute_constraint)

ERROR: This test uses the attribute constraint filter to select items

Traceback (most recent call last):
File "/p/work/oq/tests/parser_hazard_curve_unittest.py", line 269, in test_filter_attribute_constraint
self.nrml_element.reset()
File "/p/work/oq/openquake/producer.py", line 88, in reset
self.file.seek(0)
ValueError: I/O operation on closed file
-------------------- >> begin captured logging << --------------------
root: DEBUG: Found data at /p/work/oq/docs/schema/examples/hazard-curves.xml
--------------------- >> end captured logging << ---------------------

Calculation of loss maps using the deterministic event based method

What needs to be done?

The input for the calculation of loss maps is a set of ground motion fields (GMFs) provided by the hazard calculation subsystem where each GMF is a collection of (location, IML) 2-tuples.

The resulting loss map is also a list of 2-tuples where each tuple consists of the following data
- location
- the mean of the loss ratios for the given location

The loss ratio is obtained from a vulnerability function which is a collection of (IML, loss ratio) 2-tuples. Please note that each location is assigned an appropriate vulnerability function.

The loss map is calculated as follows:

  loss_map := []
  num_of_gmfs = len(gmfs)
  foreach location L:
      loss_ratio_sum := 0
      # Select the vulnerability function for the location at hand.
      vf := vulnerability_function[L]
      foreach ground_motion_field gmf:
          # Look up the IML for the location at hand.
          iml := gmf[L]
          # Calculate the loss ratio as explained in section
          # "12.2. Calculation workflow" of the OpenQuake book.
          loss_ratio := ..
          # Sum up all the loss ratios for the location at hand.
          loss_ratio_sum := loss_ratio_sum + loss_ratio
      # Calculate the mean of the loss ratios for the location at hand
      loss_map := loss_map + [(location, loss_ratio_sum/num_of_gmfs)]

Please see also the whiteboard diagram, Vitor's explanations as well as the user story in the pivotal tracker.

Solution outline

What could the solution look like?

Changes needed

What's a good way to structure the software changes needed?

Test data needed

What input/output data is needed for tests/verification?

PSHA input model postgis database schema

We need a database schema for the PSHA input model

Problem description

We require a database schema for storing PSHA input model entities like: fault, rupture and seismic source.

The equivalent NRML (XML) schema exists already and the database schema must be interoperable with the former i.e. we must be able to convert as follows:

NRML file -> PSHA input database -> NRML file

Solution outline

Define a postgres/postgis database schema that is capable of storing faults, ruptures and seismic sources. Very early work on the database schema has already started and it looks like this.

Problems and questions identified

What needs to be done?

Is it sufficient for the database schema to support the simple/complex faults, seismic sources and ruptures and the data associated with these?
What earthquake catalog data needs to be stored in the database? What are the entities and the relationships between these?

High complexity

The NRML schema is very complex and it is not obvious at this point what the database schema should look like. Figuring this out will require time as well as quite a bit of interaction with domain experts e.g. D. Monelli and/or F. Euchner.

Multiple spatial reference systems in `NRML` files

It is possible to define a spatial reference system (SRS) per geometry in a NRML file i.e. in the most complex case we could have a number of fault/source geometries with different spatial reference systems in the same file.

PostGIS databases on the other hand have a fixed SRS per geometry column. Hence the following questions arise:

What is a suitable SRS for the geometry columns in the PSHA inputs database schema (epsg:4326)?
Assuming that we use some SRS SRS_X for the geometry columns in the PSHA inputs database:
1. Do we need to perform SRS transformations for all geometries encountered in an NRML file that use an SRS other than SRS_X (probably yes)?
2. Is there such an SRS (that we can use for database geometry columns and) that facilitates arbitrary SRS transformations from/to it? If no: do we need to constrain the SRS' that may be used in NRML files?
3. What SRS should we assume in the case where the srsName attribute is absent in the NRML file.

Test data needed

We currently have some sample NRML files in docs/schema/examples and can work towards defining a database schema that's capable of handling these. In case more is needed we will work with the domain experts to provide the additional inputs.

Pivotal tracker link

https://www.pivotaltracker.com/story/show/12197273

Fix setup.py and README

setup.py still references seismicsources. README needs to specify redis2.

Loss Map XML Serialization Component

This spec covers the following user stories

What needs to be done?

To serialize mean / standard deviation we need a new xml component and an updated NRML schema

Solution outline

Refactoring of openquake/output/risk.py by creating a BaseXMLWriter class which is inherited in all the different risk cases (LossCurve/LossRatio/LossMap)

implementation of LossMapXMLWriter
refactoring of CurveXMLWriter

Test data needed

This branch includes also updated NRML files. I have taken over the story which is needed from my branch

Add the capability of setting the number of samples in the configuration file

This is a detailed spec for:
https://www.pivotaltracker.com/story/show/10315249

Description

In the probabilistic event based scenario, the number of samples used to generate the set of loss ratios is defined in the code.

from openquake.logs import LOG
from openquake.risk.common import collect, loop

DEFAULT_NUMBER_OF_SAMPLES = 25


def _compute_loss_ratios(vuln_function, ground_motion_field_set,
        epsilon_provider, asset):

Since it is useful to change this parameter dynamically, we want the engine to load and use a value specified in the configuration file.

Proposed solution

Add a parameter to the configuration file, and make the ProbabilisticEventMixin use this configuration value.

Throw a warning early on if default.gem doesn't exist.

It should be quite apparent to the user when they're running the opengem binary without a default configuration.

Validate job region constraints

We need to validate our region constraints in some areas, such as the _read_sites_from_exposure() method in openquake/job/init.py

In this method, if all of the assets in the exposure file are located outside the job's region constraint, the 'sites' list will be empty and this will cause problem later on in the job.

We should add some validation before the site list is returned, like this:

if len(sites) == 0:
  raise RuntimeError(
    "No sites found in exposure file `%s` for the given region constraints."
    " The region constraints defined in the job config are: %s"
    % (self[EXPOSURE], self['REGION_VERTEX']))

return sites

Notes:

This is going to break a handful of our tests and might expose a few other more serious errors. This could take some time to fully resolve.
We may want to do region constraint validation in other similar areas as well.

Remove references to pylibmc and pyyaml

These dependencies are no longer used in the codebase and can be removed.

tests:JobTestCase.test_job_runs_with_a_good_config failure

I managed to place a breakpoint in openquake/xml.py, line 122 (see the diff here: http://paste.ubuntu.com/591272/) and could introspect the asset element as follows:

(Pdb) element
<Element {http://openquake.org/xmlns/nrml/0.2}asset at 0x55a85f0>
(Pdb) element.attrib
{'{http://www.opengis.net/gml}id': 'a161801'}
(Pdb) [(a, a.attrib) for a in element.iterancestors()]
[(<Element {http://openquake.org/xmlns/nrml/0.2}lossCurveList at 0x5260b40>, {'{http://www.opengis.net/gml}id': 'c1'}), (<Element {http://openquake.org/xmlns/nrml/0.2}riskResult at 0x5260f00>, {'{http://www.opengis.net/gml}id': 'rr'}), (<Element {http://openquake.org/xmlns/nrml/0.2}nrml at 0x5260af0>, {'{http://www.opengis.net/gml}id': 'nrml'})]
(Pdb) element.getchildren()
[<Element {http://openquake.org/xmlns/nrml/0.2}site at 0x55a8230>]
(Pdb) element.getchildren()[0].getchildren()
[]

All the files with a site tag with id=a161801:

$ grep -l '<asset..*a161801' -r smoketests/simplecase/
smoketests/simplecase/computed_output/simplecase-block-BLOCK:1.xml
smoketests/simplecase/computed_output/simplecase-loss-block-BLOCK:2.xml
smoketests/simplecase/computed_output/simplecase-block-BLOCK:2.xml
smoketests/simplecase/computed_output/simplecase-loss-block-BLOCK:1.xml
smoketests/simplecase/small_exposure.xml

The same for the asset's ancestor tag:

$ grep -l 'lossCurveList..*c1' -r smoketests/simplecase/
smoketests/simplecase/computed_output/simplecase-loss-block-BLOCK:2.xml
smoketests/simplecase/computed_output/simplecase-loss-block-BLOCK:1.xml

I checked the tags in both files above and they are good:

in simplecase-loss-block-BLOCK:2.xml

  <asset gml:id="a161801">
    <site>
      <gml:Point srsName="epsg:4326">
        <gml:pos>-118.186739 33.779013</gml:pos>
      </gml:Point>
    </site>

in simplecase-loss-block-BLOCK:1.xml

  <asset gml:id="a161801">
    <site>
      <gml:Point srsName="epsg:4326">
        <gml:pos>-118.186739 33.779013</gml:pos>
      </gml:Point>
    </site>

Nevertheless, the "site" child in the asset element that is causing the problem is empty.

One possible explanation is a parser bug i.e. the XML files are good but they are not being parsed correctly.

Please let me know what you think.

Clean up java code base

This spec covers the following user story:

https://www.pivotaltracker.com/story/show/5171559

What needs to be done?

Remove unused/unneeded/non-openquake-pertinent code in openquake java code base.

All classes in the following packages should be moved to a separate repository (almost all these classes have been developed during the GEM1 project and are no longer used or are not pertinent to openquake):

java/org/gem/engine/hazard
java/org/gem/engine/hazard/map
java/org/gem/engine/hazard/models
java/org/gem/engine/hazard/parsers (except NRMLConstants.java and SourceModelReader.java which are currently used in openquake to read source data)
java/org/gem/params
java/org/gem/scratch

The following classes should be also moved:

java/org/gem/CalculationSettings.java
java/org/gem/IMLLis.java
java/org/gem/IMLList.java
java/org/gem/moment_rate.java

All classes in the package java/org/gem/ipe should be moved into opensha-lite package java/org/opensha/sha/imr/attenRelImpl (NOTE: rename Chandler_Lam2002_stable_continental.java to CL_2002_AttenRel.java to be consistent with naming convention in opensha-lite. AW_2010_AttenRel.java and BW_1997_AttenRel.java already exist in java/org/opensha/sha/imr/attenRelImpl, overwrite with those in java/org/gem/ipe. Delete campbell_2008_coeff.txt)

Add utils.tasks.parallelize()

We need to be able to run a number of celery tasks that all receive the same parameters:

2011 Apr 05 10:41:29 acerisara: I don't understand your "use utils.tasks.distribute() to spawn N tasks but passing always the same input parameters"
2011 Apr 05 10:41:34 comment
...
2011 Apr 05 10:50:32 al-maisan: in my case, I need to call a Java component N times by passing always the same sites
2011 Apr 05 10:50:46 al-maisan: not different subsets for each set
2011 Apr 05 10:53:09 acerisara: I see, that's a (much) simpler use case (which is a good thing!). I will probably write a separate function (utils.tasks.parallelize() ?) that caters to it.

The next openquake ubuntu package should not require rabbitmg and redis

The rabbitmq-server and redis-server packages should be merely recommended and/or suggested since we want to install the openquake package on worker machines but have the two daemons in question be deployed somewhere else.

openquake mixin imports needs looking at.

Talk to Chris or Lars.

California smoke test is broken

I am running a script (http://paste.ubuntu.com/592124/) in order to figure out when the California smoke test broke. So far the only revision for which it is not broken is : commit 291e0e4.

run_tests.py should handle signals, and cancel celery tasks when interrupted.

Currently spawned celery tasks run even after the tests are cancelled.

Support the specification of the number of celery tasks for the Classical mixin

What needs to be done?

In the execute() method of Classical Mixin (opensha.py) a celery task is created for each logic tree realization. A celery task is responsible for calculating hazard curves for all the sites in the region of interest. This approach does not scale very well when a large number of sites is present.

We should give the user the option to define how many celery task to use. This is also useful for studying how computation time scales with number of celery tasks.

Please see also the user story in the pivotal tracker.

Solution outline

Introduce a HAZARD_TASKS config.gem parameter that allows the user to specify how many celery tasks should be used. Observe that parameter in the classical mixin.

The tests must make sure that the calculation result is correct irrespective of the number of celery tasks used.

Changes needed

When the HAZARD_TASKS parameter is present it must be greater then zero and each celery task is to handle len(site_list)/HAZARD_TASKS sites.

Test data needed

No additional scientific input/output data is needed for tests/verification.

Jpype throws an OOM error when running the converter on the full NSHMP data set.

Mixin classes cannot have init methods

This makes pylint unhappy in certain cases, example:

openquake/risk/job/probabilistic.py: [W0232, ProbabilisticEventMixin] Class has no init method
openquake/risk/job/probabilistic.py: [W0201, ProbabilisticEventMixin.epsilon] Attribute 'samples' defined outside init

Java.py hardcodes ../dist/ and ../lib/

installing into a virtualenv installs dist and lib into the root of the virtualenv not the root of site-packages.

Update the parser to read single event NRML

This issue is related to this user story https://www.pivotaltracker.com/story/show/5162969

What needs to be done?

In order to do the deterministic event based analysis we need to read ruptures from an external file in NRML format. The NRML file must contain data only for a single rupture (not many ruptures). A rupture can be of three types (point, simple fault rupture, complex fault rupture).

Solution outline

A component that is able to parse ruptures in NRML format will be added to the codebase. The component will return an instance of org.opensha.sha.earthquake.EqkRupture that matches the definition provided in the file.

Test data needed

An example file for a point / simple fault / complex fault rupture will be added to the codebase under docs/schema/examples.

Remove references to ordereddict and eventlet

Remove ordereddict from README, bootstrap, etc. It is no longer used.

Eventlet might still be used. Investigate if it is needed (and remove it if it's not).

Get rid of unused imports

eventlet ordereddict

'REGION_VERTEX' parameter in job config files is poorly named

Example:
REGION_VERTEX = 38.0, -122.2, 38.0, -121.7, 37.5, -121.7, 37.5, -122.2

The value represented here is not a 'vertex' but rather a list of vertices. Even so, a more accurate name would be something like 'REGION_CONSTRAINT' or 'REGION_BOUNDS'.

Unittest suite cannot be run off-line

When the test suite is run off-line (i.e. without access to the network) a number of (xml parsing related) tests are failing, please see http://paste.ubuntu.com/587791/ for details

https://www.pivotaltracker.com/story/show/9271645

Deterministic Event Based calculation (hazard part)

This spec covers the following user stories

What needs to be done?

To compute mean / stddev losses per asset and per ground motion field the risk subsystem needs to receive in input a set of ground motion fields. This set is computed by the hazard subsystem. This will be the workflow:

The hazard subsystem reads the rupture model stored in an external NRML file specified in the configuration file (the key will be SINGLE_RUPTURE_MODEL)
The hazard subsystem creates a GMPE as specified by the user in the configuration file (the key will be GMPE_MODEL_NAME)
The hazard subsystem loads all the sites contained in the region specified in the configuration file (the key is REGION_VERTEX)
The hazard subsystem computes N ground motion fields, where N is a parameter specified by the user in the configuration file (the key will be NUMBER_OF_GROUND_MOTION_FIELDS_CALCULATION)
The hazard subsystem stores the results in KVS for later processing

The computation will also take into account if the user wants to compute correlated or uncorrelated ground motion fields, by reading the GROUND_MOTION_CORRELATION parameter specified in the configuration file.

The engine will trigger this type of computation if the HAZARD_CALCULATION_MODE parameter in the configuration file is specified with the key "Deterministic".

As agreed with the customer, no parallelization will be added for the time being. An idea, in case of uncorrelated sites, is to create a task for each one of the N calculations.

Solution outline

A mixin that loads all the inputs needed and triggers the ground motion field calculator (Java side) will be added to the codebase.

Test data needed

Since this is an architectural story and the ground motion field calculator already has unit tests that checks the accuracy of the scientific data, just the workflow will be tested. The end-to-end suite that scientists will provide later on will check the scientific accuracy of the results of the whole calculation (hazard + risk).

GEM SUN server cleanup

This explains how we would like the gemsun machines to be set up and
provides some of the rationale for it.

The gemsun servers will ideally all
- be dual-boot machines so we can easily try different flavours
and/or releases of linux
- run the server variants of the operating system in question (e.g.
Ubuntu server) since these are better suited for the envisaged
work loads.

The envisaged disk partitioning and operating system set-up would be as
follows:

Partition Size (GB) Mount point Operating system Partition type
1 32 / OS1 primary
2 32 / OS2 primary
3 16 /var OS1 logical
4 16 /var OS2 logical
5 8 swap shared logical
6 8 /tmp shared logical
7 rest /home shared logical

Due to the "freshness" of the OpenQuake software dependencies
(primarily rabbitmq and redis server) and the availability of the
OpenQuake packages we'd want to install Ubuntu 11.04 server first.

Once the installation is complete we would
- configure the systems so that the hazard/risk calculations are
executed in distributed fashion.
- test drive the servers by running the OpenQuake smoke tests

If all goes well we are done. In the somewhat unlikely(?) case of problems
with the Ubuntu 11.04 server installation we would proceed to
- installing Ubuntu 10.10 server (the current production release)
- installing OpenQuake on top of that
- configuring the servers and test driving them as described above

include version information in the OpenQuake software

bin/openquake --version should show something like:

0.3.1

please see also http://as.ynchrono.us/2010/03/include-version-information-in-your_03.html for more "inspiration" :-)

Pin celery-server to rev. 2.0.4

The reason is that jredis does not support more recent redis releases at present.

This applies to our production and test/CI machines.

The software doesn't seem to exist.

That's a bug.

Lack of good blackbox test suite hampers development

The smoke tests we have at present are too heavy in terms of run time and computation resources.

We need to have a fast and high-quality black box suite that runs in less than ten minutes and gives us the confidence that the system is intact if it passes.

The blackbox test suite is also to be run by the merge robot.

Developers currently do not/cannot afford to run the smoke tests prior to committing change sets. This leads to unnoticed bugs that creep into the system.

The most recent example of this is: https://github.com/gem/openquake/issues#issue/130

Remove all references to 'OpenGEM'

Somehow we missed a bunch of these:

$ grep "OpenGEM" -Rn .

./bin/openquake:54:flags.DEFINE_string('config_file', 'openquake-config.gem', 'OpenGEM configuration file') ./build.xml:32: ./celeryconfig.py:22:Config for all installed OpenGEM binaries and modules. Binary file ./celeryconfig.pyc matches Binary file ./docs/build/doctrees/environment.pickle matches ./docs/make.bat:91: echo.^> qcollectiongenerator %BUILDDIR%\qthelp\OpenGEM.qhcp ./docs/make.bat:93: echo.^> assistant -collectionFile %BUILDDIR%\qthelp\OpenGEM.ghc ./docs/Makefile:75: @echo "# qcollectiongenerator $(BUILDDIR)/qthelp/OpenGEM.qhcp" ./docs/Makefile:77: @echo "# assistant -collectionFile $(BUILDDIR)/qthelp/OpenGEM.qhc" ./docs/Makefile:84: @echo "# mkdir -p $$HOME/.local/share/devhelp/OpenGEM" ./docs/Makefile:85: @echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/OpenGEM" ./docs/schema/README.rst:2:OpenGEM NRML schema ./docs/source/conf.py:18:# OpenGEM documentation build configuration file, created by ./docs/source/conf.py:187:htmlhelp_basename = 'OpenGEMdoc' ./docs/source/conf.py:202: ('index', 'OpenGEM.tex', u'OpenQuake Documentation', ./fabfile.py:144: print "OpenGEM deployment."

Fix test_ssh_handler_raises_on_bad_credentials

ERROR: test_ssh_handler_raises_on_bad_credentials (tests.SFTPHandlerTestCase)

Traceback (most recent call last):
File "/p/work/oq/tests/handlers_unittest.py", line 29, in test_ssh_handler_raises_on_bad_credentials
self.assertRaises(handlers.HandlerError, sftp_handler.handle)
File "/usr/lib/python2.7/unittest/case.py", line 465, in assertRaises
callableObj(_args, *_kwargs)
File "/p/work/oq/openquake/job/handlers.py", line 92, in handle
transport = paramiko.Transport((host, int(port)))
File "/usr/lib/pymodules/python2.7/paramiko/transport.py", line 302, in init
'Unable to connect to %s: %s' % (hostname, reason))
SSHException: Unable to connect to localhost: [Errno 111] Connection refused

Consider correlation of the vulnerability between similar building typologies

= Correlation of assets with the same building typologies =

== Introduction ==

When the risk is calculated using vulnerability functions (VF) with a
supplied uncertainty (i.e. coefficient of variation (COV) values) the
algorithm needs to make use of an "epsilon" value that has a standard
normal distribution with a zero mean and a standard deviation of one.
Suitable epsilon values can be obtained from
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html

The user will be able to specify (in the job configuration file) whether
assets with identical building typologies should be considered
uncorrelated (case 1) or "perfectly correlated" (case 2).

The risk calculation algorithm is fed sample random values for each
- asset in case 1
- building typology in case 2 (i.e. assets with the same building
typology will "share" random values).

There is a "structure type" parameter in the exposure file (supplied by
the user) and two assets with the same structure type can be considered
to be of the same building typology.

For more information please see also pages 55/56 of the OpenQuake book
as well as the photograph of Vitor's drawings from the sprint planning
in Pavia
(https://github.com/gem/openquake/wiki/images/risk-asset-correlation.jpg).

== Solution proposal ==

In either case the random value sampler is a "service component" to be
used by the probabilistic risk calculation algorithms and should ideally
provided using dependency injection
(http://en.wikipedia.org/wiki/Dependency_injection).

The logic that processes the job configuration file could instantiate a
random value sampler component of the appropriate type (uncorrelated
versus correlated assets) and pass it as a parameter to the risk
calculation algorithm.

We could write 2 components that both implement the same interface with
a single function

get_epsilon_value(asset) -> float

One component would draw a random value for each asset whereas the other
one would be drawing/caching random values for each building typology

The problem at hand could be broken down in 3 branches:
1 - service component for uncorrelated assets
2 - service component for correlated assets
3 - changes to the code that processes the job configuration file

User Story: Plot hazard maps

This is a detailed spec for:
https://www.pivotaltracker.com/story/show/10016217

Abstract

The OpenQuake engine is currently generating hazard maps which can be serialized to NRML. the focus of this story is to generate a geotiff for each hazard map. We also want to create a simple html wrapper which displays the geotiff as well as a color legend.

Requirements

It has been requested that the discrete color map known as "seminf-haxby" (http://soliton.vm.bytemark.co.uk/pub/cpt-city/jjg/misc/tn/seminf-haxby.png.index.html) be used by default for the hazard maps.
That said, it is conceivable that users of OpenQuake may want use many different color maps. The implementation of this story will include a general way of reading in most color maps from a cpt file.
The job config shall include an additional parameter (HAZARD_MAP_CPT) specifying a cpt file path.
A single geotiff file (.tiff) will be created for each hazard map.
A simple html wrapper will also be created for displaying the map and color legend. (This is already implemented for GMF geotiff generation so we should be able to re-use some code/patterns here.)
Hazard maps will consist of a grid of single pixels, each pixel representing a site.
- Each site/pixel in the map will have a color value corresponding to the site IML value
- Colors are determined using either fixed or relative color scaling
  - Fixed: Colors are mapped across a range of min and max IML values
    - To get fixed color scaling, Hazard Map IML min/max values can be defined in the job config file as HAZARD_MAP_IML_MIN and HAZARD_MAP_IML_MAX
      - These values must be positive ( >= 0.0)
      - MAX must be > MIN
  - Relative: Colors are mapped across only the min and max IML values
    existing in a given map
    - If no HAZARD_MAP_IML_MIN and _MAX are defined, hazard maps will default to relative scaling.
  - If a site IML value is outside of the range defined for the map, the closest in-range color shall be used (i.e., if a site IML value is < IML_MIN, the lowest color scale value shall be used; if a site IML value is > IML_MAX, the highest color scale value shall be used)
    - There is much debate over what should actually be drawn on a hazard map in case of such outlier values; this solution is currently implemented, but is subject to change in the future
  - The class which writes the hazard maps will include an optional parameter to specify IML MIN/MAX

Refactor LogicTree classes

This spec covers the following user story:
https://www.pivotaltracker.com/story/show/11544573

What needs to be done?

Refactor/Re-implement logic tree class (java/org/gem/engine/logictree.LogicTree.java) to match current data model as specified in nrml_seismic.xsd. The current implementation of the logic tree class dates back to the GEM1 project, and do not exactly reflect the actual data model. Tests are not present.

Logic tree data model

In the context of OpenQuake, a logic tree is a data structure allowing the user to define uncertainties in the input data for hazard calculations. Input data for hazard calculations consist of a source model and a ground motion model. A logic tree therefore allows a user to define one or more source models (and possibly uncertainties on parameters the source model depends on) and one or more ground motion prediction equations (GMPEs) to be used in hazard calculations.

A logic tree (as per schema in nrml_seismic.xsd) is currently structured as an (unbounded) sequence of logic tree branch sets, with a logic tree branch set consisting of an (unbounded) sequence of logic tree branches. A logic tree branch is defined as a sequence (bounded, maxOccurs=1) of an uncertainty model (defined as a string) and an uncertainty weight (non negative double).

A logic tree has currently an id attribute (required), and a tectonic region attribute (optional). The tectonic region attribute is set only when the logic tree describes uncertainties in the ground motion model (that is defines a set of possible GMPEs which needs to be associated to a tectonic region type).

A logic tree branch set has two attributes (both required): branching level and uncertainty type. The branching level defines the position of the branch set in the logic tree (see figure ...). The uncertainty type specifies what type of uncertainty the branch set is describing. The uncertainty type is of string type and restricted to particular values. Currently four possible values are defined: gmpeModel, sourceModel, maxMagnitudeGutenbergRichterRelative, bValueGutenbergRichterRelative. Each of these values allows the logic tree processor (currently implemented in the class org.gem.engine.LogicTreeProcessor.java) to interpret what the uncertainty model in the logic tree branches refers to.

If uncertainty type == gmpeModel, then the value of the element uncertainty model is interpreted as a string containing the name of a GMPE.
If uncertainty type == sourceModel, then the value of the element uncertainty model is interpreted as a string containing the name of a file defining a source model
If uncertainty type == maxMagnitudeGutenbergRichterRelative, then the value of the element uncertainty model is interpreted as a double representing the relative uncertainties on the maximum magnitude of a Gutenberg Richter magnitude frequency distribution.
If uncertainties type == bValueGutenbergRichterRelative, then the value of the element uncertainty type is interpreted as a double representing the relative uncertainties on the b value of a Gutenberg Richter magnitude frequency distribution.

NOTE
In the current data model, it is expected that the user defines only one branch set for branching level (meaning that a branch set at branching level N is "connected" to all branches defined in branching level N-1, i.e: the logic tree has a symmetric shape). However, it is envisioned to give the possibility to the user to define more than one branch sets per branching level (by for instance differentiating each branch set with an attribute specifying to which particular branch it applies to., which may require defining an index for each branch too.). This a just note about a future possible requirement which may be useful to take into account in the object design.

Usage of logic tree data in OpenQuake

Inside the engine logic tree data are used by the logic tree processor (org.gem.engine.LogicTreeProcessor.java), which is responsible for constructing a model (source model or GMPE) that can then be used for hazard calculation.

"Constructing a model" means looping over the branching levels defined in the logic tree, select an uncertainty model, perform the action corresponding to the selected uncertainty model.

From the usage point of view, a logic tree data object should give access to the defined branching levels (in a ordered way). For each branching level, the associated branch set must be accessible.

Dependencies

Logic tree data are currently used in the logic tree processor (org.gem.engine.LogicTreeProcessor.java) and in the logic tree data reader (org.gem.engine.LogicTreeReader.java) and in the gmpe logic tree data (org.gem.engine.GmpeLogicTreeData.java)

installation on ubuntu 10.10 server problem: celeryd config file missing

Hi there

I followed the manual installation instructions on a fresh ubuntu 10.10 server and made it to the last step under (Running OpenQuake)

...
cd /to/your/openquake/dir/
celeryd

However celeryd won't start because it cannot find the celeryconfig.py file - which nevertheless - resides within the same directory. I also cannot find the default setup script /etc/default/celeryd

Any ideas ?

See call and installation-log below

heiri@sepp:~/openquake$ celeryd
/usr/local/lib/python2.6/dist-packages/celery/loaders/default.py:53: NotConfigured: No celeryconfig.py module found! Please make sure it exists and is available to Python. NotConfigured)
Traceback (most recent call last): File "/usr/local/bin/celeryd", line 9, in load_entry_point('celery==2.1.4', 'console_scripts', 'celeryd')() File "/usr/local/lib/python2.6/dist-packages/celery/bin/celeryd.py", line 166, in main worker.execute_from_commandline()
File "/usr/local/lib/python2.6/dist-packages/celery/bin/base.py", line 40, in execute_from_commandline return self.run(_args, *_vars(options))
File "/usr/local/lib/python2.6/dist-packages/celery/bin/celeryd.py", line 85, in run return Worker(**kwargs).run()
File "/usr/local/lib/python2.6/dist-packages/celery/apps/worker.py", line 98, in run self.init_loader()
File "/usr/local/lib/python2.6/dist-packages/celery/apps/worker.py", line 147, in init_loader "Celery needs to be configured to run celeryd.") celery.exceptions.ImproperlyConfigured: Celery needs to be configured to run celeryd.

heiri@sepp:~$ sudo easy_install celery
install_dir /usr/local/lib/python2.6/dist-packages/
Searching for celery
Best match: celery 2.1.4
celery 2.1.4 is already the active version in easy-install.pth
Installing celeryctl script to /usr/local/bin
Installing celeryd script to /usr/local/bin
Installing camqadm script to /usr/local/bin
Installing celeryev script to /usr/local/bin
Installing celeryd-multi script to /usr/local/bin
Installing celerybeat script to /usr/local/bin

Support definition of Gutenberg Richter magnitude frequency distribution uncertainties (a and b value, and max magnitude)

This spec covers the following user stories:

What needs to be done?

Allows the user to define uncertainties in Gutenberg Richter magnitude frequency distribution parameters (a and b values, and maximum magnitude) and include them in logic tree.

Solution Outline

extend substitution group of magnitudeFrequencyDistribution element in nrml_seismic.xsd to include new element of type truncatedGutenbergRichterWithUncertainties. This type allows defining multiple (a,b)-pairs (each associated to a weight), and multiple maximum magnitude values (each associated to a weight)
extend SourceModelReader.java class to accept a random number generator, to be used to sample an (a,b)-pair and a maximum magnitude value when a magnitude frequency distribution is defined as a truncatedGutenbergRichterWithUncertainties.

smoketests/deterministic seems broken as well

Can someone comment on this please? This is what I am seeing on gemsun02:

Traceback (most recent call last):
  File "bin/openquake", line 162, in <module>
    job.run_job(FLAGS.config_file)
  File "openquake/job/__init__.py", line 51, in run_job
    a_job = Job.from_file(job_file)
  File "openquake/job/__init__.py", line 161, in from_file
    job = Job(params, sections=sections, base_path=base_path)
  File "openquake/job/__init__.py", line 176, in __init__
    self.to_kvs()
  File "openquake/job/__init__.py", line 333, in to_kvs
    self._slurp_files()
  File "openquake/job/__init__.py", line 324, in _slurp_files
    with open(path) as data_file:
IOError: [Errno 2] No such file or directory: '/home/muharem/tmp/openquake/smoketests/deterministic/GmpeLogicTree.inp'

Performance regression in unit test suite

The run time of the unit test suite quadrupled.

The following bash script was run last night on gemsun02 in order to exercise the test suite: http://paste.ubuntu.com/591606/.

According to the results (http://paste.ubuntu.com/591605/) the following change set caused the runtime of the unit test suite to quadruple:

commit bf18813219abd2ccae5254e9e92d9e6aa97982e8
Author: root <root@roundabout.(none)>
Date:   Fri Apr 8 08:33:20 2011 +0000

    Squashed commit of the following:

    commit b2bb2994e353c9f7e4b4d4ab55326bfed992a09d
    Author: Damiano Monelli <[email protected]>
    Date:   Mon Mar 21 16:44:04 2011 +0100

        added opensha-lite corrected for area source bug

bin/openquake --server still tries to run memcached

This got missed in the transition from memcached to redis.

lxml version 2.3 causes problems with xml parsers

This error:

Traceback (most recent call last):
  File "/Users/larsbutler/proj/openquake/tests/parser_hazard_curve_unittest.py", line 269, in test_filter_attribute_constraint
    self.nrml_element.reset()
  File "/Users/larsbutler/proj/openquake/openquake/producer.py", line 88, in reset
    self.file.seek(0)
ValueError: I/O operation on closed file

I had lxml version 2.3 installed. I downgraded to 2.2.8 and it fixed the issue. Basically, lxml 2.3 modified the etree.iterparse() function and closes the file handle when parsing is complete.

We need more bullet-proofing in our parser code so stuff isn't so likely to happen.

Proposal:

Modify FileProducer.reset() (in openquake/producer.py) to:

Check if the file is still open
If the file is still open, call seek(0)
Else, just reopen the file

Consider removing guppy

It seems like it is used just for,

print guppy.hpy().heap()

Is it worth keeping it?

Integrate deterministic risk/hazard calculations

This spec addresses the following story:
https://www.pivotaltracker.com/story/show/10958017

Abstract

Functionality was recently completed which performs deterministic hazard calculations. The purpose of this story is to write a small deterministic risk piece which performs additional computations given the available hazard, vulnerability, and exposure data. The outputs of this computation are:

mean loss value for the region (a single number)
standard deviation loss value for the region (also a single number)

What needs to be done?

Perform a deterministic risk calculation based on deterministic hazard calculation results
- Execution of the DeterministicEventBasedMixin (hazard) will produce the necessary hazard data (GMFs)
- The Ground Motion Field (hazard) data will be stored in the KVS
Serialize the calculation results to a loss map XML file (using LossMapXMLWriter)
Print or return the mean & standard deviation loss values for the region (defined in the job config)

Solution outline

openquake/risk/job/deterministic.py:

Implement execute() method of DeterministicEventBasedMixin
- use @preload decorator, to load exposure and vulnerability information into the KVS
- return True to indicate successful completion
Implement a compute_risk() method in the Deterministic mixin
- Load the vulnerability model from the KVS
- Calculate mean & stddev loss for the entire region
  - This is calculated by summing all of the asset losses for a given region
    - A single loss value (single number) is computed for the region
  - We do this for multiple realizations and create a list of loss values
  - We compute the mean & stddev values from this list of region losses
- Print the mean & stddev loss values for the region

Test data needed

smoketests/deterministic/ files

In particular, we need:

Exposure file
Job config with proper region constraints

Term definitions

GMF (Ground Motion Field): A collection of data node, each node containing Latitude, Longitude, and IML value.
Realization == GMF
Loss == Loss Ratio * Asset Value

gem / oq-engine Goto Github PK

oq-engine's Introduction

OpenQuake Engine

Current Long Term Support (LTS) release - for users wanting stability

Latest release - for users needing the latest features

Documentation

Mirrors

License

Contacts

Thanks

Public Partners

Private Partners

Governors

Advisors

Associate Partners

Project Partners

Products Distribution Partners

oq-engine's People

Contributors

Stargazers

Watchers

Forkers

oq-engine's Issues

What needs to be done?

Requirements

Restrictions

Proposed solution:

ERROR: This test uses the attribute constraint filter to select items

Calculation of loss maps using the deterministic event based method

What needs to be done?

Solution outline

Changes needed

Test data needed

We need a database schema for the PSHA input model

Problem description

Solution outline

Problems and questions identified

What needs to be done?

High complexity

Multiple spatial reference systems in NRML files

Test data needed

Pivotal tracker link

Description

Proposed solution

What needs to be done?

Support the specification of the number of celery tasks for the Classical mixin

What needs to be done?

Solution outline

Changes needed

Test data needed

What needs to be done?

Solution outline

Test data needed

What needs to be done?

Solution outline

Test data needed

ERROR: test_ssh_handler_raises_on_bad_credentials (tests.SFTPHandlerTestCase)

Abstract

Requirements

What needs to be done?

Logic tree data model

Usage of logic tree data in OpenQuake

Dependencies

What needs to be done?

Solution Outline

Abstract

What needs to be done?

Solution outline

Test data needed

Term definitions

Recommend Projects

Recommend Topics

Recommend Org

Multiple spatial reference systems in `NRML` files