Coder Social home page Coder Social logo

sead_change_control's People

Stargazers

 avatar

Watchers

 avatar  avatar

sead_change_control's Issues

20200203_DML_ADD_YEAR_TYPES: Populate tbl_years_types

Populate tbl_years_type as follows (values from Bugs plus an extra 'Unknown' - which maps to nulls and '?' in Bugs import)

years_type_id name description date_updated
  Calendar Calendar years  
  C14 Radiocarbon years  
  Radiometric Radiometric years (but not C14)  
  Unknown Unknown years type (either to be defined or unspecified in source)  

20191213_BUGS_IMPORT_POST_APPLY: Bugs Sample codes need to be stored as alt refs in SEAD

In BugsCEP, unique identifier for a sample is TSample.SampleCODE.
This is stored in bugs_trace during import but not stored in the main SEAD database.
SampleCODE needs storing in SEAD as:
tbl_sample_alt_refs.alt_ref with type "Database identifier" (new alt_ref_type)

This is needed for tracing samples through SEAD to Bugs, for example when Francesca creates a matrix of all insect samples.

Dendrochronology record type needed

Dendrochronology data needs a record type to enable some query and filter functionality.

tbl_record_types.record_name = 'Dendrochronology'
tbl_record_types.record_name = SEE ROGER'S QUERY

Update tbl_methods.record_type_id to match the new record_type

Raä-IDs

Should be next to the site name in the site filter, but requires back-end support

Isotope: Remaining errors in Isotope data

Remaining errors in Isotope data:

  • Reference to missing method_id:
    • method_id 23 in tbl_isotope_measurement
    • method_id 5 in tbl_sample_groups
  • Reference to missing sample_type_id:
    • method_id 15 in tbl_physical_samples

Filter has corrupt name

Filter with facetCode "tbl_denormalized_measured_values_37" has a DisplayTitle of "P┬░", which is obviously due to some sort of encoding translation error.

Deploy of submission failed: "Cannot change to directory ./general"

Errour output from sqitch command:

Cannot change to directory ./general: No such file or directory

Trace begun at /bin/../lib/perl5/App/Sqitch.pm line 279
App::Sqitch::_parse_core_opts('App::Sqitch', 'ARRAY(0x558dc7e04d30)') called at /bin/../lib/perl5/App/Sqitch.pm line 182
App::Sqitch::go('App::Sqitch') called at /bin/sqitch line 17

Integration of temperature proxy data

Timeline needs temperature data in the database. I have created the table tbl_temperatures for this purpose, but it should be integrated into the sead change control so that it carries over with different database versions. The name is also wrong since according to Phil this is not actually temperature data but only a proxy for it. Perhaps this data should also be inserted into an existing table somewhere rather than having it's own? If you don't mind I'll leave it up to you @visead and @roger-mahler to figure out where/how to best integrate this data into the SEAD database in a more proper manner.

Attaching SQL for creating the table and inserting the data.

tbl_temperatures.zip

tbl_measured_value_dimensions and tbl_analysis_entity_dimensions are empty

These two tables are currently empty and are not in use. The tables seem to be meant for another layer of storing information about subsamples and their measured dimensions e.g. before and after burning weights for LOI analysis.

Although I think this function seem to exist indirectly in other populated tables, but I might be cofusing things.

Since they aren't in use and you can get sample information form other tables I would suggest looking into perhaps removing them

20200203_DML_METHOD_UPDATE: Update characters in method description

20200203_DML_METHOD_UPDATE

Method description needs editing to remove linebreak between 'use.' and 'See' as it causes problem when exporting data.

"Swedish Rikets Nät (National Grid) system. Full name "RT 90 2.5 gon V 0:-15". X = south-north, Y = west-east. Essentially superceded by SWEREF 99 although still in extensive use.
See http://www.lantmateriet.se/templates/LMV_Page.aspx?id=4766&lang=EN (NOTE: include URL as biblio link)"

20190101_DDL_FACET_SCHEMA: Region facet request

Can we have a "Region" facet under the space-time group?

Contents would be something like (filtering sites?):

select location_name, site_id, location_type_id
from tbl_locations
inner join tbl_site_locations on tbl_locations.location_id=tbl_site_locations.location_id
where tbl_locations.location_type_id = 2 or tbl_locations.location_type_id = 7 or tbl_locations.location_type_id = 14 or tbl_locations.location_type_id = 16 or tbl_locations.location_type_id = 18
order by location_name

Use case: A point of entry for all sites in Västerbotten linked from https://sparfran10000ar.se/

New result facet that targets datasets

A new result facet that target datasets is added using the following JSON specification:

{
   "facet_id": 40,
   "facet_code": "result_facet_dataset",
   "display_title": "Datasets",
   "description": "Datasets",
   "facet_group_id":"99",
   "facet_type_id": 1,
   "category_id_expr": "tbl_datasets.dataset_id",
   "category_name_expr": "tbl_datasets.dataset_name",
   "sort_expr": "tbl_datasets.dataset_name",
   "is_applicable": false,
   "is_default": false,
   "aggregate_type": "count",
   "aggregate_title": "Number of datasets",
   "aggregate_facet_code": null,
   "tables": [
   {
   	"sequence_id": 1,
   	"table_name": "tbl_datasets",
   	"udf_call_arguments": null,
   	"alias":  null
   } ],
   "clauses": [  ]
}

20190101_DDL_FACET_SCHEMA: New facet "Master datasets"

A new facet with the following JSON specification:

{
	"facet_id": 38,
	"facet_code": "dataset_master",
	"display_title": "Master datasets",
	"description": "Master datasets",
	"facet_group_id": "2",
	"facet_type_id": 1,
	"category_id_expr": "tbl_dataset_masters.master_set_id ",
	"category_name_expr": "tbl_dataset_masters.master_name",
	"sort_expr": "tbl_dataset_masters.master_name",
	"is_applicable": true,
	"is_default": false,
	"aggregate_type": "count",
	"aggregate_title": "Number of samples",
	"aggregate_facet_code": "result_facet",
	"tables": [
    	{
    		"sequence_id": 1,
    		"table_name": "tbl_dataset_masters",
    		"udf_call_arguments": null,
    		"alias":  null
    	}
    ],
	"clauses": [  ]
}

Inconsistent sqitch.plan prevents deploy

Manual changes (delete) in plan of deployed changes prevents deploy.
Changes:
[x] Make system forward only for now (remove reverts)
[x] Make all changes idempotent
[x] Do a sqitch rebase - does a revert and deploy in sequence

Update all MAL-datasets that is NULL to ID 2

Erik mail May 5, 2018:

I have troubles with the datasets in sead_master_9. There are a bunch in there without master_set_id. I don’t know if these should be there or not?

Should we have some kind of collection of all the lookup data somewhere so we can include it into a datastore separately?`

Consolidate public schema.

[ x ] Setup sqitch project

Compare the different SEAD versions for schema changes, excluding bugs-related data.
Create a DDL change (using sqitch) for each identified difference.

Wrong record_type for some methods?

Methods relating to geoarchaeology might have been marked with the wrong record_type. They have been marked as "Non-biological taxa" (record_type_id 10) instead of "Soil chemistry/property" (record_type_id 12), you can see the mixup when searching in the webbrowser and sorting the sites based on their record types in the table-view

Only samples containing soil chemistry seem to have this error

SEAD @2019.12: Release checklist

Deploy new SEAD database

Note! This checklist _only_deploys the database. Relevant systems need to be deployed desperately and in conjunction.

Prerequisites

  • Docker
  • Git

Description

This will be the first deploy of SEAD database using the new change control system. The starting point of the deployment is an SQL dump of sead_master_9 stored as zipped file sead_master_9_public.sql.gz in folder starting_point in the sead_change_control repository.

The overall deployment process is as follows:

  1. Prepare deploy
  2. Create new staging
  3. Import pending submissions
  4. Deploy submissions
  5. Import bugs data
  6. Switch production database
  7. Deploy relevant systems

Prepare deploy

Make sure you have an up-to-date version of the SEAD repositories sead_change_control and sead_clearinghouse*.

Run git clone if you haven't checked out the repository before, or update the code using git pull

% mkdir -p ~/source && cd ~/source
% git clone https://github.com/humlab-sead/sead_change_control.git
% git clone https://github.com/humlab-sead/sead_clearinghouse.git
% git clone https://github.com/humlab-sead/sead_clearinghouse_import.git

Freeze deployment by adding tag @2019.12 to all Sqitch plans (for each subfolder xyz) to be included in the release (this can also be done manually by editing the .plan files):

% sqitch tag --tag @2019.12 --plan-file ./xyz/sqitch.plan --note "2019 December release"

Add tag @2019.12 to all relevant repositories.

Create new staging

$ ./bin/deploy_staging --user humlab_admin --target-db-name sead_staging --create-database --source-type dump --on-conflict drop --deploy-to-tag @2019.12

Import pending submissions

The following workflow imports and commits submissions that have been prepared in Excel files i.e. Ceramics, Dendrochronology and Isotope data. The flow assumes that relevant updates to the SEAD database model have been committed and all lookup data reference by the new data submissions has been inserted (separate change requests).

Checkout or update systems sead_clearinghouse and sead_clearinghouse_import.

  • Prepare submission data (Excel) files and copy them to sead_clearinghouse_import/data/input.
  • Import data to SEAD Clearinghouse System using sead_clearinghouse_import. Edit options in ~/sorce/sead_clearinghouse_import/runner.py i.e. pecify name of input files and data types.
  • Verify imported data using SEAD Clearinghouse.
  • Commit & deploy submission data to SEAD Change Control System
  • Deploy submission change request to SEAD staging

Run the following commands to import the data to Clearinghouse:

$ cd ~/source/sead_clearinghouse_import
$ pipenv shell
$ vi runner.py # edit run options
$ SEAD_CH_PASSWORD=qwerty python runner.py

Store staging phase 1 database:

create database sead_staging_phase_2 owner sead_master template sead_staging;

Deploy submissions to SEAD Change Control system:

$ cd ~/source/sead_clearinghouse/transport_system$ .
$ ./deploy_submission.sh --dbname=sead_staging --id=1 --force --add-change-request
$ ./deploy_submission.sh --dbname=sead_staging --id=2 --force --add-change-request
$ ./deploy_submission.sh --dbname=sead_staging --id=3 --force --add-change-request
$ ./deploy_submission.sh --dbname=sead_staging --id=4 --force --add-change-request

Store staging phase 2 database:

create database sead_staging_phase_2 owner sead_master template sead_staging;

Deploy submissions

$ sqitch deploy --target staging --mode change --no-verify -C ./submissions 20191220_DML_SUBMISSION_CERAMICS_001_COMMIT
$ sqitch deploy --target staging --mode change --no-verify -C ./submissions 20191220_DML_SUBMISSION_DENDRO_BUILDING_002_COMMIT
$ sqitch deploy --target staging --mode change --no-verify -C ./submissions 20191220_DML_SUBMISSION_DENDRO_ARCHEOLOGY_003_COMMIT
$ sqitch --target staging --mode change --no-verify -C ./submissions 20191220_DML_SUBMISSION_ISOTOPE_004_COMMIT

Store staging phase 3 database:

create database sead_staging_phase_3 owner sead_master template sead_staging;

Import bugs data

$ cd ~/source/sead_bugs_import || cd ~/source && git clone https://github.com/humlab-sead/sead_bugs_import
$ cd ~/source/sead_bugs_import && git pull
$ mvn -Dmaven.test.skip=true clean
$ mvn -Dmaven.test.skip=true package
$ vi config/application.properties # set sead_staging as target database
$ java -jar target/bugs.import-0.1-SNAPSHOT.jar --file=./bugsdata/bugsdata_20190503.mdb  > bugsimport_20191220.log 2>&1 &
$ Execute (not yet created post_bugs_import.sh) or
select bugs_import.post_import_updates()

Create Bugs import change request:

% cd ~/source/sead_change_control/
% sqitch add --change-name 20191221_DML_SUBMISSION_BUGS_20190303_COMMIT --note "Initial Bugs Import" --chdir ./submissions
% cd source/sead_change_control/submissions/deploy/20191221_DML_SUBMISSION_BUGS_20190303_COMMIT
% __navicat public schema diff pre/post bugs > public_data_diff.sql__
% pg_dump --data-only --blobs -d sead_staging --schema bugs_import -h seadserv.humlab.umu.se -F p -U humlab_admin -f bugs_import_schema.sql
% gzip bugs_import_schema.sql
% gzip public_data_diff.sql

Switch production database

-- alter database sead_production rename sead_yyyymm;
create database sead_production owner sead_master template sead_staging;

Deploy relevant systems

Rebuild SEAD Query API:

% cd ~/source/sead_query_api/
% # wget https://github.com/humlab-sead/..././docker-build.sh && chmod +x ./docker-build.sh
% vi conf/appsettings.Production.json # edit run settings
% docker-compose down
% ./docker-build.sh

Start API

% docker-compose up -d

20200203_DML_FACET_UI_ELEMENTS_UPDATE: Correct Filter group and item titles

Correct filter visible elements as follows:

measured_values - measurements
MS -Magnetic sus.
LOI - Loss on Ignition
PT? - Phosphates

space_time - space/time
Site - Sites
Country - Countries
Sample group - Sample groups

Taxa - Taxon

Reorder filter groups:
space/time
ecology
measurements
taxonomy

Reorder items within space/time filter
Master datasets
Countries
Sites
Sample groups
Time periods
Geochronology

20191125_DML_CERAMICS_LOOKUP: Ceramics lookup data

CH import system has guards that prevent import of new lookup data via Clearinghouse. The ceramics data import files were created before these guards were activated.

The lookup data are added to SEAD via a change request via the change control system, and the Excel is updated with the new SEAD system identities returned by the inserts.

20200315_DML_ECOCODE_ADD_SYSTEM_4: Anolds & van der Maarel (plants)

tbl.ecocode_sytem.name 'Anolds & van der Maarel (plants)' was not transferred from sead master 8 during migration. This needs doing, including code assignations to taxa and reference for system.
System provides ecology codes for plants in the Netherlands in environments as classified for biodiversity surveys. Part of Ida Lundberg's Magister project.

New facet "Dataset methods"

A new facet with the following specification:

{
	"facet_id": 39,
	"facet_code": "dataset_methods",
	"display_title": "Dataset methods",
	"description": "Dataset methods",
	"facet_group_id": "2",
	"facet_type_id": 1,
	"category_id_expr": "tbl_methods.method_id ",
	"category_name_expr": "tbl_methods.method_name",
	"sort_expr": "tbl_methods.method_name",
	"is_applicable": true,
	"is_default": false,
	"aggregate_type": "count",
	"aggregate_title": "Number of datasets",
	"aggregate_facet_code": "result_facet_dataset",
	"tables": [
    	{
    		"sequence_id": 1,
    		"table_name": "tbl_dataset_masters",
    		"udf_call_arguments": null,
    		"alias":  null
    	},
    	{
    		"sequence_id": 2,
    		"table_name": "tbl_datasets",
    		"udf_call_arguments": null,
    		"alias":  null
    	}
    ],
	"clauses": [  ]
}

20170101_DDL_BIBLIO_REFACTOR_MODEL: Biblio refactoring MAL data transfer

tbl_biblio is being simplified and existing MAL data in this and related tables need to be merged into the simplified fields. The order of fields to be concatenated will depend on the publication type - e.g. book vs journal article.

create string for title and full_reference

Phil will create the list for each type
e.g.
Publication_type = journal then
tbl_biblio.title = tbl_biblio.title & " " tbl...

for publication_type_id
9
5
4
6
12
26
23
22

20190503_DDL_TAXA_ATTRIBUTE_TYPE_LENGTH SqlException

Import fails with exception "ERROR: value too long for type character varying(25)" when values are inserted into tbl_taxa_measured_attributes, column attribute_type.

Sample values that exceeeds 25 characters:

  • Length larva final (10th) instar
  • Length (brachypterous females)
  • Length female brachypterous
  • Length female macropterous
  • Length, head-end of elytra

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.