Coder Social home page Coder Social logo

ohdsi / perseus Goto Github PK

View Code? Open in Web Editor NEW
32.0 13.0 8.0 97.23 MB

[under development] Tools for ETL into OMOP CDM and deployment of OHDSI toolstack

License: Apache License 2.0

Dockerfile 0.34% JavaScript 17.16% TypeScript 45.07% HTML 14.19% Shell 0.20% SCSS 3.50% Python 6.89% Java 3.45% Batchfile 0.06% CSS 5.50% MDX 0.16% R 3.33% Perl 0.04% Handlebars 0.09%

perseus's Introduction

Introduction

Perseus combines intuitive and easy to use Web-based UI for design and implement ETL (extract, transform, and load) configuration and service for conversion the native/raw data to the OMOP Common Data Model (CDM).

Additionally, Perseus has embedded tools for search in the standardized vocabularies, generates documentation for the ETL process, create the code mappings and data quality check.

Wiki

Getting started

Contact Us: [email protected]

Features

  • Map source data to tables and columns to CDM
  • Combine source tables
  • Use pre-built sql functions (replace, concat…)
  • Use pre-built source to source and source to standard vocabulary lookups (icd9, icd10, ndc…)
  • Create custom lookups for the concept_id fields
  • Set constant values to the CDM fields
  • Use System/Auto generated values for the CDM fields
  • Auto mapping for similar fields
  • OMOP Vocabulary search
  • Data Quality check
  • Search mapping between new codes and OMOP standard concepts
  • Convert data from native format to CDM
  • Logic for creating eras (DRUG_ERAs, CONDITION_ERAs…)
  • Logic for grouping visit occurrence/observation_period records
  • Auto domain switching
  • Create ETL specification

Screenshots

Start page

Link tables

Link fields

Concept configuration

Lookup configuration

Code mappings - Import

Code mappings

Technology

  • Angular 12
  • Python 3.7
  • Java 17
  • R 4.1.3
  • PostgreSQL 13.2
  • .NET Core 3.1

Deployment server requirements

  • Unix / Windows OS, Docker,
  • 4GB RAM,
  • ~10 GB HDD (Depend on Vocabulary size),
  • Open ports: 443, 80.

Getting Started

Docker Compose and Podman

Install docker compose as a plugin

apt install docker-compose-plugin

You can use Podman instead of Docker

apt install python3-pip
pip install podman-compose

If you have an issue with downloading from Docker Hub check configuration file:

/etc/containers/registries.conf

It need to contain the following string: unqualified-search-registries = ["docker.io"]

podman-compose pull
podman-compose up -d

Vocabulary

Get the link to the vocabulary from Athena.

cd vocabulary-db

Install vocabulary archive and extract to vocabulary directory. Full path vocabulary-db/vocabulary. For example:

unzip <downloaded archive> -d vocabulary

Database deployment can take a long time if the dictionary size is large enough.

To Docker Compose.

SMTP server

Multi-user

(Optional)

cd user
  • To get user registration links by e-mail you should configure SMTP server settings first. Edit file named user-envs.txt in the user directory with the following content (without spaces):

SMTP_SERVER=<your SMTP server host address>
SMTP_PORT=<your SMTP port>
SMTP_EMAIL=<email from which registration links will be sent to users>
SMTP_USER=<SMTP login>
SMTP_PWD=<SMPT password>
TOKEN_SECRET_KEY=<token encoding key>

to Docker Compose

Test user

Single-user

If you want to skip multi-user mode use user with these credential:

Email:

Password:

perseus

Starting with Docker Compose

To start all containers at once using docker-compose please

  • make sure docker-compose is installed
  • set vocabulary link, see Vocabulary section
  • configure SMTP server as it described in SMTP section (Optional)

Unix:

./startup.sh

Windows:

startup.cmd

Open localhost:80 in your browser, preferably Google Chrome

Starting each container separately

CONTAINERS

Perseus uses auxiliary services to scan, convert and validate data.

Below are links to these services, which should be included in the app build.

White-rabbit service

https://github.com/SoftwareCountry/WhiteRabbit

Cdm-builder service

https://github.com/SoftwareCountry/ETL-CDMBuilder

Data-quality-check service

https://github.com/SoftwareCountry/DataQualityDashboard

Finally

Open localhost:80 in your browser, preferably Google Chrome.

Getting Involved

License

Perseus is licensed under Apache License 2.0

perseus's People

Contributors

chmatvey avatar windman avatar mariadolotova avatar bradanton avatar etokareva avatar alexandrlopoukhov avatar dependabot[bot] avatar alexanderlopoukhov avatar kostiushenko avatar costebano avatar eugenezubkov avatar natb1 avatar ladycaster avatar delvish avatar munkk avatar dotnetpart avatar coderdes avatar nnurlan avatar ssamus avatar alondhe avatar nurlan-umetov avatar

Stargazers

Andy Rae avatar  avatar  avatar  avatar  avatar  avatar Chelsea Kay avatar Lav Patel avatar Christopher J. Bailey avatar maz_solie avatar Joris Schelfaut avatar Chester Guan (Ziyuan Guan) avatar Rudolf J avatar Alberto Labarga avatar Nico Matentzoglu avatar  avatar Janos Hajagos avatar Mike Johnson avatar  avatar Bell Eapen avatar Carlos Augusto Lima de Campos avatar Max Natthawut Adulyanukosol avatar Spiros Denaxas avatar  avatar  avatar  avatar Jacob S. Zelko avatar  avatar Raj N Manickam avatar  avatar Cory G Stevenson avatar Paul Nagy avatar

Watchers

Patrick Ryan avatar Christian Reich avatar  avatar Jon Duke avatar George Hripcsak avatar Frank DeFalco avatar Paul Nagy avatar Yuri Pinzon avatar Lee Evans avatar  avatar  avatar Will Halfpenny avatar  avatar

perseus's Issues

[FEATURE REQUEST] Athena API Functions

Requested feature for improved visibility with Athena instance within Perseus.

Criteria: Display the current instance's Athena vocabulary version within the "Vocabulary" tab, as well as the most up-to-date version of Athena available. If current version is older than the most up-to-date supported version, allow user to manually request/update vocabulary to newest version.

Considered done when current and newest versions are properly displayed within tab. Athena API request for update can be placed on lower priority.

Attached is a rough outline of proposed changes
Capture (1)

Automatically build Perseus Docker container images and push to Docker Hub when code changes in GitHub

Automatically build Perseus Docker container images and push them to Docker Hub when code changes are pushed to the OHDSI Perseus github repo. Tag the Docker image with a version number.

Benefits of this change:

  • It will ensure that we always have the latest Perseus Docker images in Docker Hub.
  • It is much faster to deploy Perseus using pre-built Docker Hub images instead of building them locally

One way to accomplish this is to develop GitHub Actions

update Perseus docker-compose.yaml file to download pre-built Perseus Docker images in Docker Hub

Currently the Perseus docker-compose.yml file is configured to build the Perseus Docker images locally.
This results in a very time consuming process for deployment of Perseus for end users.

Action:

Update the Perseus docker compose yaml file configuration to pull the latest pre-built Perseus Docker images from Docker Hub. Also update the README.md documentation to reflect this change.

See also github action #30 which ensures that the latest Perseus Docker images are maintained in Docker Hub

[Low Prio] REQUEST: Port wiki from old repo to OHDSI repo

For consistency purposes, moving the wiki from the old repo to this one should be eventually done, as we are recommending future users only refer to this repo for information.

Additionally, updating the wiki section of "Convert native to CDM" with a more explicit example of connection string (see below) has been a common user request.

image

Scan report fails to load

I tried loading CSV files, it works, builds the scan report. But then trying to use the scan report gives this error:

Failed to load report: type "empty" does not exist LINE 1: ...RCHAR(15),"Page" INT,"Item" INT,"PatientDataType" EMPTY,"Pag... ^

Docker compose fails on solr

Using master branch, when running the docker compose, I am seeing it error out on the perseus-solr step:

------
 > [perseus-solr internal] load metadata for docker.io/library/solr:8.8.1:
------
------
 > [perseus-files-manager internal] load metadata for docker.io/library/openjdk:17-alpine:
------
failed to solve: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to create LLB definition: no match for platform in manifest sha256:4b6abae565492dbe9e7a894137c966a7485154238902f2f25e9dbd9784383d81: not found

bug when loading a scan report

After doing a clean build of Perseus I get the following error from _create_source_schema_by_scan_report:

(full stack below)
init() got multiple values for argument 'schema'

Looking at this
https://stackoverflow.com/questions/75282511/df-to-table-throw-error-typeerror-init-got-multiple-values-for-argument
suggests that the issue could have to do with pandas version mismatches. I'll also attach a pip freeze.

  1. Has anybody else encountered this?
  2. The version of pandas is quite old, many of the libraries are several years old (namely Angular) and this creates a number of landmines when debugging/dev'ing. Are there any prospects for getting the libs up to date. I suspect this would be a prerequisite for most new work.

[2023-04-14 02:35:28,397] ERROR: http://localhost/backend/api/create_source_schema_by_scan_report request returned error: init() got multiple values for argument 'schema'
backend | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1516, in full_dispatch_request
backend | rv = self.dispatch_request()
backend | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1502, in dispatch_request
backend | return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
backend | File "/app/utils/username_header.py", line 12, in decorator
backend | return f(current_user, *args, **kwargs)
backend | File "/app/perseus_api.py", line 82, in create_source_schema_by_scan_report
backend | raise error
backend | File "/app/perseus_api.py", line 78, in create_source_schema_by_scan_report
backend | .create_source_schema_by_scan_report(current_user, etl_mapping.id, etl_mapping.scan_report_name)
backend | File "/app/services/source_schema_service.py", line 32, in create_source_schema_by_scan_report
backend | return _create_source_schema_by_scan_report(username, etl_mapping_id, scan_report_path)
backend | File "/app/services/source_schema_service.py", line 92, in _create_source_schema_by_scan_report
backend | raise e
backend | File "/app/services/source_schema_service.py", line 55, in _create_source_schema_by_scan_report
backend | from overview group by table;""")
backend | File "/usr/local/lib/python3.7/site-packages/pandasql/sqldf.py", line 156, in sqldf
backend | return PandaSQL(db_uri)(query, env)
backend | File "/usr/local/lib/python3.7/site-packages/pandasql/sqldf.py", line 58, in call
backend | write_table(env[table_name], table_name, conn)
backend | File "/usr/local/lib/python3.7/site-packages/pandasql/sqldf.py", line 121, in write_table
backend | index=not any(name is None for name in df.index.names)) # load index into db if all levels are named
backend | File "/usr/local/lib/python3.7/site-packages/pandas/io/sql.py", line 440, in to_sql
backend | pandas_sql = pandasSQL_builder(con, schema=schema)
backend | File "/usr/local/lib/python3.7/site-packages/pandas/io/sql.py", line 508, in pandasSQL_builder
backend | return SQLDatabase(con, schema=schema, meta=meta)
backend | File "/usr/local/lib/python3.7/site-packages/pandas/io/sql.py", line 940, in init
backend | meta = MetaData(self.connectable, schema=schema)

APScheduler==3.9.1
azure-common==1.1.28
azure-core==1.26.4
azure-identity==1.10.0
azure-keyvault-secrets==4.4.0
backports.zoneinfo==0.2.1
certifi==2022.12.7
cffi==1.15.1
charset-normalizer==3.1.0
click==8.1.3
cryptography==40.0.1
Flask==2.0.3
Flask-Cors==3.0.10
greenlet==2.0.2
idna==3.4
importlib-metadata==6.3.0
isodate==0.6.1
itsdangerous==2.1.2
Jinja2==3.1.2
MarkupSafe==2.1.2
msal==1.21.0
msal-extensions==1.0.0
msrest==0.7.1
numpy==1.21.6
oauthlib==3.2.2
pandas==0.23.4
pandasql==0.7.3
peewee==3.16.0
portalocker==2.7.0
psycopg2-binary==2.9.6
py-postgresql==1.2.2
pycparser==2.21
PyJWT==2.6.0
python-dateutil==2.8.2
pytz==2023.3
pytz-deprecation-shim==0.1.0.post0
requests==2.28.2
requests-oauthlib==1.3.1
six==1.16.0
SQLAlchemy==2.0.9
typing_extensions==4.5.0
tzdata==2023.3
tzlocal==4.3
urllib3==1.26.15
waitress==2.1.2
Werkzeug==2.2.3
xlrd==1.2.0
zipp==3.15.0

[Feature request] Provide option for non-standard certificates

Note: my knowledge of certificates is very limited, so expect vagueness in my description

I am trying to install and run Perseus on my work computer.
The company I work for issues its own certificates so that network traffic can be inspected.
This causes for example RUN npm install to fail due to certificate checks failing, which I got past by setting npm config set strict-ssl false

I believe this new issue I ran into (log below) is caused by a similar problem.
Given that Perseus is to be used within hospital networks I assume I am not the only person to face similar problems.
Could you think of a secure way of dealing with this? For example by (optionally) manually providing the certificate to be trusted within the setup/containers?

=> ERROR [build 7/10] RUN dotnet restore "org.ohdsi.cdm.presentation.builderwebapi/org.ohdsi.cdm.presentation.builderwebapi.csproj 12.7s

[build 7/10] RUN dotnet restore "org.ohdsi.cdm.presentation.builderwebapi/org.ohdsi.cdm.presentation.builderwebapi.csproj":
#22 3.366 Determining projects to restore...
#22 7.071 Restored /src/org.ohdsi.cdm.framework.common/org.ohdsi.cdm.framework.common.csproj (in 298 ms).
#22 7.072 Restored /src/org.ohdsi.cdm.framework.etl/org.ohdsi.cdm.framework.etl.common/org.ohdsi.cdm.framework.etl.common.csproj (in 298 ms).
#22 7.548 /usr/share/dotnet/sdk/3.1.426/NuGet.targets(128,5): error : Unable to load the service index for source https://api.nuget.org/v3/index.json. [/src/org.ohdsi.cdm.presentation.builderwebapi/org.ohdsi.cdm.presentation.builderwebapi.csproj]ntation. 12.7s
#22 7.548 /usr/share/dotnet/sdk/3.1.426/NuGet.targets(128,5): error : The SSL connection could not be established, see inner exception. [/src/org.ohdsi.cdm.presentation.builderwebapi/org.ohdsi.cdm.presentation.builderwebapi.csproj]
#22 7.548 /usr/share/dotnet/sdk/3.1.426/NuGet.targets(128,5): error : The remote certificate is invalid according to the validation procedure. [/src/org.ohdsi.cdm.presentation.builderwebapi/org.ohdsi.cdm.presentation.builderwebapi.csproj]

Proposal: integration with Broadsea 3.0

As discussed with @leeevans, we'd like to add Perseus services to Broadsea. This would simply leverage Perseus docker images, not alter Perseus in any way.

With Broadsea 3.0, we use traefik as a proxy, so the idea would be to add Perseus services and middlewares that allow them to work with all other services in Broadsea.

image

BUG (High Prio): "Convert to CDM" feature has duplicate naming, causing error

Originally posted by @Zwky26 in #4 (comment)
Error as originally posted here is caused by multiple entities being named "person_id". In a test mapping, I included only a single column and met the minimum requirements for pushing a "Convert to CDM" request.
image

Within the generated SQL preview, this means that the only mentions of "person_id" are those that are hardcoded within Perseus' logic. This might mean the error is independent of the mapping, and needs to be renamed to avoid ambiguity
image

Changing Perseus standard ports to accomodate other docker stacks on the same machine

Off prompts -

Summary

After 4+ months away from the project, I restarted my Perseus VM and re-familiarized myself with the application. I then tried to install it on my main server where I run other stacks, including Broadsea. I understand that Ajit and Lee are endeavoring to include a Perseus profile in Broadsea 3.1 and much work has been accomplished.

For now, I would like to understand the implications of changing the standard ports specified in the docker-compose.yml. I made port adjustments in the attached file. (docker-compose.txt)

Value Summary

Clarification on what breaks and what are the best ports/method to change to ensure compatibility in a multi-stack installation environment would assist in efforts currently to utilize the application stack as well as integrate it as a profile along with other useful stacks to the OHDSI community.

BUG (Med Priority): Leftmost Column Recognition

SQL Functions fail to recognize the leftmost column of a table

image

Renaming "primary_code" to "p_code" does not resolve issue. Swapping the position of the two columns in the csv reveals that it is not recognizing the leftmost column name
image

Error occurs for tables, seemingly regardless of how many total columns there are (1-20 columns) BUT this issue does not occur for user-created Views.

BUG (Low Prio/Edge Case): Create View fails to work when table name capitalized

An edge case occurs when creating a View that does not resolve, but also does not throw any error. Table names are censored for
proprietary reasons.

186B1E21-8CCF-427D-A503-7169E8B8C149.mov
SELECT DISTINCT pos.*
    FROM table0
	INNER join table1
	    ON table1.pat_id = table0.epic_pat_id
	INNER JOIN TABLE2 AS pos
	    ON table1.cur_prim_loc_id = pos.pos_id

Edge case occurs when TABLE2 is capitalized. Changing it to lowercase allows it to run without issue. Error only occurs when joining using a capitalized table. When simply doing

SELECT DISTINCT *
    FROM TABLE2

the capitalization does not have an effect.

Also, edge case only occurs when joining multiple tables( does not occur when joining exactly 2 tables total)

Error when scanning from csv file "Required request header 'Username' for method parameter type String is not present" pops up

Loading from a report seems to work but trying to do that from a csv file results in this error "Required request header 'Username' for method parameter type String is not present" - looks like whiterabbit isn't being spoken to properly as in the whiterabbit container logs it shows the same error as on the screen? Does the csv file need to be a special csv file or just a normal file? Not sure what I'm missing. Any help appreciated.
Cheers
Andy

Node error on docker compose build

Hello,

I am getting the below error when building using docker compose:

 => ERROR [perseus-frontend build-step 6/6] RUN npm run build:prod                                                25.8s
 => CACHED [perseus-frontend stage-1 2/7] RUN apt-get update     && apt-get install -y --no-install-recommends op  0.0s
 => CACHED [perseus-frontend stage-1 3/7] COPY nginx.conf /etc/nginx/nginx.conf                                    0.0s
------
 > [perseus-vocabularydb internal] load build context:
------
------
 > [perseus-frontend build-step 6/6] RUN npm run build:prod:
#0 5.490
#0 5.490 > [email protected] build:prod /usr/src/app
#0 5.490 > ng build --configuration production
#0 5.490
#0 24.45 An unhandled exception occurred: Cannot find module 'node:assert'
#0 24.45 Require stack:
#0 24.45 - /usr/src/app/node_modules/@ngtools/webpack/src/resource_loader.js
#0 24.45 - /usr/src/app/node_modules/@ngtools/webpack/src/ivy/plugin.js
#0 24.45 - /usr/src/app/node_modules/@ngtools/webpack/src/ivy/index.js
#0 24.45 - /usr/src/app/node_modules/@ngtools/webpack/src/index.js
#0 24.45 - /usr/src/app/node_modules/@angular-devkit/build-angular/src/webpack/configs/common.js
#0 24.45 - /usr/src/app/node_modules/@angular-devkit/build-angular/src/webpack/configs/index.js
#0 24.45 - /usr/src/app/node_modules/@angular-devkit/build-angular/src/builders/browser/index.js
#0 24.45 - /usr/src/app/node_modules/@angular/cli/node_modules/@angular-devkit/architect/node/node-modules-architect-host.js
#0 24.45 - /usr/src/app/node_modules/@angular/cli/node_modules/@angular-devkit/architect/node/index.js
#0 24.45 - /usr/src/app/node_modules/@angular/cli/models/architect-command.js
#0 24.45 - /usr/src/app/node_modules/@angular/cli/commands/build-impl.js
#0 24.45 - /usr/src/app/node_modules/@angular-devkit/schematics/tools/export-ref.js
#0 24.45 - /usr/src/app/node_modules/@angular-devkit/schematics/tools/index.js
#0 24.45 - /usr/src/app/node_modules/@angular/cli/utilities/json-schema.js
#0 24.45 - /usr/src/app/node_modules/@angular/cli/models/command-runner.js
#0 24.45 - /usr/src/app/node_modules/@angular/cli/lib/cli/index.js
#0 24.45 - /usr/src/app/node_modules/@angular/cli/lib/init.js
#0 24.45 - /usr/src/app/node_modules/@angular/cli/bin/ng
#0 24.45 See "/tmp/ng-3EtPm3/angular-errors.log" for further details.
#0 24.53 npm ERR! code ELIFECYCLE
#0 24.53 npm ERR! syscall spawn
#0 24.53 npm ERR! file sh
#0 24.53 npm ERR! errno ENOENT
#0 24.56 npm ERR! [email protected] build:prod: `ng build --configuration production`
#0 24.56 npm ERR! spawn ENOENT
#0 24.56 npm ERR!
#0 24.56 npm ERR! Failed at the [email protected] build:prod script.
#0 24.56 npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
#0 24.64
#0 24.64 npm ERR! A complete log of this run can be found in:
#0 24.64 npm ERR!     /root/.npm/_logs/2023-01-26T22_21_00_798Z-debug.log
------
failed to solve: executor failed running [/bin/sh -c npm run build:${env}]: exit code: 1

[BUG] Convert to CDM Fails; "Data can't be empty"

Hi Perseus Team,

I think I am very close to having a working ETL process but am running into an issue at the final step. When I go to click convert to CDM with the following connection I have:

image

I get the following error:

image

What is going wrong here? I consulted the wiki but it seems like I am doing nothing wrong. Furthermore, I am able to successfully generate an additional 10000 rows of fake data. The issue also persists with fake data generation:

image

Which again gives:

image

Thanks!

Enable Codespaces for Perseus

Allow developers to interact with the Perseus project through Github codespaces.

This would help (for example) software developers interested in working with the code but don't have the horsepower on their local machines.

CDM Builder ETL template

General

The .xml template file used for reading data from source table and creating abstract OMOP objects, which will be used later for conversion to the CDM format.

Generic pattern:

<QueryDefinition>
  <Query>
  <!--SQL SELECT Statement-->
  </Query>
    <OMOP_entity>
       <OMOP_entity_Definition>
          <!--fields mapping-->
       </OMOP_entity_Definition>
       <!--...-->
       <OMOP_entity_Definition>
       </OMOP_entity_Definition>
    </OMOP_entity>
    <!--...-->
    <OMOP_entity>
    <!--...-->
    </OMOP_entity>
<QueryDefinition>

One source table can generate many OMOP entities

<ConditionOccurrence>
	<ConditionOccurrenceDefinition>
		<!--fields mapping-->
	</ConditionOccurrenceDefinition>
</ConditionOccurrence>
<VisitOccurrence>
	<VisitOccurrenceDefinition>
		<!--fields mapping-->
	</VisitOccurrenceDefinition>
</VisitOccurrence>

Fields mapping

Used for getting data from source field and move to the specified OMOP entity field.

 <MeasurementDefinition> <!--OMOP entity name-->
	<PersonId>person_id</PersonId> <!--person_id will goes to Measurement.PersonId-->
	<StartDate>measurement_date</StartDate>
	<Concepts> <!--ConceptId region-->
		<Concept name="MeasurementConceptId"> <!--Concept name-->
			<ConceptIdMapper>
				<Mapper>
					<Lookup>ConditionIcd10</Lookup> <!--Lookup name, sql stored in separate file-->
				</Mapper>
			</ConceptIdMapper>
			<Fields>
                                 <!--Value from the measurement_concept_id_1 through ConditionIcd10 lookup will be mapped to the corresponding ConceptId-->
				<Field key="measurement_concept_id_1"/> 
			</Fields>
		</Concept>
	</Concepts>
</MeasurementDefinition>

Lookup file example: ConditionIcd10

The fields mapping has set of common fields like: PersonId, StartDate, EndDate, ConceptId.. and specific for current OMOP entity like: UniqueDeviceId (DEVICE_EXPOSURE), PrecedingVisitDetailId (VISIT_DETAIL).. Through common fields each OMOP entity can be converted to each other.
For instance, we decided that the set of source fields will generate Condition Occurrence entity and uses icd-10 lookup. If after mapping to icd-10 will be found that resulting concept has different from initial domain (in our case
Condition) relevant OMOP entity will be created automatically. Likewise STEM to CDM

Concept region

There are different ways to define Concept Id.

  1. Use value from source as it is.
<Concept name="MeasurementConceptId">
  <Fields>
    <Field conceptId="unit_concept_id"/>
  </Fields>
</Concept>
  1. Use lookup.
<Concepts>
  <Concept name="MeasurementConceptId">
    <ConceptIdMapper>
      <Mapper>
        <Lookup>snomed</Lookup>
      </Mapper>
    </ConceptIdMapper>
    <Fields>
      <Field key="measurement_concept_id_1"/>
    </Fields>
  </Concept>
</Concepts>
  1. For concept_id and source_value will be used different source values.
<Concept name="MeasurementConceptId">
  <ConceptIdMapper>
    <Mapper>
      <Lookup>snomed</Lookup>
    </Mapper>
  </ConceptIdMapper>
  <Fields>
    <Field key="measurement_concept_id_1" sourceKey="measurement_source_value_1"/>
  </Fields>
</Concept>
  1. The same as previous + specified type_concept_id.
<Concept name="MeasurementConceptId">
  <ConceptIdMapper>
    <Mapper>
      <Lookup>snomed</Lookup>
    </Mapper>
  </ConceptIdMapper>
  <Fields>
    <Field key="measurement_concept_id_1" sourceKey="measurement_source_value_1" typeId="measurement_type_concept_id_1"/>
  </Fields>
</Concept>
  1. Multi source fields, will create new OMOP entity for each field.
<Concepts>
  <Concept name="MeasurementConceptId">
    <ConceptIdMapper>
      <Mapper>
        <Lookup>snomed</Lookup>
      </Mapper>
    </ConceptIdMapper>
    <Fields>
      <Field key="measurement_concept_id_1"/>
      <Field key="measurement_concept_id_2"/>
      <Field key="measurement_concept_id_3"/>
    </Fields>
  </Concept>
</Concepts>

HowTo Add HTTPS redirect for Perseus?

Here is my amended nginx server.docker.conf:

server {

        listen       88;
        listen  [::]:88;
        server_name  sandbox.acumenus.net;

        location /.well-known/acme-challenge/ {
                root /verify;
                default_type "text/plain";
        }

        location = /.well-known/acme-challenge/ {
                return 404;
		}

		# Redirect all other requests to HTTPS
		location / {
			return 301 https://$host:2443$request_uri;
		}
}
		
server {

        listen 2443 ssl;
        server_name  sandbox.acumenus.net;

        ssl_protocols           TLSv1.2 TLSv1.3;
		ssl_certificate /etc/letsencrypt/archive/sandbox.acumenus.net/fullchain2.pem; # path to your fullchain.pem
		ssl_certificate_key /etc/letsencrypt/archive/sandbox.acumenus.net/privkey2.pem; # path to your privkey.pem
		ssl_session_timeout     10m;

    location / {
		proxy_set_header            X-Real-IP $remote_addr;
		proxy_set_header            X-Forwarded-For $proxy_add_x_forwarded_for;
		proxy_set_header            X-Forwarded-Proto $scheme;
		proxy_set_header            Host $host;
		proxy_pass                  http://172.17.0.1:4200;
		# proxy_pass                  http://host.docker.internal:4200;

	}

Connects insecure on port 88
Connection refused on port 2443

Any ideas what I might be doing wrong here? Any help would be greatly appreciated.

Feature (Low Prio): Sort USAGI results by Match Score

Feedback received during demo: as implemented in the Java app version, being able to click on the header to enable "Sort by Match Score Ascending" or "Descending" would be a small but appreciated quality-of-life change

[BUG] Unable to Connect to Postgres Database

Hi Perseus Team,

I am seemingly unable to scan a postgresql database instance from within Perseus and I am not sure why this is failing.
Attached is a video of me attempting a connection but it fails:

perseus_breakage.mp4

I followed the wiki and am unsure as to what happened -- this feels like a bug.

Thanks!

Clinician/analyst focused vocabulary mapping workflow.

Summary

The vocabulary mapping that exists in Perseus today is one of the key sources
of value because it creates a streamlined user experience compared to other
options. This benefit is mitigated, though, with workflows that are indirect
and include ETL activities such as database integration and design of the data
flow. We want a streamlined workflow for vocabulary mapping that specifically
targets an analyst-type persona (clinician, informaticist, data scientist, statistician, etc.)

  • Include a one paragraph brief summarizing the scope of the proposal.
    Outline the scope of the proposal by using plain language, and also be explicit
    about what is out of scope.

  • Summarize the key user personas that will be affected by this proposal. Describe the
    user workflow.

  • Summarize the value created by the proposal - ex. what audience will be impacted
    and what is the size of that audience. What are the limitations of the existing solutions.

  • Describe a strategy for acceptance testing of the proposal.

  • Is the proposal supported by the current (non-functional) design?
    If not, also draft a design proposal for review. List the relevant design proposals and
    how they are related to this feature proposal.

  • Draft the development tasks that would be required to implement the proposal.
    Estimate the level of effort for each task in person days. No task should take more
    than 5 person days.

  • Summarize the anticipated maintenance costs for the proposal.

    • Does the proposal have a dependency on
      cross-cutting functionality such as database management, scheduling, logging etc.? What
      maintenance would be required if the implementation of that cross-cutting concern changed.
    • Does the proposal have an implicit dependency on future releases?
      For example, support for new SDK's, bindings or new documentation specifications
      would all require ongoing updates in future releases.
    • End user support: What support will end user personas require to use the feature?
      How will that support requirement be mitigated.
  • Provide a plan for executing/sponsoring the implementation and maintenance of the proposal.
    What are the main risks to implementation and how can they be mitigated?

  • Attend the Perseus Working Group to discuss your proposal.

User Personas and Workflows

The target persona is a clinician-type user - differentiated from an engineering-type
user by domain knowledge and technical capability. This persona needs a "self-service"
workflow that enables them to create vocabulary mappings that can be provided to
the engineering-type persona as part of a specification to develop the ETL. The workflow
must be streamlined so that the steps make sense from the perspective of different personas.
I.e. the analyst's experience should be personalized so they are not expected to
know Perseus implementation details to accomplish their task. Ex:

  • Analyst should not be exposed to data integration tasks.
  • Vocabulary mapping should be decoupled from data flow design.
  • Mapping activities should use existing data profile, rather than importing CSV.

Value Summary

Vocabulary mapping is a core use case for Perseus and it is differentiated
by other solutions by having a streamlined user experience. Curating the
workflow the analyst will help ensure the experience is streamlined.

Acceptance Testing

An analyst persona can navigate to a page in Perseus
and - "self-service", without support from other resources
or additional documentation - perform
all the tasks necessary to create a vocabulary mapping
to serve as part of an ETL specification. This includes:

  • defining new vocabularies
  • mapping vocabularies to columns in the source data
  • validating and customizing mappings

In particular, these activities can be done without context switching
or navigating around engineering focused activities. Ex. the vocabulary
mapping workflow must use an existing data profile rather that
asking the analyst to export and upload a csv of their health data.

This acceptance criteria will be validated against user feedback. In
addition, an automated testing strategy must be developed to
verify implementation details and mitigate regressions.

Design Considerations

There are several reasons to recommend partitioning the operational
and software architecture into a new "component" focused on the vocabulary mapping
use case.

  • The Perseus project is not optimized for developer experience. Ex:
    • Software dependencies from many technologies and often very out of date.
    • Container environment is very heavy, has many interdependencies, and very limited tooling for dev.
  • We would like to invite new contributors to participate in development
    of this key use case. We expect software boundaries to follow team boundaries.
  • We expect user experience boundaries to flow from software boundaries.

For these reasons the design recommendation to develop a component that can be
developed, operationalized tested as a decoupled component of the core Perseus codebase.

Implementation Plan

Maintenance Plan

Execution Plan

Working Group Notes

MacOS Port 5000 conflict

As you know I was having a strange problem with a conflict on port 5000.

I found the issue! And I wanted to alert you so that you can inform the broader community of potential users who might try to use Docker Desktop for Mac OS, especially since Monterrey and currrently with Ventura releases of the OS.

Apple uses port 5000 for incoming requests to Airplay music from devices to the MacOS machine. Turning this off is simple:
796267e6b060412e25a2785d60c4b6a2bc36f865_2_495x500

R-Server build fails

Building R-Server using the latest commit to master of SoftwareCountry/DataQualityDashboard results in the following error:

DOCKER_BUILDKIT=0 docker-compose build r-serve
Step 33/41 : RUN ./java-secure
 ---> Running in 898db1092bc3
./java-secure: 5: cd: can't cd to etc/java-11-openjdk/security
sed: can't read java.security: No such file or directory
1 error occurred:
        * Status: The command '/bin/sh -c ./java-secure' returned a non-zero code: 2, Code: 2

Merge Perseus service wrapper code into the main OHDSI GitHub repos & remove forks

Lee Evans:

Work on aligning the Perseus Docker containers with the latest official releases of OHDSI tools like DQD, WhiteRabbit etc. I need to think about what that would look like from a practical point of view - but please add a placeholder in the scope of work for now.

  1. White Rabbit
  2. CDM Builder
  3. Data Qulaity Dashboard

internal Perseus team item: https://dev.arcadia.spb.ru/Arcadia-CDM/Perseus/_workitems/edit/240

[BUG] Errors in Setting Up Perseus

Hi Perseus Team,

Saw your recent presentations about the Perseus project at various OHDSI calls - amazing work! Very excited about it and have already started exploring how to use it. With that said, I have run into a variety of errors with setting up Perseus and am blocked. Was hoping I could receive some assistance. The following sections break up what I have encountered.

System and Software Details

Operating System: RHEL 8
Docker version: N/A
Podman: 4.1.1
Perseus: 6a40cd6

Merge: 86ee545 72b7c42
Author: Matvey Chudakov [email protected]
Date: Sat Sep 17 10:22:06 2022 +0300

Merge pull request #675 from SoftwareCountry/staging

Staging

Attempted Procedure and Expected Result

  1. Clone Perseus repo onto computer:
  2. Make sure to have podman and podman-compose installed on your machine
    • For RHEL based systems sudo dnf install podman podman-compose
    • For Debian based systems sudo apt-get install podman podman-compose
    • For OSX systems sudo brew install podman podman-compose
  3. Navigate to https://athena.ohdsi.org/vocabulary/list and download all vocabularies that do not require a license
    • This will take 15 - 20 minutes
  4. With the downloaded Athena zip, save it to the perseus/vocabulary-db directory.
  5. Create a new folder within vocabulary-db called vocabulary
  6. Unzip the Athena zip file into vocabulary
  7. Navigate to the root of the Perseus repository and run the shell script startup.sh
    • Could be run bash startup.sh or ./startup.sh
  8. Navigate one level up from the root of the Perseus repository and clone the following repository:
    • git clone https://github.com/SoftwareCountry/ETL-CDMBuilder.git
  9. Edit the Dockerfile and change where it says: build: ../ETL-CDMBuilder to build: ../ETL-CDMBuilder/source
  10. Start the compose process by running: sudo podman-compose up -d

I expect that the compose process should just "work" and can navigate to https://localhost:80 from my browser.

NOTE: Step 8 and 9 were required as the OHDSI ETL-CDMBuilder that Perseus tries to call is too old and errors. I noticed that the SoftwareCountry fork had the most recent changes and so replaced that step within the Docker process manually to provide a recent version that had an existing Dockerfile - the other version does not have a Dockerfile and errors.

What Actually Happens

Instead, of the above process working, the process fails and I am unable to access Perseus. Here is the error message that I see after everything builds:

STEP 1/33: FROM r-base:4.1.3
STEP 2/33: ARG prop=default
--> Using cache f50afaf4061e1e70d6eaa1cbb3f66647737ae4f67aa92c027926e8252f11e1b6
--> f50afaf4061
STEP 3/33: RUN apt-get update
Get:1 http://deb.debian.org/debian testing InRelease [161 kB]
Get:2 http://deb.debian.org/debian testing/main amd64 Packages [8,490 kB]
Get:3 http://cdn-fastly.deb.debian.org/debian sid InRelease [158 kB]
Get:4 http://cdn-fastly.deb.debian.org/debian sid/main amd64 Packages [9,343 kB]
Fetched 18.2 MB in 3s (6,197 kB/s)
Reading package lists...['podman', '--version', '']
using podman version: 4.1.1
** excluding:  set()
['podman', 'inspect', '-t', 'image', '-f', '{{.Id}}', 'perseus_web']
['podman', 'inspect', '-t', 'image', '-f', '{{.Id}}', 'perseus_shareddb']
['podman', 'inspect', '-t', 'image', '-f', '{{.Id}}', 'perseus_frontend']
['podman', 'inspect', '-t', 'image', '-f', '{{.Id}}', 'perseus_vocabularydb']
['podman', 'inspect', '-t', 'image', '-f', '{{.Id}}', 'perseus_files-manager']
['podman', 'inspect', '-t', 'image', '-f', '{{.Id}}', 'perseus_user']
['podman', 'inspect', '-t', 'image', '-f', '{{.Id}}', 'perseus_solr']
['podman', 'inspect', '-t', 'image', '-f', '{{.Id}}', 'perseus_r-serve']
podman build -t perseus_r-serve -f ../DataQualityDashboard/R/Dockerfile --build-arg prop=docker ../DataQualityDashboard/R
exit code: 134
['podman', 'inspect', '-t', 'image', '-f', '{{.Id}}', 'perseus_backend']
['podman', 'inspect', '-t', 'image', '-f', '{{.Id}}', 'perseus_white-rabbit']
['podman', 'inspect', '-t', 'image', '-f', '{{.Id}}', 'perseus_athena']
['podman', 'inspect', '-t', 'image', '-f', '{{.Id}}', 'perseus_cdm-builder']
['podman', 'inspect', '-t', 'image', '-f', '{{.Id}}', 'perseus_usagi']
['podman', 'inspect', '-t', 'image', '-f', '{{.Id}}', 'perseus_data-quality-dashboard']
['podman', 'network', 'exists', 'perseus_default']
podman create --name=web --label io.podman.compose.config-hash=123 --label io.podman.compose.project=perseus --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=perseus --label com.docker.compose.project.working_dir=/home/[email protected]/FOSS/Perseus --label com.docker.compose.project.config_files=docker-compose.yaml --label com.docker.compose.container-number=1 --label com.docker.compose.service=web --net perseus_default --network-alias web -p 80:80 --restart always perseus_web
exit code: 125
podman volume inspect perseus_shareddb || podman volume create perseus_shareddb
['podman', 'volume', 'inspect', 'perseus_shareddb']
['podman', 'network', 'exists', 'perseus_default']
podman create --name=shareddb --label io.podman.compose.config-hash=123 --label io.podman.compose.project=perseus --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=perseus --label com.docker.compose.project.working_dir=/home/[email protected]/FOSS/Perseus --label com.docker.compose.project.config_files=docker-compose.yaml --label com.docker.compose.container-number=1 --label com.docker.compose.service=shareddb -v perseus_shareddb:/data/postgres --net perseus_default --network-alias shareddb -p 5432:5432 perseus_shareddb
exit code: 125
['podman', 'network', 'exists', 'perseus_default']
podman create --name=frontend --label io.podman.compose.config-hash=123 --label io.podman.compose.project=perseus --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=perseus --label com.docker.compose.project.working_dir=/home/[email protected]/FOSS/Perseus --label com.docker.compose.project.config_files=docker-compose.yaml --label com.docker.compose.container-number=1 --label com.docker.compose.service=frontend --net perseus_default --network-alias frontend -p 4200:4200 perseus_frontend
exit code: 125
podman volume inspect perseus_vocabularydb || podman volume create perseus_vocabularydb
['podman', 'volume', 'inspect', 'perseus_vocabularydb']
['podman', 'network', 'exists', 'perseus_default']
podman create --name=vocabularydb --label io.podman.compose.config-hash=123 --label io.podman.compose.project=perseus --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=perseus --label com.docker.compose.project.working_dir=/home/[email protected]/FOSS/Perseus --label com.docker.compose.project.config_files=docker-compose.yaml --label com.docker.compose.container-number=1 --label com.docker.compose.service=vocabularydb -v perseus_vocabularydb:/data/postgres --net perseus_default --network-alias vocabularydb -p 5431:5432 --healthcheck-command /bin/sh -c pg_isready' '-q' '-d' 'vocabulary' '-U' 'perseus --healthcheck-interval 30s --healthcheck-timeout 60s --healthcheck-retries 10 perseus_vocabularydb
exit code: 125
['podman', 'network', 'exists', 'perseus_default']
podman create --name=files-manager --label io.podman.compose.config-hash=123 --label io.podman.compose.project=perseus --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=perseus --label com.docker.compose.project.working_dir=/home/[email protected]/FOSS/Perseus --label com.docker.compose.project.config_files=docker-compose.yaml --label com.docker.compose.container-number=1 --label com.docker.compose.service=files-manager -e SPRING_PROFILES_ACTIVE=docker --net perseus_default --network-alias files-manager -p 10500:10500 perseus_files-manager
exit code: 125
['podman', 'network', 'exists', 'perseus_default']
podman create --name=user --label io.podman.compose.config-hash=123 --label io.podman.compose.project=perseus --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=perseus --label com.docker.compose.project.working_dir=/home/[email protected]/FOSS/Perseus --label com.docker.compose.project.config_files=docker-compose.yaml --label com.docker.compose.container-number=1 --label com.docker.compose.service=user --env-file /home/[email protected]/FOSS/Perseus/user/user-envs.txt -e USER_ENV=Docker --net perseus_default --network-alias user -p 5001:5001 perseus_user
exit code: 125
podman volume inspect perseus_solr || podman volume create perseus_solr
['podman', 'volume', 'inspect', 'perseus_solr']
['podman', 'network', 'exists', 'perseus_default']
podman create --name=solr --label io.podman.compose.config-hash=123 --label io.podman.compose.project=perseus --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=perseus --label com.docker.compose.project.working_dir=/home/[email protected]/FOSS/Perseus --label com.docker.compose.project.config_files=docker-compose.yaml --label com.docker.compose.container-number=1 --label com.docker.compose.service=solr -v perseus_solr:/var/solr --net perseus_default --network-alias solr -p 8983:8983 perseus_solr
exit code: 125
['podman', 'network', 'exists', 'perseus_default']
podman create --name=r-serve --label io.podman.compose.config-hash=123 --label io.podman.compose.project=perseus --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=perseus --label com.docker.compose.project.working_dir=/home/[email protected]/FOSS/Perseus --label com.docker.compose.project.config_files=docker-compose.yaml --label com.docker.compose.container-number=1 --label com.docker.compose.service=r-serve --net perseus_default --network-alias r-serve -p 6311:6311 perseus_r-serve
exit code: 125
['podman', 'network', 'exists', 'perseus_default']
podman create --name=backend --label io.podman.compose.config-hash=123 --label io.podman.compose.project=perseus --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=perseus --label com.docker.compose.project.working_dir=/home/[email protected]/FOSS/Perseus --label com.docker.compose.project.config_files=docker-compose.yaml --label com.docker.compose.container-number=1 --label com.docker.compose.service=backend -e PERSEUS_ENV=Docker --net perseus_default --network-alias backend -p 5000:5000 perseus_backend
exit code: 125
['podman', 'network', 'exists', 'perseus_default']
podman create --name=white-rabbit --label io.podman.compose.config-hash=123 --label io.podman.compose.project=perseus --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=perseus --label com.docker.compose.project.working_dir=/home/[email protected]/FOSS/Perseus --label com.docker.compose.project.config_files=docker-compose.yaml --label com.docker.compose.container-number=1 --label com.docker.compose.service=white-rabbit -e SPRING_PROFILES_ACTIVE=docker --net perseus_default --network-alias white-rabbit -p 8000:8000 perseus_white-rabbit
exit code: 125
['podman', 'network', 'exists', 'perseus_default']
Starting OpenBSD Secure Shell server: sshd.

PostgreSQL Database directory appears to contain a database; Skipping initialization

Starting OpenBSD Secure Shell server: sshd.

PostgreSQL Database directory appears to contain a database; Skipping initialization

,--.   ,--. ,--.      ,--.   ,--.             ,------.           ,--.    ,--.    ,--.   ,--.
|  |   |  | |  ,---.  `--' ,-'  '-.  ,---.    |  .--. '  ,--,--. |  |-.  |  |-.  `--' ,-'  '-.
|  |.'.|  | |  .-.  | ,--. '-.  .-' | .-. :   |  '--'.' ' ,-.  | | .-. ' | .-. ' ,--. '-.  .-'
|   ,'.   | |  | |  | |  |   |  |   \   --.   |  |\  \  \ '-'  | | `-' | | `-' | |  |   |  |
'--'   '--' `--' `--' `--'   `--'    `----'   `--' '--'  `--`--'  `---'   `---'  `--'   `--'

WhiteRabbitService 0.4.0
Powered by Spring Boot 2.6.3
2022-09-29 20:08:03.576  INFO 1 --- [           main] c.a.w.WhiteRabbitServiceApplication      : Starting WhiteRabbitServiceApplication v0.4.0 using Java 17.0.2 on 46a72233bf0d with PID 1 (/app.jar started by root in /)
2022-09-29 20:08:03.578  INFO 1 --- [           main] c.a.w.WhiteRabbitServiceApplication      : The following profiles are active: docker
2022-09-29 20:08:04.538  INFO 1 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data JPA repositories in DEFAULT mode.
2022-09-29 20:08:04.723  INFO 1 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 176 ms. Found 9 JPA repository interfaces.
2022-09-29 20:08:05.539  INFO 1 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat initialized with port(s): 8000 (http)
2022-09-29 20:08:05.555  INFO 1 --- [           main] o.apache.catalina.core.StandardService   : Starting service [Tomcat]
2022-09-29 20:08:05.556  INFO 1 --- [           main] org.apache.catalina.core.StandardEngine  : Starting Servlet engine: [Apache Tomcat/9.0.56]
2022-09-29 20:08:05.617  INFO 1 --- [           main] o.a.c.c.C.[.[localhost].[/white-rabbit]  : Initializing Spring embedded WebApplicationContext
2022-09-29 20:08:05.617  INFO 1 --- [           main] w.s.c.ServletWebServerApplicationContext : Root WebApplicationContext: initialization completed in 1992 ms
2022-09-29 20:08:05.848  INFO 1 --- [           main] o.hibernate.jpa.internal.util.LogHelper  : HHH000204: Processing PersistenceUnitInfo [name: default]
2022-09-29 20:08:05.964  INFO 1 --- [           main] org.hibernate.Version                    : HHH000412: Hibernate ORM core version 5.6.4.Final
2022-09-29 20:08:06.196  INFO 1 --- [           main] o.hibernate.annotations.common.Version   : HCANN000001: Hibernate Commons Annotations {5.1.2.Final}
2022-09-29 20:08:06.367  INFO 1 --- [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Starting...
2022-09-29 20:08:06.681  INFO 1 --- [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Start completed.
2022-09-29 20:08:06.702  INFO 1 --- [           main] org.hibernate.dialect.Dialect            : HHH000400: Using dialect: org.hibernate.dialect.PostgreSQLDialect
App config initialized
2022-09-29 20:08:08.389  INFO 1 --- [           main] o.h.e.t.j.p.i.JtaPlatformInitiator       : HHH000490: Using JtaPlatform implementation: [org.hibernate.engine.transaction.jta.platform.internal.NoJtaPlatform]
2022-09-29 20:08:08.400  INFO 1 --- [           main] j.LocalContainerEntityManagerFactoryBean : Initialized JPA EntityManagerFactory for persistence unit 'default'

,------.              ,--.               ,-----.                      ,--. ,--.   ,--.               ,------.                    ,--.      ,--.                                ,--.
|  .-.  \   ,--,--. ,-'  '-.  ,--,--.   '  .-.  '   ,--.,--.  ,--,--. |  | `--' ,-'  '-. ,--. ,--.   |  .-.  \   ,--,--.  ,---.  |  ,---.  |  |-.   ,---.   ,--,--. ,--.--.  ,-|  |
|  |  \  : ' ,-.  | '-.  .-' ' ,-.  |   |  | |  |   |  ||  | ' ,-.  | |  | ,--. '-.  .-'  \  '  /    |  |  \  : ' ,-.  | (  .-'  |  .-.  | | .-. ' | .-. | ' ,-.  | |  .--' ' .-. |
|  '--'  / \ '-'  |   |  |   \ '-'  |   '  '-'  '-. '  ''  ' \ '-'  | |  | |  |   |  |     \   '     |  '--'  / \ '-'  | .-'  `) |  | |  | | `-' | ' '-' ' \ '-'  | |  |    \ `-' |
`-------'   `--`--'   `--'    `--`--'    `-----'--'  `----'   `--`--' `--' `--'   `--'   .-'  /      `-------'   `--`--' `----'  `--' `--'  `---'   `---'   `--`--' `--'     `---'
                                                                                         `---'
DataQualityDashboard 0.0.4
Powered by Spring Boot 2.6.3
2022-09-29 20:08:09.316  INFO 1 --- [           main] c.a.D.DataQualityDashboardApplication    : Starting DataQualityDashboardApplication v0.0.4 using Java 17-ea on 68e203986cb1 with PID 1 (/app.jar started by root in /)
2022-09-29 20:08:09.320  INFO 1 --- [           main] c.a.D.DataQualityDashboardApplication    : The following profiles are active: docker
2022-09-29 20:08:09.410  WARN 1 --- [           main] JpaBaseConfiguration$JpaWebConfiguration : spring.jpa.open-in-view is enabled by default. Therefore, database queries may be performed during view rendering. Explicitly configure spring.jpa.open-in-view to disable this warning
2022-09-29 20:08:09.802  INFO 1 --- [           main] c.a.c.i.jackson.JacksonVersion           : Package versions: jackson-annotations=2.13.1, jackson-core=2.13.1, jackson-databind=2.13.1, jackson-dataformat-xml=2.13.1, jackson-datatype-jsr310=2.13.1, azure-core=1.26.0, Troubleshooting version conflicts: https://aka.ms/azsdk/java/dependency/troubleshoot
2022-09-29 20:08:09.883  INFO 1 --- [           main] AbstractAzureServiceClientBuilderFactory : Will configure the default credential of type DefaultAzureCredential for class com.azure.identity.DefaultAzureCredentialBuilder.
2022-09-29 20:08:10.342  INFO 1 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat started on port(s): 8000 (http) with context path '/white-rabbit'
2022-09-29 20:08:10.359  INFO 1 --- [           main] c.a.w.WhiteRabbitServiceApplication      : Started WhiteRabbitServiceApplication in 7.337 seconds (JVM running for 7.835)
2022-09-29 20:08:10.574  INFO 1 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data JPA repositories in DEFAULT mode.
2022-09-29 20:08:10.762  INFO 1 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 177 ms. Found 3 JPA repository interfaces.
2022-09-29 20:08:11.294  INFO 1 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat initialized with port(s): 8001 (http)
2022-09-29 20:08:11.305  INFO 1 --- [           main] o.apache.catalina.core.StandardService   : Starting service [Tomcat]
2022-09-29 20:08:11.305  INFO 1 --- [           main] org.apache.catalina.core.StandardEngine  : Starting Servlet engine: [Apache Tomcat/9.0.56]
2022-09-29 20:08:11.347  INFO 1 --- [           main] o.a.c.c.C.[.[.[/data-quality-dashboard]  : Initializing Spring embedded WebApplicationContext
2022-09-29 20:08:11.347  INFO 1 --- [           main] w.s.c.ServletWebServerApplicationContext : Root WebApplicationContext: initialization completed in 1946 ms
2022-09-29 20:08:11.532  INFO 1 --- [           main] o.hibernate.jpa.internal.util.LogHelper  : HHH000204: Processing PersistenceUnitInfo [name: default]
2022-09-29 20:08:11.585  INFO 1 --- [           main] org.hibernate.Version                    : HHH000412: Hibernate ORM core version 5.6.4.Final
2022-09-29 20:08:11.772  INFO 1 --- [           main] o.hibernate.annotations.common.Version   : HCANN000001: Hibernate Commons Annotations {5.1.2.Final}
2022-09-29 20:08:11.896  INFO 1 --- [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Starting...
2022-09-29 20:08:12.100  INFO 1 --- [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Start completed.
2022-09-29 20:08:12.119  INFO 1 --- [           main] org.hibernate.dialect.Dialect            : HHH000400: Using dialect: org.hibernate.dialect.PostgreSQLDialect
2022-09-29 20:08:12.906  INFO 1 --- [           main] o.h.e.t.j.p.i.JtaPlatformInitiator       : HHH000490: Using JtaPlatform implementation: [org.hibernate.engine.transaction.jta.platform.internal.NoJtaPlatform]
2022-09-29 20:08:12.916  INFO 1 --- [           main] j.LocalContainerEntityManagerFactoryBean : Initialized JPA EntityManagerFactory for persistence unit 'default'
2022-09-29 20:08:13.423  WARN 1 --- [           main] JpaBaseConfiguration$JpaWebConfiguration : spring.jpa.open-in-view is enabled by default. Therefore, database queries may be performed during view rendering. Explicitly configure spring.jpa.open-in-view to disable this warning
2022-09-29 20:08:13.617  INFO 1 --- [           main] o.s.b.a.w.s.WelcomePageHandlerMapping    : Adding welcome page: class path resource [static/index.html]
2022-09-29 20:08:13.747  INFO 1 --- [           main] c.a.c.i.jackson.JacksonVersion           : Package versions: jackson-annotations=2.13.1, jackson-core=2.13.1, jackson-databind=2.13.1, jackson-dataformat-xml=2.13.1, jackson-datatype-jsr310=2.13.1, azure-core=1.26.0, Troubleshooting version conflicts: https://aka.ms/azsdk/java/dependency/troubleshoot
2022-09-29 20:08:13.813  INFO 1 --- [           main] AbstractAzureServiceClientBuilderFactory : Will configure the default credential of type DefaultAzureCredential for class com.azure.identity.DefaultAzureCredentialBuilder.
2022-09-29 20:08:14.058  INFO 1 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat started on port(s): 8001 (http) with context path '/data-quality-dashboard'
2022-09-29 20:08:14.066  INFO 1 --- [           main] c.a.D.DataQualityDashboardApplication    : Started DataQualityDashboardApplication in 5.804 seconds (JVM running for 7.004)
2022-09-29 20:08:14.087  INFO 1 --- [           main] ConditionEvaluationReportLoggingListener : 

Error starting ApplicationContext. To display the conditions report re-run your application with 'debug' enabled.
2022-09-29 20:08:14.102 ERROR 1 --- [           main] o.s.boot.SpringApplication               : Application run failed

java.lang.IllegalStateException: Failed to execute CommandLineRunner
	at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:772) ~[spring-boot-2.6.3.jar!/:2.6.3]
	at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:753) ~[spring-boot-2.6.3.jar!/:2.6.3]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:309) ~[spring-boot-2.6.3.jar!/:2.6.3]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:1303) ~[spring-boot-2.6.3.jar!/:2.6.3]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:1292) ~[spring-boot-2.6.3.jar!/:2.6.3]
	at com.arcadia.DataQualityDashboard.DataQualityDashboardApplication.main(DataQualityDashboardApplication.java:17) ~[classes!/:0.0.4]
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78) ~[na:na]
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
	at java.base/java.lang.reflect.Method.invoke(Method.java:568) ~[na:na]
	at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:49) ~[app.jar:0.0.4]
	at org.springframework.boot.loader.Launcher.launch(Launcher.java:108) ~[app.jar:0.0.4]
	at org.springframework.boot.loader.Launcher.launch(Launcher.java:58) ~[app.jar:0.0.4]
	at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:88) ~[app.jar:0.0.4]
Caused by: com.arcadia.DataQualityDashboard.service.error.RException: Cannot connect: r-serve
	at com.arcadia.DataQualityDashboard.service.r.RConnectionCreatorImpl.createRConnection(RConnectionCreatorImpl.java:70) ~[classes!/:0.0.4]
	at com.arcadia.DataQualityDashboard.DataQualityDashboardApplication.lambda$run$0(DataQualityDashboardApplication.java:24) ~[classes!/:0.0.4]
	at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:769) ~[spring-boot-2.6.3.jar!/:2.6.3]
	... 13 common frames omitted
Caused by: org.rosuda.REngine.Rserve.RserveException: Cannot connect: r-serve
	at org.rosuda.REngine.Rserve.RConnection.<init>(RConnection.java:90) ~[Rserve-1.8.1.jar!/:na]
	at org.rosuda.REngine.Rserve.RConnection.<init>(RConnection.java:60) ~[Rserve-1.8.1.jar!/:na]
	at com.arcadia.DataQualityDashboard.service.r.RConnectionCreatorImpl.createRConnection(RConnectionCreatorImpl.java:56) ~[classes!/:0.0.4]
	... 15 common frames omitted
Caused by: java.net.UnknownHostException: r-serve
	at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:567) ~[na:na]
	at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:331) ~[na:na]
	at java.base/java.net.Socket.connect(Socket.java:630) ~[na:na]
	at java.base/java.net.Socket.connect(Socket.java:581) ~[na:na]
	at java.base/java.net.Socket.<init>(Socket.java:505) ~[na:na]
	at java.base/java.net.Socket.<init>(Socket.java:285) ~[na:na]
	at org.rosuda.REngine.Rserve.RConnection.<init>(RConnection.java:85) ~[Rserve-1.8.1.jar!/:na]
	... 17 common frames omitted

2022-09-29 20:08:14.115  INFO 1 --- [           main] j.LocalContainerEntityManagerFactoryBean : Closing JPA EntityManagerFactory for persistence unit 'default'
2022-09-29 20:08:14.117  INFO 1 --- [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Shutdown initiated...
2022-09-29 20:08:14.121  INFO 1 --- [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Shutdown completed.
10.89.0.1 - - [29/Sep/2022:20:12:05 +0000] "GET / HTTP/1.1" 499 0 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.5112.115 Safari/537.36" "-"
10.89.0.1 - - [29/Sep/2022:20:13:05 +0000] "GET / HTTP/1.1" 504 569 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.5112.115 Safari/537.36" "-"
podman create --name=athena --label io.podman.compose.config-hash=123 --label io.podman.compose.project=perseus --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=perseus --label com.docker.compose.project.working_dir=/home/jzelko3@[REDACTED]/FOSS/Perseus --label com.docker.compose.project.config_files=docker-compose.yaml --label com.docker.compose.container-number=1 --label com.docker.compose.service=athena -e ATHENA_ENV=Docker --net perseus_default --network-alias athena -p 5002:5002 perseus_athena
exit code: 125
['podman', 'network', 'exists', 'perseus_default']
podman create --name=cdm-builder --label io.podman.compose.config-hash=123 --label io.podman.compose.project=perseus --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=perseus --label com.docker.compose.project.working_dir=/home/jzelko3@[REDACTED]/FOSS/Perseus --label com.docker.compose.project.config_files=docker-compose.yaml --label com.docker.compose.container-number=1 --label com.docker.compose.service=cdm-builder -e ASPNETCORE_ENVIRONMENT=Docker --net perseus_default --network-alias cdm-builder -p 9000:9000 perseus_cdm-builder
exit code: 125
['podman', 'network', 'exists', 'perseus_default']
podman create --name=usagi --label io.podman.compose.config-hash=123 --label io.podman.compose.project=perseus --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=perseus --label com.docker.compose.project.working_dir=/home/jzelko3@[REDACTED]/FOSS/Perseus --label com.docker.compose.project.config_files=docker-compose.yaml --label com.docker.compose.container-number=1 --label com.docker.compose.service=usagi -e USAGI_ENV=Docker --net perseus_default --network-alias usagi -p 5003:5003 perseus_usagi python /app/main.py
exit code: 125
['podman', 'network', 'exists', 'perseus_default']
podman create --name=data-quality-dashboard --label io.podman.compose.config-hash=123 --label io.podman.compose.project=perseus --label io.podman.compose.version=0.0.1 --label com.docker.compose.project=perseus --label com.docker.compose.project.working_dir=/home/jzelko3@[REDACTED]/FOSS/Perseus --label com.docker.compose.project.config_files=docker-compose.yaml --label com.docker.compose.container-number=1 --label com.docker.compose.service=data-quality-dashboard -e SPRING_PROFILES_ACTIVE=docker --net perseus_default --network-alias data-quality-dashboard -p 8001:8001 perseus_data-quality-dashboard
exit code: 125
podman start -a web
podman start -a shareddb
podman start -a frontend
podman start -a vocabularydb
podman start -a files-manager
podman start -a user
podman start -a solr
podman start -a r-serve
exit code: 125
podman start -a backend
podman start -a white-rabbit
podman start -a athena
podman start -a cdm-builder
podman start -a usagi
podman start -a data-quality-dashboard
exit code: 1
exit code: 139
2022-09-29 20:13:35.925  INFO 1 --- [ionShutdownHook] j.LocalContainerEntityManagerFactoryBean : Closing JPA EntityManagerFactory for persistence unit 'default'
2022-09-29 20:13:35.927  INFO 1 --- [ionShutdownHook] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Shutdown initiated...
2022-09-29 20:13:35.949  INFO 1 --- [ionShutdownHook] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Shutdown completed.

I am quite at a loss as to what to do -- I doubt that it is podman related as podman is a drop-in replacement for docker. It seems to me coming more from step in the build process as I do have r-serve seemingly fail to be built.

Happy to report more on this so we can figure out what to do! Thanks for all the hard work so far!

~ tcp 🌳

Misc Notes

NOTE: There will be some interactive prompts that pop-up you will need to navigate through in the compose process
NOTE: This will take a tremendous amount of time to set-up so don't be concerned if this is so slow
NOTE: For some reason, one cannot be connected to any VPN while running this stack or else there will be download errors.

[FEATURE REQUEST] Add a To-Do List to Repo

Userbase has requests for future functionalities/changes to support. As many of these are long term projects, we should not categorized as "issues" but instead feature requests. To prevent Issues Tab from overflowing with feature requests, can we add a "To-Do" list within the Readme or a KanBan Board similar to the one on the Software County repo?

Most highly requested features:

  • [ATHENA API] Allow calling functions such as: see version type, force pull most recent vocab version
  • [Databricks Connection] Support connection endpoint to databricks environment. Long term project
  • [Setup Support] Including a tutorial for standing up own instance of Perseus using Docker

Building DQ Dashboard takes over 24 hours

I'm not sure if this is a bug, or if it's a normal experience. Building the latest (master) commit of the SoftwareCountry/DataQualityDashboard ran for over 24 hours before I accidentally killed it. After restarting it has been running for several hours. It appears to be stuck on step 15/15 of the build step - ./mvnw package. I am not on powerful hardware, but this seem excessive. I have already added platform: linux/amd64 to the docker-compose so I don't think it's a platform issue.

What is the normal experience for building the dashboard, does it normally take a long time?

[BUG] Can not cast as datetime (only timestamp)

Can not cast a csv column read in as Date type as datetime. I am able to cast as timestamp. While this is probably sufficient, its a bit confusing as Type is listed as Datetime in the target table for birth_datetime

image

[BUG] build fails on apple silicon

I'm working through the build on a silicon Mac. Has anybody else tried this? The build is failing due to platform issues.

 => ERROR [internal] load metadata for docker.io/library/openjdk:17-alpine                                                                 0.2s
------
 > [internal] load metadata for docker.io/library/openjdk:17-alpine:
------
failed to solve with frontend dockerfile.v0: failed to create LLB definition: no match for platform in manifest sha256:4b6abae565492dbe9e7a894137c966a7485154238902f2f25e9dbd9784383d81: not found
ERROR: Service 'files-manager' failed to build : Build failed

I have a PR for to fix this first issue. The build takes a while though so I will continue to update if I run into any additional issues.

(Med-High Prio) Bug: Request to "Convert to CDM" breaks, path missing

After connecting to an Azure SQL database, clicking the "Convert to CDM" and sending the actual request runs into error. It seems like part of requests go through, and server receives them as activity spikes, but no changes are actually finished and pushed to server. Full error message is below
image

image

Creating CDM database...
Warning: CDM database exists
Warning: CDM schema exists
Warning: CDM tables exists
CDM tables truncated
Loading locations...
Locations was loaded
Loading care sites...
Care sites was loaded
Loading providers...
Providers was loaded
Saving lookups...
Lookups was saved
CMSPlaceOfService - Loading ...
DONE - 73 ms | KeysCount=260
CMSPlaceOfService - Uploading to File Manager...
CMSPlaceOfService - Uploading DONE
Could not find a part of the path '/app/ETL/Common/Scripts/PostgreSQL/DropChunkTable.sql'.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.