Comments (5)
- I could imagine wanting to represent REDCap data dictionarys in this format, but in practice the only time I really care about a REDCap project's representation is when its in table form -- where each REDCap form corresponds to a table and each field to a column.
- I think the main advantage is that it allows for easier development of tooling around models.
- Yes, but I can imagine this being cumbersome unless it is absolutely necessary.
- Nothing comes to mind immediately, although a link to available formats (JSON, md, etc) on the HTML(default?) would be nice. It might also be nice to eventually track what downstream projects are using elements of these models,
- While potentially useful I think it might be slightly more confusing. It might be worthwhile to dig in and focus on RDBMS models rather than trying to ALSO incorporate non-table based models. I'm in favor of keeping things simple to begin with to establish use cases and workflow -- and build from there. I think thinking about these things is important and worthwhile, but people might hear "directed acyclic graph" and lose interest do to lack of familiarity. I'm in favor of keeping the bar low to encourage use.
from data-models.
- If I wanted to have a complete Origins view of the PCGC data (complete as far as the data hub is concerned), I would want to encode three REDCap forms; however, my forms aren't actually hierarchical. If I encoded them, I could represent them as simple tables .... I wonder what percentage of REDCap forms are effectively just tables.
- If I were forced to use the non-tabular format for one thing, I might like to use it for everything. The non-tabular format is certainly not bad, but it might increase difficulty of uptake.
- Defining mappings would be needed to produce a full Origins view of the data, but unless there were some imperative requirement, it's hard to imagine doing that for PCGC.
- Re: current views, they are a little hard for me to look at and digest as a human being. I think I'm used to looking at this kind of data in tabular format (à la the csv source files).
- Re: the non-tabular format, it seems reasonable for its purpose (the only other alternative I can think of would be to replace path-name with name-type-parent so people could, for instance, define 'form' and 'section' names explicitly). I agree with Tyler that if this were the only format available, it would discourage use. At the moment I would lean toward allowing this format to live alongside the current one, if it were implemented.
BTW, so tables.csv for non-tables would look like this?
model | version | path | name | description |
---|---|---|---|---|
redcap_project | v1 | demographics | Demographics form | |
redcap_project | v1 | demographics | location | Location section within the Demographics form |
and then definitions/demographics.location.csv
(or definitions/demographics/location.csv
) might look like:
model | version | path | field | required | ref_path | ref_field | description |
---|---|---|---|---|---|---|---|
i2b2_for_pedsnet | v2 | demographics/location | city | No |
?
from data-models.
@tjrivera Thanks. I agree with sentiment and pragmatic view on what is necessary now. Getting into the space of modeling arbitrary data models may not appropriate just yet. At the same I don't want to get into a situation where we are needing to force a data model into a table-based structured simply because that is all we support.
@murphyke Yes, except there would no longer be a difference between table vs. field since they are just segments/nodes in the path. I agree that if this is rolled out both formats should be supported.
That all being said, this certainly increases the complexity of defining and maintaining the data models since the structure is less well-defined. This is not as much of a technical challenge (generators, forms), but a representation one. The current views are organized around the fixed table-field relationship since their semantics are well defined. Allowing for multiple levels and different node types introduces arbitrary relationships which makes it difficult or impossible to represent all possible variations in a single view. For this reason, it would be likely that a model type would need to be introduced which would toggle the representation layers.
My gut tells me this is a rabbit hole we are not quite ready for.
from data-models.
Having only just seen this, I have to say that I also feel this is more trouble than I think it would be worth at this point. I like that the repo is easy to describe at this point and fear that more complexity could scare away potential contributors/users.
from data-models.
If this is requested in the future, we will start the conversation again.
from data-models.
Related Issues (20)
- Organize files by table
- Downstream packages to start requiring type-specific attributes
- Issues in creating pcornet schema from automated ddl HOT 3
- OHDSI added new column in drug_strength, denominator value HOT 3
- Separate OMOP Vocabulary data model? HOT 2
- Updates to PCORnet v3 field lengths
- Maybe create additional indexes for pedsnet, pcornet
- Allow two PEDSnet vocabulary columns to be NULL
- Missing concept_id constraints from PEDSnet v2.2/v2.3 HOT 1
- Re-add pedsnet foreign key indexes
- OHDSI added box_size to the drug_strength table HOT 5
- Final tweak to pedsnet 2.4.0 era table additions HOT 3
- Add PCORNET v3.1 model
- Add version 5.1 of OMOP
- Add version 2.5 of PEDSnet
- CI configuration HOT 3
- Fix omop 5.0.0 HOT 2
- Possible Changes for Data Models Generator HOT 1
- Data Model service not generating timestamp fields for oracle DDL HOT 1
- Table: hash_token | data type: varchar(18) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from data-models.