Coder Social home page Coder Social logo

steno-aarhus / ukbaid Goto Github PK

View Code? Open in Web Editor NEW
8.0 1.0 3.0 12.13 MB

Aid Steno Researchers Who Work on the UKB RAP.

Home Page: https://steno-aarhus.github.io/ukbAid/

License: Other

R 94.57% JavaScript 3.83% SCSS 1.60%
data-extraction data-import reproducibility ukbiobank

ukbaid's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

ukbaid's Issues

Documentation of variables

Below is a list of commonly used covariables used in health research. We need to create code for each variable to include them and the covariables will be split between us:

  1. Smoking @Fie-Langmann
  2. Alcohol @nielsbock
  3. Age - @arborzhang
  4. Sex - Already coded
  5. Education - @arborzhang
  6. Income - @arborzhang
  7. Physical activity - Christina C. Dahm
  8. Townsend Deprivation Score - @Fie-Langmann
  9. Ethnicity - @danielibsen
  10. Baseline medication - Daniel Witte
  11. Anthropometry
    BMI @nunonog
    WC @nunonog
    Fatmass @nunonog
    Lean body mass @nunonog
    Bioempedance measures
    Leg length
    Sitting height
    Weight
  12. Diet - @danielibsen
  13. Diabetes - Daniel Witte
    • Hba1c
  14. Lipids @arborzhang
  15. Family history of diseases @Fie-Langmann

Add in FAQ about cli version

If you get an error about "cli out of date, is already loaded", terminate the RStudio session and re-open a new one and start from scratch again.

create dataset using ukbAid

When adding variables starting with quotes (e.g., p93, "Systolic blood pressure, manual reading"), the dataset could not be created. The error message received was:
failed to export data: Field name(s) not found: 'p93_i0', 'p94_i0', 'p95_i0'] Please check job logs and error files for more details.
@lwjohnst86

Things to include in the guidelines/project application

  • Justify why you should be on this project
  • Who is keeping an eye on you/is your project part of PhD or side project?
  • Does your project fall under the application?
  • What's your connection to Steno?
  • (Give note that all this information will be public)
  • Name of your supervisor (if PhD)
  • What is your primary affiliation?
  • Have you read the documentation?
  • Do you agree to the conditions of participating in this project?

Code review notes and things to add to the documentation

Structured:

  • First step is seeing what people have issues with
  • Second step, select a repo (if someone wants to volunteer, otherwise random) and go from the start (e.g. targets) and first code
  • During each session, identify which composite variables could be converted to a ukbAid function

Notes:

  • After sourcing and moving to project (when starting RAP), make sure to restart R session to refresh things
  • Random selection of people, with larger number of variables to filter down, before then selecting down the variable list based on what are needed
  • Update diagram for phases for selecting larger list of variables, random sample to look into it before doing the final selection. This could be done before finishing the protocol, in the checklist
  • When making new datasets and uploading them to RAP, delete old datasets with same name
  • Documentation on processing data and saving back to Parquet format
  • When needing help or having an issue, write an Issue in your own repo and include an @username mention of me and others that might be relevant
  • Links to how to insert bibliography references in to Quarto doc

Tasks:

  • @Fie-Langmann will start making/brainstorming how to combine smoking/tobacco status
  • Exploration Quarto doc for help with variable selection phase, like with histograms and tables

Info to get

  • What is the UK Biobank statement they suggest to include in manuscript for this project? Including number. (@AlisaDK)

Remove rspm from imports

Rspm should actually be using ppm, and doesn't install on windows. Don't really need it anyway

Documentation to include

  • We have obligations, since we've submitted them under AU and Midt fortegnelse's and Ethics boards, so we can't accept everyone

Error when running the "create-data.R" commands in RAP

When I try to create a .csv file with the variables I am to use in my analysis I get an error where field names of data-fields with arrays are not found (see code/output below). It also appears that when looking up the field ids of the data-fields causing the errors, the array number doesn't fit. In the error message below p20107 is presented with 10 arrays (0-9), which is also the case in the data showcase for UKB (https://biobank.ndph.ox.ac.uk/ukb/field.cgi?id=20107). This does however seem a bit odd as the showcase has more than 10 different categories for the data-field. When looking up some of the other field ids in the showcase I find the same issues with unmatching number of arrays for the given data-field in the project-variables_original.csv compared to the showcase.
The main problem is that the error prevents me from creating my dataset to begin the actual statistical analyses on UKB data.

Below is a copy of the console content from my failed try to create a dataset in the RAP:

library(magrittr)
ukbAid::get_username()
[1] "FieLangmann"
readr::read_csv(here::here("data-raw/rap-variables.csv")) %>%

  • dplyr::pull(field_id) %>%
    
  • ukbAid::create_csv_from_database()
    

Rows: 1588 Columns: 3
── Column specification ──────────────────────────────────────────────────────────
Delimiter: ","
chr (3): field_id, rap_variable_name, id

ℹ Use spec() to retrieve the full column specification for this data.
ℹ Specify the column types or set show_col_types = FALSE to quiet this message.
ℹ Started extracting the variables and converting to CSV.
! This function runs for quite a while, at least 5 minutes or more. Please be patient to let it finish.
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/dxpy/scripts/dx.py", line 2858, in run_one
dxexecution.wait_on_done()
File "/usr/local/lib/python3.8/dist-packages/dxpy/bindings/dxjob.py", line 283, in wait_on_done
raise DXJobFailureError(err_msg)
dxpy.exceptions.DXJobFailureError: Job has failed because of AppError: Failed to export data: Field name(s) not found: ['p20107_i0_a0', 'p20107_i0_a1', 'p20107_i0_a2', 'p20107_i0_a3', 'p20107_i0_a4', 'p20107_i0_a5', 'p20107_i0_a6', 'p20107_i0_a7', 'p20107_i0_a8', 'p20107_i0_a9', 'p20107_i1_a0', 'p20107_i1_a1', 'p20107_i1_a2', 'p20107_i1_a3', 'p20107_i1_a4', 'p20107_i1_a5', 'p20107_i1_a6', 'p20107_i1_a7', 'p20107_i1_a8', 'p20107_i1_a9', 'p20107_i2_a0', 'p20107_i2_a1', 'p20107_i2_a2', 'p20107_i2_a3', 'p20107_i2_a4', 'p20107_i2_a5', 'p20107_i2_a6', 'p20107_i2_a7', 'p20107_i2_a8', 'p20107_i2_a9', 'p20107_i3_a0', 'p20107_i3_a1', 'p20107_i3_a2', 'p20107_i3_a3', 'p20107_i3_a4', 'p20107_i3_a5', 'p20107_i3_a6', 'p20107_i3_a7', 'p20107_i3_a8', 'p20107_i3_a9', 'p20110_i0_a0', 'p20110_i0_a1', 'p20110_i0_a10', 'p20110_i0_a2', 'p20110_i0_a3', 'p20110_i0_a4', 'p20110_i0_a5', 'p20110_i0_a6', 'p20110_i0_a7', 'p20110_i0_a8', 'p20110_i0_a9', 'p20110_i1_a0', 'p20110_i1_a1', 'p20110_i1_a10', 'p20110_i1_a2', 'p20110_i1_a3', 'p20110_i1_a4', 'p20110_i1_a5', 'p20110_i1_a6', 'p20110_i1_a7', 'p20110_i1_a8', 'p20110_i1_a9', 'p20110_i2_a0', 'p20110_i2_a1', 'p20110_i2_a10', 'p20110_i2_a2', 'p20110_i2_a3', 'p20110_i2_a4', 'p20110_i2_a5', 'p20110_i2_a6', 'p20110_i2_a7', 'p20110_i2_a8', 'p20110_i2_a9', 'p20110_i3_a0', 'p20110_i3_a1', 'p20110_i3_a10', 'p20110_i3_a2', 'p20110_i3_a3', 'p20110_i3_a4', 'p20110_i3_a5', 'p20110_i3_a6', 'p20110_i3_a7', 'p20110_i3_a8', 'p20110_i3_a9', 'p20111_i0_a0', 'p20111_i0_a1', 'p20111_i0_a10', 'p20111_i0_a11', 'p20111_i0_a2', 'p20111_i0_a3', 'p20111_i0_a4', 'p20111_i0_a5', 'p20111_i0_a6', 'p20111_i0_a7', 'p20111_i0_a8', 'p20111_i0_a9', 'p20111_i1_a0', 'p20111_i1_a1', 'p20111_i1_a10', 'p20111_i1_a11', 'p20111_i1_a2', 'p20111_i1_a3', 'p20111_i1_a4', 'p20111_i1_a5', 'p20111_i1_a6', 'p20111_i1_a7', 'p20111_i1_a8', 'p20111_i1_a9', 'p20111_i2_a0', 'p20111_i2_a1', 'p20111_i2_a10', 'p20111_i2_a11', 'p20111_i2_a2', 'p20111_i2_a3', 'p20111_i2_a4', 'p20111_i2_a5', 'p20111_i2_a6', 'p20111_i2_a7', 'p20111_i2_a8', 'p20111_i2_a9', 'p20111_i3_a0', 'p20111_i3_a1', 'p20111_i3_a10', 'p20111_i3_a11', 'p20111_i3_a2', 'p20111_i3_a3', 'p20111_i3_a4', 'p20111_i3_a5', 'p20111_i3_a6', 'p20111_i3_a7', 'p20111_i3_a8', 'p20111_i3_a9', 'p41270_a0', 'p41270_a1', 'p41270_a10', 'p41270_a100', 'p41270_a101', 'p41270_a102', 'p41270_a103', 'p41270_a104', 'p41270_a105', 'p41270_a106', 'p41270_a107', 'p41270_a108', 'p41270_a109', 'p41270_a11', 'p41270_a110', 'p41270_a111', 'p41270_a112', 'p41270_a113', 'p41270_a114', 'p41270_a115', 'p41270_a116', 'p41270_a117', 'p41270_a118', 'p41270_a119', 'p41270_a12', 'p41270_a120', 'p41270_a121', 'p41270_a122', 'p41270_a123', 'p41270_a124', 'p41270_a125', 'p41270_a126', 'p41270_a127', 'p41270_a128', 'p41270_a129', 'p41270_a13', 'p41270_a130', 'p41270_a131', 'p41270_a132', 'p41270_a133', 'p41270_a134', 'p41270_a135', 'p41270_a136', 'p41270_a137', 'p41270_a138', 'p41270_a139', 'p41270_a14', 'p41270_a140', 'p41270_a141', 'p41270_a142', 'p41270_a143', 'p41270_a144', 'p41270_a145', 'p41270_a146', 'p41270_a147', 'p41270_a148', 'p41270_a149', 'p41270_a15', 'p41270_a150', 'p41270_a151', 'p41270_a152', 'p41270_a153', 'p41270_a154', 'p41270_a155', 'p41270_a156', 'p41270_a157', 'p41270_a158', 'p41270_a159', 'p41270_a16', 'p41270_a160', 'p41270_a161', 'p41270_a162', 'p41270_a163', 'p41270_a164', 'p41270_a165', 'p41270_a166', 'p41270_a167', 'p41270_a168', 'p41270_a169', 'p41270_a17', 'p41270_a170', 'p41270_a171', 'p41270_a172', 'p41270_a173', 'p41270_a174', 'p41270_a175', 'p41270_a176', 'p41270_a177', 'p41270_a178', 'p41270_a179', 'p41270_a18', 'p41270_a180', 'p41270_a181', 'p41270_a182', 'p41270_a183', 'p41270_a184', 'p41270_a185', 'p41270_a186', 'p41270_a187', 'p41270_a188', 'p41270_a189', 'p41270_a19', 'p41270_a190', 'p41270_a191', 'p41270_a192', 'p41270_a193', 'p41270_a194', 'p41270_a195', 'p41270_a196', 'p41270_a197', 'p41270_a198', 'p41270_a199', 'p41270_a2', 'p41270_a20', 'p41270_a200', 'p41270_a201', 'p41270_a202', 'p41270_a203', 'p41270_a204', 'p41270_a205', 'p41270_a206', 'p41270_a207', 'p41270_a208', 'p41270_a209', 'p41270_a21', 'p41270_a210', 'p41270_a211', 'p41270_a212', 'p41270_a213', 'p41270_a214', 'p41270_a215', 'p41270_a216', 'p41270_a217', 'p41270_a218', 'p41270_a219', 'p41270_a22', 'p41270_a220', 'p41270_a221', 'p41270_a222', 'p41270_a223', 'p41270_a224', 'p41270_a225', 'p41270_a226', 'p41270_a227', 'p41270_a228', 'p41270_a229', 'p41270_a23', 'p41270_a230', 'p41270_a231', 'p41270_a232', 'p41270_a233', 'p41270_a234', 'p41270_a235', 'p41270_a236', 'p41270_a237', 'p41270_a238', 'p41270_a239', 'p41270_a24', 'p41270_a240', 'p41270_a241', 'p41270_a242', 'p41270_a25', 'p41270_a26', 'p41270_a27', 'p41270_a28', 'p41270_a29', 'p41270_a3', 'p41270_a30', 'p41270_a31', 'p41270_a32', 'p41270_a33', 'p41270_a34', 'p41270_a35', 'p41270_a36', 'p41270_a37', 'p41270_a38', 'p41270_a39', 'p41270_a4', 'p41270_a40', 'p41270_a41', 'p41270_a42', 'p41270_a43', 'p41270_a44', 'p41270_a45', 'p41270_a46', 'p41270_a47', 'p41270_a48', 'p41270_a49', 'p41270_a5', 'p41270_a50', 'p41270_a51', 'p41270_a52', 'p41270_a53', 'p41270_a54', 'p41270_a55', 'p41270_a56', 'p41270_a57', 'p41270_a58', 'p41270_a59', 'p41270_a6', 'p41270_a60', 'p41270_a61', 'p41270_a62', 'p41270_a63', 'p41270_a64', 'p41270_a65', 'p41270_a66', 'p41270_a67', 'p41270_a68', 'p41270_a69', 'p41270_a7', 'p41270_a70', 'p41270_a71', 'p41270_a72', 'p41270_a73', 'p41270_a74', 'p41270_a75', 'p41270_a76', 'p41270_a77', 'p41270_a78', 'p41270_a79', 'p41270_a8', 'p41270_a80', 'p41270_a81', 'p41270_a82', 'p41270_a83', 'p41270_a84', 'p41270_a85', 'p41270_a86', 'p41270_a87', 'p41270_a88', 'p41270_a89', 'p41270_a9', 'p41270_a90', 'p41270_a91', 'p41270_a92', 'p41270_a93', 'p41270_a94', 'p41270_a95', 'p41270_a96', 'p41270_a97', 'p41270_a98', 'p41270_a99', 'p41271_a0', 'p41271_a1', 'p41271_a10', 'p41271_a11', 'p41271_a12', 'p41271_a13', 'p41271_a14', 'p41271_a15', 'p41271_a16', 'p41271_a17', 'p41271_a18', 'p41271_a19', 'p41271_a2', 'p41271_a20', 'p41271_a21', 'p41271_a22', 'p41271_a23', 'p41271_a24', 'p41271_a25', 'p41271_a26', 'p41271_a27', 'p41271_a28', 'p41271_a29', 'p41271_a3', 'p41271_a30', 'p41271_a31', 'p41271_a32', 'p41271_a33', 'p41271_a34', 'p41271_a35', 'p41271_a36', 'p41271_a37', 'p41271_a38', 'p41271_a39', 'p41271_a4', 'p41271_a40', 'p41271_a41', 'p41271_a42', 'p41271_a43', 'p41271_a44', 'p41271_a45', 'p41271_a46', 'p41271_a5', 'p41271_a6', 'p41271_a7', 'p41271_a8', 'p41271_a9', 'p41272_a0', 'p41272_a1', 'p41272_a10', 'p41272_a100', 'p41272_a101', 'p41272_a102', 'p41272_a103', 'p41272_a104', 'p41272_a105', 'p41272_a106', 'p41272_a107', 'p41272_a108', 'p41272_a109', 'p41272_a11', 'p41272_a110', 'p41272_a111', 'p41272_a112', 'p41272_a113', 'p41272_a114', 'p41272_a115', 'p41272_a116', 'p41272_a117', 'p41272_a118', 'p41272_a119', 'p41272_a12', 'p41272_a120', 'p41272_a121', 'p41272_a122', 'p41272_a123', 'p41272_a13', 'p41272_a14', 'p41272_a15', 'p41272_a16', 'p41272_a17', 'p41272_a18', 'p41272_a19', 'p41272_a2', 'p41272_a20', 'p41272_a21', 'p41272_a22', 'p41272_a23', 'p41272_a24', 'p41272_a25', 'p41272_a26', 'p41272_a27', 'p41272_a28', 'p41272_a29', 'p41272_a3', 'p41272_a30', 'p41272_a31', 'p41272_a32', 'p41272_a33', 'p41272_a34', 'p41272_a35', 'p41272_a36', 'p41272_a37', 'p41272_a38', 'p41272_a39', 'p41272_a4', 'p41272_a40', 'p41272_a41', 'p41272_a42', 'p41272_a43', 'p41272_a44', 'p41272_a45', 'p41272_a46', 'p41272_a47', 'p41272_a48', 'p41272_a49', 'p41272_a5', 'p41272_a50', 'p41272_a51', 'p41272_a52', 'p41272_a53', 'p41272_a54', 'p41272_a55', 'p41272_a56', 'p41272_a57', 'p41272_a58', 'p41272_a59', 'p41272_a6', 'p41272_a60', 'p41272_a61', 'p41272_a62', 'p41272_a63', 'p41272_a64', 'p41272_a65', 'p41272_a66', 'p41272_a67', 'p41272_a68', 'p41272_a69', 'p41272_a7', 'p41272_a70', 'p41272_a71', 'p41272_a72', 'p41272_a73', 'p41272_a74', 'p41272_a75', 'p41272_a76', 'p41272_a77', 'p41272_a78', 'p41272_a79', 'p41272_a8', 'p41272_a80', 'p41272_a81', 'p41272_a82', 'p41272_a83', 'p41272_a84', 'p41272_a85', 'p41272_a86', 'p41272_a87', 'p41272_a88', 'p41272_a89', 'p41272_a9', 'p41272_a90', 'p41272_a91', 'p41272_a92', 'p41272_a93', 'p41272_a94', 'p41272_a95', 'p41272_a96', 'p41272_a97', 'p41272_a98', 'p41272_a99', 'p6150_i0_a0', 'p6150_i0_a1', 'p6150_i0_a2', 'p6150_i0_a3', 'p6150_i1_a0', 'p6150_i1_a1', 'p6150_i1_a2', 'p6150_i1_a3', 'p6150_i2_a0', 'p6150_i2_a1', 'p6150_i2_a2', 'p6150_i2_a3', 'p6152_i0_a0', 'p6152_i0_a1', 'p6152_i0_a2', 'p6152_i0_a3', 'p6152_i0_a4', 'p6152_i1_a0', 'p6152_i1_a1', 'p6152_i1_a2', 'p6152_i1_a3', 'p6152_i1_a4', 'p6152_i2_a0', 'p6152_i2_a1', 'p6152_i2_a2', 'p6152_i2_a3', 'p6152_i2_a4', 'p6152_i3_a0', 'p6152_i3_a1', 'p6152_i3_a2', 'p6152_i3_a3', 'p6152_i3_a4'] Please check job logs and error files for more details.
dxpy.utils.resolver.ResolutionError: Unable to resolve "data-FieLangmann-leha.csv" to a data object or folder name in '/'
✔ Finished saving to CSV. Check "/mnt/project/users/FieLangmann" or the project folder on the RAP to see that it was created.
[1] "job-GY93488JqFgVJJzXZQV5x8q2" NA
Warning message:
In system(table_exporter_command, intern = TRUE) :
running command 'dx run app-table-exporter --brief --wait -y -idataset_or_cohort_or_dashboard=record-GJ3kvBQJbxZX8fxKJ62kgk0V -ifield_names='p31' -ifield_names='p34' -ifield_names='p52' -ifield_names='p190' -ifield_names='p191' -ifield_names='p738_i0' -ifield_names='p738_i1' -ifield_names='p738_i2' -ifield_names='p738_i3' -ifield_names='p1239_i0' -ifield_names='p1239_i1' -ifield_names='p1239_i2' -ifield_names='p1239_i3' -ifield_names='p1249_i0' -ifield_names='p1249_i1' -ifield_names='p1249_i2' -ifield_names='p1249_i3' -ifield_names='p1538_i0' -ifield_names='p1538_i1' -ifield_names='p1538_i2' -ifield_names='p1538_i3' -ifield_names='p1548_i0' -ifield_names='p1548_i1' -ifield_names='p1548_i2' -ifield_names='p1548_i3' -ifield_names='p2443_i0' -ifield_names='p2443_i1' -ifield_names='p2443_i2' -ifield_names='p2443_i3' -ifield_names='p2453_i0' -ifield_names='p2453_i1' -ifield_names='p2453_i2' -ifield_names='p2453_i3' -ifield_names='p2887_i0' -ifield_names='p2887_i1' -ifield_names='p2887_i2' - [... truncated]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.