Framework and modular code for creating an analysis ready dataset in the IDI. NOTE(June 2019) - Code currently being updated in the IDI to work on SAS Grid and Windows 10
The data foundation needs a library declaration file to be run to define the sand library (and other IDI libraries) that are used in the code. This has to be either added as a dependency, or explicitly invoked from the si_main.sas script.
The SASAUTOS macro called si_subset_idi_dataset macro has used hard-coded values for the schema name. It uses DL-MAA2016-15 as the hard-coded value, which tends to succeed when users within the SIA project execute the code for creating qualifications data. However, this code will fail when anybody external to that project uses it.
si_create_rollup_vars.sas : parameter si_aggs_cols needs to = department. This doesnt work if there is no roll up chosen. i.e if we want to just roll up by snz_uid
for example if we choose si_aggs_cols=%str(snz_uid) , this would fail.
Instead of having equally sized periods for all roll-up variables, it would be better to have variable-sized periods closer to the "as-at" date. For example, having variables 3 months before & after the as-at date, 6 months before & after, 12 months before & after, etc.
The SI data foundation takes too long to roll up events that have high dates, because it essentially breaks up events into periods specified by period duration. This poses a problem for cases where the end date is 31-12-9999. Need to default to current date or date of IDI refresh in such cases.
Around 20 files in ./sasautos/ and other folders of this repo have the SIA's IDI_Sandpit schema (DL-MAA2016-15) hard coded into it and it will need to be changed to the schema available to other projects that obviously can't write to that schema. In the long term a solution would be to parameterise this, in the short term perhaps just a note in the readme alerting people to the need to do these changes.