public-health-scotland / slfhelper Goto Github PK
View Code? Open in Web Editor NEWAn R package for working with the SLFs
Home Page: https://public-health-scotland.github.io/slfhelper/
License: Other
An R package for working with the SLFs
Home Page: https://public-health-scotland.github.io/slfhelper/
License: Other
Implement code so that you can select a specific partnership and only return their data.
In the latest update, social care variables were added and generally rearranged. Variable list files need updating.
Possible additional test and re-write required
If we're returning more than one years worth of data, we should always return the year
variable, even if it wasn't specified in the columns
argument.
Other cases would be, when recid
or partnership
etc. arguments are used, if we are returning more than one, always return the filtering variable. The current code will extract recid etc if it's needed for filtering but then not return it if it wasn't specifically asked for in the columns
parameter.
Check if the submitted values look correct by comparing them to the inbuilt lookups. We will need to add one for recids, but that is useful separately. Can use stringdist like in opendata to suggest corrections.
Could be done like the other filters i.e. filter(!is.na(anon_chi))
but since all the missing CHIs appear at the start of the file it might be better to have an index of row numbers (which would need to be kept up to date - this could be checked with tests). Then the index could be used to do read_fst(from = <first_row_with_non_missing_chi>
Using fst::metadata_fst
to get the file size then read in chunks. Read and filter each one sequentially to reduce overall memory usage (probably slower than current though). Alternatively, read the chunks in parallel to improve read speed?
e.g. cost_vars()
which would include all cost related variables.
The idea is to make using column selection easier.
Include checks for the existence of files.
Include check and warning if hscdiip file is newer / equal (hash check?) to dev file.
Possibly include temp building of fst files if required.
Have value labels stored somehow, then have functions for people to use this easily. One idea is to store them as JSON, with get functions:
vars = list(gender = list(
description = "Patient's gender",
values = list("1" = "Female", "2" = "Male", "9" = "Unknown"),
type = "integer"
)) %>%
jsonlite::toJSON()
get_values <- function(variable) {
json <- jsonlite::fromJSON(vars)
return(json[[variable]][["values"]])
}
get_values("gender")
Could also have functions that would 'swap out' values for labels (to be used for plots etc.) mutate(gender = slf_factor_labels(gender))
Use purr better by using list_of_years %>% map(function, common_args)' rather than
complicated_list %>% pmap(...)`
Provide error message, if a year isn't valid.
Older files won't necessarily work with new files as some variables are missing.
It would be nice if you could specify multiple years and these are returned added together using dplyr::bind_rows
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.