rhoinc / sas-codebook Goto Github PK
View Code? Open in Web Editor NEWA SAS macro for generating a concise summary of every variable in a SAS dataset.
License: MIT License
A SAS macro for generating a concise summary of every variable in a SAS dataset.
License: MIT License
When the VAR= list is long enough to cause multiple pages of output within each BY= level, SAS crashes. This is apparently an issue because of the use of STARTPAGE=NOW at the top of each BY= level. Because I am using STARTPAGE=NOW, SAS gives up trying to control page breaks. If there are a lot of variables (more than will fit on a page), it crashes. ๐ข
Short-term hack for users: don't specify more than 8 variable names in the VAR= parameter. Call the macro multiple of there are more than 8 variables of interest.
Possible long-term workarounds include:
To appease those who would say "that's not a codebook", consider adding an appendix at the end which shows more than just the "Top 5" that are shown in the 1" strips. Would like include counts and maybe percents. Might still need to include some sort of max for cases with 100+ unique values.
If this is implemented, consider including hyperlinks so that folks could jump back and forth from the strips to the appendix and back. Not sure how to do this, but how hard could it be!?
I'm running the macro "codebook_generic" and the PDF looks great, but the graphic for each variable doesn't end up on the PDF. Only the graphic for the first variable appears. The graphics just get output to PNG files in my data folder. I've only specified the macro variables "DATA" and "PDFPREFIX". I'm using SAS/STAT 14.1. Do you know what might be going on?
Distinguish sample size at each level of the panelby variable.
Spencer just got 3700 PNG files - ick!
Primary motivator is a treatment/control type comparison.
Running the codebook on multiple datasets, interactively on SAS Grid through EG at least, creates a substantial lag of nearly a minute per dataset. Recommend substantially paring down messages sent to the log.
add %nrbquote to deal with apostrophes
Latencies never seem to appear on page 1, so perhaps create several one-page outputs and stack them together on the back end. Would lose the automatic page numbering, though could possibly manually number the one-at-a-time pages as they are created. Getting the "of Y" part would require more pre-planning/measuring/counting, but again still possible.
Not sure if there's much added value here, but maybe a hyper-visual individual would find value in it.
Add a table at the top of page 1 which displays the number of observations in the dataset for each level of the panelby variable. Will likely want to match the sort of the subsequent plots, but this is not a certainty. Might display counts as text or bar charts.
Complete list of all combinations of the by variables and count of obs within each level. Could run into width issues if lots of variables are specified. Might need to go landscape for this first page and then flip back to portrait for subsequent pages.
Consider adding more high-level information at the top of each report related to the dataset. For instance:
The macro creates secondary versions of variables by adding prefixes (e.g., VAR1 becomes CB_CHAR_VAR1). This is problematic when incoming variable names are already approaching the length limit of 32. Need to update the code to somehow get around this issue.
Low priority query from DMEDS
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.