leeper / slopegraph Goto Github PK
View Code? Open in Web Editor NEWEdward Tufte-Inspired Slopegraphs
Edward Tufte-Inspired Slopegraphs
These should be called panel.first
and panel.last
, respectively, to be consistent with graphics::plot.default
.
When I run this:
if (!require("ghit")) {
install.packages("ghit")
}
ghit::install_github("leeper/slopegraph")
I get this error:
Error in read.dcf(file = tmpf) : cannot open the connection
In addition: Warning message:
In read.dcf(file = tmpf) :
cannot open compressed file '//var/folders/d6/vj5j60497bs_kc9z28wc3j6c0000gn/T//RtmpeVYawE/ghitdrat/src/contrib/PACKAGES', probable reason 'No such file or directory'
Wide format is most sensible, but it is untidy. It would be useful to allow input data to be in long format
I didn't do a pull request because it seems you all are in a bit of flux, but I did want to give you proper credit and a look at the slightly different way I approached the task I took a look at your base code and there's probably an opportunity to merge some pieces here.
https://ibecav.github.io/slopegraph/
Thanks for your efforts.
Chuck
Tufte has a lot: http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0003nk
I'd like to display slopegraphs with points with only a single decimal, so I use decimal=1
in my specification. But if I do, I get overlap of labels (eg VT and CT below) which I wouldn't get if used decimal=2
. But using the latter is so untidy! Is there any way to get the benefit of the latter without displaying more than one significant digit?
One Significant Digit
Two Significant Digits
col.lines is not working correcting in the function 'slopegraph'. It can be fixed easily, by adding 'n' to the cbind command that creates 'todraw' (line155), thus passing the index of the line in question to the apply function on line 157. Then add the line ' i <- rowdata[5]' to the apply function (line 162, e.g.) and the lines will color as expected.
I'm using ggslopegraph() exclusively because of how awesome it looks and how great ggplot2 is. But when I call ggslopegraph() initially I get the following error:
Error in strwidth(sprintf(fmt, long[["value"]])) :
plot.new has not been called yet
While invoking plot.new() works, this should not be necessary, right? I've never needed to call it for any other ggplot2 methods.
Great package! But I'm still having difficulty with missing values. I'm attaching some data on state polarization for the northeast. Everything's there except the 2014 value for MA. This messes up the plot. I'd like to have the 2008 point for MA go directly to 2014. But as it is, MA gets orphaned after 2008, and the point for 2017 (the end) is not connected to anything.
na.span
should do the job but it doesn't.
This is the code I use:
rownames(sg)=sg$st; sg$st=NULL
colnames(sg)=str_c("Year.",colnames(sg))
slopegraph(sg, col.line='gray',col.lab = 'black', decimals=1,
xlabels=c('1996','2000','2004','2008','2014','2017'),
cex.lab = 1, cex.num=0.75)
ggslopegraph2 mostly works, except when it is called in a function. Then it complains about a dataframe I pass to it, complaining:
Error: The first object in your list 'sg.df' does not exist. It should be a dataframe
My code is as follows:
ggslopegraph2(sg.df, times=year, measurement=comp.diffs, grouping=st, title = NULL) +
theme_bw() + labs(x=NULL, y=NULL) + theme(legend.position = "none")
In the debugger I have verified that sg.df is absolutely a dataframe. What could be going on?
...with the goal of getting it on CRAN!
The current binning algorithm simply puts labels on new lines when they're too close. A better version would evenly space the numbers within the range of overlap. It will have the same even appearance, but probably line up better with slope lines.
Thanks for all your work on this so far.
A couple of things.
The code I'm using is below and the resulting slopegraph is at this link. This shows the issues.
https://dl.dropboxusercontent.com/u/10963448/slope_graph.jpeg
library("RColorBrewer")
capex <- matrix(c(
700348, 550203, 504668, 262529, 351732, # Perth
1355928, 942090, 735799, 609752, 686136, # Melbourne
805693, 792228, 713762, 629305, 641685, # Sydney
504764, 989579, 653563, 517648, 487636, # South East Queensland
256668, 230838, 143365, 59393, 48937, # Canberra
595851, 530075, 331038, 187945, 152124, # Adelaide
51978, 58080, 64789, 25600, NA), # Darwin
nrow = 7, byrow = TRUE)
capex <- capex/1000
capex <- data.frame(capex)
cities <- c('Perth', 'Melbourne', 'Sydney','SE Queensland', 'Canberra', 'Adelaide', 'Darwin')
row.names(capex) <- cities
names(capex) <- c('X2010-11', 'X2011-12', 'X2012-13', 'X2013-14', 'X2014-15')
my_col = brewer.pal(9, name = 'Paired')
par(oma = c(1,4,1,1))
slopegraph(capex,
labels = c('2010-11', '2011-12', '2012-13', '2013-14', '2014-15'),
decimals = 0,
col.lines = my_col,
col.lab = my_col,
family = 'sans')
Dear Thomas,
Thank you for the amazing package!
Would you inform me about the correct citation of the slopegraph package?
Using ggslopegraph2, box frame around numeric values is default. I would like to ask your help to remove box frame around L1, L2, .... numeric values.
Best wishes, Zsolt
The new tools svglite
and rsvg
provide us with some new potential for this package. Fortunately, I think it requires no changes to the existing slopegraph
code. I'll work up some examples to demonstrate the new potential.
Thanks so much for this great package.
This comes up in the New York Times example.
For example, using the following dataset,
# A tibble: 14 x 3
X__1 SE_withOPTIMAL `SE_without OPTIMAL`
* <chr> <dbl> <dbl>
1 HR_male 0.0340 0.0357
2 HR_age 0.0186 0.0211
3 HR_nowmsk 0.0408 0.0450
4 HR_oxygen 0.0457 0.0439
5 HR_fev1pre 0.0416 0.0412
6 HR_statin 0.0433 0.0413
7 HR_azithromycin 0.0387 0.0359
8 HR_LAMA 0.0511 0.0456
9 HR_LABA 0.0480 0.0504
10 HR_ICS 0.0507 0.0756
11 HR_sgrq 0.0130 0.0132
12 HR_BMI10 0.0268 0.0290
13 HR_OPTIMAL 0.0512 0.307
Suggestion: put Installation before Examples in README
In executing the GDP or the cancer data example, I am able to get only the top line. I know this was working fine last year but while updating the code, I realized that it is not reproducing. My environment is given below.
R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 14393)
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] slopegraph_0.1.7
loaded via a namespace (and not attached):
[1] tools_3.3.1
Bug report via email:
Here's a data frame of state polarization by 4 year intervals. When I enter the following interactively, I get the slopegraph in the Rstudio plot window.
slopegraph(na.omit(sg), col.line='gray',decimals=1,labels=c('1996','2000','2004','2008','2014'), binval=1.5)
BTW, the slopegraph doesn't work unless I omit NA
s but this isn't quite what I want as I lose the entire set of observations if a state has any missing values. (see also #3)
The big question though is why I get a strange error when I set up the pdf:
pdf('Plots/Legislatures_2016/States/polarization_slopegraph.pdf', height=16, width=12, family='Palatino')
slopegraph(na.omit(sg), col.line='gray',decimals=1,labels=c('1996','2000','2004','2008','2014'), binval=1.5)
The error I get from just adding the pdf command at the top is:
Error in `row.names<-.data.frame`(`*tmp*`, value = value) :
duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique value when setting 'row.names': ‘0.4’
I don't understand why row names would be set at one of the values? It should be state abbreviations!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.