leeper / slopegraph Goto Github PK

View Code? Open in Web Editor NEW

187.0 14.0 21.0 1.55 MB

Edward Tufte-Inspired Slopegraphs

R 100.00%

slopegraph cran r tufte ggplot2 dataviz data-visualization

slopegraph's People

Contributors

Stargazers

Watchers

slopegraph's Issues

Rename `add.before` and `add.after` arguments

These should be called panel.first and panel.last, respectively, to be consistent with graphics::plot.default.

Not able to load package

When I run this:

if (!require("ghit")) {
install.packages("ghit")
}
ghit::install_github("leeper/slopegraph")

I get this error:

Error in read.dcf(file = tmpf) : cannot open the connection
In addition: Warning message:
In read.dcf(file = tmpf) :
cannot open compressed file '//var/folders/d6/vj5j60497bs_kc9z28wc3j6c0000gn/T//RtmpeVYawE/ghitdrat/src/contrib/PACKAGES', probable reason 'No such file or directory'

Suggestion: Allow input data in long format

Wide format is most sensible, but it is untidy. It would be useful to allow input data to be in long format

Thanks and some thoughts

I didn't do a pull request because it seems you all are in a bit of flux, but I did want to give you proper credit and a look at the slightly different way I approached the task I took a look at your base code and there's probably an opportunity to merge some pieces here.

https://ibecav.github.io/slopegraph/

Thanks for your efforts.

Chuck

Add other examples

Tufte has a lot: http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0003nk

Display labels by fewer decimals

I'd like to display slopegraphs with points with only a single decimal, so I use decimal=1 in my specification. But if I do, I get overlap of labels (eg VT and CT below) which I wouldn't get if used decimal=2. But using the latter is so untidy! Is there any way to get the benefit of the latter without displaying more than one significant digit?

One Significant Digit

Two Significant Digits

rownames don't show in example codes

Examples fail to produce rownames as shown in the example screenshots (R v3.4.3 in Windows)

> slopegraph(cancer, col.lines = 'gray', col.lab = "black", 
+            xlim = c(-.5,5.5), cex.lab = 0.5, cex.num = 0.5,
+            xlabels = c('5 Year','10 Year','15 Year','20 Year'))

col.lines[i] not working

col.lines is not working correcting in the function 'slopegraph'. It can be fixed easily, by adding 'n' to the cbind command that creates 'todraw' (line155), thus passing the index of the line in question to the apply function on line 157. Then add the line ' i <- rowdata[5]' to the apply function (line 162, e.g.) and the lines will color as expected.

Better handle reversed y-axis

Plot.new() needed for ggslopegraph?

I'm using ggslopegraph() exclusively because of how awesome it looks and how great ggplot2 is. But when I call ggslopegraph() initially I get the following error:

Error in strwidth(sprintf(fmt, long[["value"]])) : 
  plot.new has not been called yet

While invoking plot.new() works, this should not be necessary, right? I've never needed to call it for any other ggplot2 methods.

Handle missing data points better

Great package! But I'm still having difficulty with missing values. I'm attaching some data on state polarization for the northeast. Everything's there except the 2014 value for MA. This messes up the plot. I'd like to have the 2008 point for MA go directly to 2014. But as it is, MA gets orphaned after 2008, and the point for 2017 (the end) is not connected to anything.

na.span should do the job but it doesn't.

This is the code I use:

rownames(sg)=sg$st; sg$st=NULL
colnames(sg)=str_c("Year.",colnames(sg))
    
slopegraph(sg, col.line='gray',col.lab = 'black', decimals=1,
     xlabels=c('1996','2000','2004','2008','2014','2017'), 
     cex.lab = 1, cex.num=0.75)

slopegraph problem 042117.zip

ggslopegraph2 problem: not a data frame

ggslopegraph2 mostly works, except when it is called in a function. Then it complains about a dataframe I pass to it, complaining:

Error: The first object in your list 'sg.df' does not exist. It should be a dataframe

My code is as follows:

    ggslopegraph2(sg.df, times=year, measurement=comp.diffs, grouping=st, title = NULL) +  
      theme_bw() + labs(x=NULL, y=NULL) + theme(legend.position = "none")

In the debugger I have verified that sg.df is absolutely a dataframe. What could be going on?

Request to take over development of slopegraph...

...with the goal of getting it on CRAN!

Fix vertically overlapping value and row labels

The current binning algorithm simply puts labels on new lines when they're too close. A better version would evenly space the numbers within the range of overlap. It will have the same even appearance, but probably line up better with slope lines.

Matching label and line colours and consistent decimals

Thanks for all your work on this so far.

A couple of things.

Could the label colours match the line colours?
The number of decimal places is not consistent between columns.

The code I'm using is below and the resulting slopegraph is at this link. This shows the issues.

https://dl.dropboxusercontent.com/u/10963448/slope_graph.jpeg

library("RColorBrewer")
capex <- matrix(c(
  700348, 550203, 504668, 262529, 351732, # Perth
  1355928, 942090, 735799, 609752, 686136, # Melbourne
  805693, 792228, 713762, 629305, 641685, # Sydney
  504764, 989579, 653563, 517648, 487636, # South East Queensland
  256668, 230838, 143365, 59393, 48937, # Canberra
  595851, 530075, 331038, 187945, 152124, # Adelaide
  51978, 58080, 64789, 25600, NA), # Darwin
  nrow = 7, byrow = TRUE)


capex <-  capex/1000
capex <-  data.frame(capex)
cities <- c('Perth', 'Melbourne', 'Sydney','SE Queensland', 'Canberra', 'Adelaide', 'Darwin')
row.names(capex) <- cities
names(capex) <- c('X2010-11', 'X2011-12', 'X2012-13', 'X2013-14', 'X2014-15')

my_col = brewer.pal(9, name = 'Paired')
par(oma = c(1,4,1,1))

slopegraph(capex, 
           labels =  c('2010-11', '2011-12', '2012-13', '2013-14', '2014-15'),
           decimals = 0,
           col.lines = my_col,
           col.lab = my_col,
           family = 'sans')

Citation and removing box frame of L1, L2, ... numeric values

Dear Thomas,

Thank you for the amazing package!
Would you inform me about the correct citation of the slopegraph package?
Using ggslopegraph2, box frame around numeric values is default. I would like to ask your help to remove box frame around L1, L2, .... numeric values.

Best wishes, Zsolt

Convert README to README.Rmd

svglite and rsvg offer new potential

The new tools svglite and rsvg provide us with some new potential for this package. Fortunately, I think it requires no changes to the existing slopegraph code. I'll work up some examples to demonstrate the new potential.

Thanks so much for this great package.

Handle NA values in df

This comes up in the New York Times example.

inappropriate rounding messes up the plot, when data points are close to zero

For example, using the following dataset,

# A tibble: 14 x 3
   X__1            SE_withOPTIMAL `SE_without OPTIMAL`
 * <chr>                    <dbl>                <dbl>
 1 HR_male                 0.0340               0.0357
 2 HR_age                  0.0186               0.0211
 3 HR_nowmsk               0.0408               0.0450
 4 HR_oxygen               0.0457               0.0439
 5 HR_fev1pre              0.0416               0.0412
 6 HR_statin               0.0433               0.0413
 7 HR_azithromycin         0.0387               0.0359
 8 HR_LAMA                 0.0511               0.0456
 9 HR_LABA                 0.0480               0.0504
10 HR_ICS                  0.0507               0.0756
11 HR_sgrq                 0.0130               0.0132
12 HR_BMI10                0.0268               0.0290
13 HR_OPTIMAL              0.0512               0.307

will result in:

Suggestion: put Installation before Examples in README

Only the top line showing

In executing the GDP or the cancer data example, I am able to get only the top line. I know this was working fine last year but while updating the code, I realized that it is not reproducing. My environment is given below.

R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 14393)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] slopegraph_0.1.7

loaded via a namespace (and not attached):
[1] tools_3.3.1

Problem when saving as PDF

Bug report via email:

Here's a data frame of state polarization by 4 year intervals. When I enter the following interactively, I get the slopegraph in the Rstudio plot window.

slopegraph(na.omit(sg), col.line='gray',decimals=1,labels=c('1996','2000','2004','2008','2014'), binval=1.5)

BTW, the slopegraph doesn't work unless I omit NAs but this isn't quite what I want as I lose the entire set of observations if a state has any missing values. (see also #3)

The big question though is why I get a strange error when I set up the pdf:

pdf('Plots/Legislatures_2016/States/polarization_slopegraph.pdf', height=16, width=12, family='Palatino')
slopegraph(na.omit(sg), col.line='gray',decimals=1,labels=c('1996','2000','2004','2008','2014'), binval=1.5)

The error I get from just adding the pdf command at the top is:

Error in `row.names<-.data.frame`(`*tmp*`, value = value) : 
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique value when setting 'row.names': ‘0.4’

I don't understand why row names would be set at one of the values? It should be state abbreviations!

leeper / slopegraph Goto Github PK

slopegraph's People

Contributors

Stargazers

Watchers

Forkers

slopegraph's Issues

Recommend Projects

Recommend Topics

Recommend Org