Coder Social home page Coder Social logo

peep's Introduction

peep

R's default head or tail functions print 6 or so rows and all columns. When you have a dataset that has a rather large number of columns, printing can be very poor, spilling over several rows.

The default output would look like

> dim(xy)
[1]  43 205

> head(xy)
    Sample   SLC25A5      MEOX1       CD4       HFE     GABRA3     RNH1        VIM       MYOC
1 sample 1 169.28930 0.08670971 18.975543 14.149516 0.10174355 5.021980 1674.99418  0.2268739
2 sample 2 122.55490 0.11898911 16.644021  5.116629 0.06980979 4.246349  610.15708  0.9339964
3 sample 3  13.49505 0.39190042  7.006248  8.686389 0.61902692 1.242957   88.91675 11.1807812
4 sample 4 111.84583 0.07831660  8.329725  2.129984 0.34920184 8.478207  194.47382  0.0000000
       CAPG      ZIC2       EPHA3      ELN      NTN1     ABCC9    CYBRD1      NTN4    NUAK1  SLC25A3
1 48.951242 0.9216751  5.97980198 3.591096 4.5935819 12.916405 45.980617 10.402553 6.992939 81.73932
2 10.396297 1.0434491 19.52954610 4.078562 5.2744688 40.005469 17.813941 15.819005 9.638139 60.31633
3  2.468217 3.6529801  2.96681949 5.578765 0.5051865  6.682470  1.563846  1.976812 2.269706 11.34410
4 24.162310 3.4463885  0.07879854 3.225239 2.1506768  1.343375 20.593887  4.886979 3.763131 56.44569
...

spilling columns over several lines with no end in sight for data with large number of colums (think expressions in bioinformatics).

It came down to typing xy[1:5, 1:5] for the rest of my life or develop a function that would make this easier. Enter peep. It prints a few first and last rows and columns. If any columns or rows have been omitted, it adds a horizontal or vertical delimiter of dots to indicate that there's something there.

> peep(xy)
       Sample SLC25A5   MEOX1    CD4    HFE  GABRA3    MMP12 SPON1   MSMB  CCL4  CCL3
01:  sample 1   169.3 0.08671  18.98  14.15  0.1017  ·  5.77 45.49  8.803 77.19 76.08
02:  sample 2   122.6   0.119  16.64  5.117 0.06981  · 114.9 274.7 0.2449 44.92 41.04
03:  sample 3    13.5  0.3919  7.006  8.686   0.619  · 4.861 2.803    278 9.299 5.599
04:  sample 4   111.8 0.07832   8.33   2.13  0.3492  · 146.8 560.5      0 25.22 16.66
05:  sample 5   92.04       0 0.9135 0.8531  0.8672  · 91.63 8.617  4.848 8.899  6.03
06:  sample 6   63.44  0.3779  4.487  11.16   4.573  · 273.6  7.65  20.03 15.84 9.727
            ·       ·       ·      ·      ·       ·  ·     ·     ·      ·     ·     ·
38: sample 38   119.9 0.06032  2.194  8.856       0  · 1.158 4.308      0 8.464 19.14
39: sample 39   83.36       0  5.265  1.505       0  · 7.512 225.4      0 17.25 16.83
40: sample 40   100.1       0  4.783  1.692       0  · 88.81 18.54      0 55.66 5.438
41: sample 41   77.02 0.01632  14.92  3.389 0.01915  ·     0 68.71      0 18.79  11.5
42: sample 42   39.37  0.1706  4.553  4.284  0.2891  · 6.476 20.21  8.893 21.19 9.978
43: sample 43   36.44       0  1.064  4.322  0.4157  · 0.544 5.627      0 5.444 2.933

This works has been inspired by data.table and pandas, a package for working with DataFrames in Python.

Installation

To install from GitHub, try using package remotes

remotes::install_github("romunov/peep")

peep's People

Contributors

romunov avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

peep's Issues

disable returning of results

When printing using peep, prevent the function from returning anything. The only side effect should be printing. Returning an object just duplicates the print if the result is not saved into a variable.

Last column gets printed separately

Example:

library(peep)
library(data.table)
x <- cbind(mtcars, mtcars)
setDT(x)
peep(x)

Output (to reproduce set the console size to 107 characters.):

> peep(x)
     mpg cyl  disp  hp drat    wt  qsec vs am gear    disp.1 hp.1 drat.1  wt.1 qsec.1 vs.1 am.1 gear.1
 1:   21   6   160 110  3.9  2.62 16.46  0  1    4  ·    160  110    3.9  2.62  16.46    0    1      4
 2:   21   6   160 110  3.9 2.875 17.02  0  1    4  ·    160  110    3.9 2.875  17.02    0    1      4
 3: 22.8   4   108  93 3.85  2.32 18.61  1  1    4  ·    108   93   3.85  2.32  18.61    1    1      4
 4: 21.4   6   258 110 3.08 3.215 19.44  1  0    3  ·    258  110   3.08 3.215  19.44    1    0      3
 5: 18.7   8   360 175 3.15  3.44 17.02  0  0    3  ·    360  175   3.15  3.44  17.02    0    0      3
 6: 18.1   6   225 105 2.76  3.46 20.22  1  0    3  ·    225  105   2.76  3.46  20.22    1    0      3
 7:    ·   ·     ·   ·    ·     ·     ·  ·  ·    ·  ·      ·    ·      ·     ·      ·    ·    ·      ·
 8:   26   4 120.3  91 4.43  2.14  16.7  0  1    5  ·  120.3   91   4.43  2.14   16.7    0    1      5
 9: 30.4   4  95.1 113 3.77 1.513  16.9  1  1    5  ·   95.1  113   3.77 1.513   16.9    1    1      5
10: 15.8   8   351 264 4.22  3.17  14.5  0  1    5  ·    351  264   4.22  3.17   14.5    0    1      5
11: 19.7   6   145 175 3.62  2.77  15.5  0  1    5  ·    145  175   3.62  2.77   15.5    0    1      5
12:   15   8   301 335 3.54  3.57  14.6  0  1    5  ·    301  335   3.54  3.57   14.6    0    1      5
13: 21.4   4   121 109 4.11  2.78  18.6  1  1    4  ·    121  109   4.11  2.78   18.6    1    1      4
    carb
 1:    4
 2:    4
 3:    1
 4:    1
 5:    2
 6:    1
 7:    ·
 8:    2
 9:    2
10:    4
11:    6
12:    8
13:    2

BTW: why the rownames are not padded with zeros?

Display rownames as a first column?

peep shows the rownames as numbers, which is useful, as it gives an idea about the size of the data, see example:

# options("width")
# $width
# [1] 80

library(peep)
x <- matrix(1:100, ncol = 50,
            dimnames = list(rownames(x) <- c("gene1", "gene2"),
                            colnames(x) <- paste0("s", 1:50)))
x[, 1:3]
#       s1 s2 s3
# gene1  1  3  5
# gene2  2  4  6
# Great, now we know rownames have gene names!

peep(x)
#    s1 s2 s3 s4 s5 s6 s7 s8 s9 s10    s41 s42 s43 s44 s45 s46 s47 s48 s49 s50
# 1:  1  3  5  7  9 11 13 15 17  19  ·  81  83  85  87  89  91  93  95  97  99
# 2:  2  4  6  8 10 12 14 16 18  20  ·  82  84  86  88  90  92  94  96  98 100
# With *peep*, now we need 2nd step rownames(x) to see the gene names.

Could we have 1st column to show the rownames only if they are not 1:n? Something like:

if(!identical(as.character(1:nrow(x)), rownames(x)){ "add rownames as 1st column" }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.