Coder Social home page Coder Social logo

pst's Introduction

pst

pst is a command line tool for processing and combining columns across column oriented files. pst can also compute row-wise statistics (such as mean, median, variance, etc).

usage

usage: pst <options> file1 file2 ...

options:
  -c="": compute statistics across column values in each output row.
    Please note that each value in the output has to be convertible into a float
    for this to work. The computed statistics are determined by a comma separated
    list of actions. The result of each action is printed as a separate column value.
    Currently supported compute actions are:
        - mean  : compute row mean
        - std   : compute row standard deviation
        - var   : compute row variance
        - median: compute row median
        - max   : compute maximum value of row
        - min   : compute minimum value of row
    Thus, "mean, std, median" will result in three columns per row, with the
    mean, standard deviation and median of the raw column values.
  -h=false: show basic usage info
  -i="": specify the input columns to extract. This flag is optional.
    The spec format is "<column list file1>|<column list file2>|..."
    where each column specifier is of the form col_i,col_j,col_k-col_n, ....
    If the number of specifiers is less than the number of files, the last
    specifier i will be applied to files i through N, where N is the total
    number of files provided. If this flag is not provided all input columns
    will be extracted.
  -n=1: number of threads (default: 1)
  -o="": specify the order in which to print the output columns. This flag is optional.
    The spec format is "i,j,k-l,m,..", where 0 < i,j,k,l,m, ... < numCol, and
    numCol is the total number of columns extracted from the input files.
    Columns can be specified multiple times and ranges are accepted. If this
    option is not provided the columns are pasted in the order in which they
    are extracted.
  -r="": specify which rows to process and output. This flag is optional.
    If not specified all rows will be output. Rows can be specified by a comma
    separated list of row IDs or row ID ranges. E.g., "1,2,4-8,22" will process
    rows 1, 2, 4, 5, 7, 22.
  -s="": column separator for input files. The default separator is whitespace.
  -t=" ": column separator for output files. The default separator is a single space.

Notes

The output file is assembled in memory and thus requires sufficient storage
to hold the complete final output data.

The input column specifiers are zero based and can include ranges. The end
of a range is not included in the output, i.e. the range 2-5 selects columns
2, 3, and 4.

Examples

pst -i "0,1" file1 file2 file3 > outfile

This command selects columns 0 and 1 from each of file1, file2, and file3
 and outputs them to outfile (which thus contains 6 columns).


pst -i "0,1|3" file1 file2 file3 > outfile

This invocation selects columns 0 and 1 from file1, and column 3 from file2
and file3. outfile contains 4 columns.


pst -i "0,1|3|4-5" file1 file2 file3 > outfile

This command selects column 0 and 1 from file1, column 3 from file2, and
columns 4 and 5 from file 3. outfile contains 5 columns.


pst -t "," -s ";" -i "0,1|3|4-5" file1 file2 file3 > outfile

This command splits the input files into columns with ';' as
separator. It selects column 0 and 1 from file1, column 3 from file2, and
columns 4 and 5 from file 3. outfile contains 5 columns each separated
by ','.


pst -c -t "," -s ";" -i "0,1|3|4-5" file1 file2 file3 > outfile

Same as above but instead of outputting 5 columns, it computes and prints
for each row the mean and variance across each 5 columns. Please note that
this assumes that each column entry can be converted into a float value.

pst's People

Contributors

haskelladdict avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.