Coder Social home page Coder Social logo

psv-spec's Introduction

PSV - Pipe Separated Values

CSV without doubts!

PSV is a text based data format for tabular data. It is similar to CSV in function, but more strictly defined. Although there is RFC 4180 for CSV, in practice there are a lot of incompatibilites between CSV implementations. I.e. trying to open a CSV file in a spreadsheet program leads to a plethora of (mostly technical) options. The goal of PSV is to simplify reading and writing tabular data by defining as much technical aspects as possible.

PSV in a Nutshell

  1. PSV files are text files encoded with UTF-8
  2. Every line represents a row delimited by LF or CRLF characters
  3. The last line has no line ending
  4. Every row consists of one or more fields separated by a pipe character
  5. There are no leading or trainling pipe characters
  6. Some characters are escaped by a leading backslash
    • carriage return: \r
    • line feed: \n
    • backslash: \\
    • pipe character: \|
  7. All other characters preceeded by a backslash are treated as is

Comparison to CSV (RFC 4180)

One line sample

CSV:

aaa,bbb,ccc

PSV:

aaa|bbb|ccc

Double quoted field and multiple lines

CSV:

"aaa","bbb","ccc"<CRLF>
zzz,yyy,xxx

PSV:

aaa|bbb|ccc<CRLF>
zzz|yyy|xxx

Fields containing lines breaks

CSV:

"aaa","b<CRLF>
bb","ccc"<CRLF>
zzz,yyy,xxx

PSV:

aaa|b\nbb|ccc<CRLF>
zzz|yyy|xxx

Fields containing separators and quotes

CSV:

"aaa","b""bb","ccc"<CRLF>
zzz,"yy,y",xxx

PSV:

aaa|b"bb|ccc<CRLF>
zzz|yy\|y|xxx

PSV in Depth

1. PSV files are text files encoded with UTF-8

There are no doubts about the encoding. It's always UTF-8. No BOM!

2. Every line represents a row delimited by LF or CRLF characters

Since carriage return and line feed characters are escaped, a parser can assume, that an unescaped LF or CRLF character always represents the end of a row.

A carriage return can therefor simply be ignored, when parsing a PSV stream.

3. The last line has no line ending

If the last line in a PSV stream contains a CRLF or LF line ending, the parser will create another row with one empty field.

4. Every row consists of one or more fields separated by a pipe character

Every row has at least one field. Multiple fields are separated by the pipe character "|".

5. There are no leading or trailing pipe characters

A leading or trailing pipe character in a line represents an additional empty field.

6. Some characters are escaped by a leading backslash

The following characters in a fields content are treated special by escaping them with a leading backslash:

  • carriage return: \r
  • line feed: \n
  • backslash: \\
  • pipe character: \|

This is one of the core aspects of PSV!

7. All other characters preceeded by a backslash are treated as is

A backslash can theoretically occur everywhere in a stream. If the following character is none of the ones listed in Rule 6, it will be ignored.

Implementations

psv-spec's People

Contributors

jgis avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.