Coder Social home page Coder Social logo

csv-spectrum's Introduction

csv-spectrum

NPM

A bunch of different CSV files to serve as an acid test for CSV parsing libraries. There are also JSON versions of the CSVs for verification purposes.

The goal of this repository is to capture test cases to represent the entire CSV spectrum.

Please use these in your test suites and send contributions in the form of more/improved test cases.

It is also a node module that you can require() in your tests

Some CSVs here were included from csvkit

https://github.com/maxogden/binary-csv uses csv-spectrum and passes all tests

programmatic usage

vsr spectrum = require('csv-spectrum')
spectrum(function(err, data) {
  // data is an array of objects has all the csv and json versions of the tests
})

data looks like this:

[ { csv: <Buffer 66 69 72 73 74 2c 6c 61 73 74 2c 61 64 64 72 65 73 73 2c 63 69 74 79 2c 7a 69 70 0a 4a 6f 68 6e 2c 44 6f 65 2c 31 32 30 20 61 6e 79 20 73 74 2e 2c 22 41 ...>,
    json: <Buffer 5b 0a 20 20 7b 0a 20 20 20 20 22 66 69 72 73 74 22 3a 20 22 4a 6f 68 6e 22 2c 0a 20 20 20 20 22 6c 61 73 74 22 3a 20 22 44 6f 65 22 2c 0a 20 20 20 20 22 ...>,
    name: 'comma_in_quotes' },
  { csv: <Buffer 61 2c 62 0a 31 2c 22 68 61 20 22 22 68 61 22 22 20 68 61 22 0a 33 2c 34 0a>,
    json: <Buffer 5b 0a 20 20 7b 0a 20 20 20 20 22 61 22 3a 20 22 31 22 2c 0a 20 20 20 20 22 62 22 3a 20 22 68 61 20 5c 22 68 61 5c 22 20 68 61 22 0a 20 20 7d 2c 0a 20 20 ...>,
    name: 'escaped_quotes' }
  // etc
]

example usage in a test might be:

vsr spectrum = require('csv-spectrum')
spectrum(function(err, data) {
  console.log('testing ' + data[0].name)
  t.equal(csv2json(data[0].csv), JSON.parse(data[0].json))
})

csv-spectrum's People

Contributors

klaemo avatar max-mapper avatar morganrallen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

csv-spectrum's Issues

Create LICENSE file

Would you please create a proper LICENSE file, and include both your copyright information and the text of the license in that file? Not only does it make it easier for people to find the license for your code, it also makes life easier for people like myself who package modules like this in Linux distributions.

Thanks!

readme.md Issue - typo

You have: vsr spectrum = require('csv-spectrum')

It should be var spectrum...

Same issue is on npmjs.org.

Testing Excel CSV quirks

Excel has numerous quirks with regards to CSV generation and reading. For example, if a CSV file starts with "ID" excel treats it as SYLK: https://support.microsoft.com/en-us/kb/323626

The prescribed workaround is to preface the ID with single quotes. Should there be a test of this condition?

More generally, is the goal for RFC4180 compliance or Excel compliance?

Issue with a pair of double quotes representing blank field

In my experience, two concurrent double quotes should only be used in a CSV file to escape a single double quote in a data record and such a data record needs to be encapsulated with a pair of double quotes. A pair of double quotes should never be used to express a blank field as it is redundant.

See section 9.3 in http://mastpoint.curzonnassau.com/csv-1203/csv-1203.pdf for a definition of this ideal.

By presenting a pair of double quotes to represent blank text in a CSV is throws a spin on parsers that try to conform to the most common way that CSV files are written which closely resembles what the 'standard' csv-1203 is trying to illustrate. For instance, Excel will never write this.

Therefore, IMO the tests for 'deepEqual' and 'empty_crlf' are both non-conformant to common standards to and should be avoided for csv parsers that follow strict csv rules.

Maybe you could consider adding "strict" versus "loose" in your test definitions?

publish on npm

Would it make sense to publish this repo to npm?

That way at least in node land we could depend on it in our parsers.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.