Coder Social home page Coder Social logo

Decimal mark customization about d3-dsv HOT 8 CLOSED

d3 avatar d3 commented on May 2, 2024
Decimal mark customization

from d3-dsv.

Comments (8)

badosa avatar badosa commented on May 2, 2024 1

Yes, [email protected] works on Windows.

from d3-dsv.

mbostock avatar mbostock commented on May 2, 2024

This library does not perform any number formatting or parsing, and so has no concept of decimal mark (other than the default string coercion behavior of JavaScript). You’ll need to format your numbers prior to generating DSV (say using a localized version of d3-format), and similarly specify your own number parser (perhaps also using d3-format) if you want this functionality.

from d3-dsv.

badosa avatar badosa commented on May 2, 2024

This library does not perform any number formatting or parsing

I can see this... For i18n sake, it seems to me that it should, though. The solution you suggest seems a very complex way to get, from json2dsv, a usable CSV in German, French, Spanish, Italian, Swedish, Norwegian, Danish, Dutch, Czech...

I'm aware that number handling is not mentioned in RFC 4180 (which is an Informational RFC) as it only refers to text-based fields. Because of this (everything is text), and as much as I dislike it, localized versions of CSV-consuming software usually require localized CSVs (they apply a localized conversion of strings to numbers). That makes CSV language-dependent, while JSON is not. Conversion tools between the two should, IMHO, take this into account.

But I understand your reasons (RFC 4180). I'm just arguing that the CLI of d3-dsv would benefit from considering CSV in (international) practice.

from d3-dsv.

mbostock avatar mbostock commented on May 2, 2024

Decoupling the handling of delimiter-separated text fields from handling of numbers greatly simplifies the code (both the internal implementation and the interface). Requiring this library to know which fields are numbers would require the DSV format to store metadata, and there is no broadly-accepted convention for doing this. So while I appreciate your desire to make this process easier, I do not see a good way to make it easier than it already is.

from d3-dsv.

badosa avatar badosa commented on May 2, 2024

Thank you, @mbostock, to take the time to answer this

I gather that the command-line interface is just a by-product of d3-dsv that might not deserve too much effort (apparently it didn't even deserve the inclusion of an important feature like columns that, on the other hand, is supported by dsv.format(rows[, columns])).

Requiring this library to know which fields are numbers would require the DSV format to store metadata

I understand the difficulty of adding this feature to dsv2json, but as you can guess from my last comment my focus is mainly on json2dsv. When the input is an array of objects, dsv.format(rows[, columns]) could take into account the type of each property in the object of the first element of the array if --output-decimal is specificed, and do the proper replacements ([dsv.format(rows[, columns][,decimalChar])]). This is a very limited functionality and not bullet-proof (for example, presence of different types for the same property in different elements, like null used for missing values in the first element of the array) but seems a quite useful addition to the CLI in real life scenarios.

But, again, I understand the CLI (where this feature makes more sense) is not D3.org's priority, so this is probably an unnecessary and ugly addition to the d3-dsv module. That's why I ended up adding a similar functionality to my jsonstat-conv. This module converts JSON-stat (a format which has the needed metadata to detect number fields) to several flavors of JSON (and CSV) that can be used as an input of json2csv: the latter will receive number fields properly translated into strings with the requested decimal mark.

Field delimiter: comma; decimal mark: dot

curl http://ec.europa.eu/eurostat/wdds/rest/data/v2.1/json/en/tesem120?precision=1 | jsonstat2arrobj -b geo -d sex,age,unit -t | json2csv > unr.csv

Field delimiter: semicolon; decimal mark: comma

curl http://ec.europa.eu/eurostat/wdds/rest/data/v2.1/json/en/tesem120?precision=1 | jsonstat2arrobj -b geo -d sex,age,unit -k -t | json2csv > unr.csv -w ";"

By the way, this works perfectly on a Mac but on Windows json2csv returns:

Error: ENOENT: no such file or directory, stat 'C:\dev\stdin'
    at Error (native)

Could this have to do with the use of "/dev/stdin" on dash.js?

from d3-dsv.

mbostock avatar mbostock commented on May 2, 2024

It’s not a question of CLI vs. API. The issue is that the API deals with string input and string output exclusively. Anything that is not a string is coerced to a string using JavaScript’s default behavior (which is not localizable, as far as I am aware).

So again, if you want to control the formatting of numbers to strings, you must format them before passing them to dsvFormat (or *2dsv). And if you want to control the parsing of strings into numbers, you must parse them after receiving them from dsvParse (or dsv2*).

If you want to do this on the command-line, I recommend using ndjson-cli. For example, given the following CSV input:

name,value
fish,1.23

You can reformat the number column to a different locale like so:

csv2json -n < in.csv \
  | ndjson-map -r d3=d3-format '(d.value = d3.formatLocale({decimal: ",", thousands: " ", grouping: [3]}).format("")(+d.value), d)' \
  | json2csv -n \
  > out.csv

Which results in:

name,value
fish,"1,23"

You’ll need to npm install -g ndjson-cli d3-dsv d3-format to get the above to work.

The Windows issue is unrelated to this issue and my guess is it’s an issue with the rw library.

from d3-dsv.

mbostock avatar mbostock commented on May 2, 2024

FYI, I’ve also released [email protected] as my fourth attempt to get rw working on Windows. If you uninstall and reinstall you should get the newer version, and hopefully that will make it work on Windows.

from d3-dsv.

badosa avatar badosa commented on May 2, 2024

Thank you for the tip on [email protected]. Don't have a Windows machine beside me right now but I'll try it when I have one.

Thank you also for the sample code: I've used and love ndjson-cli but never tried d3-format.

from d3-dsv.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.