Coder Social home page Coder Social logo

inventaire / isbn3 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from tadas-s/isbnjs

20.0 4.0 6.0 663 KB

ISBN utils: parse, validate, format, audit

Home Page: http://inventaire.github.io/isbn3/

JavaScript 87.54% Shell 3.17% HTML 9.29%
isbn parse validate format audit isbn10 isbn13

isbn3's Introduction

isbn3

Node

An ISBN JavaScript Library.

Please note that this is a fork of isbn2, which was a fork of isbn package which was forked from the original isbnjs project on Google Code.

Ranges data are generated from isbn-international.org data.

Added features compared to isbn2:

  • recover common errors:
    • ignore bad hyphenization (ex: 978-1933988030)
  • modularizing and updating the code for ES6, in a class-less way.
  • improve performance (see benchmark)
  • Auto-update groups data every month

Demo

NPM

Auto-update groups data

Summary

Install

From the command line:

npm install isbn3

Then in your JS file:

const ISBN = require('isbn3')

Alternatively, you can call the ES5 browserified version of the module from an HTML file, which sets the module object on window.ISBN:

<script type="application/javascript" src="./node_modules/dist/isbn.js"></script>

See ./index.html or the live demo for an example.

Functions

parse

ISBN.parse('1-933988-03-7')
// => {
// source: '1-933988-03-7',
// isValid: true,
// isIsbn10: true,
// isIsbn13: false,
// group: '1',
// publisher: '933988',
// article: '03',
// check: '7',
// isbn13: '9781933988030',
// isbn13h: '978-1-933988-03-0',
// check10: '7',
// check13: '0',
// groupname: 'English language',
// isbn10: '1933988037',
// isbn10h: '1-933988-03-7'
// }

ISBN.parse('1933988037')
// => idem but with source === '1933988037'

ISBN.parse('978-4-87311-336-4')
// => {
//   source: '978-4-87311-336-4',
//   isValid: true,
//   isIsbn10: false,
//   isIsbn13: true,
//   prefix: '978',
//   group: '4',
//   publisher: '87311',
//   article: '336',
//   check: '4',
//   isbn13: '9784873113364',
//   isbn13h: '978-4-87311-336-4',
//   check10: '9',
//   check13: '4',
//   groupname: 'Japan',
//   isbn10: '4873113369',
//   isbn10h: '4-87311-336-9'
// }

ISBN.parse('9784873113364')
// => idem but with source === '9784873113364'

ISBN.parse('978-4873113364')
// => idem but with source === '978-4873113364'

ISBN.parse('979-10-96908-02-8')
// {
//   source: '979-10-96908-02-8',
//   isValid: true,
//   isIsbn10: false,
//   isIsbn13: true,
//   prefix: '979',
//   group: '10',
//   publisher: '96908',
//   article: '02',
//   check: '8',
//   isbn13: '9791096908028',
//   isbn13h: '979-10-96908-02-8',
//   check10: '6',
//   check13: '8',
//   groupname: 'France'
// }

ISBN.parse('not an isbn')
// => null

asIsbn13

ISBN.asIsbn13('4-87311-336-9')           // 9784873113364
ISBN.asIsbn13('4-87311-336-9', true)     // 978-4-87311-336-4

asIsbn10

ISBN.asIsbn10('978-4-87311-336-4')       // 4873113369
ISBN.asIsbn10('978-4-87311-336-4', true) // 4-87311-336-9

hyphenate

ISBN.hyphenate('9784873113364')          // 978-4-87311-336-4

audit

Get clues for possible mistake in an ISBN.

For instance, if in your data, a French edition has an ISBN-13 starting by 978-1-0, which would make it part of an English language groups, it could be that somewhere a prefix mistake was made and the ISBN actually starts by 979-10 (a French group). This is typically the case when an 979-prefix ISBN-13 was converted to an ISBN-10 (which is wrong as 979-prefixed ISBNs can't have ISBN-10), and then re-converted to an ISBN-13 with the 978 prefix. This is soooo wrong, but data is a dirty place I'm afraid.

ISBN.audit('9784873113364')
// {
//   "source": "9784873113364",
//   "validIsbn": true,
//   "groupname": "Japan",
//   "clues": []
// }

ISBN.audit('9781090648525')
// {
//   "source": "9781090648525",
//   "validIsbn": true,
//   "groupname": "English language",
//   "clues": [
//     {
//       "message": "possible prefix error",
//       "candidate": "979-10-90648-52-4",
//       "groupname": "France"
//     }
//   ]
// }

ISBN.audit('978-1-0906-4852-4')
// {
//   "source":"978-1-0906-4852-4",
//   "validIsbn":false,
//   "clues":[
//     {
//       "message":"checksum hints different prefix",
//       "candidate":"979-10-90648-52-4",
//       "groupname":"France"
//     }
//   ]
// }

groups

ISBN.groups['978-99972']
// => {
//   name: 'Faroe Islands',
//   ranges: [ [ '0', '4' ], [ '50', '89' ], [ '900', '999' ] ]
// }

CLI

Installing the module globally (npm install -g isbn3) will make the following commands available from your terminal.

If you installed locally (npm install isbn3), the command can be accessed from the project directory at ./node_modules/.bin, or just by their filename in npm scripts.

isbn

isbn <isbn> <format>

Valid ISBN input examples:
- 9781491574317
- 978-1-4915-7431-7
- 978-1491574317
- isbn:9781491574317
- 9781-hello-491574317
- 030433376X
- 0-304-33376-X

Formats:
- h: hyphen
- n: no hyphen
- 13: ISBN-13 without hyphen
- 13h: ISBN-13 with hyphen (default)
- 10: ISBN-10 without hyphen
- 10h: ISBN-10 with hyphen
- prefix, group, publisher, article, check, check10, check13: output ISBN part value
- data: output all this data as JSON

isbn-audit

Return the results of the audit function as JSON

isbn-audit <isbn>

This command also accepts a stream of newline-delimited isbns and outputs a stream of newline-delimited JSON, where each line corresponds to an ISBN that is either invalid or that could be suspect of being malformed. Valid ISBN with no possible malformation detected don't return anything. echo ' 9784873113364 9781090648525 978-1-0906-4852-4 ' | isbn-audit > audit_data.ndjson

isbn-checksum

Return the checksum that would correspond to the passed input (ignoring its current checksum if any).

isbn-checksum <isbn>

isbn-checksum 978-4-87311-336-4
# {
#   "input": "978-4-87311-336-4",
#   "checksumCalculatedFrom": "978487311336",
#   "checksum": "4",
#   "isbn": "978-4-87311-336-4"
# }

isbn-checksum 978-4-87311-336-1
# {
#   "input": "978-4-87311-336-1",
#   "checksumCalculatedFrom": "978487311336",
#   "checksum": "4",
#   "isbn": "978-4-87311-336-4"
# }

isbn-checksum 978-4-87311-336
# {
#   "input": "978-4-87311-336",
#   "checksumCalculatedFrom": "978487311336",
#   "checksum": "4",
#   "isbn": "978-4-87311-336-4"
# }

isbn-checksum 978487311336
# {
#   "input": "978487311336",
#   "checksumCalculatedFrom": "978487311336",
#   "checksum": "4",
#   "isbn": "978-4-87311-336-4"
# }


isbn-checksum 978-4-87311-336-1

Benchmark

Indicative benchmark, nothing super scientific, YMMV.

Running npm run benchmark a few times on some Linux machine with Node.Js v8.12 produced in average the following mesure:

  • isbn3
  • load module: 6ms
  • parse 4960 non-hyphenated ISBNs in around 110ms
  • load module: 4.5ms
  • parse 4960 non-hyphenated ISBNs in around 285ms

The difference is mainly due to the generation of a map of groups in isbn3, which takes more time a initialization but makes groups lookups much faster.

Development

Test Suite

To run the lint/test suite use:

npm test

Update Groups data

Groups data are fetched from isbn-international.org, and are critical to how this lib parses ISBNs. Unfortunately, those groups aren't fixed once for all, and we need to update those data periodically.

Once a month, a CI job takes care of updating ISBN groups data and publishing a patch version: Auto-update groups data

To get the latest data, you thus just need to update to the latest version (beware of breaking changes if that makes you switch to a new major version though):

npm install isbn3@latest

See also

isbn3's People

Contributors

bcherny avatar clemlatz avatar jum-s avatar makepanic avatar maxlath avatar mvolz avatar pbakondy avatar roman991 avatar tadas-s avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

isbn3's Issues

Consider caching isbn definitions on load

The ISBN definitions seem to be updated fairly frequently as far as I can tell. Would it be possible to make a request on loading the library for the new definitions? The hard-coded ones could be used as back-up if the library is offline / the website is down. If that seems like something that we want to do I could make a PR.

Add typing to isbn3

Consider adding types for isbn3 to allow usage with typescript.
eg.

interface ISBNAuditObject {
    source: string,
    validIsbn: boolean,
    groupname?: string,
    clues: Array<{
        message: string,
        candidate: string,
        groupname: string
    }>
}

interface ISBNObject {
    source: string,
    isValid: boolean,
    isIsbn10: boolean,
    isIsbn13: boolean,
    prefix?: string,
    group: string,
    publisher: string,
    article: string,
    check: string,
    isbn13?: string,
    isbn13h?: string,
    check10: string,
    check13: string,
    groupname: string,
    isbn10?: string,
    isbn10h?: string
}

declare module "isbn3" {
    function parse(isbn: string): ISBNObject | null;
    function asIsbn13(isbn: string): string;
    function asIsbn10(isbn: string): string;
    function hyphenate(isbn: string): string;
    function audit(isbn: string): ISBNAuditObject;
    const groups: Record<string, {
        name: string,
        ranges: Array<[string, string]>
    }>;

    export = { parse, asIsbn10, asIsbn13, hyphenate, audit, groups }
}

make parser 70x times faster?

Hi, I am working some time now on a really fast ISBN parser/formatter in Java. And for this Java library (https://github.com/creativecouple/isbn-validation-java/) I recently found a way to speed-up the parsing throughput from 6k to 13k ops/milliseconds (meaning just 75 nanoseconds per parse operation).

When trying out this approach for other programming languages, I found your isbn3 NPM library.
Your benchmark script was not able to measure that tiny amount of time correctly, so I put a 1000x loop around the parsing like this:

for (let i=0; i<1000; i++) {
  const data = isbns.map(isbn => parse(isbn))
}

My question is: Are you interested in rewriting your parsing engine?
Otherwise I would try to create a new npm package with that approach.

I compared your latest version of isbn3 to the old npm lib isbn and then my temporary prototype using either of these different imports:

const { parse } = require('..') // your latest version from Github
const { ISBN: { parse } } = require('isbn') // using npm i --no-save isbn (7 years old!!)
const { parse } = require('isbn-validation-js') // using npm i --no-save git://github.com/creativecouple/isbn-validation-java#tmp-javascript-version

This is the result on my machine with these three approaches:

ISBN3

$ npm run benchmark

load module: 5.498ms
parsed 1000x 5640 non-hyphenated ISBNs in: 1:06.333 (m:ss.mmm)
ISBN 978-0-00-443799-6 Group Name English language

ISBN (7 years old!!)

$ npm run benchmark

load module: 1.033ms
parsed 1000x 5640 non-hyphenated ISBNs in: 10.214s
ISBN 978-0-00-443799-6 Group Name English speaking area

my prototype from https://github.com/creativecouple/isbn-validation-java/tree/tmp-javascript-version

$ npm run benchmark

load module: 1.425ms
parsed 1000x 5640 non-hyphenated ISBNs in: 975.009ms
ISBN 978-0-00-443799-6 Group Name English language

So the old isbn package is still faster than your current version, but as you see it is possible to go sub-second for 5,640,000 parsing operations.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.