jhermsmeier / node-vcf Goto Github PK
View Code? Open in Web Editor NEWA not so forgiving vCard / vcf parser
License: MIT License
A not so forgiving vCard / vcf parser
License: MIT License
The library appears to run into issues if a vcf file contains multiple vCards. What I see is the records from all the vCards are merged into one entry.
For example (trimmed):
vCard {
version: '3.0',
data:
{ version:
[ [String: '3.0'],
[String: '3.0'],
[String: '3.0'],
[String: '3.0'],
[String: '3.0'],
[String: '3.0'],
[String: '3.0'],
[String: '3.0'],
[String: '3.0'],
[String: '3.0'],
[String: '3.0'],
[String: '3.0'],
[String: '3.0'],
[String: '3.0'] ],
prodid:
[ [String: '-//Apple Inc.//Mac OS X 10.12.1//EN'],
[String: '-//Apple Inc.//Mac OS X 10.12.1//EN'],
[String: '-//Apple Inc.//Mac OS X 10.12.1//EN'],
[String: '-//Apple Inc.//Mac OS X 10.12.1//EN'],
[String: '-//Apple Inc.//Mac OS X 10.12.1//EN'],
[String: '-//Apple Inc.//Mac OS X 10.12.1//EN'],
[String: '-//Apple Inc.//Mac OS X 10.12.1//EN'],
[String: '-//Apple Inc.//Mac OS X 10.12.1//EN'],
[String: '-//Apple Inc.//Mac OS X 10.12.1//EN'],
[String: '-//Apple Inc.//Mac OS X 10.12.1//EN'],
[String: '-//Apple Inc.//Mac OS X 10.12.1//EN'],
[String: '-//Apple Inc.//Mac OS X 10.12.1//EN'],
[String: '-//Apple Inc.//Mac OS X 10.12.1//EN'],
[String: '-//Apple Inc.//Mac OS X 10.12.1//EN'] ],
n:
[ [String: 'Matters;Sylvie;;;'],
[String: 'Dawson;Jesse;;;'],
[String: 'Kelsdaughter;Jen;;;'],
[String: ';;;;'],
[String: 'Jameson;Jen;;;'],
[String: ';;;;'],
[String: 'Dawson;Marc;;;'],
[String: ';Agathe;;;'],
[String: ';;;;'],
[String: 'Doe;John;;;'],
[String: 'Vino;Vito;;;'],
[String: ';Alexei;;;'],
[String: 'Boucher;Marie-Claude;;;'],
[String: ';;;;'] ],
...
It would be useful to have this scenario supported, even if it via a new function parseMulti()
, as to not break assumptions of code depending on current design.
Does node-vcf intentionally dispose of the base64 photo data? I don't see it treated specially in the source. If this is a bug, I'll drum up a testcase...
e.g.
PHOTO;ENCODING=BASE64;JPEG:/9j/4AAQSkZJRgABAQAAAQ
CAgICAgQDAgICAgUEBAMEBgUGBgYFBgYGBwkIBgcJBwYGCAs
2wBDAQICAgICAgUDAwUKBwYHCgoKCgoKCgoKCgoKCgoKCgoK
goKCgoKCgoKCgoKCgr/wAARCABgAGADASIAAhEBAxEB/8QAH
Hi!
Author of camelcase
library decided to support only modern browsers since 4.0 version.
sindresorhus/camelcase#19
Therefore node-vcf requires the use of transpiler to make it work in old browsers.
Do you consider downgrading camelcase
to 3.0 that exports es5 code? :)
I'm absolutely bumfuzzled by what get returns.
My code looks something like:
var photo = card.get( 'photo' );
If I console.log( photo ), I get:
{ [String: '<bunch of base64 data>'] encoding: 'BASE64', type: 'jpeg' }
What throws me so severely is that this is supposedly an object (starts with {) but it's made up of an array which doesn't have a property as a key, then no comma between that array's end and the property 'encoding'. Huh?
So I do a JSON.stringify( photo, null, 4 ) and get:
[
"photo",
{
"encoding": "BASE64",
"type": "jpeg"
},
"text",
"<bunch of base64 data>"
]
Why that's different, I don't know, but it at least looks like a proper JavaScript data structure. It's an array, yet when I console.log( photo[0] ) I get "undefined".
I've stuck debug console.logs in the source and so know it's returning this.data[ key ].clone()
(confirmed with typeof), but that's not helping me make sense of this, or get my photo data!
I also know, from test/photo.js, that it's a vCard.Property. Still confused :)
Never seen anything like this! My JS must improve!
I am trying to use your plugin to split a big vcard into chunks of 100.
Here is my source code:
let fs = require('fs');
let vCard = require( 'vcf' );
fs.readFile('contacts.vcf', 'utf8', (err, data) => {
if (err) throw err;
let cards = vCard.parse( data );
console.log(cards.length);
perChunk = 100 // items per chunk
inputArray = cards;
let chunks = inputArray.reduce((resultArray, item, index) => {
const chunkIndex = Math.floor(index/perChunk)
if(!resultArray[chunkIndex]) {
resultArray[chunkIndex] = [] // start a new chunk
}
resultArray[chunkIndex].push(item)
return resultArray
}, [])
console.log(chunks.length);
let part = 1;
for(let chunk of chunks){
let filename = `part${part}.vcf`;
let data = '';
for(let card of chunk){
data += card.toString();
data += '\n';
}
fs.writeFile(filename, data, 'utf8');
part++;
}
});
All is well, except for something strange happening with the encoding.
For example there is a contact named "Déborah Garçon". In VSCode I can see the name correctly, however as soon as I import it into Google Contacts, it appears as "Déborah Garçon".
I
When I skip step 2, it imports the name correctly...
Any ideas?
Just noticed that because on our end it says:
'Invalid vCard: Expected "VERSION:\\d.\\d" but found "'+ version +'"'
and not:
https://github.com/jhermsmeier/node-vcf/blob/master/lib/vcard.js#L280-L286
But package.json Version match.
Can you update the npm lib please?
I am testing the parsing of vcard files with a bunch of sample vcard examples I found online.
The birthday on them seems to come up before the version number, which I would assumed wouldn't break the parse.
I'm just concerned if one of my users uploads a similar vcard with this similar format of NOT having the version number as the first data field, this parser cannot parse it and will fail.
Is this a standardized failure by the example site I got these sample vcards from that is never to be expected with a vcard? I was thinking that this package would still parse through and search for a version number, but it appears not.
Let me know what y'all think. For right now, I'm just going to move the BDAY
below the VERSION
for now.
Error:
SyntaxError: Invalid vCard: Expected "VERSION:\d.\d" but found "BDAY;VALUE=DATE:1963-09-21"
This is the one I'm trying to parse:
BEGIN:VCARD
BDAY;VALUE=DATE:1963-09-21
VERSION:3.0
N:Stenerson;Derik
FN:Derik Stenerson
ORG:Microsoft Corporation
ADR;TYPE=WORK,POSTAL,PARCEL:;;One Microsoft Way;Redmond;WA;98052-6399;USA
TEL;TYPE=WORK,MSG:+1-425-936-5522
TEL;TYPE=WORK,FAX:+1-425-936-7329
EMAIL;TYPE=INTERNET:[email protected]
END:VCARD
BEGIN:VCARD
VERSION:3.0
N:Ganguly;Anik
FN:Anik Ganguly
ORG: Open Text Inc.
ADR;TYPE=WORK,POSTAL,PARCEL:;Suite 101;38777 West Six Mile Road;Livonia;MI;48152;USA
TEL;TYPE=WORK,MSG:+1-734-542-5955
EMAIL;TYPE=INTERNET:[email protected]
END:VCARD
BEGIN:VCARD
VERSION:3.0
N:Moskowitz;Robert
FN:Robert Moskowitz
EMAIL;TYPE=INTERNET:[email protected]
END:VCARD
Reading the documentation as it is currently you could easily miss the fact there is a parseMultiple()
function that can be used to parse a string that contains multiple vCards. Updated the documentation is clarify this.
This is a result of the discussion in issue #11.
I exported a set of contacts from my icloud.com account. This file appears to be correct but when importing using the multiple imports method I get this error:
SyntaxError: Invalid vCard: Expected "VERSION:\d.\d" but found "undefined"
Here is how read and import the file:
var contactData = fs.readFileSync("./samples/multiple-contacts-from-icloud.vcf","utf8", "utf8");
var contacts = vCard.parse(contactData); //multiple
I've attached my sample file here, i appended the '.txt' extension to allow attach here.
multiple-contacts-from-icloud.vcf.txt
running the following on a vcf 2.1 will return version 2.1 attribute in the jcard.
var card = vCard.parse(string);
var jcard = card.toJSON();
I assume we can get a 4.0 version by doing:
var card = vCard.parse(string);
card = vCard.parse(card.toString('4.0'));
var jcard = card.toJSON();
If that's the case, then it might be more obvious for a user to call a function like toJCard() that will do that.
Code to reproduce:
let vcf = <a card>
// Parse into vCard
const card = new vCard().parse(vcf);
// Serialize back to string
vcf = card.toString()
// Throws an exception because newlines in new vcf string are `\n` and the parser expects `\r\n`
const card = new vCard().parse(vcf);
Just noticed that if a vCard uses \n
for line breaks vcf returns and undefined version match error. This can happen. A workaround would be a check after the first normalize split:
https://github.com/jhermsmeier/node-vcf/blob/master/lib/vcard.js#L263
if (lines.length <= 1) {
lines = lines[0].split(/\n/g);
}
According to (https://tools.ietf.org/html/rfc6350#section-3.2) line folding is supposed to have only one space/tab character at the beginning of each folded line.
The current regex for normalizing the values does not handle it properly, it is visible also on the main readme.md file where United States ofAmerica
is missing a space.
adr: [
{ [String: ';;100 Waters Edge;Baytown;LA;30314;United States of America']
type: 'work',
label: '"100 Waters Edge\\nBaytown, LA 30314\\nUnited States of America"' },
{ [String: ';;42 Plantation St.;Baytown;LA;30314;United States of America']
type: 'home',
label: '"42 Plantation St.\\nBaytown, LA 30314\\nUnited States ofAmerica"' }
card.add('MEMBER', '1a7935722bfa2289da2934b87769cd14')
This generates:
-M-E-M-B-E-R:1a7935722bfa2289da2934b87769cd14
I expected:
MEMBER:1a7935722bfa2289da2934b87769cd14
I fiddled with the following line to remove the case translation, but have not got a definitive fix:
Line 54 in 37c7bf1
Hey, we are integrating your lib in our Angular project, which lately reported us that vcf is a cjs lib. Would it be possible to provide an esm export, so it can be optimized by the compiler? From what I see vcf isn't that heavy, so it would just require a small refactor. You could use esbuild
to compile it into different formats.
At first specifications :P --> https://tools.ietf.org/html/rfc7095#section-3.3.1.3:
The vCard specification defines properties with structured values,
for example, "GENDER" or "ADR". In vCard, a structured text value
consists of one or multiple text components, delimited by the
SEMICOLON character. Its equivalent in jCard is a structured
property value, which is an array containing one element for each
text component, with empty/missing text components represented by
zero-length strings.[...]
Per Section 6.3.1 of [RFC6350], the component separator MUST be
specified even if the component value is missing. Similarly, the
jCard array containing the structured data MUST contain all required
elements, even if they are empty.vCard Example:
ADR;LABEL="123 Maple Ave\nSuite 901\nVancouver BC\nA1B 2C9\nCan
ada":;;;;;;jCard Example:
["adr",
{"label":"123 Maple Ave\nSuite 901\nVancouver BC\nA1B 2C9\nCanada"},
"text",
["", "", "", "", "", "", ""]
]
My Problem is using new vCard().parse(vcfString).toJCard()
with following vcfString:
BEGIN:VCARD
VERSION:2.1
ADR;CHARSET=utf-8;WORK:Postfach 9999
END:VCARD
results in:
[
"vcard",
[
["version", {}, "text", "4.0"],
["adr", {"charset": "utf-8", "type": "work"}, "text",
["Postfach 9999"]
]
]
]
whereas I think for "adr" it should be:
[
"vcard",
[
["version", {}, "text", "4.0"],
["adr", {"charset": "utf-8", "type": "work"}, "text",
["Postfach 9999", "", "", "", "", "", ""]
]
]
]
Hi,
When loading v4 vcard as a new vCard.parse failed with return code:
Error: Unsupported version \"ARD\"
I've notice that when there is one value for email address and phone numbers the ouput is an Object but if they are multiple it's an Array. Should always be an Array.
The current implementation uses string.split(/\r\n/g) to split strings based on newline characters. This approach only handles the \r\n sequence, which is the newline convention used in Windows. It does not account for newline characters used in other operating systems, such as \n (used in Unix/Linux and modern macOS) and \r (used in older versions of macOS).
Proposed Change:
Replace the current implementation:
string.split(/\r\n/g) with string.split(/\r\n|\r|\n/)
There are several instances of const
hoisting, which Firefox 36+ considers a JavaScript error:
ReferenceError: can't access lexical declaration `PROPERTIES' before initialization
Here is a small snippet which allows to reproduce the issue:
const Vcf = require('vcf')
const data = `BEGIN:VCARD
VERSION:4.0
NAME;PARAMETER="parameter value contains a :colon":value also contains a :colon
END:VCARD`.replace(/\n/g, '\r\n')
const vCard = new Vcf().parse(data)
console.log(vCard.get('name').valueOf())
This returns:
colon":value also contains a :colon
instead of:
value also contains a :colon
According to the RFC, the colon is allowed in the parameter value as long as its within double quotes.
param-value = *SAFE-CHAR / DQUOTE *QSAFE-CHAR DQUOTE
any-param = (iana-token / x-name) "=" param-value *("," param-value)
NON-ASCII = UTF8-2 / UTF8-3 / UTF8-4
; UTF8-{2,3,4} are defined in [RFC3629]
QSAFE-CHAR = WSP / "!" / %x23-7E / NON-ASCII
; Any character except CTLs, DQUOTE
SAFE-CHAR = WSP / "!" / %x23-39 / %x3C-7E / NON-ASCII
; Any character except CTLs, DQUOTE, ";", ":"
Hi Jonas,
Is this the output you expect?
vCard {
version: '2.1',
end: [ { data: 'VCARD' }, { data: 'VCARD' } ],
begin: [ { data: 'VCARD' }, { data: 'VCARD' } ],
n: [ { data: 'Bawrie;Allie;;;' }, { data: 'Mickley;Andrew;;;' } ],
fn: 'Andrew Mickley',
tel:
[ { data: 'CELL;PREF:0427240757' },
{ data: 'HOME:55875233' },
{ data: 'CELL:0429015624' } ] }
For this input?
BEGIN:VCARD
VERSION:2.1
END:VCARD
BEGIN:VCARD
VERSION:2.1
N:Bawrie;Allie;;;
FN:Allie Bawrie
TEL;CELL;PREF:0427240757
TEL;HOME:55875233
END:VCARD
BEGIN:VCARD
VERSION:2.1
N:Mickley;Andrew;;;
FN:Andrew Mickley
TEL;CELL:0429015624
END:VCARD
It seems there's more raw data (and semicolons) and less being recognized/parsed than in your example.
I've had to do a lot of extra parsing because parse()
uses the new
keyword to create what should be primitive values. I'm sure there is a good reason for this, though I'm not sure what it coudl be. Perhaps it would be better to adjust the parse()
method to output primitives?
Maybe something like this:
{
version: '4.0',
data: {
version: '4.0',
n: ['Gump','Forrest','',''],
fn: 'Forrest Gump',
org: 'Bubba Gump Shrimp Co.',
title: 'Shrimp Man',
photo: {
text: 'http://www.example.com/dir_photos/my_photo.gif'
mediatype: 'image/gif'
},
tel: [
{ uri: 'tel:+11115551212', type: [ 'work', 'voice' ] },
{ uri: 'tel:+14045551212', type: [ 'home', 'voice' ] }
],
adr: [
{
text: ['', '', '100 Waters Edge','Baytown','LA','30314','United States of America'],
type: 'work',
label: '"100 Waters Edge\\nBaytown, LA 30314\\nUnited States of America"'
},
{
text: ['', '', '42 Plantation St.','Baytown','LA','30314','United States of America'],
type: 'home',
label: '"42 Plantation St.\\nBaytown, LA 30314\\nUnited States ofAmerica"'
}
],
email: '[email protected]',
rev: '20080424T195243Z'
}
}
Here is a fiddle
Hi 👋,
I'm using this excellent library with typescript, and I have a request regarding the type definitions. Currently fetching a value looks like this:
// Must assert the type, since get returns string | object.
const photo: string = <string>card.get('photo').valueOf();
It starts looking funny when you need to perform more actions on the value:
const categories = (<string>card.get('photo').valueOf()).split(',');
I propose that Property be generic, maybe still constrain it to just strings and objects with <Type extends string | object>
so that I'm able to do:
const categories = card.get<string>('photo').valueOf().split(',');
The above would also make it possibe to type-hint against specific interfaces.
I'm willing to submit a pull request for this.
EDIT:
Also having this for Property | Property[]
would be useful, so maybe something like:
const categories = card.get<Property<string>[]>('photo').valueOf().split(',');
EDIT 2:
Even better, it would be nice if get could return a single type (e.g just Property[]
) so that we do not have to check if the returned value is a string or an array.
2.0.6 fixed the bug introduced in 2.0.5 that prevented it from parsing its own output.
But 2.0.6 fails to parse its own output generated by version 2.0.4
2.0.4 on the other hand is able to parse output from versions 2.0.4, 2.0.5 and 2.0.6
I am trying to parse below vcf file , its giving error of expected version but found BDAY.
This file is having birthday property on the second line of the file which is making issue I think . The tool is expecting version to be at second line compulsory .
However this file import easily in the mobile so this file should be parse by this tool also .
I am attaching the file , please check and let me know .
File is as below
BEGIN:VCARD
BDAY;VALUE=DATE:1963-09-21
VERSION:3.0
N:Stenerson;Derik
FN:Derik Stenerson
ORG:Microsoft Corporation
ADR;TYPE=WORK,POSTAL,PARCEL:;;One Microsoft Way;Redmond;WA;98052-6399;USA
TEL;TYPE=WORK,MSG:+1-425-936-5522
TEL;TYPE=WORK,FAX:+1-425-936-7329
EMAIL;TYPE=INTERNET:[email protected]
END:VCARD
BEGIN:VCARD
VERSION:3.0
N:Ganguly;Anik
FN:Anik Ganguly
ORG: Open Text Inc.
ADR;TYPE=WORK,POSTAL,PARCEL:;Suite 101;38777 West Six Mile Road;Livonia;MI;48152;USA
TEL;TYPE=WORK,MSG:+1-734-542-5955
EMAIL;TYPE=INTERNET:[email protected]
END:VCARD
BEGIN:VCARD
VERSION:3.0
N:Moskowitz;Robert
FN:Robert Moskowitz
EMAIL;TYPE=INTERNET:[email protected]
END:VCARD
Thanks[](url)
According to the RFC:
The default type is "voice". These type parameter values can be
specified as a parameter list (e.g., TYPE=text;TYPE=voice) or as a
value list (e.g., TYPE="text,voice"). The default can be
overridden to another set of values by specifying one or more
alternate values. For example, the default TYPE of "voice" can be
reset to a VOICE and FAX telephone number by the value list
TYPE="voice,fax".
Unfortunately the parse is invalid of such arguments, "text,voice"
gets parsed to ["\"text", "voice\""]
.
For example use:
TEL;VALUE=uri;PREF=1;TYPE="voice,home":tel:+1-555-555-5555;ext=5555
The parser should eliminate the double quotes. I will make a pull request.
Using the vCard 2.1 example at https://en.wikipedia.org/wiki/VCard, the adr
and label
properties are split apart, instead of the expected result of labels being a property of each address. Using a v4.0 card works as expected.
Version 2.1
vCard {
version: '2.1',
data: {
adr: [
{ _data: ';;100 Waters Edge;Baytown;LA;30314;United States of America',
type: 'work'},
{ _data: ';;42 Plantation St.;Baytown;LA;30314;United States of America',
type: 'home'}
],
label: [
{ _data: '"100 Waters Edge\\nBaytown, LA 30314\\nUnited States of America"' },
{ _data: '"42 Plantation St.\\nBaytown, LA 30314\\nUnited States ofAmerica"'}
]
}
}
Version 4.0
vCard {
version: '4.0',
data: {
adr: [
{ _data: ';;100 Waters Edge;Baytown;LA;30314;United States of America',
type: 'work',
label: '"100 Waters Edge\\nBaytown, LA 30314\\nUnited States of America"' },
{ _data: ';;42 Plantation St.;Baytown;LA;30314;United States of America',
type: 'home',
label: '"42 Plantation St.\\nBaytown, LA 30314\\nUnited States ofAmerica"' }
]
}
}
I'm trying to set custom type labels on my tel
properties although they're being uppercased. Is there any reason for this?
I'm setting the property with card.add('tel', '1234567', { type: 'personal' });
I can see the uppercasing is happening in the lib here.
Is there any quick fix to get around this? I'm using vCard version 3.0
and this is what I'm hoping for:
Thanks!
Hi
Seems the parser can't handle text with Quoted printable characters! For example, for this text :
FN;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:Forr=C3=A9st Gump
It should give us this text when decoded :
Forrést Gump
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.