Coder Social home page Coder Social logo

Comments (19)

amitguptagwl avatar amitguptagwl commented on August 17, 2024 1

please check 'spec/assets/" folder. I'm using sample.xml, ptest.xml, and midsize.xml

from fast-xml-parser.

Delagen avatar Delagen commented on August 17, 2024

My PR was for attrNodeName, and mean that all attributes present as child object with this name using it original names (or prefixed if it specified).

arrayMode. It does not my change. But I expect something like:

"types": [ {
            "#text": "textdetail",
            "type": [
                "val",
                { "@attr": "value", "#text": "val" },
                { "@attr":"value", "#text": "val" , "nestedtag" : ["val"]},
                { "@attr": "value"},
                "val"
            ],
            "config": [ "someconfig" ]
    }]

Mean that single tag value should be wrapped as array of object, instead of single object

from fast-xml-parser.

amitguptagwl avatar amitguptagwl commented on August 17, 2024

Thanks for your quick response. As you initially mentioned for simple migration from xml2js or others for this change, let me check how does that library transforms above example. So the consistency can be mentioned.

Thanks

from fast-xml-parser.

Delagen avatar Delagen commented on August 17, 2024

@amitguptagwl https://github.com/Leonidas-from-XIV/node-xml2js#options
attrkey: object key for collecting attributes. Your previous implementation cannot wrap attributes in subobject, all placed in single level with prefix.
Test: https://github.com/Leonidas-from-XIV/node-xml2js/blob/master/test/parser.test.coffee#L264
You can read other options for future implementation. As for me only this prevent me to migrate.

from fast-xml-parser.

amitguptagwl avatar amitguptagwl commented on August 17, 2024

attrkey: object key for collecting attributes. Your previous implementation cannot wrap attributes in subobject, all placed in single level with prefix.

There was an option attrNodeName for the same purpose.

It's fine if something is not implemented, my main concern is if something is half implemented.

Thanks a lot for your active discussion and contribution.

from fast-xml-parser.

Delagen avatar Delagen commented on August 17, 2024

There was an option attrNodeName for the same purpose.

That was my first PR to this project ))

from fast-xml-parser.

amitguptagwl avatar amitguptagwl commented on August 17, 2024

Thanks @Delagen .

I have just compared the performance of xml2js(0.4.19) with both version V2, V3 and I feel fast-xml-parser can give the advantage in case of small files only. When it comes to big files the performance of xml2js is stable and very fast than this library. And it is pure js based. Hence I feel that there is no advantage to create V3 or adding new features of this library as xml2js is already solving most of the concerns.

from fast-xml-parser.

Delagen avatar Delagen commented on August 17, 2024

Strange... at about 20M
√ xml2js with large (9049ms)
√ fast-xml-parser with large (279ms)
√ xml-js with large (33326ms)

from fast-xml-parser.

amitguptagwl avatar amitguptagwl commented on August 17, 2024

where did you get this figure from? May be I'm testing in the wrong way.

from fast-xml-parser.

Delagen avatar Delagen commented on August 17, 2024

I'l take sample XML SOAP:response with ~80К CDATA base64 response
Make source with xmlStringLargeElements=200

var xmlStringLarge = `<root>${xmlString.repeat(xmlStringLargeElements)}</root>`;

And parse it 100 times.

On 500
xml-js out of memory

with 1000 elements (~80-90MB)
√ xml2js with large (45113ms)
√ fast-xml-parser with large (1317ms)

from fast-xml-parser.

amitguptagwl avatar amitguptagwl commented on August 17, 2024

I'm reading 13m XML file in a string and using benchmark js to test

suite.add('xml2js', {
  'defer': true,
  'fn' : function(deferred) {
      xml2js.parseString(largeXmlData,function(err,result){
        if (err) throw err;
        deferred.resolve();
      });
    }
})
.on('complete', function() {
    console.log(this.name + ":" +   this.hz + " requests/second");
})
.run({ 'async': true });

from fast-xml-parser.

amitguptagwl avatar amitguptagwl commented on August 17, 2024

You're making me to push my changes.... Can you plz share the gist ? I'm definitely doing something wrong. Because for me xml2js always takes 3s for both small (1.5kb) and big (98mb) files.

from fast-xml-parser.

Delagen avatar Delagen commented on August 17, 2024

@amitguptagwl Can you share benchmark project?
Update: I use only 5 iterations on large files.

I am using mocha with simple loop

it("xml2js with large", function() {
		this.timeout(60000);
		var callback = function(err, result) {
			if (err) {
				console.error(err); //eslint-disable-line no-console
			}
			result.root["soap:Envelope"].should.have.lengthOf(xmlStringLargeElements);
			for (var j = 0; j < xmlStringLargeElements; j++) {
				result.root["soap:Envelope"][j]["soap:Body"].should.ok;
			}
		};
		for (var i = 0; i < benchmarkLargeIterations; i++) {
			xml2js.parseString(xmlStringLarge, xml2jsOptions, callback);
		}
	});

Try to remove async & defer, because callback is syncronous

from fast-xml-parser.

amitguptagwl avatar amitguptagwl commented on August 17, 2024

It is just a file in benchmark folder. Let me share the code;

var Benchmark = require('benchmark');
var suite = new Benchmark.Suite("XML Parser benchmark");

var parser = require("../src/parser");
var xml2js = require("xml2js");
//var parser2 = require("../src/parserV2");

var fs = require("fs");
var path = require("path");
var fileNamePath = path.join(__dirname, "../spec/assets/midsize.xml");
var xmlData = fs.readFileSync(fileNamePath).toString();

suite
.add('xml to json + validation V1', function() {
  parser.validate(xmlData); 
  parser.parse(xmlData); 
})
/*.add('xml to json + validation V2 ', function() {
  parser2.parse(xmlData);
})*/
.add('xml2js ', function() {
  xml2js.parseString(xmlData,function(err,result){
    if (err) throw err;
  });
})
/* .add('xml2js', {
  'defer': true,
  'fn' : function(deferred) {
      xml2js.parseString(xmlData,function(err,result){
        console.log("err", err);
        if (err) throw err;
        deferred.resolve();
      });
    }
}) */

.on('start',function(){
	console.log("Running Suite: " + this.name);
})
.on('error',function(e){
	console.log("Error in Suite: " + this.name);
})
.on('abort',function(e){
	console.log("Aborting Suite: " + this.name);
})
/*.on('cycle',function(event){
	console.log("Suite ID:" + event.target.id);
})*/
// add listeners 
.on('complete', function() {
  for (var j = 0; j < this.length; j++) {
    console.log(this[j].name + ":" +   this[j].hz + " requests/second");
  }
})
// run async 
.run({ 'async': true });

from fast-xml-parser.

Delagen avatar Delagen commented on August 17, 2024

@amitguptagwl xml2js speed vary on data. For example: modified 3 lines

var fileNamePath = path.join(__dirname, "../spec/assets/test.xml");
var xmlData = fs.readFileSync(fileNamePath).toString();
xmlData=`<root>${xmlData.repeat(1000)}</root>`;

where test.xml something like

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
    <soap:Body>
        <rpt:loadReportFileResponseElem
                xmlns:s="http://bus.x.com/common/support/v1"
                xmlns:rpt="http://bus.x.com/service/statement/v1">
            <s:code>0</s:code>
            <s:responseTime>2588</s:responseTime>
            <s:responseDbTime>1893</s:responseDbTime>
            <s:requestId>6b408fd09eb211e7a0807e34820340ec</s:requestId>
            <s:route>172.16.x.x:9192</s:route>
            <rpt:result>
 <rpt:file><![CDATA[
...~80К base64
]]></rpt:file>
            </rpt:result>
        </rpt:loadReportFileResponseElem>
    </soap:Body>
</soap:Envelope>

And result
xml to json V1:5.084196828113811 requests/second
xml2js :0.11925605306783423 requests/second
50!!! times

even with validation (which unneeded if user trust source)
xml to json + validation V1:0.7115240765649846 requests/second
xml2js :0.11756098869218376 requests/second
7 times

xml2js does not have remove namespaces feature that needed by common case, because namespaces names can change and without much complex code parsing of response became pain.
I wrote own cleanXml function that clean up namespaces from result object when used xml2js

on 10000 repeat
xml to json V1:0.4878174819040743 requests/second
xml2js FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory

That old error on SAX parser.
Xml2js full of ugly errors Leonidas-from-XIV/node-xml2js#385 and others)
Thats all ))

from fast-xml-parser.

amitguptagwl avatar amitguptagwl commented on August 17, 2024

Agree, with the XML having CDATA benchmarks are as follows;

xml to json V1: 36.2170364893656 requests/second
xml to json + validation V1: 16.293542685483445 requests/second
xml to json + validation V2 : 18.840299721018933 requests/second
xml2js : 8.493040709182775 requests/second

for small file

xml to json V1: 44887.09328602369 requests/second
xml to json + validation V1: 17996.547289493574 requests/second
xml to json + validation V2 : 17024.9652624249 requests/second
xml2js : 8680.17982076353 requests/second

for small file without CDATA

xml to json V1:22418.135460499547 requests/second
xml to json + validation V1:9835.206488018259 requests/second
xml to json + validation V2 :7393.2310311734145 requests/second
xml2js :6392.260927006273 requests/second

But for big file without CDATA

For data you shared (When I remove CADATA part)
xml to json V1:89.67479082829274 requests/second
xml to json + validation V1:29.936565553420834 requests/second
xml to json + validation V2 :23.292112894835128 requests/second
xml2js :15.190920817625521 requests/second

File I made (13m)

xml to json V1:1.0754129079969013 requests/second
xml to json + validation V1:0.6726000748140463 requests/second
xml to json + validation V2 :0.8346327369734697 requests/second
xml2js :5978.957878234166 requests/second

I can work more to speed it up as it gives better performance at least in case of CDATA or when file size is less than 3-4 mb.

from fast-xml-parser.

amitguptagwl avatar amitguptagwl commented on August 17, 2024

There is some more interesting result. I have created parser V3 which is not doing validation and I've done some more changes.

Now when I tested against 98m file

V1: error out due to RE
V2: 8648ms (include validation as well)
V3: 7441ms (no validation, less RE)
xml2js: 4ms

But when I read the output of xml2js, it just read 2 starting tags and skipped rest of the XML. Hence it was fast.

Moreover, I modified your example and added XML prolog and DOCTYPE. In this case xml2js error out.

Now the only problem is parser V1 is up to 4 times faster than V2 or V3 but it works only for small file up to 15mb then it errors out.

I could have rewritten the JS RE engine if I get the time to complete my other project grapes.

from fast-xml-parser.

Delagen avatar Delagen commented on August 17, 2024

4 times not too big degradation for support of large files, I think it still faster than xml2js.
My old options for xml2js, may be that change some?

var defaultOptions = {
	explicitCharkey:    false,
	trim:               true,
	normalize:          false,
	normalizeTags:      false,
	attrkey:            "$",
	charkey:            "_",
	explicitArray:      false,
	ignoreAttrs:        false,
	mergeAttrs:         false,
	explicitRoot:       true,
	validator:          null,
	xmlns:              false,
	explicitChildren:   false,
	childkey:           "$$",
	charsAsChildren:    false,
	async:              false,
	strict:             true,
	attrNameProcessors: null,
	tagNameProcessors:  null,
	rootName:           "root",
	xmldec:             {
		version:    "1.0",
		encoding:   "utf-8",
		standalone: true
	},
	doctype:            null,
	renderOpts:         {
		pretty:  true,
		indent:  "  ",
		newline: "\n"
	}
};

Can you share your example XML?

from fast-xml-parser.

amitguptagwl avatar amitguptagwl commented on August 17, 2024

Performance (should have been discussed in separate thread) is degraded to handle some cases of XML but no impact due to large files now.

I have removed arraymode from official documents but it is still supported. As soon as I'll verify it, I'll make it as a part of official document.

from fast-xml-parser.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.