Coder Social home page Coder Social logo

Comments (15)

dtjohnson avatar dtjohnson commented on May 11, 2024 1

I just published v1.6.0 that has much better performance. In tests in V8 I was able to generate 9.5 million cells.

from xlsx-populate.

dtjohnson avatar dtjohnson commented on May 11, 2024

At which point is it crashing? Which browser(s)?

Could you provide a full working code example as an attachment that I could run?

from xlsx-populate.

rivlinehud avatar rivlinehud commented on May 11, 2024

Hi ... here is a repository with the project.
hope its helpfull..

Im developing on chrome and the browser memory is inflated on line 55 --> export.js

thanks :-)

from xlsx-populate.

dtjohnson avatar dtjohnson commented on May 11, 2024

Thanks. The problem is that in line 55 you are setting the value for 2 million cells. Whenever you set the value for a cell, an object gets created for it that looks something like this:

{
    attributes: { r: "A1", s: "1", t: "s" },
    children: [{
        name: "v",
        children: [5]
    }]
}

Dumping a heap snapshot, it looks like that object takes around 600 bytes. Multiply that by 2 million and you run into the V8 heap limit. While I'm sure it's possible to find ways to minimize the memory, it's not easy and is unlikely to happen. The good news is that the limit in Chrome/Node.js does seem to be right about 2 million cells. That seems to be a healthy amount for most use cases.

from xlsx-populate.

rivlinehud avatar rivlinehud commented on May 11, 2024

I see ...
Is there a way to stream like the file in order to free the browser memory ?

from xlsx-populate.

dtjohnson avatar dtjohnson commented on May 11, 2024

Nope. The file has to be held in memory in the browser. Do you need more than 2 million cells?

from xlsx-populate.

reviewher avatar reviewher commented on May 11, 2024

@dtjohnson @rivlinehud there's actually a lower limit to be aware of: V8 strings cannot be larger than 256 MB nodejs/node#3175 (comment)

from xlsx-populate.

rivlinehud avatar rivlinehud commented on May 11, 2024

@dtjohnson, yepp ... i need the capability to export large stock files ...
if i cant do it in the browser i will have to do it on the server and that is less convenient,
because of separation of code...

from xlsx-populate.

dtjohnson avatar dtjohnson commented on May 11, 2024

@reviewher, thanks for that. I wasn't aware of the string limit. That's not the issue in this particular problem, but I'm sure I'll run into it at some point.

@rivlinehud, how many cells do you need? We should be able to get to 4 million cells without too much difficulty...

from xlsx-populate.

rivlinehud avatar rivlinehud commented on May 11, 2024

@dtjohnson, 4 million cells would be amaizing !!! :-)

from xlsx-populate.

dtjohnson avatar dtjohnson commented on May 11, 2024

OK. So this is going to be harder than I thought. With the current setup you are actually limited to about 650k cells if you want to output a file. While you can certainly get 2M cells in memory, the downstream steps to turn it into a file consume additional memory. I do think 3-4 million cells should be possible, but it will require a lot more effort.

Here is the basic process the xlsx-populate uses:

  1. Opens a file using JSZip. A file with 4 million cells takes only 120kB uncompressed so I don't think there is any memory concerns at this stage.
  2. Uses sax js to stream the XML text in the files into a corresponding object model. Since sax js uses streaming and the final object model must be held entirely in memory I don't think there is any memory savings here.
  3. As you manipulate the workbook, xlsx-populate manipulates the object model. Since the point of xlsx-populate is to be minimally invasive and fairly ignorant about the full Open XML spec, there is a limit to what we can do here to cut memory. One obvious big savings is to eliminate child arrays if there is only a single child. That's pretty easy to do and seems to cut memory usage by 40% or so.
  4. To write to file it must first convert the object model back to XML. We currently use xmlbuilder-js to create an XML document and then convert it to a string. The issue is that the intermediate XML document is itself pretty large and needs to be held in memory at the same time as the object model. To save memory, we'd need to eliminate this and go straight from object model to an XML string. It's definitely doable, but non-trivial.
  5. Use JSZip to zip up all of the strings into an XLSX file. Again, the size of the strings/zips are not significant so I don't think there is a concern here.

So to summarize, I do think it's possible to get to 3-4 million cells, but there's a decent amount of work to get there. You might be better off looking at another library like Sheet JS for the time being.

from xlsx-populate.

dtjohnson avatar dtjohnson commented on May 11, 2024

OK. So I just published v1.2.0, which is as far as I'm going to take this at this point. I shaved the memory of the data structure a bit and a rewrote the XML builder from scratch, which saved a huuuuge amount of memory. Right now you can get just over 2 million cells in Chrome. Now CPU is the big bottleneck. Also, performance in IE is terrible in general, and this doesn't fair well either. I'm hesitant to push performance too much more at this stage. I'd rather wait for more completeness of the API before. So hopefully this is good enough for you.

from xlsx-populate.

rivlinehud avatar rivlinehud commented on May 11, 2024

thanks so much !! :-)

i will do some tests soon and let you know my experience with the uppgrade

from xlsx-populate.

unchewyglees avatar unchewyglees commented on May 11, 2024

Amazing! I've tried a few libraries that would all run out of memory and crash the browser before finding this one. This was able to do approximately 3.75 million cells in a file I just generated. Great work!

from xlsx-populate.

arthanariramesh avatar arthanariramesh commented on May 11, 2024

Is there a way to over come this memory crash issue. I am trying to load a 60 mb xlsx.

from xlsx-populate.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.