Coder Social home page Coder Social logo

Comments (8)

philipc avatar philipc commented on August 22, 2024

Do you need to share them (reference them simultaneously from multiple Units), or is it enough to be able to reuse the Abbreviations from a previous Unit that you've finished processing?

from gimli.

Swatinem avatar Swatinem commented on August 22, 2024

We are currently keeping all the Units in memory because there might be cross-references I think.

This means that we can’t easily reuse the Abbreviations from a previous Unit. I wonder if we could lift that restriction at some point, but thus far we construct all the Units lazily and keep them around.

from gimli.

Swatinem avatar Swatinem commented on August 22, 2024

See also getsentry/symbolic#683 for my workaround upstream. I essentially copied all of Unit::new which is unfortunate.

Having a second constructor that takes Abbreviations would make that workaround a bit less ugly. Then I can mem::swap stuff around as I please to avoid the re-parsing.

from gimli.

Swatinem avatar Swatinem commented on August 22, 2024

This is really an interesting case and very much depends on the underlying DWARF data.

For my pathological case, there is only a single Abbreviations table with ~1500 entries in debug_abbrev that is shared by all CUs.

But other files (for example electron.debug) are rather duplicated and seem to have one abbreviation table per CU.

I think the second case already wastes bytes in the raw DWARF by not deduplicating these tables, but on the other hand is only parsing as much as needed in gimli (well, except for the abbrevs that are duplicated across CUs).
The first case however highlights a shortcoming in gimli as it is wasting a lot of cpu and memory to parse the complete Abbreviations table for each CU, even though it could get away with only parsing a single one.

from gimli.

philipc avatar philipc commented on August 22, 2024

This is definitely a shortcoming in gimli, which I was aware of in the past and forgot about. It wasn't a problem before we added Dwarf and Unit.

mem::swap sounds too error prone. You would need to swap every time before and after calling something that uses it.

Ideally there would be a reference to an abbreviations cache somewhere, perhaps in Dwarf. I'm not sure how hard it will be to design the ownership for that though.

from gimli.

Swatinem avatar Swatinem commented on August 22, 2024

before and after calling something that uses it.

My current workaround is very brittle anyway in the sense that I very carefully avoided calling anything that uses it. Which means things can break easily in the future if people are not careful :-(

And yes, having an Abbreviations cache in Dwarf sounds like a good solution to me. But as you mentioned, ownership might be a pain there and either require some more lifetime parameters, or just going with Arc.

from gimli.

Swatinem avatar Swatinem commented on August 22, 2024

@mstange suggested that it might be dsymutil that merges/deduplicates all the Abbreviations, and indeed, the macOS dSYM for https://github.com/electron/electron/releases/tag/v20.1.4 only has a single Abbreviations table with ~1100 entries. But it only has ~18k CUs which limits the combinatorial explosion.

My hack in getsentry/symbolic#683 to deduplicate the parsing speeds symcache_debug up from ~26 -> ~18 seconds, and peak RSS from ~6.7 G -> ~2.6G, which are very good results.

from gimli.

mstange avatar mstange commented on August 22, 2024

Here's a dSYM with 1902 shared abbreviations and 9927 CUs, which can be used to test potential fixes: https://storage.googleapis.com/profiler-get-symbols-fixtures/XUL-E2C3444B769A3A1887EDA7C34A07A56C0.dSYM.tar.bz2

from gimli.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.