Coder Social home page Coder Social logo

Comments (9)

klauspost avatar klauspost commented on June 16, 2024

Nice find. I will probably modify the "generic" function for this, since it will otherwise be a generic slowdown (keeping things in level 1 cache is critical)

from rawspeed.

LibRaw avatar LibRaw commented on June 16, 2024

I cannot understand your statement.
maximum slices count is 15 (so, 4 bit), offsets[] table is slices+1 in size;
So, for 32-bit offsets, the table will fit in 64 bytes (16*4), for 64-bit offsets - in 128 bytes.So, 2 or 4 cache lines.
L1 cache is about 500 times larger.

from rawspeed.

LibRaw avatar LibRaw commented on June 16, 2024

I've changed offset[] to offset64[] for easy refactoring (so, compiler warns until all places was fixed).
Here is my patch: https://www.dropbox.com/s/i64gott37ekx791/offset64.patch?dl=0
Sorry for do not use github pull mechanics, my repo is not synced with github.

Maximum data offset is now 32bit, this is enough for 2GPix bayer images and for 0.5-0.7Gpix 4/3-component linear dngs. Enough if we want to keep RawSpeed 32-bit compatible.

from rawspeed.

klauspost avatar klauspost commented on June 16, 2024

See PR #52 ... If that works for you, I will merge it.

from rawspeed.

klauspost avatar klauspost commented on June 16, 2024

The reason for not using 64 bits is that 64 bit shifts are very slow on 32 bit platforms.

I don't see any reason to slow down 99% of all decodes, so we handle this situation separately.

It still decodes at ~80MP/sec on my 5 year old Q6600 system.

from rawspeed.

LibRaw avatar LibRaw commented on June 16, 2024

Offset calculation is called once per slice. With maximum 16 slices per image, the speed difference is not measureable.

from rawspeed.

LibRaw avatar LibRaw commented on June 16, 2024

#52: unable to check right now, I use my own 64-bit offset code. Sometimes in future I'll rebase to current rawspeed from Github. Anyway, I'll wait until development branch will be merged to master.

from rawspeed.

klauspost avatar klauspost commented on June 16, 2024

No, it is used "slices" times per line, see:

uint32 slices = (uint32)slicesW.size() * (frame.h - skipY);

In all DNG's, that is 1 per line (slices are only a CR2 concept).

from rawspeed.

klauspost avatar klauspost commented on June 16, 2024

I have merged code, that uses the generic decoder, if 28 bits for offsets isn't enough.

from rawspeed.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.