Coder Social home page Coder Social logo

Eval4Inputs optimisation about little-cms HOT 5 CLOSED

glennwilton avatar glennwilton commented on September 27, 2024
Eval4Inputs optimisation

from little-cms.

Comments (5)

mm2 avatar mm2 commented on September 27, 2024

Thanks @glennwilton
The current code is such because it has been tested for speed in many optimizing compilers and this variant runs fast. The two loops are vectorized in some cases. I choose to keep this code for clarity sake. Also, I found the whole tetrahedra partition has to be recomputed because sometimes is different when K advances. Assembly code produced by optimizer is quite different from C, so those additions probably are not there.

If you want to provide a PR, I will be glad to run it across an extensive throughput test bed I keep for such cases.

from little-cms.

glennwilton avatar glennwilton commented on September 27, 2024

Thanks, I did not consider compiler optimisations, which negate the complexity; I'm using a similar function in Javascript on 3D/4D Luts and noticed the optimisations that make a big difference in JS, as I need all the optimisations I can get!!

from little-cms.

mm2 avatar mm2 commented on September 27, 2024

Yep, JS is a quite different thing. You can try what modern optimizers does in C here:

https://godbolt.org/

I tried this code with -Ofast as options:

int test()
{
    int a = 0;

    for (int i=0; i < 10; i++)
    {   
        a += i*2;
    }
    
    for (int j=0; j < 10; j++)
    {   
        a -= j/3;
    }

return a;
}

And the compiler generated this assembly:


test():
        mov     eax, 78
        ret

from little-cms.

glennwilton avatar glennwilton commented on September 27, 2024

That is interesting. It has been a long time since I did anything In C.

Out of interest are there any utils for testing the throughput performance of LittleCMS, I'm curious to see the speed difference between LCMS and what I've done in JS. My C is very rusty, but I suspect I could modify transicc.

On my Ryzen7 3700X in Javascript (all single Core/ No threads) using a prebuilt RGB->CMYK LUT 33x33x33 with 64bit float values , and 8bit input/out arrays , Chrome is pushing through 39 million pixels per second, and Firefox 36 million pixels per second. Fast enough for what I need but interesting to see what C code can pump though.

from little-cms.

mm2 avatar mm2 commented on September 27, 2024

No defect was described here, so I close the issue

from little-cms.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.