Coder Social home page Coder Social logo

Extend C-Bindings? about highwayhash HOT 6 OPEN

google avatar google commented on May 14, 2024
Extend C-Bindings?

from highwayhash.

Comments (6)

jan-wassenberg avatar jan-wassenberg commented on May 14, 2024

Hi, thanks for reaching out. We could extend them whenever someone needs them. Is speed a concern? If not, there's already the simple C version in the c/ directory which includes cat.
Otherwise, if you're willing to extend the bindings, I'd be happy to review and integrate.

from highwayhash.

ifadams avatar ifadams commented on May 14, 2024

Hi Jan,

Sorry for the slow response.

Speed is indeed a concern, we originally used exactly the ones you pointed out.

Since then original question we have made some hack-ey C bindings wrapped around the C++ but so far are drastically under-performing. We're definitely hitting the avx instructions (confirmed by with valgrind/callgrind), and getting correct hash values back, so I'm not sure what we're doing wrong. We've got aligned allocations in the buffers, and I've seen similarly bad performance when I just use the native CPP class, so I'm clearly doing something wrong.

Any common "gotchas" that you might know of for us to look at would be much appreciated!

Thanks again.

Ian

from highwayhash.

jan-wassenberg avatar jan-wassenberg commented on May 14, 2024

Hi Ian, no worries!

I've seen similarly bad performance when I just use the native CPP class, so I'm clearly doing something wrong.

I'm not sure I understand the problem - are you saying both the C and C++ versions are equally slow? What's the basis for comparison - the published benchmark results?

Some ideas: is the InstructionSets dispatcher involved? Especially with Cat, the app-specific code generating data and the calls to Cat really should be inlined together with only a single call to the dispatcher (instead of once per Append).

Somewhat related: compilers didn't seem able to keep the hash state in registers across calls to Append, so we're loading 128 bytes for every call. Might be even worse on an older/non-Clang compiler.

Also, depending on how the C++ code and its wrapper are compiled, we might get VZEROUPPER after every function call. Does it help to enable -mavx2 in all translation units?

from highwayhash.

ifadams avatar ifadams commented on May 14, 2024

Hi Jan,

Right now, The C++ version with the -mavx2 flag set is working significantly slower than the pure C implementation when we're streaming file-data through them. Our measurements are just a crude timer for MBPS throughput, and getting something like <100MBPS using the C++ libraries, and we're pulling from the pagecache, so filethroughput is >5GBPS, so thats not the issue.

That said included benchmark is a bit slower than the results published on the README, but not so much that it can explain that. I can provide detailed numbers if thats helpful.

We'll take a look at the stuff you suggested.

Thanks again!

Ian

from highwayhash.

jan-wassenberg avatar jan-wassenberg commented on May 14, 2024

Oh, that's a surprise. We aren't using Cat in time-critical apps yet, and I did have trouble with the state not being kept in vector registers, but I'm still surprised it's slower than C.
Would you be able to share disassembly of the relevant parts to see where we're running into trouble?

from highwayhash.

ifadams avatar ifadams commented on May 14, 2024

Hi Jan,

As usual, we get sidetracked before circling around, apologies. I'll see if we can do that. Not sure we can (heavy-weight bureaucracy on our end even in innocuous cases) but worth a shot.

from highwayhash.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.