Coder Social home page Coder Social logo

Comments (22)

user706 avatar user706 commented on June 29, 2024

Basically:

#include <iostream>
#include <functional>

void print_num(int i)
{
    std::cout << i << '\n';
}

int main()
{
    auto bound = std::bind(print_num, 31337);
    static_assert(std::is_trivially_copyable<std::decay_t<decltype(bound)>>{}, 
                  "functor not trivially copyable");

    auto b2 = bound;
    
    return 0;
}

Above the static_assert will issue an error.
Do we need it in code/extern/generic_cmake/generic/forwarder.hpp (ref)?

from generic.

user1095108 avatar user1095108 commented on June 29, 2024

Yes, the static_assert is necessary; it's what distinguishes gnr::forwarder from std::function. You may have noticed that gnr::forwarder has no destructor and it does not invoke any destructors. This makes is perform better when compared to std::function. It makes the invocation code compile to a handful of instructions in most cases, but it also makes the static_assert necessary. You see, gnr::forwarder is for speed fiends, who want some convenience that comes with std::function, but don't want the cost. There would be no reason for gnr::forwarder to exist if it were not for the static_assert.

from generic.

user706 avatar user706 commented on June 29, 2024

Hmm... ok I think I see where you're coming from...

But: Does a missing destructor produce faster code, even for trivially copyable objects, that don't really need a destructor?

Or is the static_assert there to avoid possible leaks (or late deallocation (in the case of smart-pointers being held)) since the destructor (corresponding to placement-new) is not called?

from generic.

user1095108 avatar user1095108 commented on June 29, 2024

from generic.

user1095108 avatar user1095108 commented on June 29, 2024

from generic.

user706 avatar user706 commented on June 29, 2024

I think I know why you demand a type that does not need an explicit destructor.

Because std::function has type erasure. (Here functor_type is erased, as template parameter F is only known in assign(), but not in the class.)

So how does one handle destruction of a previously placement-new constructed type, when one does not know the type.

buffer->~MyType();    // but one does not know MyType anymore

?

The "simplest solution" is to just bypass the problem, by just demanding a type who's destructor does nothing...

from generic.

user1095108 avatar user1095108 commented on June 29, 2024

The type of the object you erase is "remembered", since you instantiate a function, that acts as an invoker, i.e. the stub. Similarly you can instantiate functions that act as "deleters" and they remember the type too. Or you can instantiate a static class which remembers the type you erased and store a pointer to it. These are all complications, that affect performance, however. I saw from disassembly, that forwarder produced the fastest code of all alternatives, with caveats.

from generic.

user706 avatar user706 commented on June 29, 2024

The type of the object you erase is "remembered", since you instantiate a function, that acts as an invoker, i.e. the stub. Similarly you can instantiate functions that act as "deleters" and they remember the type too. Or you can instantiate a static class which remembers the type you erased and store a pointer to it.

Ah yes, thanks!

These are all complications, that affect performance, however. I saw from disassembly, that forwarder produced the fastest code of all alternatives, with caveats.

Very nice!

However... this thing about the fastest code has caveats (as you write). It is very dependent on

  • platform/architecture (i.e. 64 bit vs 32 bit; arm vs intel, etc.)
  • compiler and c++standard used
  • optimization settings
  • what one is actually measuring (just function call, or including time of constructor and possible destructor)

Because to tell you the truth, in the following measurements gnr::forwarder is slower than std::function:
https://github.com/user706/CxxFunctionBenchmark#sample-result (was run on my machine [Intel Core i7-6700K], with -O3 -std=c++17 -DNDEBUG on gcc-8.1.0)

But yes, if I want no heap, then std::forwarder (and embxx::util::StaticFunction) is faster than using "a wrapper (std:ref, or gnr::memfun) in combination with std::function" to avoid heap-allocation.

struct A
{
    A(): a(2) {}

    int operator()(int val) { return val * a; }
    int arr[8] = {}; // pad it fat!
    int a;
};

int main()
{
    A a;
    std::function<int(int)> f1{std::ref(a)};

    decltype(gnr::memfun<MEMFUN(A::times_a)>(a)) binder = gnr::memfun<MEMFUN(A::times_a)>(a);
    std::function<int(int)> f2{binder};

    gnr::forwarder<int(int), sizeof(A)> f3{a};

    embxx::util::StaticFunction<int(int), sizeof(A)+4> f4{a};

    for (int i = 0; i < 10; ++i) {
	f1(i); // slow
	f2(i); // slow
	f3(i); // fast
	f4(i); // fastest
    }
}

On my machine, the benchmark here gives the following output:

[caller]
Perf< direct >:       0.5995013430 [s] {checksum: 0}
Perf< forw_memfun >:  1.5839021150 [s] {checksum: 0}
Perf< function_memfun >: 1.5768125770 [s] {checksum: 0}    // std::function with gnr::memfun
Perf< forw >:         1.5624859270 [s] {checksum: 0}       // gnr::forwarder
Perf< function_ref >: 1.5990735100 [s] {checksum: 0}       // std::function with std::ref
Perf< static_func >:  1.0868871510 [s] {checksum: 0}       // embxx::util::StaticFunction

\\\\\\\ memuse printout (#0) -----------
memory_tot                           = 0
memory_accumulation_since_last_print = 0
num_alloc_tot                        = 0
num_alloc_cur                        = 0

from generic.

user706 avatar user706 commented on June 29, 2024

Because to tell you the truth, in the following measurements gnr::forwarder is slower than std::function:
https://github.com/user706/CxxFunctionBenchmark#sample-result (was run on my machine [Intel Core i7-6700K], with -O3 -std=c++17 -DNDEBUG on gcc-8.1.0)

What numbers do you get, if you do:

git clone https://github.com/user706/CxxFunctionBenchmark.git
cd                                   CxxFunctionBenchmark/
mkdir build/
cd    build/
cmake ..   # or if you want to use a custom boost:    cmake -DBOOST_ROOT=/path/to/boost ..
make -j4 VERBOSE=1
./various

?

from generic.

user706 avatar user706 commented on June 29, 2024

I've run my benchmark again...

got new numbers...
https://pastebin.com/ETXb453D

So in addition to:

However... this thing about the fastest code has caveats (as you write). It is very dependent on

  • platform/architecture (i.e. 64 bit vs 32 bit; arm vs intel, etc.)
  • compiler and c++standard used
  • optimization settings
  • what one is actually measuring (just function call, or including time of constructor and possible destructor)

one needs to add

  • what else is running on your PC, while you're running the benchmark

This type of benchmarking stuff needs to be taken with a grain of salt, I think...

from generic.

user1095108 avatar user1095108 commented on June 29, 2024

from generic.

user1095108 avatar user1095108 commented on June 29, 2024

from generic.

user1095108 avatar user1095108 commented on June 29, 2024

from generic.

user1095108 avatar user1095108 commented on June 29, 2024

Anyway, I'm surprised, embxx_util_StaticFunction is more versatile than forwarder, produces more code, has virtual functions, yet is faster :) Maybe it's a warning to us, that the choice of a delegate is not as important as actually writing a useful app :)

from generic.

user706 avatar user706 commented on June 29, 2024

But essentially, your tests are flawed in the sense, that you don't define NDEBUG while compiling, if there are asserts in the code, they will skew the results.

No, I do compile with -DNDEBUG. That's because the CMakeLists.txt sets CMAKE_BUILD_TYPE to Release (ref) and if you then do make VERBOSE=1 (ref) you'll see the -DNDEBUG flag passed.

from generic.

user1095108 avatar user1095108 commented on June 29, 2024

Ha ha, never mind. If you have time, you can research why one delegate is better than another. It seems you are interested in this - but it really is not relevant a great deal - you can't always get a perfect compiler/architecture fit. As for me, I'm allergic about everything virtual, so even if a delegate using virtual member functions is faster, I am not going to use it :) BTW: He should have qualified the virtuals as final.

from generic.

user706 avatar user706 commented on June 29, 2024

Anyway, I'm surprised, embxx_util_StaticFunction is more versatile than forwarder, produces more code, has virtual functions, yet is faster :)

But it has a stricter license...

you can't always get a perfect compiler/architecture fit

yip!

from generic.

user706 avatar user706 commented on June 29, 2024

If you have time, you can research why one delegate is better than another. It seems you are interested in this

Ha well partially, but it's a rabbit hole that leads into a gigantic cave of vast proportions, and in the and one does not know up from down.

You put it best here:

Maybe it's a warning to us, that the choice of a delegate is not as important as actually writing a useful app :)

from generic.

user1095108 avatar user1095108 commented on June 29, 2024

from generic.

user706 avatar user706 commented on June 29, 2024

If you publish an extensive benchmark people will take a look at it, since C++ delegates fascinate many people for some reason.

I don't think I'll go beyond the few pull-requests I've done here.

But just a thought...

What about a different approach: run-time code generation approach!!? (Some JIT approaches could perhaps be used. ref)

One would generate custom invocation code that is guaranteed to be the fastest possible.
Perhaps in some circumstances the overhead of jumping to that generated invocation routine, and invoking the function, would be smaller than any other approach.

But that would need some time, and maybe I'm just dreaming and in the end it would not really be faster...

from generic.

user1095108 avatar user1095108 commented on June 29, 2024

Runtime code generation is so fancy, you better do it in your own repository. It's also architecture-dependent, unless you generate code for some virtual machine (like JVM). If you have an idea about your own delegate, just write one - I don't mind. I doubt though, that JIT is the way towards a faster delegate, C++ compilers are simply too good. Better spend your time on something else. I'm satisfied with forwarder.hpp, callback.hpp (which you didn't test at all) and memfun.hpp. If you figure out why others are faster, be sure to tell me :)

from generic.

user1095108 avatar user1095108 commented on June 29, 2024

Just some more thoughts: bothering with delegates isn't worth it. I suggest you find something worthwhile like graphics..., and implement stuff from that field. There's plenty of delegates to choose from, loool. Or maybe you could just use std::function<>.

from generic.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.