Coder Social home page Coder Social logo

adaptive_predicates's People

Contributors

mfdeakin avatar

Stargazers

 avatar  avatar

Watchers

 avatar

adaptive_predicates's Issues

Improve compile times and reduce compiled sizes

The adaptive_eval_impl class has some complicated template meta-programming which makes it slow to compile and can prevent compilation when used with larger expressions (ie, the determinant of a 3x3 in-sphere matrix) with certain compilers (clang).
I've done some work on replacing the current system with a simpler one and am recording results here.

GCC Results:

Original tests binary size (with debug info):
3.7 MiB

Run-time integer index tests binary size:
3.3 MiB

Compile-time integer index tests binary size:
3.4 MiB

Compile times:
Original:

tests             test_benchmarks
real     user     real     user
12.286   9.774    10.996   8.581
12.020   9.609    11.043   8.612
12.137   9.626    11.030   8.603
12.150   9.648    11.036   8.655

Run-time indexing:

tests             test_benchmarks
real     user     real     user
12.161   9.619    10.209   7.896
11.609   9.226    10.213   7.914
11.595   9.168    10.147   7.814
11.743   9.245    10.163   7.888

Compile-time indexing:

tests             test_benchmarks
real     user     real     user
11.696   9.285    10.622   8.358
11.723   9.323    10.715   8.335
11.770   9.344    10.590   8.275
11.801   9.363    10.627   8.277

Original benchmarks:

Adaptive eval orient2d   Adaptive eval matrix orient2d   
mean (ns) std dev (ns) mean (ns) std dev (ns)
2.73405 0.491417 72.1173 12.1189
107.243 14.8644 381.397 65.1008
2.7146 0.445853 201.584 31.1763
113.471 16.7273 391.132 57.8666

Run-time indexing benchmarks:

Adaptive eval orient2d   Adaptive eval matrix orient2d   
mean (ns) std dev (ns) mean (ns) std dev (ns)
2.74295 0.474749 69.0401 9.60249
118.247 15.4077 382.39 71.9227
3.22872 0.548605 230.826 36.1118
124.454 21.0036 400.177 70.1243

Compile-time indexing benchmarks:

Adaptive eval orient2d   Adaptive eval matrix orient2d   
mean (ns) std dev (ns) mean (ns) std dev (ns)
3.23738 0.538081 66.9969 10.9523
114.434 19.0912 362.157 49.3593
2.69643 0.364822 239.817 50.3422
127.445 25.6992 386.584 71.8093

Clang Results:

Original tests binary size (with debug info):
2.2 MiB

Run-time integer index tests binary size:
2.1 MiB

Compile-time integer index tests binary size:
2.1 MiB

Compile times:
Original:

tests
real     user
12.620   12.385
12.842   12.647
12.802   12.599
12.819   12.608

Run-time indexing:

tests
real     user
12.206   11.937
12.370   12.116
12.429   12.150
12.150   11.935

Compile-time indexing:

tests
real     user
12.468   12.270
12.329   12.130
12.535   12.340
12.314   12.110

Original benchmarks:

Adaptive eval orient2d   Adaptive eval matrix orient2d  
mean (ns) std dev (ns) mean (ns) std dev (ns)
3.66767 0.477605 40.7895 4.12443
141.515 42.5352 384.805 34.0432
3.76203 0.702961 168.353 23.1179
143.798 26.6885 409.305 62.9992

Run-time indexing benchmarks:

Adaptive eval orient2d   Adaptive eval matrix orient2d  
mean (ns) std dev (ns) mean (ns) std dev (ns)
3.79398 0.793719 42.0438 8.52517
137.486 8.10147 417.849 73.3438
3.77942 0.217006 204.442 34.484
145.141 21.3075 430.088 70.8435

Compile-time indexing benchmarks:

Adaptive eval orient2d   Adaptive eval matrix orient2d  
mean (ns) std dev (ns) mean (ns) std dev (ns)
3.68414 0.511048 40.3166 3.77443
130.63 10.6486 392.006 37.8768
3.70758 0.572383 208.538 20.4872
148.522 8.42591 426.24 65.8583

GPU Performance of exact evaluation not great

I'm guessing performance would be much better if shared memory was used, as is register pressure is likely very high

  • Verify register pressure is the main bottleneck
  • Implement or use an allocator that can choose to make use of the stack or if available, fast shared memory

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.