Comments (4)
Hello, great library!
I just got back to my old raytracing project (abandoned 10 years ago) and decided that it would have been nice to work on the performances, horrible at best. I've started replacing my quick BIH implementation with a BVH but decided to switch to an external lib and found this one. I could make it work, but performances were exactly the same as with my BIH:
bvh::LocallyOrderedClusteringBuilder<bvh::Bvh, int> builder(bvh);
builder.build(bvh::BoundingBox(MasterBVHPrimitive::toVector3(worldbb.min), MasterBVHPrimitive::toVector3(worldbb.max)), bboxes, centers, shapes.size());
bvh::LeafCollapser<bvh::Bvh> collapser(bvh);
collapser.collapse();
I've started profiling the execution and found that 35% of the time was being spent inside std::fill, and I could track it back to
struct Vector {
Scalar values[N];
Vector() = default;
bvh__always_inline__ Vector(Scalar s) { std::fill(values, values + N, s); }
As I'm working with 3-dimensional vectors, I then replaced the call to fill with:
values[0] = values[1] = values[2] = s;
My test scene rendering time went from 8 seconds to 3!
Am I using the lib the wrong way? Is there an option that would disable/improve that initialization? Would my change break something somewhere?
Thank you!
from bvh.
Hello! @Janos95 I'll look into this. The library you link seems to be focused on closest point search, so chances are it is going to be slower than this library for ray-tracing. The numbers they give on their page at least seems to confirm that. @cignox1 Please create a separate issue for this, so as not to pollute this one. I suspect that you forgot to compile with optimizations on, or that your compiler is too old and effectively terrible at optimizing very simple code (see this for an example of how a decent compiler optimizes std::fill
). Consider switching to gcc
(Mingw64 if you're on Windows) or clang
. If you use CMake, set CMAKE_BUILD_TYPE
to Release
.
from bvh.
So, having a quick look at fcpw, plugging it into my path tracer, I get the following trace:
Scene loaded in 1380 ms (651259 vertices, 781184 triangles).
BVH constructed in 623 ms (0 nodes).
Average frame time: 85 ms.
Average frame time: 86 ms.
Average frame time: 87 ms.
Average frame time: 88 ms.
Average frame time: 88 ms.
For reference, the combination bvh::LocallyOrderedClustering
+ bvh::LeafCollapser
of this library gives:
Scene loaded in 1355 ms (651259 vertices, 781184 triangles).
BVH constructed in 174 ms (889959 nodes).
Average frame time: 59 ms.
Average frame time: 58 ms.
Average frame time: 59 ms.
Average frame time: 59 ms.
Average frame time: 59 ms.
Embree (avx2
) gives:
Scene loaded in 1405 ms (651259 vertices, 781184 triangles).
BVH constructed in 68 ms (0 nodes).
Average frame time: 42 ms.
Average frame time: 41 ms.
Average frame time: 41 ms.
Average frame time: 42 ms.
Average frame time: 42 ms.
This is of course with optimizations on and machine-specific flags (-O3 -march=native
with gcc
). As you can see, this library is faster. The claim that fcpw
is "only" 0.8x slower than Embree seems to also be incorrect, in this scenario at least. Maybe they only benchmarked coherent rays, or maybe something was wrong in their configuration. In a realistic usage scenario like running a path tracer to generate the picture of the front page, fcpw
is in fact twice as slow as Embree for ray intersections, and 10 times as slow for data structure construction.
Because this library can also generate higher-quality BVHs at a slight cost in build times, here's the result of running bvh::ParallelReinsertionOptimization
in between bvh::LocallyOrderedClustering
and bvh::LeafCollapser
:
Scene loaded in 1354 ms (651259 vertices, 781184 triangles).
BVH constructed in 523 ms (739621 nodes).
Average frame time: 54 ms.
Average frame time: 54 ms.
Average frame time: 54 ms.
Average frame time: 54 ms.
Average frame time: 54 ms.
So as a conclusion, this library is faster than fcpw
for ray intersection queries and build times. For single-ray traversal, vectorization does not give you so much (it's essentially the difference between this library and Embree: around 30%). Better BVH layouts and careful traversal loop design are more important. I'm not planning to add this library to the chart. In general, and as a rule for future benchmark submissions, I'll only consider benchmarking a library if:
- I have the time (which is getting more and more difficult),
- The library to compare with is within 30% of Embree,
- The library to compare with generates correct results.
Edit: Please ignore the 0 nodes
for Embree and fcpw. I just changed that number to 0 for those libraries since they do not expose this information.
from bvh.
Thanks for the super fast response time and detailed benchmark + analysis.
I totally agree that allocating your free time to this project is totally up to you.
Personally, I think you should factor in the popularity of a library for making
the decision to add a benchmark to the plot or not (which was probably the reason
why you have Fast-BVH in there and a good argument against fcpw, at least for now ;)).
from bvh.
Related Issues (20)
- raw, bvh node data to upload to the gpu HOT 1
- `build()` in debug build of v2 feels slower than v1 HOT 2
- Recovering "split axis" from MiniTreeBuilder HOT 1
- Face culling options? HOT 1
- Your code is building fine on MSYS2 MINGW64 HOT 1
- Two tests need pthread to compile on linux HOT 2
- Library does not handle rays parallel to axes HOT 5
- Identifiers near and far in bvh.h cause issues when precompiled headers are used HOT 3
- config.min_leaf_size > 1 Leads to assertion failure HOT 1
- What is the use case? is it slower than embree? HOT 1
- Cancel build +progress monitor HOT 4
- Consider adding to vcpkg HOT 1
- Ability to re-use BVH allocations? HOT 1
- Adding and Removing elements HOT 1
- Optimize node index serialization HOT 1
- Clarify what primitive data leaf nodes actually store HOT 1
- Potential bug in traversal functions HOT 1
- Possible stale pointer usage in extract_bvh()? HOT 4
- Sweep SAH builder HOT 4
- ReinsertionOptimizer sometimes gets stuck in an infinite loop HOT 12
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bvh.