About the Project | Getting Started | Usage | Contributing | License | Contact
stdgpu is an open-source library which provides several generic GPU data structures for fast and reliable data management. stdgpu lets you write more complex algorithms rapidly that look like sequential CPU code but are executed in parallel on the GPU.
-
Productivity. Previous libraries such as thrust, VexCL, ArrayFire or Boost.Compute focus on the fast and efficient implementation of various algorithms for contiguously stored data to enhance productivity. stdgpu follows an orthogonal approach and focuses on fast and reliable data management to enable the rapid development of more general and flexible GPU algorithms just like their CPU counterparts.
-
Interoperability. Instead of providing yet another ecosystem, stdgpu is designed to be a lightweight container library. Therefore, a core feature of stdgpu is its interoperability with previous established frameworks, i.e. the thrust library, to enable a seamless integration into new as well as existing projects.
-
Maintainability. Following the trend in recent C++ standards of providing functionality for safer and more reliable programming, the philosophy of stdgpu is to provide clean and familiar functions with strong guarantees that encourage users to write more robust code while giving them full control to achieve a high performance.
stdgpu has been developed as part of the SLAMCast live telepresence system which performs real-time, large-scale 3D scene reconstruction from RGB-D camera images as well as real-time data streaming between a server and an arbitrary number of remote clients. We hope to foster further developments towards unified CPU and GPU computing and welcome contributions from the community.
To compile the library, please make sure to fulfill the build requirements and execute the respective build scripts.
The following components (or newer versions) are required to build the library:
- C++14 compiler (one of the following):
- Ubuntu 18.04:
- GCC 7:
sudo apt install build-essential
- Clang 6:
sudo apt install clang
- GCC 7:
- Windows:
- MSVC 19.2x (Visual Studio 2019)
- Ubuntu 18.04:
- CMake 3.15:
https://apt.kitware.com/
- Doxygen 1.8.13 (optional):
sudo apt install doxygen
- lcov 1.13 (optional):
sudo apt install lcov
Depending on the used backend, additional components (or newer versions) are required:
- CUDA Backend:
- CUDA 10.0:
https://developer.nvidia.com/cuda-downloads
- CUDA 10.0:
- OpenMP Backend:
- OpenMP 2.0
- thrust 1.9.3: included in CUDA, but also available at
https://github.com/thrust/thrust
Older compiler versions may also work but are currently considered experimental and untested.
For convenience, we provide several cross-platform scripts to build the project. Note that some scripts depend on the build type, so there are scripts for both debug
and release
builds.
Command | Effect |
---|---|
sh scripts/setup_<build_type>.sh |
Performs a full clean build of the project. Removes old build, configures the project (build path: ./build ), builds the project, and runs the unit tests. |
sh scripts/build_<build_type>.sh |
(Re-)Builds the project. Requires that project is configured (or set up). |
sh scripts/run_tests_<build_type>.sh |
Runs the unit tests. Requires that project is built. |
sh scripts/create_documentation.sh |
Builds the documentation locally. Requires doxygen and that project is configured (or set up). |
sh scripts/run_coverage.sh |
Builds a test coverage report locally. Requires lcov and that project is configured (or set up). |
sh scripts/install_<build_type>.sh |
Installs the project at the configured install path (default: ./bin ). |
In the following, we show some examples on how the library can be integrated into and used in a project.
To use the library in your project, you can either install it externally first and then include it using find_package
:
find_package(stdgpu 1.0.0 REQUIRED)
add_library(foo ...)
target_link_libraries(foo PUBLIC stdgpu::stdgpu)
Or you can embed it into your project and build it from a subdirectory:
# Exclude the examples from the build
set(STDGPU_BUILD_EXAMPLES OFF CACHE INTERNAL "")
# Exclude the tests from the build
set(STDGPU_BUILD_TESTS OFF CACHE INTERNAL "")
add_subdirectory(stdgpu)
add_library(foo ...)
target_link_libraries(foo PUBLIC stdgpu::stdgpu)
To configure the library, two sets of options are provided. The following build options control the build process:
Build Option | Effect | Default |
---|---|---|
STDGPU_BACKEND |
Device system backend | STDGPU_BACKEND_CUDA |
STDGPU_SETUP_COMPILER_FLAGS |
Constructs the compiler flags | ON if standalone, OFF if included via add_subdirectory |
STDGPU_BUILD_SHARED_LIBS |
Builds the project as a shared library, if set to ON , or as a static library, if set to OFF |
BUILD_SHARED_LIBS |
STDGPU_BUILD_EXAMPLES |
Build the examples | ON |
STDGPU_BUILD_TESTS |
Build the unit tests | ON |
STDGPU_BUILD_TEST_COVERAGE |
Build a test coverage report | OFF |
In addition, the implementation of some functionality can be controlled via configuration options:
Configuration Option | Effect | Default |
---|---|---|
STDGPU_ENABLE_AUXILIARY_ARRAY_WARNING |
Enable warnings when auxiliary arrays are allocated in memory API | OFF |
STDGPU_ENABLE_CONTRACT_CHECKS |
Enable contract checks | OFF if CMAKE_BUILD_TYPE equals Release or MinSizeRel , ON otherwise |
STDGPU_ENABLE_MANAGED_ARRAY_WARNING |
Enable warnings when managed memory is initialized on the host side but should be on device in memory API | OFF |
STDGPU_USE_32_BIT_INDEX |
Use 32-bit instead of 64-bit signed integer for index_t |
ON |
STDGPU_USE_FAST_DESTROY |
Use fast destruction of allocated arrays (filled with a default value) by omitting destructor calls in memory API | OFF |
STDGPU_USE_FIBONACCI_HASHING |
Use Fibonacci Hashing instead of Modulo to compute hash bucket indices | ON |
This library offers flexible interfaces to reliably perform complex tasks on the GPU such as the creation of the update set at the server component of the related SLAMCast system:
namespace stdgpu
{
template <>
struct hash<short3>
{
inline STDGPU_HOST_DEVICE std::size_t
operator()(const short3& key) const
{
return key.x * 73856093
^ key.y * 19349669
^ key.z * 83492791;
}
};
}
// Spatial hash map for voxel block management
using block_map = stdgpu::unordered_map<short3, voxel*>;
__global__ void
compute_update_set(const short3* blocks,
const stdgpu::index_t n,
const block_map tsdf_block_map,
stdgpu::unordered_set<short3> mc_update_set)
{
// Global thread index
stdgpu::index_t i = blockIdx.x * blockDim.x + threadIdx.x;
if (i >= n) return;
short3 b_i = blocks[i];
// Neighboring candidate blocks for the update
short3 mc_blocks[8]
= {
short3(b_i.x - 0, b_i.y - 0, b_i.z - 0),
short3(b_i.x - 1, b_i.y - 0, b_i.z - 0),
short3(b_i.x - 0, b_i.y - 1, b_i.z - 0),
short3(b_i.x - 0, b_i.y - 0, b_i.z - 1),
short3(b_i.x - 1, b_i.y - 1, b_i.z - 0),
short3(b_i.x - 1, b_i.y - 0, b_i.z - 1),
short3(b_i.x - 0, b_i.y - 1, b_i.z - 1),
short3(b_i.x - 1, b_i.y - 1, b_i.z - 1),
};
for (stdgpu::index_t j = 0; j < 8; ++j)
{
// Only consider existing neighbors
if (tsdf_block_map.contains(mc_blocks[j]))
{
mc_update_set.insert(mc_blocks[j]);
}
}
}
On the other hand, simpler tasks such as the integration of a range of values can be expressed very conveniently:
class stream_set
{
public:
void
add_blocks(const short3* blocks,
const stdgpu::index_t n)
{
set.insert(stdgpu::make_device(blocks),
stdgpu::make_device(blocks + n));
}
// Further functions
private:
stdgpu::unordered_set<short3> set;
// Further members
};
More examples can be found in the examples
directory.
For detailed information on how to contribute, see CONTRIBUTING
.
Distributed under the Apache 2.0 License. See LICENSE
for more information.
If you use this library in one of your projects, please also cite the following publications:
@UNPUBLISHED{stotko2019stdgpu,
author = {Stotko, Patrick},
title = {{stdgpu: Efficient STL-like Data Structures on the GPU}},
year = {2019},
month = aug,
note = {arXiv:1908.05936},
url = {https://arxiv.org/abs/1908.05936}
}
@article{stotko2019slamcast,
author = {Stotko, P. and Krumpen, S. and Hullin, M. B. and Weinmann, M. and Klein, R.},
title = {{SLAMCast: Large-Scale, Real-Time 3D Reconstruction and Streaming for Immersive Multi-Client Live Telepresence}},
journal = {IEEE Transactions on Visualization and Computer Graphics},
volume = {25},
number = {5},
pages = {2102--2112},
year = {2019}
}
Patrick Stotko - [email protected]