stdgpu: Efficient STL-like Data Structures on the GPU

About the Project

stdgpu is an open-source library which provides several generic GPU data structures for fast and reliable data management. stdgpu lets you write more complex algorithms rapidly that look like sequential CPU code but are executed in parallel on the GPU.

Productivity. Previous libraries such as thrust, VexCL, ArrayFire or Boost.Compute focus on the fast and efficient implementation of various algorithms for contiguously stored data to enhance productivity. stdgpu follows an orthogonal approach and focuses on fast and reliable data management to enable the rapid development of more general and flexible GPU algorithms just like their CPU counterparts.
Interoperability. Instead of providing yet another ecosystem, stdgpu is designed to be a lightweight container library. Therefore, a core feature of stdgpu is its interoperability with previous established frameworks, i.e. the thrust library, to enable a seamless integration into new as well as existing projects.
Maintainability. Following the trend in recent C++ standards of providing functionality for safer and more reliable programming, the philosophy of stdgpu is to provide clean and familiar functions with strong guarantees that encourage users to write more robust code while giving them full control to achieve a high performance.

stdgpu has been developed as part of the SLAMCast live telepresence system which performs real-time, large-scale 3D scene reconstruction from RGB-D camera images as well as real-time data streaming between a server and an arbitrary number of remote clients. We hope to foster further developments towards unified CPU and GPU computing and welcome contributions from the community.

Getting Started

To compile the library, please make sure to fulfill the build requirements and execute the respective build scripts.

Prerequisites

The following components (or newer versions) are required to build the library:

C++14 compiler (one of the following):
- Ubuntu 18.04:
  - GCC 7: sudo apt install build-essential
  - Clang 6: sudo apt install clang
- Windows:
  - MSVC 19.2x (Visual Studio 2019)
CMake 3.15: https://apt.kitware.com/
Doxygen 1.8.13 (optional): sudo apt install doxygen
lcov 1.13 (optional): sudo apt install lcov

Depending on the used backend, additional components (or newer versions) are required:

CUDA Backend:
- CUDA 10.0: https://developer.nvidia.com/cuda-downloads
OpenMP Backend:
- OpenMP 2.0
- thrust 1.9.3: included in CUDA, but also available at https://github.com/thrust/thrust

Older compiler versions may also work but are currently considered experimental and untested.

Building

For convenience, we provide several cross-platform scripts to build the project. Note that some scripts depend on the build type, so there are scripts for both debug and release builds.

Command	Effect
`sh scripts/setup_<build_type>.sh`	Performs a full clean build of the project. Removes old build, configures the project (build path: `./build`), builds the project, and runs the unit tests.
`sh scripts/build_<build_type>.sh`	(Re-)Builds the project. Requires that project is configured (or set up).
`sh scripts/run_tests_<build_type>.sh`	Runs the unit tests. Requires that project is built.
`sh scripts/create_documentation.sh`	Builds the documentation locally. Requires doxygen and that project is configured (or set up).
`sh scripts/run_coverage.sh`	Builds a test coverage report locally. Requires lcov and that project is configured (or set up).
`sh scripts/install_<build_type>.sh`	Installs the project at the configured install path (default: `./bin`).

Usage

In the following, we show some examples on how the library can be integrated into and used in a project.

CMake Integration

To use the library in your project, you can either install it externally first and then include it using find_package:

find_package(stdgpu 1.0.0 REQUIRED)

add_library(foo ...)

target_link_libraries(foo PUBLIC stdgpu::stdgpu)

Or you can embed it into your project and build it from a subdirectory:

# Exclude the examples from the build
set(STDGPU_BUILD_EXAMPLES OFF CACHE INTERNAL "")

# Exclude the tests from the build
set(STDGPU_BUILD_TESTS OFF CACHE INTERNAL "")

add_subdirectory(stdgpu)

add_library(foo ...)

target_link_libraries(foo PUBLIC stdgpu::stdgpu)

CMake Options

To configure the library, two sets of options are provided. The following build options control the build process:

Build Option	Effect	Default
`STDGPU_BACKEND`	Device system backend	`STDGPU_BACKEND_CUDA`
`STDGPU_SETUP_COMPILER_FLAGS`	Constructs the compiler flags	`ON` if standalone, `OFF` if included via `add_subdirectory`
`STDGPU_BUILD_SHARED_LIBS`	Builds the project as a shared library, if set to `ON`, or as a static library, if set to `OFF`	`BUILD_SHARED_LIBS`
`STDGPU_BUILD_EXAMPLES`	Build the examples	`ON`
`STDGPU_BUILD_TESTS`	Build the unit tests	`ON`
`STDGPU_BUILD_TEST_COVERAGE`	Build a test coverage report	`OFF`

In addition, the implementation of some functionality can be controlled via configuration options:

Configuration Option	Effect	Default
`STDGPU_ENABLE_AUXILIARY_ARRAY_WARNING`	Enable warnings when auxiliary arrays are allocated in memory API	`OFF`
`STDGPU_ENABLE_CONTRACT_CHECKS`	Enable contract checks	`OFF` if `CMAKE_BUILD_TYPE` equals `Release` or `MinSizeRel`, `ON` otherwise
`STDGPU_ENABLE_MANAGED_ARRAY_WARNING`	Enable warnings when managed memory is initialized on the host side but should be on device in memory API	`OFF`
`STDGPU_USE_32_BIT_INDEX`	Use 32-bit instead of 64-bit signed integer for `index_t`	`ON`
`STDGPU_USE_FAST_DESTROY`	Use fast destruction of allocated arrays (filled with a default value) by omitting destructor calls in memory API	`OFF`
`STDGPU_USE_FIBONACCI_HASHING`	Use Fibonacci Hashing instead of Modulo to compute hash bucket indices	`ON`

Examples

This library offers flexible interfaces to reliably perform complex tasks on the GPU such as the creation of the update set at the server component of the related SLAMCast system:

namespace stdgpu
{
template <>
struct hash<short3>
{
    inline STDGPU_HOST_DEVICE std::size_t
    operator()(const short3& key) const
    {
        return key.x * 73856093
             ^ key.y * 19349669
             ^ key.z * 83492791;
    }
};
}

// Spatial hash map for voxel block management
using block_map = stdgpu::unordered_map<short3, voxel*>;


__global__ void
compute_update_set(const short3* blocks,
                   const stdgpu::index_t n,
                   const block_map tsdf_block_map,
                   stdgpu::unordered_set<short3> mc_update_set)
{
    // Global thread index
    stdgpu::index_t i = blockIdx.x * blockDim.x + threadIdx.x;
    if (i >= n) return;

    short3 b_i = blocks[i];

    // Neighboring candidate blocks for the update
    short3 mc_blocks[8]
    = {
        short3(b_i.x - 0, b_i.y - 0, b_i.z - 0),
        short3(b_i.x - 1, b_i.y - 0, b_i.z - 0),
        short3(b_i.x - 0, b_i.y - 1, b_i.z - 0),
        short3(b_i.x - 0, b_i.y - 0, b_i.z - 1),
        short3(b_i.x - 1, b_i.y - 1, b_i.z - 0),
        short3(b_i.x - 1, b_i.y - 0, b_i.z - 1),
        short3(b_i.x - 0, b_i.y - 1, b_i.z - 1),
        short3(b_i.x - 1, b_i.y - 1, b_i.z - 1),
    };

    for (stdgpu::index_t j = 0; j < 8; ++j)
    {
        // Only consider existing neighbors
        if (tsdf_block_map.contains(mc_blocks[j]))
        {
            mc_update_set.insert(mc_blocks[j]);
        }
    }
}

On the other hand, simpler tasks such as the integration of a range of values can be expressed very conveniently:

class stream_set
{
public:
    void
    add_blocks(const short3* blocks,
               const stdgpu::index_t n)
    {
        set.insert(stdgpu::make_device(blocks),
                   stdgpu::make_device(blocks + n));
    }

    // Further functions

private:
    stdgpu::unordered_set<short3> set;
    // Further members
};

More examples can be found in the examples directory.

Contributing

For detailed information on how to contribute, see CONTRIBUTING.

License

Distributed under the Apache 2.0 License. See LICENSE for more information.

If you use this library in one of your projects, please also cite the following publications:

@UNPUBLISHED{stotko2019stdgpu,
    author = {Stotko, Patrick},
     title = {{stdgpu: Efficient STL-like Data Structures on the GPU}},
      year = {2019},
     month = aug,
      note = {arXiv:1908.05936},
       url = {https://arxiv.org/abs/1908.05936}
}

@article{stotko2019slamcast,
    author = {Stotko, P. and Krumpen, S. and Hullin, M. B. and Weinmann, M. and Klein, R.},
     title = {{SLAMCast: Large-Scale, Real-Time 3D Reconstruction and Streaming for Immersive Multi-Client Live Telepresence}},
   journal = {IEEE Transactions on Visualization and Computer Graphics},
    volume = {25},
    number = {5},
     pages = {2102--2112},
      year = {2019}
}

Contact

Patrick Stotko - [email protected]

tide999 / stdgpu Goto Github PK

stdgpu's Introduction

stdgpu: Efficient STL-like Data Structures on the GPU

About the Project

Getting Started

Prerequisites

Building

Usage

CMake Integration

CMake Options

Examples

Contributing

License

Contact

stdgpu's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent