Coder Social home page Coder Social logo

owensgroup / rxmesh Goto Github PK

View Code? Open in Web Editor NEW
210.0 23.0 28.0 9.62 MB

GPU-accelerated triangle mesh processing

License: BSD 2-Clause "Simplified" License

CMake 1.77% Cuda 66.76% C++ 31.08% Shell 0.32% C 0.03% Batchfile 0.03%
mesh data-structure gpu cuda parallel-computing 3d-graphics 3d geometry-processing mesh-processing surface-mesh

rxmesh's Introduction

RXMesh Ubuntu Windows


Contents

About

RXMesh is a surface triangle mesh data structure and programming model for processing static meshes on the GPU. RXMesh aims at provides a high-performance, generic, and compact data structure that can handle meshes regardless of their quality (e.g., non-manifold). The programming model helps to hide the complexity of the data structure and provides an intuitive access model for different use cases. For more details, please check out our paper and GTC talk:

The library also features a sparse and dense matrix infrastructure that is tightly coupled with the mesh data structure. We expose various cuSolver, cuSparse, and cuBlas operations through the sparse and dense matrices, tailored for geometry processing applications.

This repository provides 1) source code to reproduce the results presented in the paper (git tag v0.1.0) and 2) ongoing development of RXMesh.

Compilation

The code can be compiled on Ubuntu, Windows, and WSL providing that CUDA (>=11.1.0) is installed. To run the executable(s), an NVIDIA GPU should be installed on the machine.

Dependencies

  • OpenMesh to verify the applications against reference CPU implementation
  • RapidJson to report the results in JSON file(s)
  • GoogleTest for unit tests
  • spdlog for logging
  • glm for small vectors and matrices operations
  • Eigen for small vectors and matrices operations
  • Polyscope for visualization
  • cereal for serialization

All the dependencies are installed automatically! To compile the code:

> git clone https://github.com/owensgroup/RXMesh.git
> cd RXMesh
> mkdir build 
> cd build 
> cmake ../

Depending on the system, this will generate either a .sln project on Windows or a make file for a Linux system.

Organization

RXMesh is a CUDA/C++ header-only library. All unit tests are under the tests/ folder. This includes the unit test for some basic functionalities along with the unit test for the query operations. All applications are under the apps/ folder.

Programming Model

The goal of defining a programming model is to make it easy to write applications using RXMesh without getting into the nuances of the data structure. Applications written using RXMesh are composed of one or more of the high-level building blocks defined under Computation. To use these building blocks, the user would have to interact with data structures specific to RXMesh discussed under Structures. Finally, RXMesh integrates Polyscope as a mesh Viewer which the user can use to render their final results or for debugging purposes.

Structures

  • Attributes are the metadata (geometry information) attached to vertices, edges, or faces. Allocation of the attributes is per-patch basis and managed internally by RXMesh. The allocation could be done on the host, device, or both. Allocating attributes on the host is only beneficial for I/O operations or initializing attributes and then eventually moving them to the device.

    • Example: allocation
      RXMeshStatic rx("input.obj");
      auto vertex_color = 
        rx.add_vertex_attribute<float>("vColor", //Unique name 
                                       3,        //Number of attribute per vertex 
                                       DEVICE,   //Allocation place 
                                       SoA);     //Memory layout (SoA vs. AoS)                                 
      
    • Example: reading from std::vector
      RXMeshStatic rx("input.obj");
      std::vector<std::vector<float>> face_color_vector;
      //....
      
      auto face_color = 
        rx.add_face_attribute<int>(face_color_vector,//Input attribute where number of attributes per face is inferred 
                                   "fColor",         //Unique name                                
                                   SoA);             //Memory layout (SoA vs. AoS)                                  
    • Example: move, reset, and copy
      //By default, attributes are allocated on both host and device     
      auto edge_attr = rx.add_edge_attribute<float>("eAttr", 1);  
      //Initialize edge_attr on the host 
      // ..... 
      
      //Move attributes from host to device 
      edge_attr.move(HOST, DEVICE);
      
      //Reset all entries to zero
      edge_attr.reset(0, DEVICE);
      
      auto edge_attr_1 = rx.add_edge_attribute<float>("eAttr1", 1);  
      
      //Copy from another attribute. 
      //Here, what is on the host sde of edge_attr will be copied into the device side of edge_attr_1
      edge_attr_1.copy_from(edge_attr, HOST, DEVICE);
  • Handles are the unique identifiers for vertices, edges, and faces. They are usually internally populated by RXMesh (by concatenating the patch ID and mesh element index within the patch). Handles can be used to access attributes, for_each operations, and query operations.

    • Example: Setting vertex attribute using vertex handle
      auto vertex_color = ...    
      VertexHandle vh; 
      //...
      
      vertex_color(vh, 0) = 0.9;
      vertex_color(vh, 1) = 0.5;
      vertex_color(vh, 2) = 0.6;
  • Iterators are used during query operations to iterate over the output of the query operation. The type of iterator defines the type of mesh element iterated on e.g., VertexIterator iterates over vertices which is the output of VV, EV, or FV query operations. Since query operations are only supported on the device, iterators can be only used inside the GPU kernel. Iterators are usually populated internally.

    • Example: Iterating over faces
      FaceIterator f_iter; 
      //...
      
      for (uint32_t f = 0; f < f_iter.size(); ++f) {	
        FaceHandle fh = f_iter[f];
        //do something with fh ....
      }

Computation

  • for_each runs a computation over all vertices, edges, or faces without requiring information from neighbor mesh elements. The computation that runs on each mesh element is defined as a lambda function that takes a handle as an input. The lambda function could run either on the host, device, or both. On the host, we parallelize the computation using OpenMP. Care must be taken for lambda function on the device since it needs to be annotated using __device__ and it can only capture by value. More about lambda function in CUDA can be found here

    • Example: using for_each to initialize attributes
      RXMeshStatic rx("input.obj");
      auto vertex_pos   = rx.get_input_vertex_coordinates();                   //vertex position 
      auto vertex_color = rx.add_vertex_attribute<float>("vColor", 3, DEVICE); //vertex color 
      
      //This function will be executed on the device 
      rx.for_each_vertex(
          DEVICE,
          [vertex_color, vertex_pos] __device__(const VertexHandle vh) {
              vertex_color(vh, 0) = 0.9;
              vertex_color(vh, 1) = vertex_pos(vh, 1);
              vertex_color(vh, 2) = 0.9;
          });

    Alternatively, for_each operations could be written the same way as Queries operations (see below). This might be useful if the user would like to combine a for_each with queries operations in the same kernel. For more examples, checkout ForEach unit test.

  • Queries operations supported by RXMesh with description are listed below

    Query Description
    VV For vertex V, return its adjacent vertices
    VE For vertex V, return its incident edges
    VF For vertex V, return its incident faces
    EV For edge E, return its incident vertices
    EF For edge E, return its incident faces
    FV For face F, return its incident vertices
    FE For face F, return its incident edges
    FF For face F, return its adjacent faces

    Queries are only supported on the device. RXMesh API for queries takes a lambda function along with the type of query. The lambda function defines the computation that will be run on the query output.

    • Example: vertex normal computation
      template<uint32_t blockSize>
      __global__ void vertex_normal (Context context){      
          auto compute_vn = [&](const FaceHandle face_id, const VertexIterator& fv) {
          	//This thread is assigned to face_id
      
          	// get the face's three vertices coordinates
          	vec3<T> c0(coords(fv[0], 0), coords(fv[0], 1), coords(fv[0], 2));
          	vec3<T> c1(coords(fv[1], 0), coords(fv[1], 1), coords(fv[1], 2));
              vec3<T> c2(coords(fv[2], 0), coords(fv[2], 1), coords(fv[2], 2));
      
            //compute face normal
            vec3<T> n = cross(c1 - c0, c2 - c0);
      
              // add the face's normal to its vertices
          		for (uint32_t v = 0; v < 3; ++v)     // for every vertex in this face
                  for (uint32_t i = 0; i < 3; ++i)   // for the vertex 3 coordinates
          		        atomicAdd(&normals(fv[v], i), n[i]);          
          };
      
        //Query must be called by all threads in the block. Thus, we create this cooperative_group
        //that uses all threads in the block and pass to the Query 
        auto block = cooperative_groups::this_thread_block();
        
        Query<blockThreads> query(context);
      
        //Qeury will first perform the query, store the results in shared memory. ShmemAllocator is 
        //passed to the function to make sure we don't over-allocate or overwrite user-allocated shared
        //memory 
        ShmemAllocator shrd_alloc;
      
        //Finally, we run the user-defined computation i.e., compute_vn
        query.dispatch<Op::FV>(block, shrd_alloc, compute_vn);
      } 

    To save computation, query.dispatch could be run on a subset of the input mesh element i.e., active set. The user can define the active set using a lambda function that returns true if the input mesh element is in the active set.

    • Example: defining active set
      template<uint32_t blockSize>
      __global__ void active_set_query (Context context){
        auto active_set = [&](FaceHandle face_id) -> bool{ 
          // ....         
          };
      
          auto computation = [&](const FaceHandle face_id, const VertexIterator& fv) {          
          // ....         
          };
      
          query.dispatch<Op::FV, blockSize>(context, computation, active_set);
      } 
  • Reduction operations apply a binary associative operation on the input attributes. RXMesh provides dot products between two attributes (of the same type), L2 norm of an input attribute, and user-defined reduction operation on an input attribute. For user-defined reduction operation, the user needs to pass a binary reduction functor with member __device__ T operator()(const T &a, const T &b) or use on of CUB's thread operators e.g., cub::Max(). Reduction operations require allocation of temporary buffers which we abstract away using ReduceHandle.

    • Example: dot product, L2 norm, user-defined reduction
      RXMeshStatic rx("input.obj");
      auto vertex_attr1 = rx.add_vertex_attribute<float>("v_attr1", 3, DEVICE);
      auto vertex_attr2 = rx.add_vertex_attribute<float>("v_attr2", 3, DEVICE);
      
      // Populate vertex_attr1 and vertex_attr2 
      //....
      
      //Reduction handle 
      ReduceHandle reduce(v1_attr);
      
      //Dot product between two attributes. Results are returned on the host 
      float dot_product = reduce.dot(v1_attr, v2_attr);
      
      cudaStream_t stream; 
      //init stream 
      //...
      
      //Reduction operation could be performed on specific attribute and using specific stream 
      float l2_norm = reduce.norm2(v1_attr, //input attribute 
                                   1,       //attribute ID. If not specified, reduction is run on all attributes 
                                   stream); //stream used for reduction. 
      
      
      //User-defined reduction operation 
      float l2_norm = reduce.reduce(v1_attr,                               //input attribute 
                                    cub::Max(),                            //binary reduction functor 
                                    std::numeric_limits<float>::lowest()); //initial value 

Viewer

Starting v0.2.1, RXMesh integrates Polyscope as a mesh viewer. To use it, make sure to turn on the CMake parameter USE_POLYSCOPE i.e.,

> cd build 
> cmake -DUSE_POLYSCOPE=True ../

By default, the parameter is set to True. RXMesh implements the necessary functionalities to pass attributes to Polyscope—thanks to its data adaptors. However, this needs attributes to be moved to the host first before passing it to Polyscope. For more information about Polyscope's different visualization options, please checkout Polyscope's Surface Mesh documentation.

  • Example: render vertex color
    RXMeshStatic rx("dragon.obj");
    
    //vertex color attribute 
    auto vertex_color = rx.add_vertex_attribute<float>("vColor", 3);
    
    //Populate vertex color on the device
    //....
    
    //Move vertex color to the host 
    vertex_color.move(DEVICE, HOST);
    
    //polyscope instance associated with rx 
    auto polyscope_mesh = rx.get_polyscope_mesh();
    
    //pass vertex color to polyscope 
    polyscope_mesh->addVertexColorQuantity("vColor", vertex_color);
    
    //render 
    polyscope::show();


Matrices and Vectors

  • Large Matrices: RXMesh has built-in support for large sparse and dense matrices built on top of cuSparse and cuBlas, respectively. For example, attributes can be converted to dense matrices as follows
RXMeshStatic rx("input.obj");

//Input mesh coordinates as VertexAttribute
std::shared_ptr<VertexAttribute<float>> x = rx.get_input_vertex_coordinates();

//Convert the attributes to a (#vertices x 3) dense matrix 
std::shared_ptr<DenseMatrix<float>> x_mat = x->to_matrix();

//do something with x_mat
//....

//Populate the VertexAttribute coordinates back with the content of the dense matrix
x->from_matrix(x_mat.get());

Dense matrices can be accessed using the usual row and column indices or via the mesh element handle (Vertex/Edge/FaceHandle) as a row index. This allows for easy access to the correct row associated with a specific vertex, edge, or face. Dense matrices support various operations such as absolute sum, AXPY, dot products, norm2, scaling, and swapping.

RXMesh supports sparse matrices, where the sparsity pattern matches the query operations. For example, it is often necessary to build a sparse matrix of size #V x #V with non-zero values at (i, j) only if the vertex corresponding to row i is connected by an edge to the vertex corresponding to column j. Currently, we only support the VV sparsity pattern, but we are working on expanding to all other types of queries.

The sparse matrix can be used to solve a linear system via Cholesky, LU, or QR factorization (relying on cuSolver)). The solver offers two APIs. The high-level API reorders the input sparse matrix (to reduce non-zero fill-in after matrix factorization) and allocates the additional memory needed to solve the system. Repeated calls to this API will reorder the matrix and allocate/deallocate the temporary memory with each call. For scenarios where the matrix remains unchanged but multiple right-hand sides need to be solved, users can utilize the low-level API, which splits the solve method into pre_solve() and solve(). The former reorders the matrix and allocates temporary memory only once. The low-level API is currently only supported for Cholesky-based factorization. Check out the MCF application for an example of how to set up and use the solver.

Similar to dense matrices, sparse matrices also support accessing the matrix using the VertexHandle and multiplication by dense matrices.

  • Small Matrices: It is often necessary to perform operations on small matrices as part of geometry processing applications, such as computing the SVD of a 3x3 matrix or normalizing a 1x3 vector. For this purpose, RXMesh attributes can be converted into glm or Eigen matrices, as demonstrated in the vertex_normal example above. Both glm and Eigen support small matrix operations inside the GPU kernel.

Replicability

This repo was awarded the replicability stamp by the Graphics Replicability Stamp Initiative (GRSI) 🎉. Visit git tag v0.1.0 for more information about replicability scripts.

Bibtex

@article{Mahmoud:2021:RAG,
  author       = {Ahmed H. Mahmoud and Serban D. Porumbescu and John D. Owens},
  title        = {{RXM}esh: A {GPU} Mesh Data Structure},
  journal      = {ACM Transactions on Graphics},
  year         = 2021,
  volume       = 40,
  number       = 4,
  month        = aug,
  issue_date   = {August 2021},
  articleno    = 104,
  numpages     = 16,
  pages        = {104:1--104:16},
  url          = {https://escholarship.org/uc/item/8r5848vp},
  full_talk    = {https://youtu.be/Se_cNAol4hY},
  short_talk   = {https://youtu.be/V_SHMXnCVws},
  doi          = {10.1145/3450626.3459748},
  acmauthorize = {https://dl.acm.org/doi/10.1145/3450626.3459748?cid=81100458295},
  acceptance   = {149/444 (33.6\%)},
  ucdcite      = {a140}
}

rxmesh's People

Contributors

ahdhn avatar ericyja avatar sachinkishan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rxmesh's Issues

ERROR On RTX3080 laptop with CUDA SDK 12.3

Hi!
Thanks for providing such super cool project. I'm trying to run the test on my PC. I can successfully run Polyscope, but the RXMesh_test will give this error:
image

The unit test has determined the machine model:
image

Here is my current cuda runtime situation:
image

I wonder, whether it is caused by currently high-version CUDA SDK (latest is 12.3, which is what I used).

Any suggestion and comment will help a lot!

Development status

Hi team,

searching for a robust GPU-based mesh data structure, I found RXMesh and wanted to inquire about the development status. I can see some work being done in dedicated branches but am not sure how advanced this is and whether the library is ready for any production purposes.

I'm interested in mesh booleans, hole-filling, generating mesh contours, etc.

Would be great to hear about your plans!

How to access multiple relations within one kernel?

Hi,

I am trying to implement an XPBD cloth simulation with the bilateral bending constraint model using RXMesh.
To compute the bending constraint for an edge, I need to first access its two adjacent faces, and then assign the "bending force" to the vertices of those two faces.
This procedure needs both E-F and F-V relations within a kernel.
I think I need to use the query_block_dispatcher API to launch an Op::EF relation right? However, I wonder how can I get the VertexIterator associated with the faces accessed by the E-F relation.

Here's the pseudo-code to explain the computation logic better:

// bending constraint kernel
auto bending_lambda = [&](uint32_t edge_id, RXMeshIterator& ef) {
  if (ef.size() == 2) {
    uint32_t v1 = ev(edge_id, 0); // question: how to get E-V ?
    uint32_t v2 = ev(edge_id, 1);
    uint32_t v3 = 0;
    uint32_t v4 = 0;

    for (int i = 0; i < 3; ++i) {
      if (fv(ef[0], i) != v1 && fv(ef[0], i) != v2) { // question: how to get F-V ?
        v3 = fv(ef[0], i);
      }
    }
    for (int i = 0; i < 3; ++i) {
      if (fv(ef[1], i) != v1 && fv(ef[1], i) != v2) {
        v4 = fv(ef[1], i);
      }
    }
  }
  // do other computations ...
}

// dispatch E-F as the document described
query_block_dispatcher<Op::EF, blockThreads>(context, bending_lambda);

Thank you very much for releasing your great work. Any support would be much appreciated.

like mesh hole filling or simplification works well?

Dear Ahdhn, i want to consult you again whether this gpu data structure works well and conveniently implemented for the algorithms like mesh hole filling or simplification which will change the topology of the original mesh. I do not need to change the original mesh itself or related patched information at real time while the algorithm is running , but just want to save the result mesh with changed topology in a object whose class is RXMESH::RXMeshAttribute. after this algorithm, i then do partition again of the topology changed mesh to run another algorithm. though i have noticed the limitations that dynamic scene is not appropriate mentioned in your paper, but just want to confirm it again, thank you .

why i can not launch multithreads for Bilateral filtering?

Dear author, recently i run the code of this project, but i tried many time that project Filtering can not launch multithread for the mesh bilateral denoising by openmesh , specifically, the function :filtering_openmesh which is called in filtering.cu file. in function body of filtering_openmesh , it is obviously that it uses openmp to run in multithread by :#pragma omp parallel for schedule(static) num_threads(num_omp_threads)
reduction(max
: max_neighbour_size).

but i attepted many times to observe 1.the time consumption in comparison with single thread by deleting #pragma。。。。。2.the cpu performance 3.the threads monitored in software "process explorer". all these phenomen indicates it runs in single thread. i do not know why it can not launch multitherad to do this job? and i am sure i have enabled openmp, macro OMP_NUM_THREADS and all possible configurations i know. In your paper, you compared openmesh implementing mcf and geodesic distance, i donot know how you switch between singlecore and multicore. Finally, my hardware running the project is briefly as below: Windows10, interl i7-10 with 16 threads, GPU2080S, cuda11.1, vs2019 and the project is the release version v0.1.0. hope to receive your answer, thank you.

Feature Request: Python API

Would love to be able to utilize this easily in Python. Bonus points if you can set it up as a backend to an already well-developed mesh package (e.g. Trimesh, PyVista, or OpenMesh).

big(>512 faces) isolate patch can not be Seperate in lloyd.

if there were some patchs that made by big isolate faces(>512 & < 1024 faces), then lloyd algorithm will be dead loop.
sulotion can be 2 kind:

  1. set more than 2 seed for big isolate faces.
  2. if a face is boundary in mesh ,then mark boundary in patch.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.