Coder Social home page Coder Social logo

mdarray's Introduction

Kokkos

Kokkos: Core Libraries

Kokkos Core implements a programming model in C++ for writing performance portable applications targeting all major HPC platforms. For that purpose it provides abstractions for both parallel execution of code and data management. Kokkos is designed to target complex node architectures with N-level memory hierarchies and multiple types of execution resources. It currently can use CUDA, HIP, SYCL, HPX, OpenMP and C++ threads as backend programming models with several other backends in development.

Kokkos Core is part of the Kokkos C++ Performance Portability Programming Ecosystem.

Kokkos is a Linux Foundation project.

Learning about Kokkos

To start learning about Kokkos:

Obtaining Kokkos

The latest release of Kokkos can be obtained from the GitHub releases page.

The current release is 4.3.00.

curl -OJ -L https://github.com/kokkos/kokkos/archive/refs/tags/4.3.00.tar.gz
# Or with wget
wget https://github.com/kokkos/kokkos/archive/refs/tags/4.3.00.tar.gz

To clone the latest development version of Kokkos from GitHub:

git clone -b develop  https://github.com/kokkos/kokkos.git

Building Kokkos

To build Kokkos, you will need to have a C++ compiler that supports C++17 or later. All requirements including minimum and primary tested compiler versions can be found here.

Building and installation instructions are described here.

You can also install Kokkos using Spack: spack install kokkos. Available configuration options can be displayed using spack info kokkos.

For the complete documentation: kokkos.org/kokkos-core-wiki/

Support

For questions find us on Slack: https://kokkosteam.slack.com or open a GitHub issue.

For non-public questions send an email to: crtrott(at)sandia.gov

Contributing

Please see this page for details on how to contribute.

Citing Kokkos

Please see the following page.

License

License

Under the terms of Contract DE-NA0003525 with NTESS, the U.S. Government retains certain rights in this software.

The full license statement used in all headers is available here or here.

mdarray's People

Contributors

amklinv-nnl avatar anabelsmruggiero avatar brycelelbach avatar crtrott avatar dalg24 avatar mhoemmen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mdarray's Issues

P1684: Better explain why mdarray is not a container

This comes from 1684R2 LEWG review on 2022/04/19.

The review expressed a preference for the current "container adapter" (but see below) design. Those not preferring that design expressed the concern documented as #15. Regardless, participants wanted us to explain better in the paper the trade-offs between the current "container adapter" (but see below) design, and an mdarray-as-container design. Here is a draft of that explanation.

(R0 design had a "container policy" to do automatic switching. R1 went away from that.)

Benefits of mdarray-as-container design

  • Allocator directly visible, instead of hidden in Container
  • No need to specify Container type requirements

Drawbacks of mdarray-as-container design

  • Container design calls for two different array types ("static_mdarray" vs "dynamic_mdarray") to handle all-static-extents vs. some-dynamic-extents cases, just like array vs. vector
  • But, 2 types is not consistent with mdspan design, which uses a single class for both cases
  • One class for both cases is consistent with other libraries, like Kokkos::View and Eigen
  • (but contrast with Boost uBLAS)

Benefits of "container adapter" design

  • Allocator not part of type if it doesn't need to be
  • Cost of move is exposed as Container, not implicit in Extents
    • (array move is more expensive than vector move)
  • Users can customize allocation and access (as with mdspan's Accessor)

One reviewer pointed out that it's not necessarily accurate to call the current design a "container adapter." This is because the dynamic or static nature of extents are separate from what's contained. The properties are customizable, not like stack or queue. This is more of an issue for wording than for design, but it's still something to keep in mind.

P1684: Consider adding a method to move the container out of the mdarray

This comes from 1684R2 LEWG review on 2022/04/19.

One reviewer suggested adding a method to move the container out of an mdarray. This would permit use cases like reclaiming and reusing an allocation for a sequence of mdarray.

Another reviewer pointed out that this might leave the container in an invalid state, since the extents and mapping would no longer correctly describe the amount of available storage. This would violate the policy that a moved-from object be left in a valid but possibly unspecified state.

@AnabelSMRuggiero has some ideas for an interface that would "sever" the mdarray from its storage. It would return both the container, and an mdspan viewing the container's storage. The result would ensure that the mdarray was in a safely usable moved-from state. For example, it could move the layout to the resulting mdspan, and assign zero to all dynamic extents. The only issue would be if the mdarray has all static extents, but the container has a move behavior more like vector than array. This would make it impossible for the "severed" mdarray to have a state usable for anything other than destruction or copy assignment. That's probably OK, but we would need to think about whether we would want mdarray ever to be in that state.

P1684: Specify requirements on Container

This comes from 1684R2 LEWG review on 2022/04/19.

We have no experience (do we?) with Container types other than vector and array. Should we restrict to those two types, or do the hard work of wording for other types like static_vector (fixed capacity that bounds the size)?

For example, the current wording only distinguishes between array and every other type. However, some other types, like static_vector, may default-construct nonempty, just like array. The question relates to whether the container can be constructed with a run-time size, vs. whether the container is "constructed nonempty" without a size.

P1684: Avoid users needing to specify required span size twice

This comes from 1684R2 LEWG review on 2022/04/19.

If users want to use std::array as mdarray's Container, they currently need to specify the size as part of the Container's type. If the size is too small, it's UB. For common use cases (layout_{left,right} with all compile-time extents), mdarray can compute the correct minimum size (via required_span_size()) at compile time. Users shouldn't have to repeat themselves (Don't Repeat Yourself principle).

One suggested fix was to provide a template type alias that deduces the correct std::array type from ElementType and SizeTypes..., at least for common layouts (perhaps layout_{left,right} only). A more comprehensive fix would make the default value of Container a function of all the previous template parameters. There would be no need to make that policy customizable, as users could always explicitly specify the Container type (and write their own policy to deduce it).

pmr::polymorphic_allocator, scoped_allocator_adaptor, and uses_allocator

Mark had mentioned earlier that supporting the memory resources in std::pmr was a hurdle that mdarray needed to get over;std::pmr::polymorphic_allocator follows the same scoped allocator model that is supported by std::scoped_allocator_adaptor and std::uses_allocator. Unless I'm missing something, the main things needed to fully support the scoped allocator model (when container_type::allocator_type exists) are:

  • Add a constructor to basic_mdarray that allows it to have an allocator passed in that passes the allocator to container_policy::create()
  • Support T(std::allocator_arg, alloc, size) as a possible container constructor and
  • Partial specialize std::uses_allocator to be based on basic_mdarray::container_type

I should be able to piece together most of a PR implementing uses allocator construction over the weekend.

Edit: got a bit ahead of myself and forgot that the relevant policy was specifically for vectors; supporting uses allocator construction for other containers would be the responsibility of the relevant policy.

P1684: Consider not requiring default-constructible elements, because ranges::to construction should always work

This comes from 1684R2 LEWG review on 2022/04/19.

One reviewer suggests: iota + cartesian_product should mean that we can construct a range over the input mdarray or mdspan, and thus should be able to use ranges::to construction. Thus, we shouldn't have to preallocate an empty container with default-constructed elements. This would remove the requirement that elements be default constructible.

However, see #13 (if we need to use a particular parallel ExecutionPolicy for iterating over elements, then the current ranges::to interface will not suffice).

The question is whether we should permit implementations to separate container allocation (with default construction of elements if they are not trivially default constructible) from copying elements into the container. This permission would let conforming implementations use a parallel algorithm to copy elements into the container, rather than (essentially) requiring ranges::to. In turn, this permission would make it possible to write mdarray in Standard C++ without requiring default constructibility of the elements. (Note that vector does not require default constructibility of its elements, unless you invoke one of the vector constructors that default-constructs elements.)

P1684: Permit array size to be larger than needed

This comes from 1684R2 LEWG review on 2022/04/19.

Currently, P1684 requires array to have size() equal to required_span_size(). Relax this to let array's size() be greater than or equal to required_span_size(). This would permit e.g., SIMD-aligned access.

P1684: mdarray iteration over elements needs a parallel execution policy

In P1684R2, mdarray needs to iterate over elements in its conversion constructors (from mdarray or mdspan). (I'm not counting whatever iteration over elements the mdarray's container already does, e.g,. on construction with a nonzero size, or destruction.)

This iteration currently uses no execution policy. This is not good if the mdarray stores elements in a memory space that needs a matching execution policy for correctness or performance (e.g., GPU memory, or NUMA allocations with a particular distribution). This is analogous to why Kokkos::View has both a memory space (for allocations) and an execution space (for iteration).

There are potentially two different execution policies: the now-being-constructed mdarray's preferred policy, and the input's preferred policy. (An input mdspan doesn't define a way to get its preferred execution policy, though its accessor could define that implicitly.) Standard C++ doesn't have an idea of execution policy compatibility (i.e., inaccessible memory spaces), so we can just pick the policy at hand, from the input. (I'm presuming that the instance of the policy matters, which is a bit of a generalization from the current execution policies in the Standard.)

One way to fix this would be to have a customization point function that takes an mdarray or mdspan, and returns its preferred execution policy instance. One complication is that ranges::to doesn't currently take an execution policy (none of the ranges algorithms do). This would hinder constructing the new mdarray's container from the input (using ranges things like iota and cartesian_product to iterate over the input).

The array.data() and array.view().data() are inaccessible.

This means there is no way to access the container underneath linearly. Deal breaker when you do file IO.

Example code:

TEST_CASE("experimental/mdarray")
{
  auto array = std::experimental::mdarray<float, std::experimental::dynamic_extent, std::experimental::dynamic_extent, std::experimental::dynamic_extent>(3, 3, 3);

  auto count = 0;
  for (auto x = 0; x < 3; ++x)
    for (auto y = 0; y < 3; ++y)
      for (auto z = 0; z < 3; ++z)
        array(x, y, z) = count++;

  auto recount = 0;
  for (auto i = 0; i < array.size(); ++i)
    REQUIRE(array.data()[i] == recount++); // Compile error.
}

Error: C2039 'data': is not a member of 'std::experimental::__mdarray_version_0::vector_container_policy<T,std::allocator>' array_test C:\development\source\cpp\particle_tracer\build\vcpkg\installed\x64-windows\include\experimental__p1684_bits\basic_mdarray.hpp 387

When changed to array.view().data()[i] error becomes: C2248 'std::experimental::__mdarray_version_0::basic_mdarray<float,std::experimental::extents<-1,-1,-1>,std::experimental::layout_right,std::experimental::_mdarray_version_0::vector_container_policy<T,std::allocator>>::map': cannot access private member declared in class 'std::experimental::__mdarray_version_0::basic_mdarray<float,std::experimental::extents<-1,-1,-1>,std::experimental::layout_right,std::experimental::__mdarray_version_0::vector_container_policy<T,std::allocator>>' array_test C:\development\source\cpp\particle_tracer\build\vcpkg\installed\x64-windows\include\experimental__p1684_bits\basic_mdarray.hpp 75

P1684: Consider changing proposal not to require storing a pointer in mdarray

An important use case for mdarray is small arrays that users would normally pass around by value. For example, 3x1, 4x1, 3x3, or 4x4 arrays of very small integer or floating-point types are common in computer graphics. Such arrays might be small enough to fit in a pointer, so storing an explicit pointer to the data in mdarray (e.g., by storing an mdspan inside the mdarray) might cost enough to make mdarray undesirable for these applications.

Acceptance criteria:

  1. An mdarray with entirely compile-time extents, and suitable layout and accessor types, should take no more storage space than an std::array<T, N>, where T is the element type and N is the product of the extents.
  2. It should be possible to express overaligned access of types (e.g., 4*sizeof(T) for a 4x1 array of T) with entirely compile-time extents. It's OK if this requires a custom accessor, but it must not require additional storage (see (1)).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.