Coder Social home page Coder Social logo

redorav / hlslpp Goto Github PK

View Code? Open in Web Editor NEW
451.0 16.0 39.0 7.64 MB

Math library using hlsl syntax with SSE/NEON support

License: MIT License

C++ 64.62% C 33.06% Batchfile 0.02% Lua 2.28% Shell 0.01%
hlsl math cpp shaders game-development sse sse41 neon vector matrix

hlslpp's Introduction

MIT License AppVeyor

HLSL++

Small header-only math library for C++ with the same syntax as the hlsl shading language. It supports any SSE (x86/x64 devices like PC, Mac, PS4/5, Xbox One/Series) and NEON (ARM devices like Android, iOS, Switch) platforms. It features swizzling and all the operators and functions from the hlsl documentation. The library is aimed mainly at game developers as it's meant to ease the C++ to shader bridge by providing common syntax, but can be used for any application requiring fast, portable math. It also adds some functionality that hlsl doesn't natively provide, such as convenient matrix functions, quaternions and extended vectors such as float8 (8-component float) that take advantage of wide SSE registers.

Example

hlsl++ allows you to be as expressive in C++ as when programming in the shader language. Constructs such as the following are possible.

float4 foo4 = float4(1, 2, 3, 4);
float3 bar3 = foo4.xzy;
float2 logFoo2 = log(bar3.xz);
foo4.wx = logFoo2.yx;
float4 baz4 = float4(logFoo2, foo4.zz);
float4x4 fooMatrix4x4 = float4x4( 1, 2, 3, 4,
                                  5, 6, 7, 8,
                                  8, 7, 6, 5,
                                  4, 3, 2, 1);
float4 myTransformedVector = mul(fooMatrix4x4, baz4);
int2 ifoo2 = int2(1, 2);
int4 ifoo4 = int4(1, 2, 3, 4) + ifoo2.xyxy;
float4 fooCast4 = ifoo4.wwyx;

float8 foo8 = float8(1, 2, 3, 4, 5, 6, 7, 8);
float8 bar8 = float8(1, 2, 3, 4, 5, 6, 7, 8);
float8 add8 = foo8 + bar8;

The natvis files provided for Visual Studio debugging allow you to see both vectors and the result of the swizzling in the debugging window in a programmer-friendly way.

Swizzle Natvis Preview

Requirements

The only required features are a C++ compiler supporting anonymous unions, and SSE or NEON depending on your target platform. If your target platform does not have SIMD support, it can also fall back to a scalar implementation. As a curiosity it also includes an Xbox 360 implementation.

How to use

// The quickest way, expensive in compile times but good for fast iteration
#include "hlsl++.h"

// If you care about your compile times in your cpp files
#include "hlsl++_vector_float.h"
#include "hlsl++_matrix_float.h"

// If you only need type information (e.g. in header files) and don't use any functions
#include "hlsl++_vector_float_type.h"
#include "hlsl++_quaternion_type.h"
  • Remember to add an include path to "hlslpp/include"
  • Windows has defines for min and max so if you're using this library and the <windows.h> header remember to #define NOMINMAX before including it
  • To force the scalar version of the library, define HLSLPP_SCALAR globally. The scalar library is only different from the SIMD version in its use of regular floats to represent vectors. It should only be used if your platform (e.g. embedded) does not have native SIMD support. It can also be used to compare performance
  • To enable the transforms feature, define HLSLPP_FEATURE_TRANSFORM globally
  • The f32 members of float4 and the [ ] operators make use of the union directly, so the generated code is up to the compiler. Use with care

Features

  • SSE/AVX/AVX2, NEON, Xbox360, and scalar versions
  • float1, float2, float3, float4, float8
  • int1, int2, int3, int4
  • uint1, uint2, uint3, uint4
  • double1, double2, double3, double4
  • floatNxM
  • quaternion
  • Conversion construction and assignment, e.g. float4(float2, float2) and int4(float2, int2)
  • Efficient swizzling for all vector types
  • Basic operators +, *, -, / for all vector and matrix types
  • Per-component comparison operators ==, !=, >, <, >=, <= (no ternary operator as overloading is disallowed in C++)
  • hlsl vector functions: abs, acos, all, any, asin, atan, atan2, ceil, clamp, cos, cosh, cross, degrees, distance, dot, floor, fmod, frac, exp, exp2, isfinite, isinf, isnan, length, lerp, log, log2, log10, max, mad, min, modf, normalize, pow, radians, reflect, refract, round, rsqrt, saturate, sign, sin, sincos, sinh, smoothstep, sqrt, step, trunc, tan, tanh
  • Additional matrix functions: determinant, transpose, inverse (not in hlsl but very useful)
  • Matrix multiplication for all NxM matrix combinations
  • Transformation matrices for scale, rotation and translation, as well as world-to-view look_at and view-to-projection orthographic/perspective coordinate transformations. These static functions are optionally available for matrix types float2x2, float3x3, float4x4 when hlsl++.h is compiled with HLSLPP_FEATURE_TRANSFORM definition.
  • Native visualizers for Visual Studio (.natvis files) which correctly parse with both MSVC and Clang in Windows

Missing/planned:

  • boolN types

hlslpp's People

Contributors

benualdo avatar denys-liubushkin avatar egorodet avatar new-cassellito avatar plekakis avatar redorav avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hlslpp's Issues

HLSL++ structs do not support move-semantics

HLSL++ vector and matric structs have user-defined copy constructor which breaks "rule of zero", but do not define copy assignment, move constructor and move assignment operators which also means that these types also do not follow "rule of five" resulting in missing support of move semantics and lower performance when used in STL containers like std::vector.

It seems like HLSL++ types do not need to have user-defined copy constructor. Removing of user-defined copy-constructors will let the compiler generate correct implementations of noexcept copy/move constructors and noexcept assignment operators unlocking the effective memory management in modern C++.

vceilq_f32

hlslpp_inline float32x4_t vceilq_f32(float32x4_t x)
{
	float32x4_t trnc = vcvtq_f32_s32(vcvtq_s32_f32(x));				// Truncate
	float32x4_t gt = vcgtq_f32(trnc, x);							// Check if truncation was greater or smaller (i.e. was negative or positive number)
	uint32x4_t shr = vshrq_n_u32(vreinterpretq_u32_f32(gt), 31);	// Shift to leave a 1 or a 0
	float32x4_t result = vaddq_f32(trnc, vcvtq_f32_u32(shr));		// Add to truncated value
	return result;
}

"float32x4_t gt = vcgtq_f32(trnc, x);" should be modified to "float32x4_t gt = vcgtq_f32(x, trnc);"

Matrix comparison operators

Matrices currently do not have any comparison operators defined for them. I can get around it by writing my own operators manually but it would be nice if these were built-in in hlslpp.

Build output

1>C:\Personal\ElectronicJonaJoy\src\EngineTests\math.tests.cpp(67,1): error C2678: binary '==': no operator found which takes a left-hand operand of type 'hlslpp::float4x4' (or there is no acceptable conversion)
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_float.h(1122,23): message : could be 'hlslpp::float1 hlslpp::operator ==(const hlslpp::float1 &,const hlslpp::float1 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_float.h(1123,23): message : or       'hlslpp::float2 hlslpp::operator ==(const hlslpp::float2 &,const hlslpp::float2 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_float.h(1124,23): message : or       'hlslpp::float3 hlslpp::operator ==(const hlslpp::float3 &,const hlslpp::float3 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_float.h(1125,23): message : or       'hlslpp::float4 hlslpp::operator ==(const hlslpp::float4 &,const hlslpp::float4 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_int.h(568,21): message : or       'hlslpp::int1 hlslpp::operator ==(const hlslpp::int1 &,const hlslpp::int1 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_int.h(569,21): message : or       'hlslpp::int2 hlslpp::operator ==(const hlslpp::int2 &,const hlslpp::int2 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_int.h(570,21): message : or       'hlslpp::int3 hlslpp::operator ==(const hlslpp::int3 &,const hlslpp::int3 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_int.h(571,21): message : or       'hlslpp::int4 hlslpp::operator ==(const hlslpp::int4 &,const hlslpp::int4 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_uint.h(573,22): message : or       'hlslpp::uint1 hlslpp::operator ==(const hlslpp::uint1 &,const hlslpp::uint1 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_uint.h(574,22): message : or       'hlslpp::uint2 hlslpp::operator ==(const hlslpp::uint2 &,const hlslpp::uint2 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_uint.h(575,22): message : or       'hlslpp::uint3 hlslpp::operator ==(const hlslpp::uint3 &,const hlslpp::uint3 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_uint.h(576,22): message : or       'hlslpp::uint4 hlslpp::operator ==(const hlslpp::uint4 &,const hlslpp::uint4 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_double.h(1079,24): message : or       'hlslpp::double1 hlslpp::operator ==(const hlslpp::double1 &,const hlslpp::double1 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_double.h(1080,24): message : or       'hlslpp::double2 hlslpp::operator ==(const hlslpp::double2 &,const hlslpp::double2 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_double.h(1081,24): message : or       'hlslpp::double3 hlslpp::operator ==(const hlslpp::double3 &,const hlslpp::double3 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_vector_double.h(1090,24): message : or       'hlslpp::double4 hlslpp::operator ==(const hlslpp::double4 &,const hlslpp::double4 &)' [found using argument-dependent lookup]
1>c:\personal\electronicjonajoy\external\hlslpp\include\hlsl++_quaternion.h(227,24): message : or       'hlslpp::float4 hlslpp::operator ==(const hlslpp::quaternion &,const hlslpp::quaternion &)' [found using argument-dependent lookup]
1>C:\Personal\ElectronicJonaJoy\src\EngineTests\math.tests.cpp(67,1): message : while trying to match the argument list '(hlslpp::float4x4, hlslpp::float4x4)'

Wrong value when flooring Y Component.

Below code snippet is the current behaviour for me.

float1 y{ -0.01f };
float uf = hlslpp::floor(y); // returns -1 : ok
			
float3 broken{ -11.15f,-0.1f,-15.0f };
// Accessing the Y component is correct
float yVal = broken.y;
// next statement returns -12.0f -> Floor of x component
float actualf = hlslpp::floor(broken.y);

The floor function seems to be flooring my x component and returning that value instead of the Y component.

Add refract

I think the function refract is missing. The following code is stolen from here. I is the incident vector, N is the normal vector, and eta is the ratio of indices of refraction.

k = 1.0 - eta * eta * (1.0 - dot(N, I) * dot(N, I));
if (k < 0.0)
    R = floatN(0.0);
else
    R = eta * I - (eta * dot(N, I) + sqrt(k)) * N;

Please add it thank you.

Better inclusion manual for usage in existing project

It would be nice if it would contain a manual on how to incorporate it into an existing VisualStudio solution to be able to quickly use this awesome library. For example which settings have to be checked for a successful compilation or what else to pay attention to, because simple including the headers doesn't work.

I think that would be great, because it enables inexperienced c++, vs pipeline users to quickly try and use this awesome library.

Add modulo operator

Apparently float also accepts the modulo operator so it needs to be added to every type

Optimize double vectors using AVX

This is already halfway done, but here for keeping track. Takes advantage of AVX support to pack double3 and double4 into __m256d instead of two __m128d

Add overloads for float

It is ambiguous to do things like hlslpp::radians(0.3f) because float can be implicitly converted to floatN. Even if it's not the purpose of hlsl++ to provide scalar versions of these functions it's probably not hard and makes it more complete

Vector comparison operators are not available for doubles and uints

operator ==, !=, <, <=, >, >= is not implemented for double1, double2, double3, double4 types and the implementation is commented out for uint1, uint2, uint3, uint4. Meanwhile these operators are properly implemented for floats and ints.

I'm trying to use vector types in my template wrapper class Point<T, N> which is used with floats, ints, uints and doubles and its is currently failing to compile for T=uint32_t and T=double because of this asymetry in underlying vector types implementation.

Would it be possible to implement these comparison operators for all vector types?

Broken lerp

Lerp seems to be broken. Had to revert to path marked as slower in _hlslpp_lerp_ps in order to get it working.

Add "any" and 'all' syntax to branch according to vector comparison result

Hi,
I would found it very useful to add the 'any'/'all' HLSL syntax to branch according to vector comparison result.

ie.

void CommandList::setViewport(const uint4 & _viewport)
{
    if ( any( _viewport != m_viewport ) )
    {
        bindViewport(_viewport);
        m_viewport = _viewport;
    }
}

Operators like intN operator != could return boolN to make it even clearer to use.

Thanks,
Benoît.

_hlslpp_sel_ps NEON

I think there is a mistake in the NEON definition of _hlslpp_sel_ps
The SSE definition is
#define _hlslpp_sel_ps(x, y, mask) _mm_blendv_ps((x), (y), (mask))
which is correct, whem mask is 1 y is selected otherwise x
in NEON
#define _hlslpp_sel_ps(x, y, mask) vbslq_f32((mask), (x), (y))
which should be
#define _hlslpp_sel_ps(x, y, mask) vbslq_f32((mask), (y), (x))
in vbslq_f32 when mask is one the second argument is selected otherwise the third

GCC build error and MSVC warning on invalid cast of an rvalue

Latest version of HLSL++ doest not build with GCC & MSVC at maximum warning level:

  • GCC errors example: hlsl++_sse.h:792:41: error: invalid cast of an rvalue expression of type ‘__m128’ {aka ‘__vector(4) float’} to type ‘const n128i&’ {aka ‘const __vector(2) long long int&’} 792 | x = (const n128i&)_mm_load_ss((float*)p);
  • MSVC warnings example: hlsl++_sse.h(792,20): warning C4238: nonstandard extension used: class rvalue used as lvalue

Add operator [ ]

For vectors and matrices. Vectors return a float1, matrices a float4

Add scalar version of library

Add a non-vectorized version of the library. This can allow to mix and match on platforms (like NEON 32-bit) that don't have vectorized double types but may want to use the math lib. It can also help in future comparisons between vectorized code and scalar code.

Optimize NEON shuffles

They're too generic currently and inefficient. We can probably specialize most combinations using constructs such as

vcombine_f32(vget_high_f32(x), vget_low_f32(y))
vrev64q_f32(x)

etc.

Add load

I noticed there's store() but no load(). There is a section specified as "Float Store/Load" but load is missing. Just making sure it's not forgotten. Would be handy.

Add SSE2 fallbacks

Seems simple enough, it's these functions:

// Float
_mm_blend_ps
_mm_blendv_ps
_mm_trunc_ps
_mm_round_ps
_mm_ceil_ps

// Int
_mm_blend_epi16
_mm_mullo_epi32
_mm_mul_epi32
_mm_max_epi32
_mm_min_epi32

// Double
_mm_blend_pd

Compare with similar libraries

It wll be reasonable to compare this library with similar libraries, such as glm and hlml

Will be interesting to see something like a table of hlslpp vs glm vs hlml features

type size and alignment

hi,

hlslpp looks great,the only reason which prevents me to use it is the size and alignment of each types.
float1/2/3 is 16 bytes, and every floatN(xM) in hlslpp has alignment of 16 bytes(rather than 4).
that's very different from hlsl, we can't share some code between c++ and hlsl, such as some buffer struct defines.

any thoughts about it ? thanks.

Move constructors are not auto-generated for matrix types anymore

Hi @redorav,
I've noticed that in one of the latest commits you've added manual implementation of the copy constructors and copy assignment operators to matrix types. As the result, C++ compiler does not generate move constructors and assignment operators automatically for these types and I received a bunch of issues from my static analysis system regarding std::move(matrix) calls and other std::move(...) calls for types that have matrix fields in Methane Kit. This can be fixed either by removing manual implementation of copy constructors and assignment operators to let C++ do the magic of auto-generating them properly or implement both copy and move constructors and assignment operators (according to rule of five). Also be sure to make move constructors and assignment operators noexcept according to standard. I have suggested to do this before in issue #40 which was fixed with removal of manual implementations. Is there any reason to keep these manual implementations? Are they different from the auto-generated ones?

Fast affine inverse support

There are lots of cases in animation require computing the inverse of affine matrices, there are many assumptions that one can make when a 4x4 matrix is an affine transformation, any chance something like that would be considered?

(p.s. this is Bryan from PG ;))

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.