Coder Social home page Coder Social logo

Comments (6)

redorav avatar redorav commented on September 26, 2024

Hi @mlangerak, this is expected and by design. The reason for that is that is that even if the physical layout changes, the logical layout still works. Are you finding inconsistencies when using these?

from hlslpp.

mlangerak avatar mlangerak commented on September 26, 2024

Conceptually a float4x3 consists of 4 rows (or columns for column major) of float3's so you'd expect that the matrix internally has 4 SIMD vector elements, since that would be the most natural and efficient mapping to matrix operations like constructing from 4 rows (or columns for column major), accessing a row (or column for column major), etc.

from hlslpp.

redorav avatar redorav commented on September 26, 2024

That is true, the reason I went with the other model is so that it was more space-efficient. Would you make it work the same way if it was a 4x2 matrix also be 4 simd vectors? Changing it now to work how you need would be a huge amount of work, even if I understand the rationale for it. All the matrix operations are done taking this in mind.

from hlslpp.

redorav avatar redorav commented on September 26, 2024

I've been looking at the code. There seems to be no "right" solution to trying to represent non 4x4 matrices by fixed-width vectors. The advantages and disadvantages to a hypothetical change I can see are:

Advantages
· Matrices are "castable" to and from the origin matrices, for example a float4x4 can become a float4x3 without issues (it can already be cast to a float3x4)
· You can poke at matrix rows via [ ] operators and assign float3 or float4

Disadvantages
· Matrices take a lot more space. In the extreme case, a float4x1 takes up four SIMD vectors if we want to make it all consistent
· It biases a physical row layout over columns
· float2x2 will become bloated as well (it's a single simd vector at the moment)

This of course ignores the large disadvantage that is to change the inner workings of all these matrices, verifying results, etc. I don't have a unit test coverage as large as I would like, it was originally more manual.

I think I can see both sides of this, it took me a lot of time to get the matrix stuff working and verified results, etc (even if there are bugs), changing it now would cause a lot more issues than it solves. If you have other ideas or suggestions I'm open to seeing your perspective or how you'd solve it

from hlslpp.

mlangerak avatar mlangerak commented on September 26, 2024

There are pros and cons to either layout, but I would argue for using the layout that causes the least surprise even if it is less optimal in some respects. By least surprise I mean which layout is more common in practice. For instance I think (I did not double check this) that when sharing constant data with the GPU, the DXC compiler will use 4 float3's for a float4x3 in row-major mode. In that case, it would be "least surprise" if you could memcopy a hlslpp::float4x3 directly when pushing constant data to the GPU.

The float2x2 case is fortunate, since then DXC uses the same packing into a single float4 as hlslpp does already. So in that case it is already consistent with DXC.

[edit]
Supporting both row and column major does seem to make this more complicated. If you really needed that flexibility you could have separate declarations for the matrix type from the matrix layout. E.g., something like:

struct Matrix4x3{...} // storage for 4 float3's, implicitly row major
struct Matrix3x4{...} // storage for 3 float4's, implicitly row major
using float4x3 = Matrix4x3; // when compiled with row major matrices
using float4x3 = Matrix3x4; // when compiled with column major matrices

It would get confusing really quickly though, so unless there is a good reason to support both row and column major, I would default to row major always since it is more natural IMO, indeed I am using hlslpp and DXC in row major mode. Incidentally I need to also support Metal shading language which is sadly column major and there is no shader compiler switch to change it either it seems. To compensate, I implemented a mul intrinsic for MSL which hides this column/row major distinction:

template<typename T, int RowN, int ColN, int N>
inline decltype(auto) mul(matrix<T, ColN, RowN> m, vec<T, N> v)
{
    return v * m;  // Note reversed order of matrix-vector product; Metal has column major matrix layout.
}

This works well to hide the row/column major confusion, and so far I've been able to write all my MSL pretending it is row-major.

from hlslpp.

redorav avatar redorav commented on September 26, 2024

There is a common misunderstanding that this library is meant to interface with hlsl when uploading to the GPU, I'll refer you to #58 for some more discussion.

While that would be convenient there's all sorts of cases where this doesn't happen. Just as a few examples:

cbuffer ExampleCB
{
	float3 a;     // Offset: 0
	float b;      // Offset: 12
        float2x2 m;   // Offset: 16
        float4 v;     // Offset: 48
}
  1. HLSL considers float3 to be 12 bytes, and if you append a float3 and a float it will put them in the same 16 byte line. HLSL++ won't be able to do that since all SIMD vectors are 16 bytes. i.e.
HLSL:   [ a.x ][ a.y ][ a.z ][  b  ]

HLSL++: [ a.x ][ a.y ][ a.z ][ a.w ]
        [  b  ][      padding      ]

Same applies to float1 and float2, with their respective paddings.

  1. HLSL considers float2x2 to be two float4s, contrary to what you say above. The offset for the next variable v is 32 bytes and not 16 as one would expect from a single float4.
HLSL:   [ m_00 ][ m_01 ][    padding   ]
        [ m_01 ][ m_11 ][    padding   ]

HLSL++: [ m_00 ][ m_01 ][ m_01 ][ m_11 ]
  1. float3x4 and float4x3 can vary in HLSL depending on whether you specify row_major or column_major. The alignment changes depending on how you specify this parameter, between 48 bytes and 64, as you'd expect, but there is no fixed meaning to float3x4 or float4x3, it can mean 3 rows 4 columns or the other way around. This isn't something that HLSL++ can express in any way really. A row_major float3x4 maps to float3x4, and column_major float4x3 maps to float4x3, but the other combinations will fail.

The main takeaway is that there is no least surprise behavior, these things are surprising no matter what you do, even if you had no SIMD vectors C++'s packing rules can come in and do unexpected things. It is not HLSL++'s aim to solve these problems.

That said, I use a little interface to cater for this use case in my codebase. As it is still incomplete it hasn't made its way here yet, but it is possible to do with not that much work using the store() family of functions. The other thing I have basically seen at lots of codebases is to just declare aligned types in general, i.e. don't use anything other than float4, maybe row_major float3x4 and float4x4.

Hopefully that helps

from hlslpp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.