Comments (6)
Hi @mlangerak, this is expected and by design. The reason for that is that is that even if the physical layout changes, the logical layout still works. Are you finding inconsistencies when using these?
from hlslpp.
Conceptually a float4x3 consists of 4 rows (or columns for column major) of float3's so you'd expect that the matrix internally has 4 SIMD vector elements, since that would be the most natural and efficient mapping to matrix operations like constructing from 4 rows (or columns for column major), accessing a row (or column for column major), etc.
from hlslpp.
That is true, the reason I went with the other model is so that it was more space-efficient. Would you make it work the same way if it was a 4x2 matrix also be 4 simd vectors? Changing it now to work how you need would be a huge amount of work, even if I understand the rationale for it. All the matrix operations are done taking this in mind.
from hlslpp.
I've been looking at the code. There seems to be no "right" solution to trying to represent non 4x4 matrices by fixed-width vectors. The advantages and disadvantages to a hypothetical change I can see are:
Advantages
· Matrices are "castable" to and from the origin matrices, for example a float4x4 can become a float4x3 without issues (it can already be cast to a float3x4)
· You can poke at matrix rows via [ ] operators and assign float3 or float4
Disadvantages
· Matrices take a lot more space. In the extreme case, a float4x1 takes up four SIMD vectors if we want to make it all consistent
· It biases a physical row layout over columns
· float2x2 will become bloated as well (it's a single simd vector at the moment)
This of course ignores the large disadvantage that is to change the inner workings of all these matrices, verifying results, etc. I don't have a unit test coverage as large as I would like, it was originally more manual.
I think I can see both sides of this, it took me a lot of time to get the matrix stuff working and verified results, etc (even if there are bugs), changing it now would cause a lot more issues than it solves. If you have other ideas or suggestions I'm open to seeing your perspective or how you'd solve it
from hlslpp.
There are pros and cons to either layout, but I would argue for using the layout that causes the least surprise even if it is less optimal in some respects. By least surprise I mean which layout is more common in practice. For instance I think (I did not double check this) that when sharing constant data with the GPU, the DXC compiler will use 4 float3's for a float4x3 in row-major mode. In that case, it would be "least surprise" if you could memcopy a hlslpp::float4x3 directly when pushing constant data to the GPU.
The float2x2 case is fortunate, since then DXC uses the same packing into a single float4 as hlslpp does already. So in that case it is already consistent with DXC.
[edit]
Supporting both row and column major does seem to make this more complicated. If you really needed that flexibility you could have separate declarations for the matrix type from the matrix layout. E.g., something like:
struct Matrix4x3{...} // storage for 4 float3's, implicitly row major
struct Matrix3x4{...} // storage for 3 float4's, implicitly row major
using float4x3 = Matrix4x3; // when compiled with row major matrices
using float4x3 = Matrix3x4; // when compiled with column major matrices
It would get confusing really quickly though, so unless there is a good reason to support both row and column major, I would default to row major always since it is more natural IMO, indeed I am using hlslpp and DXC in row major mode. Incidentally I need to also support Metal shading language which is sadly column major and there is no shader compiler switch to change it either it seems. To compensate, I implemented a mul intrinsic for MSL which hides this column/row major distinction:
template<typename T, int RowN, int ColN, int N>
inline decltype(auto) mul(matrix<T, ColN, RowN> m, vec<T, N> v)
{
return v * m; // Note reversed order of matrix-vector product; Metal has column major matrix layout.
}
This works well to hide the row/column major confusion, and so far I've been able to write all my MSL pretending it is row-major.
from hlslpp.
There is a common misunderstanding that this library is meant to interface with hlsl when uploading to the GPU, I'll refer you to #58 for some more discussion.
While that would be convenient there's all sorts of cases where this doesn't happen. Just as a few examples:
cbuffer ExampleCB
{
float3 a; // Offset: 0
float b; // Offset: 12
float2x2 m; // Offset: 16
float4 v; // Offset: 48
}
- HLSL considers float3 to be 12 bytes, and if you append a float3 and a float it will put them in the same 16 byte line. HLSL++ won't be able to do that since all SIMD vectors are 16 bytes. i.e.
HLSL: [ a.x ][ a.y ][ a.z ][ b ]
HLSL++: [ a.x ][ a.y ][ a.z ][ a.w ]
[ b ][ padding ]
Same applies to float1 and float2, with their respective paddings.
- HLSL considers float2x2 to be two float4s, contrary to what you say above. The offset for the next variable v is 32 bytes and not 16 as one would expect from a single float4.
HLSL: [ m_00 ][ m_01 ][ padding ]
[ m_01 ][ m_11 ][ padding ]
HLSL++: [ m_00 ][ m_01 ][ m_01 ][ m_11 ]
- float3x4 and float4x3 can vary in HLSL depending on whether you specify row_major or column_major. The alignment changes depending on how you specify this parameter, between 48 bytes and 64, as you'd expect, but there is no fixed meaning to float3x4 or float4x3, it can mean 3 rows 4 columns or the other way around. This isn't something that HLSL++ can express in any way really. A row_major float3x4 maps to float3x4, and column_major float4x3 maps to float4x3, but the other combinations will fail.
The main takeaway is that there is no least surprise behavior, these things are surprising no matter what you do, even if you had no SIMD vectors C++'s packing rules can come in and do unexpected things. It is not HLSL++'s aim to solve these problems.
That said, I use a little interface to cater for this use case in my codebase. As it is still incomplete it hasn't made its way here yet, but it is possible to do with not that much work using the store() family of functions. The other thing I have basically seen at lots of codebases is to just declare aligned types in general, i.e. don't use anything other than float4, maybe row_major float3x4 and float4x4.
Hopefully that helps
from hlslpp.
Related Issues (20)
- Integer vector comparison is working incorrectly on Intel based Macs with MacOS >= 11 HOT 3
- Improve HLSL++ unit tests by using full-featured testing framework HOT 2
- Matrix accessor operator HOT 6
- operator / (float1, float3) looks inverted HOT 8
- How well does this cover HLSL202x? HOT 11
- * operator brake mult function in quaternion HOT 7
- Quaternion slerp returning nan values HOT 2
- Clang error: unused variable 't4' in hlsl++_quaternion.h HOT 6
- Upgrade warnings to errors HOT 2
- round() on ARM Neon uses wrong rounding mode HOT 2
- mul() intrinsic is missing some overloads. HOT 2
- clamp() intrinsic is missing overloads for intN HOT 1
- Missing matrix constructors HOT 2
- Some vector double functions missing HOT 5
- Undefined behavior when accessing vector elements with operator[] HOT 10
- internal::round_float gives incorrect results HOT 4
- Build failure on G++13.2 HOT 11
- Errors on building Linux ARM64 HOT 2
- vcpkg support? HOT 11
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hlslpp.