crud89 / litefx Goto Github PK

Modern, flexible computer graphics and rendering engine, written in C++23 with support for Vulkan 🌋 and DirectX 12 ❎.

License: MIT License

CMake 5.23% C++ 93.71% HLSL 1.02% PowerShell 0.04%

computer-graphics vulkan-engine directx-12 fluent-api vulkan directx-12-engine cpp20 rendering rendering-engine rendering-3d-graphics

litefx's People

Contributors

Stargazers

Watchers

Forkers

a8e4 asdlei99 sarvex meowboy326 redchew-fork

litefx's Issues

Allow descriptor set to be build from shader reflection

Instead of explicitly defining descriptor sets and vertex input assembly formats when creating a pipeline layout, the builder could instead load the required information from an shader reflection interface. This would simplify setting up render pipelines, but can also cause issues, since the application buffer format is not that obvious.

Some references:

Support embedding shaders into binary resources.

Shaders currently can only be copied into a pre-defined directory. It would be nice, if they are first built into an intermediate directory and then copied there. As an alternative, they could also be embedded into the binary as a resource.

Embeded resources are not trivial to manage using CMake, so we probably will need to invoke the resource compiler manually. Furthermore, this might introduce another dependency on MSVC, since other compilers treat resources differently.

Add support for mip-map generation.

Basically mip maps already can be used, as long as they are part of the image memory. We should implement support for mip map generation.

Use vcpkg to include DirectX-Headers.

Currently the DirectX-Headers library is included as a submodule. There is, however, a PR pending for vcpkg. As soon as it gets merged, we should rather use vcpkg to include the headers to stay in line with the current build process.

Decouple descriptors from resources

Currently a resource is initialized with the descriptor layout it has been created from. We should decouple this and instead only create a thin descriptor object, that can be updated with any resource that fits the layout. This has two major advantages:

It would allow to bind the same resource to different descriptor types (e.g. writable/read-only descriptors, uniform or storage buffers)
It would allow to use a resource in different pipeline configurations.

This would also nicely fit to the work done in #51.

Implement Raytracing support.

Preliminary issue to track feature support for ray tracing.

~~Similar to compute pipelines, raytracing may need its own pipeline type, layout and shader programs.~~

Since this is a large-scale feature, the following bullets approximately track the progress:

Acceleration structures
- Support GPU virtual addresses in Vulkan and expose them to the buffer interfaces.
- BLAS and TLAS structures.
- Instance and Geometry flags.
- Barrier support
- Copying and compaction
- Allow multiple acceleration structures to be stored in a single buffer.
- Allow acceleration structure buffer release ~~(backing buffers can be moved out of their acceleration structures, buffers for building can be released on demand)~~.
Ray tracing pipeline
- Barrier support
- Shader binding tables
- Pipeline implementation
  - Vulkan
  - DirectX 12
General
- Matrix type
- Inline Ray Tracing (Ray Queries)
- Runtime option (instead of compile-time option)

Make resource barriers explicit.

We should introduce an explicit resource barrier and resource state type. Currently resource transitions are done implicitly, which is not very efficient. Instead, it should be possible for the application to define a set of transitions within one barrier. This way it is possible to:

Barrier multiple resources to copy dest.
Copy all resources.
Barrier all resources back to shader resource.

Interop DXGI swap chain in Vulkan, if DX12 backend is enabled.

Describe your problem

Re-using a window between different backends is not possible, if the window has been used by a flip model swap chain. It is not possible to revert to BitBlt, which leaves the window unusable from Vulkan, which uses GDI to present images.

Describe your proposed solution

If the DirectX12 backend is enabled, we can interop to the DXGI swap chain, to allow Vulkan to present it's images using the flip model, too. This solves the problem in an elegant fashion, whilst also improving performance. An example interop can be found here.

Additional context

This should also allow us to use proper HDR from Vulkan.

Create installation examples.

Basically there are two (and a half) ways we should support installation:

Manual builds should be possible to be used from CMake.
1.5 This means it would also be possible (in theory) to create a vcpkg port.
Binary distributions ~~using CPack~~.

Create fullscreen sample.

We should create a sample on how to use fullscreen mode (both: exclusive and fullscreen window). This might also require some tweaks to how the swap chain is created.

Add support for push constants.

Push constants are small buffers that can be efficiently updated between individual draw calls. They are handy when providing transform buffers or material properties.

In DirectX 12 they are called Root Constants

Remove samples from default feature list.

With the introduction of the vcpkg.json manifest file, the engine can actually be built based on features, which can become handy for creating a vcpkg port. However, switching off CMake variables (i.e. BUILD_EXAMPLES) requires a port file. Thus I've temporarily added the samples feature to the list of default features. Otherwise the stb dependency will not be resolved, when building from a workflow.

I need to figure out a way to properly define the features that are used when building the project. However, this might also be a misunderstanding of how the manifest system works.

Implement 1D and 3D textures.

Currently only 2D textures are supported. We should also add support for 1D and 3D textures.

Expose line width as a dynamic state property of the rasterizer.

Currently the line width is a static property of the rasterizer. This should be changed to the dynamic state, so that changing it does not require pipeline re-creation.

Allow loading shaders modules from streams.

Currently shaders can only be loaded from files. We should also allow them to be loaded from binary streams.

Test builds before merging pull requests.

We should run test builds before merging pull requests. assimp uses a nice system, where first-time contributors cannot automatically invoke builds. We could use something similar (and possibly only allow manual builds).

Add support for render target blending.

A render target should expose a blend state. This should include:

Alpha and color blend operations (if both are NONE, disable blending all together).
The color write mask (i.e. which channels to write to after blending).
Source alpha and color blend factors.
Destination alpha and color blend factors.

Futhermore, the pipeline should expose two blend properties:

Per channel blend constants (a 4-D vector)
Logical operation to apply (optional) - if NONE, the logical operation will be disabled.

Use Timeline Semaphores for synchronization.

Timeline semaphores provide a more straightforward way to synchronization. They behave more like fences in DirectX and thus should allow us to get rid of the Vulkan-specific command buffer submit logic.

~~This should furthermore allow us to get rid of the semaphore-overload for present calls.~~ ¹

¹ Unfortunately, timeline semaphores are not (yet?) supported in present commands (see question 7 in linked article).

Invert relation between render pass and pipeline

A render pass should contain the pipeline, not the other way around. This makes construction more straightforward:

m_pipeline = m_device->buildRenderPass()
    .attachPresentTarget(true)
    .definePipeline()
        .withLayout()
            .setRasterizer()
                .withPolygonMode(PolygonMode::Solid)
                .withCullMode(CullMode::BackFaces)
                .withCullOrder(CullOrder::ClockWise)
                .withLineWidth(1.f)
                .go()
            .setInputAssembler()
                .withTopology(PrimitiveTopology::TriangleList)
                .withIndexType(IndexType::UInt16)
                .addVertexBuffer(sizeof(Vertex), 0)
                    .addAttribute(0, BufferFormat::XYZ32F, offsetof(Vertex, Position))
                    .addAttribute(1, BufferFormat::XYZW32F, offsetof(Vertex, Color))
                    .go()
                .go()
            .setShaderProgram()
                .addVertexShaderModule("shaders/deferred_shading.vert.spv")
                .addFragmentShaderModule("shaders/deferred_shading.frag.spv")
                .addDescriptorSet(DescriptorSets::PerFrame, ShaderStage::Vertex | ShaderStage::Fragment)
                    .addUniform(0, sizeof(CameraBuffer))
                    .go()
                .addDescriptorSet(DescriptorSets::PerInstance, ShaderStage::Vertex)
                    .addUniform(0, sizeof(TransformBuffer))
                    .go()
                .go()
            .addViewport()
                .withRectangle(RectF(0.f, 0.f, static_cast<Float>(m_device->getBufferWidth()), static_cast<Float>(m_device->getBufferHeight())))
                .addScissor(RectF(0.f, 0.f, static_cast<Float>(m_device->getBufferWidth()), static_cast<Float>(m_device->getBufferHeight())))
                .go()
            .go()
        .go()
    .go();

Also this should be implemented in order to implement support for render-pass dependencies (see issue #4).

Introduce present and post-processing queues.

As described here we could use dedicated present queues alongside a compute queue for post-processing to start rendering the next frame, before the current frame gets post-processed and presented.

Separate transfer and rendering.

Currently transfer calls are issued on the graphics queue of the underlying device. This is not optimal, since it may increase frame times, if large memory parts are required to be uploaded into VRAM. We should instead use a separate queue (that only supports transfer) for transferring memory. We could then use a separate thread to load buffers and issue transfer calls.

Decouple frame buffer from render pass.

We should move the frame buffer to a separate object that stores the render targets and is actually separate from (yet owned by) the render pass. This should make mapping between render passes, such as re-creating frame buffers on resize easier.

Expose depth/stencil state.

Currently the depth/stencil state can only be enabled or disabled. This should be enhanced by exposing the depth and stencil states on a pipeline.

The depth state should expose:

Bias.
Compare operation. †
Test enable/disable. †
Write enable/disable. †
Depth boundaries.
Boundaries test enable/disable. †

The stencil state should expose:

Test enable/disable. †
Compare mask.
Write mask.
Reference value.

† Part of the static state, i.e. if modified, the pipeline needs to be recreated.

Combine `IImage` and `ITexture` into one interface.

Describe your problem
Currently the only real difference between textures and images is, that textures support multi-sampling and mip-map generation. Mip-maps are also supported by images, but generating mip-maps is only allowed for textures.

Describe your proposed solution
The differentiation between images and textures appears somewhat arbitrary. We should remove the ITexture interface and use the common IImage interface for both. This would make the API design much clearer.

Refactor rendering interfaces to templates to eliminate dynamic casts.

It does not make sense to have a Vulkan interface receive instances from another backend, so we could also turn the rendering interfaces into templates that specialize based on the current backend. This way we could save runtime casting costs.

Add compute pipelines.

Compute pipelines are used to execute compute shaders. Compute shaders must be executed outside the typical render pass/framebuffer scheme and require external synchronization with graphics work. It is not valid to invoke a compute shader, if a render pass is currently executing.

Support lost allocations

In Vulkan, it is valid to keep certain objects (like textures) in memory and let them run out of date. The driver might then re-allocate the memory and move it from the VRAM to the DRAM. Vulkan Memory Allocator has builtin support for this scenario. So instead of relying on custom streaming, we should implement support for lost allocations. Instead of counting active references, a streaming-implementation could then be a chain of fallbacks:

When required, transfer a resource to the VRAM.
When not required any longer, VMA may move the resource out of VRAM.
If the resource is requested again, try to transfer it back to the VRAM (fail if not possible)
If the DRAM pressure is too high, release lost allocations.

Create workflow to publish releases.

We should provide a separate script that runs with every commit to the main branch, that builds the documentation and deploys it to a litefx-docs repository. Furthermore, we should forbid commits to the main branch, so that each change needs a branch.

This nicely fits with issue #30.

Refactor shader builds.

There's already a branch that started refactoring the shader build script. The script should be a standalone script that can be installed (see issue #27) along the distributions in order to compile shaders for applications. The general interface for a shader definition could look like this:

ADD_SHADER_MODULE(vertex_shader
    SOURCE vs.hlsl
    LANGUAGE HLSL
    COMPILE_AS DXIL
    SHADER_MODEL 6_3
    TYPE VERTEX)

ADD_SHADER_MODULE(pixel_shader
    SOURCE ps.hlsl
    LANGUAGE HLSL
    COMPILE_AS DXIL
    SHADER_MODEL 6_3
    TYPE PIXEL)

TARGET_LINK_SHADERS(my_target SHADERS vertex_shader pixel_shader)

We then need to figure out how to handle installs for shader targets.

Allow fences to be made explicit.

Similar to #55 it could be useful to have a manual way of specifying fences and wait for them.

For Vulkan this can be implemented by #24.

~~This would allow us to store a fence on each frame buffer, that can be reset, if the parent render pass begins.~~

Use one command buffer per swap chain image.

The current implementation uses one command buffer per render pass, which has a fence that waits for the previous pass to be finished before beginning a new pass. This is problematic, since it basically waits for the previous swap chain image to be rendered before starting to record draw calls for the next image.

To circumvent this, each swap chain image should have it's own command buffer that backs it. Only if one image starts to be drawn again, it should wait for the earlier pass to be finished.

Further information: Keeping your GPU fed

Add support for debug objects.

To enhance debugging, it would be nice to use the APIs debug object interfaces to provide more meaningful information, such as object names. Internally, this could be done using an IDebugObject interface, that (when compiled in debug config) stores the name and provides an use method that sets the debug info.

This tutorial shows how to do this in Vulkan.
In DX12, this can be done using ID3D12Object::SetName.

Another feature would be debug markers, however, I am not quite sure how they work in DirectX.

Re-use command list over multiple render passes.

Currently, we bind the global descriptor heaps with each render pass, which is redundant and inefficient, since it might cause GPU flushes. We should instead re-use the same command list over all render passes in the DirectX 12 backend.

Support multiple render passes

Currently a pipeline only contains one render-pass, which basically only allows to implement forward rendering. We should extend the library to support multiple render passes. This involves the following adjustments:

Move the shader program into the render pass.
Support initializing multiple render passes when building up a pipeline.
Besides beginFrame/endFrame, there should be also a method that advances from one render pass to another.

Note that this basic implementation only supports simple render pass lists, whilst Vulkan (in theory) also supports sub-passes with more complex dependency layouts.

Allow multiple command buffers per frame buffer.

Currently each render pass records every draw call (and even more) into a single command buffer, that is stored in the active frame buffer. We should change this in order to allow for more efficient multi-threaded command recording.

The advantage of command buffers is that they are executed asynchronously. This means, once submitted, another command buffer can be recorded in parallel. Subsequent submissions are executed in order. However, recording the whole scene within a render pass into a single command buffer might be inefficient, since it may cause the GPU to stall and wait for the CPU to submit a new command buffer [1][2].

So instead we should host a pool of command buffers in each frame buffer instance. The number of command buffers should be configurable (depending on the number of threads). Furthermore, each render pass instance should host two command buffers: One for recording render pass begin work, and one for recording end work. Ending the render pass should check if all frame buffer command buffers are closed (only in debug mode) and then submit the command lists as a batch.

The DirectX 12 multi-threading example [3] should give a good introduction.

[1] https://developer.nvidia.com/dx12-dos-and-donts#worksubmit
[2] https://gpuopen.com/wp-content/uploads/2016/03/GDC_2016_D3D12_Right_On_Queue_final.pdf
[3] https://github.com/microsoft/DirectX-Graphics-Samples/blob/master/Samples/Desktop/D3D12Multithreading/

Improve storage buffers.

Currently buffers cannot be written to easily. The Vulkan storage buffer model does not map well to the DirectX UAV model. We should drop storage buffers as a form of "dynamically sized" constant buffers and make textures and buffers optionally writable instead.

Implement depth bounds testing.

This is a follow up to issue #17.

We should add support for depth bounds. The static pipeline state controls whether to use depth bounds or not, whilst the bounds themselves are part of the dynamic state. Support for this feature is optional and must be checked. In Vulkan the VkPhysicalDeviceFeatures::depthBounds property is used for this, in DirectX 12 CheckFeatureSupport must be called. If the feature is not available, the static state must be disabled.

To implement this...

... in DX12:

Set the DepthClipEnable property of the D3D12_RASTERIZER_DESC.
Call OMSetDepthBounds on the pipeline.

In Vulkan:

Set depthBoundsTestEnable on the pipeline static state.
Use VK_DYNAMIC_STATE_DEPTH_BOUNDS/vkCmdSetDepthBounds to set dynamic state.

We could extent the DepthStencilState to cover this.

Support importing shader targets in application builds.

We should create a sample that uses the latest vcpkg registry to build a demo application. We should furthermore check, if shaders and resources are properly installed alongside the application when building using vcpkg.

More consistent property accessors.

We should redesign the property accessors to use a more consistent language. The following example uses the word RenderPass to demonstrate different usage scenarios:

getRenderPass() should be if a copy of a member is created, for example when a pointer to an internal member is exposed.
setRenderPass() should be used, if an member is exchanged.
renderPass() should be used if a reference of an internal member is acquired. We should limit this to members that are passed to/initialized in the constructor and are guaranteed to be initialized during the lifetime of the object

If an object requires late initialization, it should expose an initialize method, to which all required parameters are passed to. This method can can be called from a builders go() function. All parameters must then be stored within the builder instance until they are passed to the actual object.

For example, the IRenderPipeline interface should look like this:

class LITEFX_RENDERING_API IRenderPipeline {
public:
    virtual ~IRenderPipeline() noexcept = default;
    
    // Required to create the object.
public:
    virtual const IRenderPass& renderPass() const noexcept = 0;

    // Lazy initialization (optional, if initialize can be done from constructor.
public:
    void initialize(UniquePtr<IRenderPipelineLayout>&& layout) = 0;
    bool isInitialized() const noexcept = 0;

    // Properties that are available after initialization.
public:
    virtual const IRenderPipelineLayout* getLayout() const noexcept = 0;
    virtual IRenderPipelineLayout* getLayout() noexcept = 0;
};

Add support for texture arrays.

Currently each texture is assumed to be a single texture. We should implement support for texture arrays.

To support this, we should define methods in the descriptor set and graphics factory classes in order to facilitate multiple textures/samplers instead of individual ones. Also we should add an overload to the descriptor set to bind multiple textures/samplers at once.

Expose multisampling state.

Multisampling currently is only supported in the form of samples per target. However, the actual state is more elaborate and we should expose it on the pipeline.

Here are some references:

Vulkan Multisample State

Add support for cube maps.

Cube maps are a special view on texture resources, which are currently unsupported. The support should contain:

Loading and sampling cube maps.
Drawing to individual sides of a cube map as a render target.

Use dynamic viewport states.

In Vulkan there are two ways for updating a viewport. Currently only static viewports, that are directly passed to the pipeline during creation are supported. The problem is, that a resize requires all pipelines to be recreated in this case, which can be inefficient. Dynamic viewport states do not require the pipeline itself to be recreated, but only the swap-chain. In this case, a vkCmdSetViewport call is made to set the viewport. The problem with this approach, however, is, that it might not be as efficient during runtime.

Since there are tradeoffs with both approaches, we should support them both.

Review Vulkan samples.

There are two issues with the current Vulkan samples:

A minor issue causes the deferred rendering sample to draw the wrong colors.
The textures sample crashes. If validation layers are activated, it is stated, that a descriptor set is not initialized.

Implement descriptor arrays.

Currently only scalar descriptors can be used. We should implement support for binding arrays.

In DirectX 12, is done by changing the pipeline layout root descriptor table.
In Vulkan, this can be done in the descriptor set layout.

Support back-end switching.

Currently the back end needs to be pre-defined when initializing the application. We should support initializing multiple back-ends in parallel, so that a user can switch back-ends on demand. Ideally, we can take over states or design the interface to be agnostic towards the underlying back-end. Furthermore, we should create a sample where we demonstrate a basic implementation.

Allow custom attachment mappings.

Currently the order of shader input and output attachments is fixed by the order they are defined. A render target should expose a location property, that allows for it to be explicitly ordered. The default behavior could remain to automatically increment the location when adding a render target to a list.

For input attachments, a RenderPassDependency object could be introduced, that holds a reference to the previous render pass, as well as a map that can be used to map a render target of previous render passes to an input attachment in the current render pass. In order to do this we could evaluate, if the render target actually belongs to an (indirect) dependency. This way, it would be possible to use render targets from multiple render passes as input attachments.

Allow shaders to be embedded into binaries.

Following #41, we should allow to load shaders from resource streams and also provide a way to build shaders in a way to embed them into a binary. This would, for example, allow us to embed the blit shader into the DirectX 12 backend shared library.

Decouple pipelines from render passes.

Currently a render pass only hosts one and exactly one pipeline instance. This should be changed, so that a render pass can create multiple pipeline instances. This is important, if one render pass draws objects that use different shader programs, which are closely coupled to a pipeline and a pipeline layout. The draw loop then can bind the pipeline, based on whatever is required. A draw loop then would bind the following objects with increasing frequency:

Render Pass
Pipeline(s)
Descriptor Sets
Vertex/Index Buffers
Push Constants (if supported at some point)

Since a lot of data gets shared between pipeline instances (e.g. viewports, descriptor set layouts and input assembler/rasterizer configurations) one approach would be, to store those properties on the render pass and make them overwriteable when creating a pipeline. A more simplisitc approach could simply provide a Clone-function for pipeline instances.

Support resource aliasing.

Resource aliasing is basically supported by the memory allocators for both backends. However, currently there is no way of using it.