Coder Social home page Coder Social logo

Comments (10)

floooh avatar floooh commented on August 15, 2024 1

Deleted my previous comment because I wrote rubbish :D But maybe the tex3d sample (https://github.com/floooh/sokol-samples/blob/master/sapp/tex3d-sapp.c) can still be helpful even if it doesn't dynamically upload data each frame, but maybe it can still be a good playground to investigate the issue.

One thing that's different from other 3D backends is that there's separate code to populate a new texture, or update an existing texture. For new textures this is the important code:

sokol/sokol_gfx.h

Lines 10408 to 10434 in c54523c

_SOKOL_PRIVATE void _sg_d3d11_fill_subres_data(const _sg_image_t* img, const sg_image_data* data) {
const int num_faces = (img->cmn.type == SG_IMAGETYPE_CUBE) ? 6:1;
const int num_slices = (img->cmn.type == SG_IMAGETYPE_ARRAY) ? img->cmn.num_slices:1;
int subres_index = 0;
for (int face_index = 0; face_index < num_faces; face_index++) {
for (int slice_index = 0; slice_index < num_slices; slice_index++) {
for (int mip_index = 0; mip_index < img->cmn.num_mipmaps; mip_index++, subres_index++) {
SOKOL_ASSERT(subres_index < (SG_MAX_MIPMAPS * SG_MAX_TEXTUREARRAY_LAYERS));
D3D11_SUBRESOURCE_DATA* subres_data = &_sg.d3d11.subres_data[subres_index];
const int mip_width = _sg_miplevel_dim(img->cmn.width, mip_index);
const int mip_height = _sg_miplevel_dim(img->cmn.height, mip_index);
const sg_range* subimg_data = &(data->subimage[face_index][mip_index]);
const size_t slice_size = subimg_data->size / (size_t)num_slices;
const size_t slice_offset = slice_size * (size_t)slice_index;
const uint8_t* ptr = (const uint8_t*) subimg_data->ptr;
subres_data->pSysMem = ptr + slice_offset;
subres_data->SysMemPitch = (UINT)_sg_row_pitch(img->cmn.pixel_format, mip_width, 1);
if (img->cmn.type == SG_IMAGETYPE_3D) {
// FIXME? const int mip_depth = _sg_miplevel_dim(img->depth, mip_index);
subres_data->SysMemSlicePitch = (UINT)_sg_surface_pitch(img->cmn.pixel_format, mip_width, mip_height, 1);
} else {
subres_data->SysMemSlicePitch = 0;
}
}
}
}
}

...while for updating an existing texture this code is used:

sokol/sokol_gfx.h

Lines 11448 to 11483 in c54523c

D3D11_MAPPED_SUBRESOURCE d3d11_msr;
for (int face_index = 0; face_index < num_faces; face_index++) {
for (int slice_index = 0; slice_index < num_slices; slice_index++) {
for (int mip_index = 0; mip_index < img->cmn.num_mipmaps; mip_index++, subres_index++) {
SOKOL_ASSERT(subres_index < (SG_MAX_MIPMAPS * SG_MAX_TEXTUREARRAY_LAYERS));
const int mip_width = _sg_miplevel_dim(img->cmn.width, mip_index);
const int mip_height = _sg_miplevel_dim(img->cmn.height, mip_index);
const int src_pitch = _sg_row_pitch(img->cmn.pixel_format, mip_width, 1);
const sg_range* subimg_data = &(data->subimage[face_index][mip_index]);
const size_t slice_size = subimg_data->size / (size_t)num_slices;
const size_t slice_offset = slice_size * (size_t)slice_index;
const uint8_t* slice_ptr = ((const uint8_t*)subimg_data->ptr) + slice_offset;
hr = _sg_d3d11_Map(_sg.d3d11.ctx, img->d3d11.res, subres_index, D3D11_MAP_WRITE_DISCARD, 0, &d3d11_msr);
_sg_stats_add(d3d11.num_map, 1);
if (SUCCEEDED(hr)) {
// FIXME: need to handle difference in depth-pitch for 3D textures as well!
if (src_pitch == (int)d3d11_msr.RowPitch) {
memcpy(d3d11_msr.pData, slice_ptr, slice_size);
} else {
SOKOL_ASSERT(src_pitch < (int)d3d11_msr.RowPitch);
const uint8_t* src_ptr = slice_ptr;
uint8_t* dst_ptr = (uint8_t*) d3d11_msr.pData;
for (int row_index = 0; row_index < mip_height; row_index++) {
memcpy(dst_ptr, src_ptr, (size_t)src_pitch);
src_ptr += src_pitch;
dst_ptr += d3d11_msr.RowPitch;
}
}
_sg_d3d11_Unmap(_sg.d3d11.ctx, img->d3d11.res, subres_index);
_sg_stats_add(d3d11.num_unmap, 1);
} else {
_SG_ERROR(D3D11_MAP_FOR_UPDATE_IMAGE_FAILED);
}
}
}
}

...both loop over face_index, slice_index and mip_index.

It could be that the bug is in both places, or only in the update-image code, maybe it makes sense to share a bit of code between those two places (not sure yet if that makes sense though).

In the Metal and WebGPU backends, creating and updating a texture uses a common helper function:

  • Metal:

    sokol/sokol_gfx.h

    Lines 12288 to 12334 in c54523c

    _SOKOL_PRIVATE void _sg_mtl_copy_image_data(const _sg_image_t* img, __unsafe_unretained id<MTLTexture> mtl_tex, const sg_image_data* data) {
    const int num_faces = (img->cmn.type == SG_IMAGETYPE_CUBE) ? 6:1;
    const int num_slices = (img->cmn.type == SG_IMAGETYPE_ARRAY) ? img->cmn.num_slices : 1;
    for (int face_index = 0; face_index < num_faces; face_index++) {
    for (int mip_index = 0; mip_index < img->cmn.num_mipmaps; mip_index++) {
    SOKOL_ASSERT(data->subimage[face_index][mip_index].ptr);
    SOKOL_ASSERT(data->subimage[face_index][mip_index].size > 0);
    const uint8_t* data_ptr = (const uint8_t*)data->subimage[face_index][mip_index].ptr;
    const int mip_width = _sg_miplevel_dim(img->cmn.width, mip_index);
    const int mip_height = _sg_miplevel_dim(img->cmn.height, mip_index);
    // special case PVRTC formats: bytePerRow and bytesPerImage must be 0
    int bytes_per_row = 0;
    int bytes_per_slice = 0;
    if (!_sg_mtl_is_pvrtc(img->cmn.pixel_format)) {
    bytes_per_row = _sg_row_pitch(img->cmn.pixel_format, mip_width, 1);
    bytes_per_slice = _sg_surface_pitch(img->cmn.pixel_format, mip_width, mip_height, 1);
    }
    /* bytesPerImage special case: https://developer.apple.com/documentation/metal/mtltexture/1515679-replaceregion
    "Supply a nonzero value only when you copy data to a MTLTextureType3D type texture"
    */
    MTLRegion region;
    int bytes_per_image;
    if (img->cmn.type == SG_IMAGETYPE_3D) {
    const int mip_depth = _sg_miplevel_dim(img->cmn.num_slices, mip_index);
    region = MTLRegionMake3D(0, 0, 0, (NSUInteger)mip_width, (NSUInteger)mip_height, (NSUInteger)mip_depth);
    bytes_per_image = bytes_per_slice;
    // FIXME: apparently the minimal bytes_per_image size for 3D texture is 4 KByte... somehow need to handle this
    } else {
    region = MTLRegionMake2D(0, 0, (NSUInteger)mip_width, (NSUInteger)mip_height);
    bytes_per_image = 0;
    }
    for (int slice_index = 0; slice_index < num_slices; slice_index++) {
    const int mtl_slice_index = (img->cmn.type == SG_IMAGETYPE_CUBE) ? face_index : slice_index;
    const int slice_offset = slice_index * bytes_per_slice;
    SOKOL_ASSERT((slice_offset + bytes_per_slice) <= (int)data->subimage[face_index][mip_index].size);
    [mtl_tex replaceRegion:region
    mipmapLevel:(NSUInteger)mip_index
    slice:(NSUInteger)mtl_slice_index
    withBytes:data_ptr + slice_offset
    bytesPerRow:(NSUInteger)bytes_per_row
    bytesPerImage:(NSUInteger)bytes_per_image];
    }
    }
    }
    }
  • WebGPU:

    sokol/sokol_gfx.h

    Lines 14484 to 14528 in c54523c

    _SOKOL_PRIVATE void _sg_wgpu_copy_image_data(const _sg_image_t* img, WGPUTexture wgpu_tex, const sg_image_data* data) {
    WGPUTextureDataLayout wgpu_layout;
    _sg_clear(&wgpu_layout, sizeof(wgpu_layout));
    WGPUImageCopyTexture wgpu_copy_tex;
    _sg_clear(&wgpu_copy_tex, sizeof(wgpu_copy_tex));
    wgpu_copy_tex.texture = wgpu_tex;
    wgpu_copy_tex.aspect = WGPUTextureAspect_All;
    WGPUExtent3D wgpu_extent;
    _sg_clear(&wgpu_extent, sizeof(wgpu_extent));
    const int num_faces = (img->cmn.type == SG_IMAGETYPE_CUBE) ? 6 : 1;
    for (int face_index = 0; face_index < num_faces; face_index++) {
    for (int mip_index = 0; mip_index < img->cmn.num_mipmaps; mip_index++) {
    wgpu_copy_tex.mipLevel = (uint32_t)mip_index;
    wgpu_copy_tex.origin.z = (uint32_t)face_index;
    int mip_width = _sg_miplevel_dim(img->cmn.width, mip_index);
    int mip_height = _sg_miplevel_dim(img->cmn.height, mip_index);
    int mip_slices;
    switch (img->cmn.type) {
    case SG_IMAGETYPE_CUBE:
    mip_slices = 1;
    break;
    case SG_IMAGETYPE_3D:
    mip_slices = _sg_miplevel_dim(img->cmn.num_slices, mip_index);
    break;
    default:
    mip_slices = img->cmn.num_slices;
    break;
    }
    const int row_pitch = _sg_row_pitch(img->cmn.pixel_format, mip_width, 1);
    const int num_rows = _sg_num_rows(img->cmn.pixel_format, mip_height);
    if (_sg_is_compressed_pixel_format(img->cmn.pixel_format)) {
    mip_width = _sg_roundup(mip_width, 4);
    mip_height = _sg_roundup(mip_height, 4);
    }
    wgpu_layout.offset = 0;
    wgpu_layout.bytesPerRow = (uint32_t)row_pitch;
    wgpu_layout.rowsPerImage = (uint32_t)num_rows;
    wgpu_extent.width = (uint32_t)mip_width;
    wgpu_extent.height = (uint32_t)mip_height;
    wgpu_extent.depthOrArrayLayers = (uint32_t)mip_slices;
    const sg_range* mip_data = &data->subimage[face_index][mip_index];
    wgpuQueueWriteTexture(_sg.wgpu.queue, &wgpu_copy_tex, mip_data->ptr, mip_data->size, &wgpu_layout, &wgpu_extent);
    }
    }
    }

...which is a lot cleaner (but Metal and WebGPU are different in that the texture object is created without data, and the data is copied into the object after creation), but as I said above, not sure how much the two code paths can be unified in the D3D11 backend.

from sokol.

jakubtomsu avatar jakubtomsu commented on August 15, 2024 1

Thanks for the example. In my case it's about uploading dynamic texture every few frames but it's still handy.

I just implemented the code for uploading the texture directly with d3d11, and it works correctly now. But I found out update_image works fine with size 128^3 and up, but didn't work for size 32^3.

This is the uploading code I ended up with, note the difference between dst_offset and src_offset. I believe update_image should have something like src_depth_pitch.

        // FIXME:
        // sg.update_image(
        //     ren.blood_volume_img,
        //     sg.Image_Data{subimage = {0 = {0 = {ptr = &game.blood, size = size_of(game.blood)}}}},
        // )

        img := sg.d3d11_query_image_info(ren.blood_volume_img)
        res := transmute(^d3d11.IResource)img.res

        msr: d3d11.MAPPED_SUBRESOURCE
        hr := d3d11_device_context->Map(res, 0, .WRITE_DISCARD, {}, &msr)
        fmt.println(hr)
        fmt.println(msr)
        fmt.println(img)
        if hr == 0 {
            defer d3d11_device_context->Unmap(res, 0)

            // NOTE: this code assumes a pixel is one byte
            for z in 0 ..< BLOOD_BOUNDS_Z {
                for y in 0 ..< BLOOD_BOUNDS_Y {
                    dst_offset := uintptr(z * int(msr.DepthPitch) + y * int(msr.RowPitch))
                    src_offset := uintptr(z * BLOOD_BOUNDS_X * BLOOD_BOUNDS_Y + y * BLOOD_BOUNDS_X)
                    runtime.mem_copy(
                        rawptr(uintptr(msr.pData) + dst_offset),
                        rawptr(uintptr(&game.blood) + src_offset),
                        BLOOD_BOUNDS_X * size_of(u8),
                    )
                }
            }
        }

from sokol.

floooh avatar floooh commented on August 15, 2024 1

Okidoki, thanks a lot for the investigation and sample code! I'll try to set aside some time next week for a new sample/regression-test and fixing the issue.

In general, the idea for uploading texture data is that the incoming data should be tightly packed (e.g. no pitch-related gaps between rows and slices - because different backends have different requirements for those anyway), and sokol_gfx.h will take care of 'unwrapping' the data, even if that means doing small per-row memcpy's.

At least that's the idea for now.

At some point in the future I want to make the resource update functions more orthogonal, first by providing more orthogonal functions, at least:

  • copy-buffer-to-buffer
  • write-mem-to-buffer
  • copy-image-to-image
  • write-mem-to-image
  • ...?

...and ideally also allowing to update only parts of the resource, and with custom row/slice pitch - and implementing "fast paths" if those things match the backend API expectations (e.g. doing a single big copy instead of many small copies).

...it'll be a while until I get to that though, probably after the computer-shader stuff...

from sokol.

floooh avatar floooh commented on August 15, 2024 1

Ok, I can reproduce the issue, and also confirm that it only seems affects the D3D11 backend (not 100% sure though because the texture size where the problem manifests seems to differ between hardware configs, see below).

Interestingly, on my Intel GPU laptop, the cutoff size is 32x32 (with RGBA8 pixels). A RGBA8 3D texture of size 32x32x3 still works (e.g. one depth slice is 32 * 32 * sizeof(uint32_t) = 4096 bytes. The required depth-pitch 4096 matches the slice size.

However going down to a 16x16x3 textures breaks. D3D11 reports a required depth-pitch of 2048, while one slice is 1024 bytes.

I'll work on a fix over the next few evenings (had to squeeze an unexpected PR and fix for sokol-shdc earlier today).

from sokol.

floooh avatar floooh commented on August 15, 2024 1

Ok, I extended the sample to test all sorts of texture sizes and also dynamic vs immutable 3D textures (this is recorded on Mac with the Metal backend, but I also tested on my Windows laptop with Intel GPU, and gaming PC with nvidia rtx2070. Merge and website update incoming (I also stumbled over an unrelated problem in the WebGPU backend, but that's not important enough to delay the merge: #1066)

Screen.Recording.2024-06-19.at.19.24.02.mov

from sokol.

floooh avatar floooh commented on August 15, 2024 1

Ok, merged. Next up is merging the new sample and updating the sample web pages.

Thanks for the bug report!

from sokol.

jakubtomsu avatar jakubtomsu commented on August 15, 2024 1

Fantastic! Thank you so much for your work:)

from sokol.

jakubtomsu avatar jakubtomsu commented on August 15, 2024

Sounds good, thank you! Fixing this would be great. And more modern uploading and writing functions would be nice to have as well, but I agree the compute stuff is higher priority.

from sokol.

floooh avatar floooh commented on August 15, 2024

...starting to look into this now. Might take a couple of days because I'll build a new sample around it.

from sokol.

floooh avatar floooh commented on August 15, 2024

Ok, fixed in #1063, but before merging I want to make the new sample in sokol-samples a bit more flexible, and also do a bit more testing (ETA tomorrow or the day after).

from sokol.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.