Coder Social home page Coder Social logo

Comments (5)

djkoloski avatar djkoloski commented on September 27, 2024 2

This is now released in version 0.4.0.

from rend.

djkoloski avatar djkoloski commented on September 27, 2024

This looks like an issue with the way that chars of non-native endianness are stored. During the test, an invalid char is created between construction and evaluation which is likely the source of this issue. To fix this, we'll have to change LittleEndian<char> and BigEndian<char> to store the underlying value as a u32. Unfortunately, this will also inhibit any niching optimizations.

from rend.

nc7s avatar nc7s commented on September 27, 2024

Ran some tests, see below:

fn main() {
	let c = '🎉';
	dbg!(
	unsafe { (
		std::mem::transmute::<_, [u8; 4]>(
				c as u32
		),
		std::mem::transmute::<_, [u8; 4]>(
			core::char::from_u32_unchecked(
				c as u32
			)
		),
		std::mem::transmute::<_, [u8; 4]>(
				(c as u32).swap_bytes()
		),
		std::mem::transmute::<_, [u8; 4]>(
			core::char::from_u32_unchecked(
				(c as u32).swap_bytes()
			)
		),
		std::mem::transmute::<_, [u8; 4]>(
				(c as u32).swap_bytes().swap_bytes()
		),
		std::mem::transmute::<_, [u8; 4]>(
			core::char::from_u32_unchecked(
				(c as u32).swap_bytes().swap_bytes()
			)
		),
	) }
	);
}

The output on arm64 is:

[test.rs:3] unsafe {
    (std::mem::transmute::<_, [u8; 4]>(c as u32),
        std::mem::transmute::<_,
                [u8; 4]>(core::char::from_u32_unchecked(c as u32)),
        std::mem::transmute::<_, [u8; 4]>((c as u32).swap_bytes()),
        std::mem::transmute::<_,
                [u8; 4]>(core::char::from_u32_unchecked((c as
                            u32).swap_bytes())),
        std::mem::transmute::<_,
                [u8; 4]>((c as u32).swap_bytes().swap_bytes()),
        std::mem::transmute::<_,
                [u8; 4]>(core::char::from_u32_unchecked((c as
                                u32).swap_bytes().swap_bytes())))
} = (
    [ 137, 243, 1, 0, ],
    [ 137, 243, 1, 0, ],
    [ 0, 1, 243, 137, ],
    [ 0, 1, 243, 137, ],
    [ 137, 243, 1, 0, ],
    [ 137, 243, 1, 0, ],
)

But on riscv64 it's:

[test.rs:3] unsafe {
    (std::mem::transmute::<_, [u8; 4]>(c as u32),
     std::mem::transmute::<_,
                           [u8; 4]>(core::char::from_u32_unchecked(c as u32)),
     std::mem::transmute::<_, [u8; 4]>((c as u32).swap_bytes()),
     std::mem::transmute::<_,
                           [u8; 4]>(core::char::from_u32_unchecked((c as
                                                                        u32).swap_bytes())),
     std::mem::transmute::<_, [u8; 4]>((c as u32).swap_bytes().swap_bytes()),
     std::mem::transmute::<_,
                           [u8; 4]>(core::char::from_u32_unchecked((c as
                                                                        u32).swap_bytes().swap_bytes())))
} = (
    [ 137, 243, 1, 0, ],
    [ 137, 243, 1, 0, ],
    [ 0, 1, 243, 137, ],
    [ 0, 1, 243, 0, ],
    [ 137, 243, 1, 0, ],
    [ 137, 243, 1, 0, ],
)

The 4th case, u32::swap_bytes() then char::from_u32_unchecked(), missed the 4th byte, but the double swapped case didn't.

Environments:

  1. rustc installed through rustup on macOS arm64
    rustc 1.63.0 (4b91a6ea7 2022-08-08)
    binary: rustc
    commit-hash: 4b91a6ea7258a947e59c6522cd5898e7c0a6a88f
    commit-date: 2022-08-08
    host: aarch64-apple-darwin
    release: 1.63.0
    LLVM version: 14.0.5

  2. rustc installed through unstable on Debian, chrooted qemu-user-riscv64 on amd64
    rustc 1.59.0 # 1.59.0+dfsg1-2
    binary: rustc
    commit-hash: unknown
    commit-date: unknown
    host: riscv64gc-unknown-linux-gnu
    release: 1.59.0
    LLVM version: 13.0.1

from rend.

djkoloski avatar djkoloski commented on September 27, 2024

Thanks for running these tests, they confirm my suspicions. I think the best approach would be to introduce a new trait with an associated type that defines the storage type for LittleEndian and BigEndian on non-native platforms. For char, this would be u32 to avoid passing through a char and causing UB. This will involve new bounds on the type parameter and so will require a version bump.

from rend.

djkoloski avatar djkoloski commented on September 27, 2024

Fixed by 6f75ee6, tested on riscv64gc-unknown-linux-gnu through cargo cross.

from rend.

Related Issues (7)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.