Coder Social home page Coder Social logo

memmap2-rs's Introduction

SWUbanner

vshymanskyy's GitHub stats

memmap2-rs's People

Contributors

01mf02 avatar adamreichold avatar adsteele916 avatar alexanderkjall avatar badboy avatar bugaevc avatar cberner avatar censoredusername avatar cldershem avatar danburkert avatar dekellum avatar diwic avatar gabrfarina avatar ho-229 avatar jesse-bakker avatar jonas-schievink avatar jsgf avatar lingman avatar lvella avatar mkeeter avatar mripard avatar mrmaxmeier avatar newpavlov avatar nyurik avatar phantomical avatar razrfalcon avatar saethlin avatar samueltardieu avatar simonsapin avatar timvisee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

memmap2-rs's Issues

Adding support for MAP_LOCK and mlock

Around the same time of this fork, there was a fork started called mapr, which implemented MAP_PRIVATE, MAP_LOCK and mlock. All use cases for MAP_PRIVATE are now covered by memmap2 through function calls. Thanks for pointing that out at #50 (comment).

What is left from that fork, what memmap2 is missing is MAP_LOCK and mlock. Before I do another PR, I'd like to check what the best way would be, hence this issue.

The only use-case for MAP_LOCK we currently have in our code base is in combination with map_anon. Should it only be added there? Or should it be added to all methods (similar to MAP_POPULATE as I did in at #50?

I think for adding mlock() there aren't any questions.

error[E0433]: failed to resolve: could not find `Advice` in `memmap2`

I'm trying to build a rust project on Windows 11 but it yields an error.

I've never built anything before in rust so I don't know what I should troubleshoot.

cargo version: cargo 1.74.1 (ecb9851af 2023-10-18)

rustup 1.26.0 (5af9b9484 2023-04-05)
info: This is the version for the rustup toolchain manager, not the rustc compiler.
info: The currently active `rustc` version is `rustc 1.74.1 (a28077b28 2023-12-04)`
error[E0433]: failed to resolve: could not find `Advice` in `memmap2`
  --> src\main.rs:83:30
   |
83 |         mmap.advise(memmap2::Advice::HugePage).unwrap();
   |                              ^^^^^^ could not find `Advice` in `memmap2`

error[E0599]: no method named `advise` found for struct `memmap2::Mmap` in the current scope
  --> src\main.rs:83:14
   |
83 |         mmap.advise(memmap2::Advice::HugePage).unwrap();
   |              ^^^^^^ method not found in `Mmap`

Some errors have detailed explanations: E0433, E0599.
For more information about an error, try `rustc --explain E0433`.
error: could not compile `parser` (bin "parser") due to 2 previous errors

When should I use memmap2?

I am hoping the discussion we have hear could make it into the project's README eventually, so I'll try to keep it general rather than specific to my use case.

The problem: I keep returning to consider mmap2 for my use case, but continue to remain unsure.

The current situation is as follows:

Problem 1: There is a "primary process" which generates information that we want to keep, but keeping it all on the RAM is not feasible.

Question 1: does Problem 1 have the right shape for memmap2 to be considered? If not, what is the right shape of problem for which memmap should be considered? After all, here's a non-memmap solution:

Solution 1 (file):

  • keep a buffer of the generated data

  • and when the buffer is full, flush the data via mpsc channels to a writer process. The writer process has a handle to an open file, into which it writes received data to the hard disk using <some binary format> + serde, while the primary process keeps generating data.

On the other hand, here is a memmap2 solution:

Solution 2 (memmap):

  • keep generated information in a memory mapped structure.

Question 2: Is the following true? "The benefit of Solution 2 (memmap) over Solution 1 (file) is that we do not have to deal with the overhead of inter-thread communication. Put differently, the primary process does not have to wait for a buffer flush + send to complete before continuing to generate data."

Question 3: Does Solution 2 make sense if you have a hard disk with slower write speed than the rate at which the primary process generates data?

One could also imagine the following solution:

Solution 3 (memmap, parallel):

  • keep a buffer of the generated data

  • and have a main, memory mapped structure which will hold all the generated data

  • this main memory mapped structure is kept by a writer process, which is sent information by the primary process using mpsc channels, which it then "appends" to the data in the memory mapped structure it is holding

Question 4: Is the following statement true? "An advantage of solution 2 is that if we have a hard disk with slower write speed than the rate at which the primary process is generating data, then Solution 3 essentially covers up this issue and replaces the cost instead with that of waiting for a buffer flush + send to complete."

Question 5: Is the following statement true? "The main benefit of memory mapping is to avoid the cost of <binary format> encoding/decoding."

(Thank you for your time.)

Implement AsRawFd, IntoRawFd, AsRawHandle, IntoRawHandle and Into<Stdio> for MMap

I would like to use memory mapped files to communicate with a child process stdin/stdout. I've already tested the idea with the crate memfile and the benchmark results are promising. However that crate does not offer Windows support, which is also something that I'm looking for.
To be able to use a the (anonymous) memory mapped file as stdin for a child process in the memfile crate I can use the From<MemFile> trait impl for Stdio. It would be great if this library added this impl too. There are multiple traits that can be implemented:

  • Into<Stdio>
  • AsRawFd / IntoRawFd
  • AsRawHandle / IntoRawHandle

Example usage with this feature implemented:

fn write_using_memmap2(buf: &[u8]) {
    let mut mmap = MmapOptions::new().len(buf.len()).map_anon().unwrap();
    mmap.copy_from_slice(buf);
    let mmap = mmap.make_read_only().unwrap();
    
    Command::new("nul")
        .stderr(Stdio::null())
        .stdout(Stdio::null())
        .stdin(Stdio::from(mmap)) // <-- Using the From<MMap> trait
        .spawn().unwrap()
        .wait().unwrap();
}

Merging changes with memmapix

Hi, I noticed that @al8n has made some changes in their fork https://github.com/al8n/memmapix, while this repo has also done some diverging work. This repo clearly has far more downloads/community following, so I wonder if it would be possible to merge some/all of the memmapix changes into this repo, or what's your future plans for this repo?

Both efforts would clearly benefit the community, and highly appreciated, thank you all!

P.S. The memmapix repo doesn't have issues tab for some reason, so I couldn't ask for it there.

PermissionDenied, message: "Access is denied." at remove file after read on Github Action WIndows

I'm trying to test a rust project on Github Action but it yields an error only on Windows.

That's all right at my desktop and Github Action Linux/macos so I don't know what I should troubleshoot.

    #[test]
    fn file_remove() {
        let tempdir = tempfile::tempdir().unwrap();
        let path = tempdir.path().join("mmap");

        let mut file = OpenOptions::new()
            .read(true)
            .write(true)
            .create(true)
            .open(path.clone())
            .unwrap();
        file.set_len(128).unwrap();

        let mmap = unsafe { Mmap::map(&file) };

        assert!(mmap.is_ok());

        let remove_res = fs::remove_file(path.clone());
        if remove_res.is_err() {
            println!("remove_res: {:?}", remove_res);
        }
        assert!(remove_res.is_ok());
    }

https://github.com/LokiSharp/memmap2-rs/blob/00de205e006aabb69e03172c04da869c18cd776f/src/lib.rs#L1575

---- test::file_remove stdout ----
remove_res: Err(Os { code: 5, kind: PermissionDenied, message: "Access is denied." })
thread 'test::file_remove' panicked at src\lib.rs:1595:9:
assertion failed: remove_res.is_ok()
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

https://github.com/LokiSharp/memmap2-rs/actions/runs/7223354238/job/19682434953

Mac osx support

it is not mentioned in the repo but cargo build and test run ok on osx.

So is OSX officially supported without issues?

Mapping beyond 4GB offset if broken on 32 bit glibc (Linux)

This is a direct consequence of this rust's libc issue. In rust targeting 32 bits glibc, libc::off_t is 32 bits, which breaks this code if you are trying to map beyond 4 GB in a file:

            let ptr = libc::mmap(
                ptr::null_mut(),
                map_len as libc::size_t,
                prot,
                flags,
                file,
                aligned_offset as libc::off_t,
            );

I had a similar issue with nix crate, where posix_fallocate() also relied on libc::off_t for file sizes. The fix was to use rustix crate, that calls the system-call directly. Maybe a similar workaround could be applied here?

Composite memory maps currently impossible

I want to create a thread-local memory map (could be file-based or anonymous, I'd prefer anonymous) in each of N threads where the first M bytes of each thread-local mapping are mapped to the same file (which is itself mapped into memory). I know this allocation scheme is possible using plain mmap in Linux but I can't find a way to do it in memmap2-rs. That is, there is no way to map the buffer of a MMapMut into the buffer of another MMap* struct.

Am I wrong to believe this or is it truly impossible under the current API?

Better terminology re: safety

The docs for MmapRaw::as_ptr() and MmapRaw::as_mut_ptr() both say:

Safety

To safely dereference this pointer, you need to make sure that the file has not been truncated since the memory map was created.

However, this isn't true in the Rust sense of the word safety: if the file is truncated, the memory mapping itself is not changed. So accessing that memory will simply cause the process to be killed with SIGBUS(1) or equivalent. That's not actually a memory safety issue as no memory will actually be read or written.

To avoid confusion, I think it'd be good to keep this warning, but rename the section to something else and describe why truncation isn't directly a memory safety issue.

  1. Modulo the rather advanced topic of catching SIGBUS.

memmap2 `advise` is unsound

I think this only applies to the read-only version, but advise can conceptually performs a write and should require mutable access. Here's a simple reproducer.

use memmap2::*;

fn main() {
    let mut a = MmapMut::map_anon(4096).unwrap();
    a.as_mut().fill(255);
    let a = a.make_read_only().unwrap();

    bar(&a, a.as_ref());
}

fn bar(map: &Mmap, slic: &[u8]) {
    let a = slic[0];
    map.advise(Advice::DontNeed);
    let b = slic[0];

    println!("{} {}", a, b);
}

In debug, this prints out 255 0, as DontNeed has freed the pages. In release, this prints out 255 255. Needlessly to say, this program has UB because it changes the memory an immutable reference points to while it is active.

Why the fork?

I'm wondering if there's a reason for the fork, and upon what basis I should choose to use memmap v1 versus v2?

Support POSIX MAP_FIXED

Would it be possible to support this flag? Currently the much older crate mmap supports it, but it hasn't been updated in 6 years (!).

`make_exec` should clear icache

On a Mac with an M1 processor, reusing an anonymous memory map can execute the previous code, resulting in seemingly-impossible failure modes.

Here's a minimal reproduction case, alternating between

mov x0 #1
ret

or

mov x0 #2
ret

and asserting that they return the correct value:

fn main() {
    let mov_x0_1: u32 = 0x200080D2; // MOV x0, #1
    let mov_x0_2: u32 = 0x400080D2; // MOV x0, #2
    let ret: u32 = 0xC0035FD6; // RET
    let mut mmap = memmap2::MmapMut::map_anon(8).unwrap();
    mmap[4..8].copy_from_slice(&ret.to_be_bytes());

    for i in 0..100 {
        // Build and execute a function that should return 1
        mmap[0..4].copy_from_slice(&mov_x0_1.to_be_bytes());
        let mmap_exec = mmap.make_exec().unwrap();
        let f: unsafe extern "C" fn() -> u32 = unsafe {
            std::mem::transmute(mmap_exec.as_ptr())
        };
        assert_eq!(unsafe { f() }, 1);
        mmap = mmap_exec.make_mut().unwrap();

        // Build and execute a function that should return 2
        mmap[0..4].copy_from_slice(&mov_x0_2.to_be_bytes());
        let mmap_exec = mmap.make_exec().unwrap();
        let f: unsafe extern "C" fn() -> u32 = unsafe {
            std::mem::transmute(mmap_exec.as_ptr())
        };
        assert_eq!(unsafe { f() }, 2);
        mmap = mmap_exec.make_mut().unwrap();
    }
}

This fails because the icache and dcache are separated, so the modified program written to the buffer will not be used during evaluation. ARM has a 2013 blog post explaining the issue, which makes me suspect this is not unique to the M1 and will be an issue on any ARM processor.

On macOS, this cache is flushed by sys_icache_invalidate. For reference, here's a fixed version of this program:

use std::ffi::c_void;

#[link(name = "c")]
extern "C" {
    fn sys_icache_invalidate(start: *const std::ffi::c_void, size: usize);
}

fn main() {
    let mov_x0_1: u32 = 0x200080D2; // MOV x0, #1
    let mov_x0_2: u32 = 0x400080D2; // MOV x0, #2
    let ret: u32 = 0xC0035FD6; // RET
    let mut mmap = memmap2::MmapMut::map_anon(8).unwrap();
    mmap[4..8].copy_from_slice(&ret.to_be_bytes());

    for i in 0..100 {
        // Build and execute a function that should return 1
        mmap[0..4].copy_from_slice(&mov_x0_1.to_be_bytes());
        unsafe {
            sys_icache_invalidate(mmap.as_ptr() as *const c_void, mmap.len())
        };

        let mmap_exec = mmap.make_exec().unwrap();
        let f: unsafe extern "C" fn() -> u32 =
            unsafe { std::mem::transmute(mmap_exec.as_ptr()) };
        assert_eq!(unsafe { f() }, 1);
        mmap = mmap_exec.make_mut().unwrap();

        // Build and execute a function that should return 2
        mmap[0..4].copy_from_slice(&mov_x0_2.to_be_bytes());
        unsafe {
            sys_icache_invalidate(mmap.as_ptr() as *const c_void, mmap.len())
        };

        let mmap_exec = mmap.make_exec().unwrap();
        let f: unsafe extern "C" fn() -> u32 =
            unsafe { std::mem::transmute(mmap_exec.as_ptr()) };
        assert_eq!(unsafe { f() }, 2);
        mmap = mmap_exec.make_mut().unwrap();
    }
}

I'm not totally sure of the right fix here!

Adding a call to sys_icache_invalidate (and the Linux equivalent) will fix the issue, but that's is not the recommended way of doing W^X on macOS:

Porting Just-In-Time Compilers to Apple Silicon suggests creating the region with both PROT_WRITE, PROT_EXEC, and the MAP_JIT flag, then using pthread_jit_write_protect_np to modify write-protection on a per-thread (?!) basis.

This would be a significant change from the common unix codebase!

Alignment issue on Windows with empty file

The Windows implementation has a problem when trying to create a mmap for an empty file.

Normally, Windows creates a mmap that is aligned in memory. But there's a catch. An empty file gets a pointer to address 1 (0x00000001), which is not aligned.

In Qdrant this was problematic, because we always expected an aligned mmap. We therefore handle this edge case, returning a dangling pointer that is aligned. This is fine because there's no data at the pointer, the byte slice is empty.

Can I submit a PR to implement a similar fix; return an aligned pointer on Windows? Or is the current behavior desirable, and should I not expect to get an aligned pointer?


I'm unsure why this happens on Windows, and couldn't find anything about this in the Windows API documentation. I assume it is an error code, where returning a null pointer is not an option so it picks the next address. This is very easily reproducible however.

This only happens on Windows. Unix-like platforms don't have this problem and return an aligned pointer.

use memmap2::Mmap;
use std::fs::File;

fn main() {
    let file = File::options()
        .read(true)
        .write(true)
        .create(true)
        .open("./empty-file.txt")
        .unwrap();
    file.set_len(0).unwrap();

    let mmap = unsafe { Mmap::map(&file).unwrap() };

    println!("Mmap pointer: {:?}", mmap.as_ptr());
}
# Windows
$ cargo run
Mmap pointer: 0x000000000001

# Linux, macOS
$ cargo run
Mmap pointer: 0x7f51653c1000

Related: qdrant/qdrant#1873
CC: @Jesse-Bakker

mmap'd anon pages fail in io_uring IOSQE_BUFFER_SELECT usage

This is half a question, and half a bug report.

I have some test code that goes through several permutations of how to execute io_uring operations using various buffer strategies. One strategy (my default, actually) kept failing with -14 Bad Address / EFAULT errors.

Tracking it down, it seemed to be only buffers from mmap'd pages that cause this, digging further.. it was only mmap'd pages that had been allocated via memmap2(?)

    let adjusted_buf_size = size.checked_next_power_of_two().unwrap();
    let buf_layout = std::alloc::Layout::from_size_align(adjusted_buf_size, PAGE_ALIGN).unwrap();
    let buf_total = buf_layout.size().checked_mul(count).unwrap();

    let buf_ptr = if OLD_SKOOL {
        let buf_ptr = libc::mmap(
            std::ptr::null_mut(),
            buf_total,
            libc::PROT_WRITE | libc::PROT_READ,
            libc::MAP_ANON | libc::MAP_POPULATE | libc::MAP_PRIVATE,
            -1,
            0,
        );
        if buf_ptr == libc::MAP_FAILED {
            panic!("mmap failed");
        }
        buf_ptr.cast()
    } else {
        let mut opts = MmapOptions::new();
        let mut buf = opts.populate().len(buf_total).map_anon().unwrap();
        buf.as_mut_ptr()
    };
    // belt and suspenders
    std::ptr::write_bytes(buf_ptr, 0, buf_total);

When OLD_SKOOL is true, then everything works as planned. If OLD_SKOOL is false, then all read via io_uring fail with the -14 Bad Address.

Before the registration of each buffer, I assert that they are aligned to 4096. I am baffled as to what the difference in approaches might be.

advise_writes_unsafely_to_part_of_map test fails on powerpc64le-unknown-linux-gnu

I'm the maintainer of the package for this crate in Fedora Linux, and with the update to version 0.9.0, I noticed a new test failure on powerpc64le compared to v0.7.1:

---- test::advise_writes_unsafely_to_part_of_map stdout ----
thread 'test::advise_writes_unsafely_to_part_of_map' panicked at 'assertion failed: `(left == right)`
  left: `0`,
 right: `255`', src/lib.rs:1978:9

The error message points to this assertion:
https://github.com/RazrFalcon/memmap2-rs/blob/v0.9.0/src/lib.rs#L1978

I don't know how or why powerpc64le is different here exactly ... the only thing I can think of right now is that the default page size is 64KB on powerpc64le, while it's 4KB on most other architectures.

All tests pass on our other supported architectures ({x86_64,i686,aarch64,s390x}-unknown-linux-gnu).


Test environment:

  • Fedora Linux 40, 39, 38 / Red Hat Enterprise Linux 9
  • Rust 1.74 (Fedora) , Rust 1.71 (RHEL)
  • powerpc64le-unknown-linux-gnu (page size: 64K)

WebAssembly Build Support

Hey, I was trying to build my crate (tokei) which depends deep down on memmap, however currently memmap2's MmapInner currently doesn't compile on WebAssembly platforms. Obviously memory maps don't work on WebAssembly, but it would be nice if it still compiled, but failed at runtime, like std's File.

Using MmapMut from multiple threads for simultaneous writing

I posted a stackoverflow question -- looking for some help on how to use this library to write to a file from multiple threads without any synchronization. Re-posting it here for visibility in case the author (thx!) or someone from the wonderful community can help.

I need to create a 40+ GB file using multiple threads. The file is used as a giant vector of u64 values. Threads do not need any kind of synchronization -- each thread's output will be unique to that thread, but each thread does NOT get its own slice. Rather, the nature of the data ensures each thread will generate a set of unique positions in the file to write to. Simple example -- each thread writes to a position [ind / thread_count], where ind goes to millions. For thread_count = 2, one thread writes to odd positions, and the other to even.

no-std support

I wonder whether or not we can make this crate no-std?

AFAIC, at least the anonymous mapping API (map_anon) should not depend on any std features.

This might not help a lot on UNIX since it depends on libc whatsoever. But on Windows, making this crate no-std should reduce link targets and/or executable size.

If you find this change proper, I could start working on this and (hopefully) make a PR soon.

Using memmap2 with RawFd

Even though files are the most obvious target for an mmap, we might want to call mmap on a file descriptor that isn't backed by a File. Buffers on Linux can be allocated that way for example.

The current API that takes a File doesn't really allow to use the crate in such a case.

Something that would work would be

let buffer_fd = allocate_buffer(whatever);
let file = unsafe { File::from_raw_fd(buffer_fd) };
let mut mmap = unsafe {
    MmapOptions::new().map_mut(&file)?
};

However, FrowRawFd consumes the ownership of the FD, therefore making it fairly hard to integrate into a larger buffer management codebase.

We could switch from File to AsRawFd the API, but I'm not sure whether that change in the API would be welcome?

Adding range support for `madvise()`

madvise() seems to support specifying a memory range, but the current implementation cannot specify a memory range.

Is there any reason for this?

Trouble compiling on aarch64 for xtensa-esp32-espidf

I'm having issues when compiling in a dev container while on aarch64 for a target of xtensa-esp32-idf. Its a std application for an esp32 and unfortunately the example is for no_std so I'm trying to convert it for my use. There are a total of 35 errors shown below:

I don't know if the issue is with compiling it on my M1 mac in a Debian docker container or if the issue is the feature selection. Any advice would be appreciated. Let me know if there is any other info required, I don't have my code hosted yet anywhere but I would be happy to upload and share it if that would help as well. I should note that I can compile libc crate just fine and I have tried compiling it in a couple different ways. Standard features and no defaults and "std".

   Compiling hash32 v0.2.1
error[E0425]: cannot find value `MAP_FAILED` in crate `libc`
  --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:93:29
   |
93 |             if ptr == libc::MAP_FAILED {
   |                             ^^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `PROT_READ` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:108:19
    |
108 |             libc::PROT_READ,
    |                   ^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `MAP_SHARED` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:109:19
    |
109 |             libc::MAP_SHARED | populate,
    |                   ^^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `PROT_READ` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:119:19
    |
119 |             libc::PROT_READ | libc::PROT_EXEC,
    |                   ^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `PROT_EXEC` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:119:37
    |
119 |             libc::PROT_READ | libc::PROT_EXEC,
    |                                     ^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `MAP_SHARED` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:120:19
    |
120 |             libc::MAP_SHARED | populate,
    |                   ^^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `PROT_READ` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:130:19
    |
130 |             libc::PROT_READ | libc::PROT_WRITE,
    |                   ^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `PROT_WRITE` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:130:37
    |
130 |             libc::PROT_READ | libc::PROT_WRITE,
    |                                     ^^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `MAP_SHARED` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:131:19
    |
131 |             libc::MAP_SHARED | populate,
    |                   ^^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `PROT_READ` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:141:19
    |
141 |             libc::PROT_READ | libc::PROT_WRITE,
    |                   ^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `PROT_WRITE` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:141:37
    |
141 |             libc::PROT_READ | libc::PROT_WRITE,
    |                                     ^^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `MAP_PRIVATE` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:142:19
    |
142 |             libc::MAP_PRIVATE | populate,
    |                   ^^^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `PROT_READ` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:157:19
    |
157 |             libc::PROT_READ,
    |                   ^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `MAP_PRIVATE` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:158:19
    |
158 |             libc::MAP_PRIVATE | populate,
    |                   ^^^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `PROT_READ` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:169:19
    |
169 |             libc::PROT_READ | libc::PROT_WRITE,
    |                   ^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `PROT_WRITE` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:169:37
    |
169 |             libc::PROT_READ | libc::PROT_WRITE,
    |                                     ^^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `MAP_PRIVATE` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:170:19
    |
170 |             libc::MAP_PRIVATE | libc::MAP_ANON | stack,
    |                   ^^^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `MAP_ANON` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:170:39
    |
170 |             libc::MAP_PRIVATE | libc::MAP_ANON | stack,
    |                                       ^^^^^^^^ not found in `libc`

error[E0425]: cannot find function `msync` in crate `libc`
    --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:181:28
     |
181  |             unsafe { libc::msync(self.ptr.offset(offset), len as libc::size_t, libc::MS_SYNC) };
     |                            ^^^^^ help: a function with a similar name exists: `fsync`
     |
    ::: /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/libc-0.2.134/src/unix/mod.rs:1009:5
     |
1009 |     pub fn fsync(fd: ::c_int) -> ::c_int;
     |     ------------------------------------ similarly named function `fsync` defined here

error[E0425]: cannot find value `MS_SYNC` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:181:86
    |
181 |             unsafe { libc::msync(self.ptr.offset(offset), len as libc::size_t, libc::MS_SYNC) };
    |                                                                                      ^^^^^^^ help: a constant with a similar name exists: `O_SYNC`
    |
   ::: /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/libc-0.2.134/src/unix/newlib/mod.rs:393:1
    |
393 | pub const O_SYNC: ::c_int = 8192;
    | ------------------------- similarly named constant `O_SYNC` defined here

error[E0425]: cannot find function `msync` in crate `libc`
    --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:194:28
     |
194  |             unsafe { libc::msync(self.ptr.offset(offset), len as libc::size_t, libc::MS_ASYNC) };
     |                            ^^^^^ help: a function with a similar name exists: `fsync`
     |
    ::: /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/libc-0.2.134/src/unix/mod.rs:1009:5
     |
1009 |     pub fn fsync(fd: ::c_int) -> ::c_int;
     |     ------------------------------------ similarly named function `fsync` defined here

error[E0425]: cannot find value `MS_ASYNC` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:194:86
    |
194 |             unsafe { libc::msync(self.ptr.offset(offset), len as libc::size_t, libc::MS_ASYNC) };
    |                                                                                      ^^^^^^^^ not found in `libc`

error[E0425]: cannot find function `mprotect` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:208:22
    |
208 |             if libc::mprotect(ptr, len, prot) == 0 {
    |                      ^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `PROT_READ` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:217:29
    |
217 |         self.mprotect(libc::PROT_READ)
    |                             ^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `PROT_READ` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:221:29
    |
221 |         self.mprotect(libc::PROT_READ | libc::PROT_EXEC)
    |                             ^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `PROT_EXEC` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:221:47
    |
221 |         self.mprotect(libc::PROT_READ | libc::PROT_EXEC)
    |                                               ^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `PROT_READ` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:225:29
    |
225 |         self.mprotect(libc::PROT_READ | libc::PROT_WRITE)
    |                             ^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `PROT_WRITE` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:225:47
    |
225 |         self.mprotect(libc::PROT_READ | libc::PROT_WRITE)
    |                                               ^^^^^^^^^^ not found in `libc`

error[E0425]: cannot find function `madvise` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:245:22
    |
245 |             if libc::madvise(self.ptr, self.len, advice as i32) != 0 {
    |                      ^^^^^^^ not found in `libc`

error[E0425]: cannot find value `_SC_PAGESIZE` in crate `libc`
   --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:297:58
    |
297 |             let page_size = unsafe { libc::sysconf(libc::_SC_PAGESIZE) as usize };
    |                                                          ^^^^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `MADV_NORMAL` in crate `libc`
  --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/advice.rs:13:20
   |
13 |     Normal = libc::MADV_NORMAL,
   |                    ^^^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `MADV_RANDOM` in crate `libc`
  --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/advice.rs:19:20
   |
19 |     Random = libc::MADV_RANDOM,
   |                    ^^^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `MADV_SEQUENTIAL` in crate `libc`
  --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/advice.rs:26:24
   |
26 |     Sequential = libc::MADV_SEQUENTIAL,
   |                        ^^^^^^^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `MADV_WILLNEED` in crate `libc`
  --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/advice.rs:32:22
   |
32 |     WillNeed = libc::MADV_WILLNEED,
   |                      ^^^^^^^^^^^^^ not found in `libc`

error[E0425]: cannot find value `MADV_DONTNEED` in crate `libc`
  --> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/advice.rs:63:22
   |
63 |     DontNeed = libc::MADV_DONTNEED,
   |                      ^^^^^^^^^^^^^ not found in `libc`

For more information about this error, try `rustc --explain E0425`.
error: could not compile `memmap2` due to 35 previous errors

CI workflow doesn't test toolchain targets correctly

This build step doesn't work as expected because rustup does not manage rustc's default target:

- name: Install toolchain
  uses: actions-rs/toolchain@v1
  with:
    toolchain: stable
    profile: minimal
    target: ${{ matrix.target }}
    override: true

- name: Run tests
  run: cargo test

The tests fail when the target is set explicitly:

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 380f1d0..b3e4c16 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -49,7 +49,7 @@ jobs:
         override: true
 
     - name: Run tests
-      run: cargo test
+      run: cargo test --target ${{ matrix.target }}
 
   test-macos-catalina:
     runs-on: macos-10.15
@@ -82,7 +82,7 @@ jobs:
         override: true
 
     - name: Run tests
-      run: cargo test
+      run: cargo test --target ${{ matrix.target }}

i686-unknown-linux-musl seems to run into a 32-bit truncation issue:

---- test::map_offset stdout ----
thread 'test::map_offset' panicked at 'attempt to subtract with overflow', src/lib.rs:179:23
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

and i686-unknown-linux-gnu fails during linking probably because 32-bit glibc isn't installed or something.

Here's a sample run (not sure if it'll survive force-pushes): https://github.com/Mrmaxmeier/memmap2-rs/runs/3251483567 (xref #21)

Apple IOS CI is broken

The following platforms are unsupported by newer rust versions:

  • armv7-apple-ios
  • armv7s-apple-ios
  • i386-apple-ios

Question: Saving copy-on-write mmap changes back to file

I'm writing a hex editor, and I want to add memory mapped file support to it.
Writable memory maps write changes directly into a file, which is not ideal, since the user might not want to save their changes.
The changes should only be written when the user saves.
I discovered that memmap2-rs supports copy-on-write memory maps, which sounds ideal for my use case.
However, I'm unsure how to safely write the changes back to the file on save.
Calling .flush() does not write the changed data back to the file.

I see two options, both of which involve keeping track of the parts of the file that changed:

  1. Just use the underlying File handle to write back the changes right from the mmap.
    Is this sound? Or is this undefined behavior?
  2. Right before saving:
    1. Copy all the changes into buffers that I maintain
    2. Drop the mmap
    3. Write the changes from my buffers using the File handle
    4. Reopen the mmap
      This is more convoluted, but this should definitely be sound. Or at least I hope so.

Which one should I use? Or is there a better alternative?

Thank you, and sorry for using issues for questions, but I couldn't find good resources on this, and I hope there are some mmap experts here that can help answer.

`map_anon()` can unsoundly create overlarge slices in safe code

On i686-unknown-linux-gnu, mmap() with MAP_ANONYMOUS is able to serve requests of 0x80000000 or more bytes. MmapMut::map_anon() and <MmapMut as Deref>::deref() can create a slice of this length in safe code:

/*
[dependencies]
memmap2 = "=0.5.4"

cargo run --target i686-unknown-linux-gnu
*/

use memmap2::MmapMut;

fn main() {
    let map: MmapMut = MmapMut::map_anon(0x80000000).unwrap();
    let slice: &[u8] = &*map;
    println!("{:#x}", slice.len()); // 0x80000000
}

However, <MmapMut as Deref>::deref() uses std::slice::from_raw_parts(), which has this safety precondition:

Behavior is undefined if any of the following conditions are violated: [...]

  • The total size len * mem::size_of::<T>() of the slice must be no larger than isize::MAX. See the safety documentation of pointer::offset.

Since isize::MAX == 0x7fffffff on i686-unknown-linux-gnu and other 32-bit targets, it is undefined behavior to create this 0x80000000-byte slice using std::slice::from_raw_parts(). Compiler optimizations can behave erratically if this precondition is violated.

The most straightforward solution would be to add assertions to MmapOptions::map(), MmapOptions::map_exec(), MmapOptions::map_mut(), MmapOptions::map_copy(), and MmapOptions::map_copy_read_only(), and MmapOptions::map_anon() that the provided or computed length is no greater than isize::MAX. MmapOptions::map_raw() does not need this assertion, since MmapRaw does not provide slices in safe code. (It may also be worthwhile to add a method that produces an anonymous MmapRaw not checked against isize::MAX.)

Support std::io::{Read, Write, Seek}

Hello! Thank you for very useful crate!

Before Rust I wrote in python. Python has its implementation for mmap. In python I can work with mmap like with File
I propose to implement traits std::io::{Read, Write, Seek}. If nobody don't mind, I'm ready to create PR.

Support `mremap(2)`

Hi! I have a use case in which I would like to use mremap(2) to resize a memory mapping created using this crate. I would be happy to submit a PR to implement this but figured that I would open an issue first.

Detaching from parent fork

To my knowledge, github doesn't support searching if a project is a fork. Would it make sense to disable the forked from (it might require an email to github support, as there is no settings AFAIK)? Seems like this project is much more alive than the parent (thx!!!)

Safe API for RO memmap construction

I'd like to be able to call a safe version of MmapRaw::from(unsafe { Mmap::map(&file) }), as a sibling API for map_raw.

Maybe it should be possible to pass flags for read/write/exec to MmapOptions::map_raw?

Support multiple advice

A common pattern with mmap is to provide multiple advice at once:

madvise(addr, size, MADV_WILLNEED | MADV_SEQUENTIAL);

This is not currently possible with the Advice implementation. Would you accept a PR overloading the Or operator to provide equivalent functionality?

Supporting zero-size Mmap?

This test shows that mapping an empty file returns an error:

memmap2-rs/src/lib.rs

Lines 978 to 992 in 27ece76

/// Checks that a 0-length file will not be mapped.
#[test]
fn map_empty_file() {
let tempdir = tempdir::TempDir::new("mmap").unwrap();
let path = tempdir.path().join("mmap");
let file = OpenOptions::new()
.read(true)
.write(true)
.create(true)
.open(&path)
.unwrap();
let mmap = unsafe { Mmap::map(&file) };
assert!(mmap.is_err());
}

I assume this is because the underlying OS APIs do not support creating a zero-size mapping but it is unsatisfying, since in some cases it forces callers to handle that special case. What do you think of having the Mmap and MmapMut structs handle this case by not creating an actual mapping but not erroring when the length is zero, and instead have Deref return &[]?

how can t do than is appand?

pub fn write(&mut self, data: &[u8]) {
        let m_mut = &mut self.writer; // this is a &mut MmapMut

        info!("mmap size:{},data size:{}", m_mut.len(), data.len());
        if m_mut.len() > data.len() {
            m_mut.deref_mut().write_all(data).unwrap();
            return;
        }
    }

In this way, subsequent writes always overwrite previous writes. I must use a variable to record the size of the previous write and then pass [pre_size..] Is this the way I want to write it?

MSRV and edition

Would it be worth it to update to 2021 and some later MSRV? 1.36 is 4 years old... seems rather dated

Migrate to `safer_owning_ref`

After the maintainer of owning_ref has been unresponsive for a long time, I decided to publish my fix of owning_ref known soundness issues as a new crate, safer_owning_ref.

Please migrate to safer_owning_ref.

How do I do it for seek_mut()?

like this:

use memmap2::MmapMut;

let mut mmap = MmapMut::open_path(file_path, memmap2::Protection::ReadWrite)?;
mmap.seek_mut(10)?; 

`.populate()` not implemented for `map_anon`

.populate() is currently ignored for map_anon() mappings. Linux supports it and it's useful, so this crate should as well. Currently I have to work around this by mapping /dev/zero.

Doesn't compile for freebsd

From my mac m1:

$ rustup target add x86_64-unknown-freebsd
$ cargo check  --target x86_64-unknown-freebsd
    Checking libc v0.2.151
    Checking memmap2 v0.9.1 (/Users/marcoieni/tmp/memmap2-rs)
error[E0425]: cannot find value `MAP_HUGETLB` in crate `libc`
    --> src/unix.rs:30:40
     |
30   | const MAP_HUGETLB: libc::c_int = libc::MAP_HUGETLB;
     |                                        ^^^^^^^^^^^ help: a constant with a similar name exists: `MFD_HUGETLB`
     |
    ::: /Users/marcoieni/.cargo/registry/src/index.crates.io-6f17d22bba15001f/libc-0.2.151/src/unix/bsd/freebsdlike/freebsd/mod.rs:4503:1
     |
4503 | pub const MFD_HUGETLB: ::c_uint = 0x00000004;
     | ------------------------------- similarly named constant `MFD_HUGETLB` defined here

error[E0425]: cannot find value `MAP_HUGE_MASK` in crate `libc`
    --> src/unix.rs:33:42
     |
33   | const MAP_HUGE_MASK: libc::c_int = libc::MAP_HUGE_MASK;
     |                                          ^^^^^^^^^^^^^ help: a constant with a similar name exists: `MFD_HUGE_MASK`
     |
    ::: /Users/marcoieni/.cargo/registry/src/index.crates.io-6f17d22bba15001f/libc-0.2.151/src/unix/bsd/freebsdlike/freebsd/mod.rs:4504:1
     |
4504 | pub const MFD_HUGE_MASK: ::c_uint = 0xFC000000;
     | --------------------------------- similarly named constant `MFD_HUGE_MASK` defined here

error[E0425]: cannot find value `MAP_HUGE_SHIFT` in crate `libc`
  --> src/unix.rs:36:43
   |
36 | const MAP_HUGE_SHIFT: libc::c_int = libc::MAP_HUGE_SHIFT;
   |                                           ^^^^^^^^^^^^^^ not found in `libc`

For more information about this error, try `rustc --explain E0425`.
error: could not compile `memmap2` (lib) due to 3 previous errors

I got the same issue here.

One weird thing: I added the freebsd check in my CI and it's compiling there :/

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.