razrfalcon / memmap2-rs Goto Github PK
View Code? Open in Web Editor NEWThis project forked from danburkert/memmap-rs
cross-platform Rust API for memory mapped IO
License: Apache License 2.0
This project forked from danburkert/memmap-rs
cross-platform Rust API for memory mapped IO
License: Apache License 2.0
After the maintainer of owning_ref
has been unresponsive for a long time, I decided to publish my fix of owning_ref
known soundness issues as a new crate, safer_owning_ref
.
Please migrate to safer_owning_ref
.
To my knowledge, github doesn't support searching if a project is a fork. Would it make sense to disable the forked from
(it might require an email to github support, as there is no settings AFAIK)? Seems like this project is much more alive than the parent (thx!!!)
I'd like to be able to call a safe version of MmapRaw::from(unsafe { Mmap::map(&file) })
, as a sibling API for map_raw
.
Maybe it should be possible to pass flags for read/write/exec to MmapOptions::map_raw?
I'm having issues when compiling in a dev container while on aarch64 for a target of xtensa-esp32-idf. Its a std application for an esp32 and unfortunately the example is for no_std so I'm trying to convert it for my use. There are a total of 35 errors shown below:
I don't know if the issue is with compiling it on my M1 mac in a Debian docker container or if the issue is the feature selection. Any advice would be appreciated. Let me know if there is any other info required, I don't have my code hosted yet anywhere but I would be happy to upload and share it if that would help as well. I should note that I can compile libc crate just fine and I have tried compiling it in a couple different ways. Standard features and no defaults and "std".
Compiling hash32 v0.2.1
error[E0425]: cannot find value `MAP_FAILED` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:93:29
|
93 | if ptr == libc::MAP_FAILED {
| ^^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `PROT_READ` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:108:19
|
108 | libc::PROT_READ,
| ^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `MAP_SHARED` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:109:19
|
109 | libc::MAP_SHARED | populate,
| ^^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `PROT_READ` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:119:19
|
119 | libc::PROT_READ | libc::PROT_EXEC,
| ^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `PROT_EXEC` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:119:37
|
119 | libc::PROT_READ | libc::PROT_EXEC,
| ^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `MAP_SHARED` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:120:19
|
120 | libc::MAP_SHARED | populate,
| ^^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `PROT_READ` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:130:19
|
130 | libc::PROT_READ | libc::PROT_WRITE,
| ^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `PROT_WRITE` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:130:37
|
130 | libc::PROT_READ | libc::PROT_WRITE,
| ^^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `MAP_SHARED` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:131:19
|
131 | libc::MAP_SHARED | populate,
| ^^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `PROT_READ` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:141:19
|
141 | libc::PROT_READ | libc::PROT_WRITE,
| ^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `PROT_WRITE` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:141:37
|
141 | libc::PROT_READ | libc::PROT_WRITE,
| ^^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `MAP_PRIVATE` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:142:19
|
142 | libc::MAP_PRIVATE | populate,
| ^^^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `PROT_READ` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:157:19
|
157 | libc::PROT_READ,
| ^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `MAP_PRIVATE` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:158:19
|
158 | libc::MAP_PRIVATE | populate,
| ^^^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `PROT_READ` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:169:19
|
169 | libc::PROT_READ | libc::PROT_WRITE,
| ^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `PROT_WRITE` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:169:37
|
169 | libc::PROT_READ | libc::PROT_WRITE,
| ^^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `MAP_PRIVATE` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:170:19
|
170 | libc::MAP_PRIVATE | libc::MAP_ANON | stack,
| ^^^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `MAP_ANON` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:170:39
|
170 | libc::MAP_PRIVATE | libc::MAP_ANON | stack,
| ^^^^^^^^ not found in `libc`
error[E0425]: cannot find function `msync` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:181:28
|
181 | unsafe { libc::msync(self.ptr.offset(offset), len as libc::size_t, libc::MS_SYNC) };
| ^^^^^ help: a function with a similar name exists: `fsync`
|
::: /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/libc-0.2.134/src/unix/mod.rs:1009:5
|
1009 | pub fn fsync(fd: ::c_int) -> ::c_int;
| ------------------------------------ similarly named function `fsync` defined here
error[E0425]: cannot find value `MS_SYNC` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:181:86
|
181 | unsafe { libc::msync(self.ptr.offset(offset), len as libc::size_t, libc::MS_SYNC) };
| ^^^^^^^ help: a constant with a similar name exists: `O_SYNC`
|
::: /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/libc-0.2.134/src/unix/newlib/mod.rs:393:1
|
393 | pub const O_SYNC: ::c_int = 8192;
| ------------------------- similarly named constant `O_SYNC` defined here
error[E0425]: cannot find function `msync` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:194:28
|
194 | unsafe { libc::msync(self.ptr.offset(offset), len as libc::size_t, libc::MS_ASYNC) };
| ^^^^^ help: a function with a similar name exists: `fsync`
|
::: /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/libc-0.2.134/src/unix/mod.rs:1009:5
|
1009 | pub fn fsync(fd: ::c_int) -> ::c_int;
| ------------------------------------ similarly named function `fsync` defined here
error[E0425]: cannot find value `MS_ASYNC` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:194:86
|
194 | unsafe { libc::msync(self.ptr.offset(offset), len as libc::size_t, libc::MS_ASYNC) };
| ^^^^^^^^ not found in `libc`
error[E0425]: cannot find function `mprotect` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:208:22
|
208 | if libc::mprotect(ptr, len, prot) == 0 {
| ^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `PROT_READ` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:217:29
|
217 | self.mprotect(libc::PROT_READ)
| ^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `PROT_READ` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:221:29
|
221 | self.mprotect(libc::PROT_READ | libc::PROT_EXEC)
| ^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `PROT_EXEC` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:221:47
|
221 | self.mprotect(libc::PROT_READ | libc::PROT_EXEC)
| ^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `PROT_READ` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:225:29
|
225 | self.mprotect(libc::PROT_READ | libc::PROT_WRITE)
| ^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `PROT_WRITE` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:225:47
|
225 | self.mprotect(libc::PROT_READ | libc::PROT_WRITE)
| ^^^^^^^^^^ not found in `libc`
error[E0425]: cannot find function `madvise` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:245:22
|
245 | if libc::madvise(self.ptr, self.len, advice as i32) != 0 {
| ^^^^^^^ not found in `libc`
error[E0425]: cannot find value `_SC_PAGESIZE` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/unix.rs:297:58
|
297 | let page_size = unsafe { libc::sysconf(libc::_SC_PAGESIZE) as usize };
| ^^^^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `MADV_NORMAL` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/advice.rs:13:20
|
13 | Normal = libc::MADV_NORMAL,
| ^^^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `MADV_RANDOM` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/advice.rs:19:20
|
19 | Random = libc::MADV_RANDOM,
| ^^^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `MADV_SEQUENTIAL` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/advice.rs:26:24
|
26 | Sequential = libc::MADV_SEQUENTIAL,
| ^^^^^^^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `MADV_WILLNEED` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/advice.rs:32:22
|
32 | WillNeed = libc::MADV_WILLNEED,
| ^^^^^^^^^^^^^ not found in `libc`
error[E0425]: cannot find value `MADV_DONTNEED` in crate `libc`
--> /home/esp/.cargo/registry/src/github.com-1ecc6299db9ec823/memmap2-0.5.7/src/advice.rs:63:22
|
63 | DontNeed = libc::MADV_DONTNEED,
| ^^^^^^^^^^^^^ not found in `libc`
For more information about this error, try `rustc --explain E0425`.
error: could not compile `memmap2` due to 35 previous errors
This test shows that mapping an empty file returns an error:
Lines 978 to 992 in 27ece76
I assume this is because the underlying OS APIs do not support creating a zero-size mapping but it is unsatisfying, since in some cases it forces callers to handle that special case. What do you think of having the Mmap
and MmapMut
structs handle this case by not creating an actual mapping but not erroring when the length is zero, and instead have Deref
return &[]
?
I would like to use memory mapped files to communicate with a child process stdin
/stdout
. I've already tested the idea with the crate memfile
and the benchmark results are promising. However that crate does not offer Windows support, which is also something that I'm looking for.
To be able to use a the (anonymous) memory mapped file as stdin for a child process in the memfile
crate I can use the From<MemFile>
trait impl for Stdio
. It would be great if this library added this impl too. There are multiple traits that can be implemented:
Into<Stdio>
AsRawFd
/ IntoRawFd
AsRawHandle
/ IntoRawHandle
Example usage with this feature implemented:
fn write_using_memmap2(buf: &[u8]) {
let mut mmap = MmapOptions::new().len(buf.len()).map_anon().unwrap();
mmap.copy_from_slice(buf);
let mmap = mmap.make_read_only().unwrap();
Command::new("nul")
.stderr(Stdio::null())
.stdout(Stdio::null())
.stdin(Stdio::from(mmap)) // <-- Using the From<MMap> trait
.spawn().unwrap()
.wait().unwrap();
}
Hi, I noticed that @al8n has made some changes in their fork https://github.com/al8n/memmapix, while this repo has also done some diverging work. This repo clearly has far more downloads/community following, so I wonder if it would be possible to merge some/all of the memmapix changes into this repo, or what's your future plans for this repo?
Both efforts would clearly benefit the community, and highly appreciated, thank you all!
P.S. The memmapix
repo doesn't have issues tab for some reason, so I couldn't ask for it there.
At $dayjob for technical reasons, we've modified this to use rustix
instead of libc
. Would you be willing to accept this change as a pull request?
like this:
use memmap2::MmapMut;
let mut mmap = MmapMut::open_path(file_path, memmap2::Protection::ReadWrite)?;
mmap.seek_mut(10)?;
MmapMut::map_mut
is an unsafe
function, but there is not any documentation about why it is unsafe
and which invariants the user have to hold.
Would it be possible to support this flag? Currently the much older crate mmap supports it, but it hasn't been updated in 6 years (!).
Related (unsolved) discussion: danburkert#68
Would it be possible to support setting linux hints, similar to how this C++ code was modified - https://github.com/osmcode/libosmium/pull/292/files
Or are there some issues with wrapping this kind of tech? Thanks!
The following platforms are unsupported by newer rust versions:
armv7-apple-ios
armv7s-apple-ios
i386-apple-ios
On i686-unknown-linux-gnu
, mmap()
with MAP_ANONYMOUS
is able to serve requests of 0x80000000 or more bytes. MmapMut::map_anon()
and <MmapMut as Deref>::deref()
can create a slice of this length in safe code:
/*
[dependencies]
memmap2 = "=0.5.4"
cargo run --target i686-unknown-linux-gnu
*/
use memmap2::MmapMut;
fn main() {
let map: MmapMut = MmapMut::map_anon(0x80000000).unwrap();
let slice: &[u8] = &*map;
println!("{:#x}", slice.len()); // 0x80000000
}
However, <MmapMut as Deref>::deref()
uses std::slice::from_raw_parts()
, which has this safety precondition:
Behavior is undefined if any of the following conditions are violated: [...]
- The total size
len * mem::size_of::<T>()
of the slice must be no larger thanisize::MAX
. See the safety documentation ofpointer::offset
.
Since isize::MAX == 0x7fffffff
on i686-unknown-linux-gnu
and other 32-bit targets, it is undefined behavior to create this 0x80000000-byte slice using std::slice::from_raw_parts()
. Compiler optimizations can behave erratically if this precondition is violated.
The most straightforward solution would be to add assertions to MmapOptions::map()
, MmapOptions::map_exec()
, MmapOptions::map_mut()
, MmapOptions::map_copy()
, and MmapOptions::map_copy_read_only()
, and MmapOptions::map_anon()
that the provided or computed length is no greater than isize::MAX
. MmapOptions::map_raw()
does not need this assertion, since MmapRaw
does not provide slices in safe code. (It may also be worthwhile to add a method that produces an anonymous MmapRaw
not checked against isize::MAX
.)
Having some options to reduce page faults would be great. On linux the MAP_HUGETLB
and MAP_POPULATE
options would be the most interesting imo.
Around the same time of this fork, there was a fork started called mapr
, which implemented MAP_PRIVATE
, MAP_LOCK
and mlock
. All use cases for MAP_PRIVATE
are now covered by memmap2
through function calls. Thanks for pointing that out at #50 (comment).
What is left from that fork, what memmap2
is missing is MAP_LOCK
and mlock
. Before I do another PR, I'd like to check what the best way would be, hence this issue.
The only use-case for MAP_LOCK
we currently have in our code base is in combination with map_anon
. Should it only be added there? Or should it be added to all methods (similar to MAP_POPULATE
as I did in at #50?
I think for adding mlock()
there aren't any questions.
I'm trying to test a rust project on Github Action but it yields an error only on Windows.
That's all right at my desktop and Github Action Linux/macos so I don't know what I should troubleshoot.
#[test]
fn file_remove() {
let tempdir = tempfile::tempdir().unwrap();
let path = tempdir.path().join("mmap");
let mut file = OpenOptions::new()
.read(true)
.write(true)
.create(true)
.open(path.clone())
.unwrap();
file.set_len(128).unwrap();
let mmap = unsafe { Mmap::map(&file) };
assert!(mmap.is_ok());
let remove_res = fs::remove_file(path.clone());
if remove_res.is_err() {
println!("remove_res: {:?}", remove_res);
}
assert!(remove_res.is_ok());
}
---- test::file_remove stdout ----
remove_res: Err(Os { code: 5, kind: PermissionDenied, message: "Access is denied." })
thread 'test::file_remove' panicked at src\lib.rs:1595:9:
assertion failed: remove_res.is_ok()
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
https://github.com/LokiSharp/memmap2-rs/actions/runs/7223354238/job/19682434953
I wonder whether or not we can make this crate no-std
?
AFAIC, at least the anonymous mapping API (map_anon
) should not depend on any std
features.
This might not help a lot on UNIX since it depends on libc
whatsoever. But on Windows, making this crate no-std
should reduce link targets and/or executable size.
If you find this change proper, I could start working on this and (hopefully) make a PR soon.
pub fn write(&mut self, data: &[u8]) {
let m_mut = &mut self.writer; // this is a &mut MmapMut
info!("mmap size๏ผ{},data size๏ผ{}", m_mut.len(), data.len());
if m_mut.len() > data.len() {
m_mut.deref_mut().write_all(data).unwrap();
return;
}
}
In this way, subsequent writes always overwrite previous writes. I must use a variable to record the size of the previous write and then pass [pre_size..] Is this the way I want to write it?
This is half a question, and half a bug report.
I have some test code that goes through several permutations of how to execute io_uring operations using various buffer strategies. One strategy (my default, actually) kept failing with -14 Bad Address / EFAULT errors.
Tracking it down, it seemed to be only buffers from mmap'd pages that cause this, digging further.. it was only mmap'd pages that had been allocated via memmap2(?)
let adjusted_buf_size = size.checked_next_power_of_two().unwrap();
let buf_layout = std::alloc::Layout::from_size_align(adjusted_buf_size, PAGE_ALIGN).unwrap();
let buf_total = buf_layout.size().checked_mul(count).unwrap();
let buf_ptr = if OLD_SKOOL {
let buf_ptr = libc::mmap(
std::ptr::null_mut(),
buf_total,
libc::PROT_WRITE | libc::PROT_READ,
libc::MAP_ANON | libc::MAP_POPULATE | libc::MAP_PRIVATE,
-1,
0,
);
if buf_ptr == libc::MAP_FAILED {
panic!("mmap failed");
}
buf_ptr.cast()
} else {
let mut opts = MmapOptions::new();
let mut buf = opts.populate().len(buf_total).map_anon().unwrap();
buf.as_mut_ptr()
};
// belt and suspenders
std::ptr::write_bytes(buf_ptr, 0, buf_total);
When OLD_SKOOL is true, then everything works as planned. If OLD_SKOOL is false, then all read via io_uring fail with the -14 Bad Address.
Before the registration of each buffer, I assert that they are aligned to 4096. I am baffled as to what the difference in approaches might be.
it is not mentioned in the repo but cargo build and test run ok on osx.
So is OSX officially supported without issues?
Even though files are the most obvious target for an mmap, we might want to call mmap on a file descriptor that isn't backed by a File. Buffers on Linux can be allocated that way for example.
The current API that takes a File doesn't really allow to use the crate in such a case.
Something that would work would be
let buffer_fd = allocate_buffer(whatever);
let file = unsafe { File::from_raw_fd(buffer_fd) };
let mut mmap = unsafe {
MmapOptions::new().map_mut(&file)?
};
However, FrowRawFd consumes the ownership of the FD, therefore making it fairly hard to integrate into a larger buffer management codebase.
We could switch from File to AsRawFd the API, but I'm not sure whether that change in the API would be welcome?
I'm trying to build a rust project on Windows 11 but it yields an error.
I've never built anything before in rust so I don't know what I should troubleshoot.
cargo version: cargo 1.74.1 (ecb9851af 2023-10-18)
rustup 1.26.0 (5af9b9484 2023-04-05)
info: This is the version for the rustup toolchain manager, not the rustc compiler.
info: The currently active `rustc` version is `rustc 1.74.1 (a28077b28 2023-12-04)`
error[E0433]: failed to resolve: could not find `Advice` in `memmap2`
--> src\main.rs:83:30
|
83 | mmap.advise(memmap2::Advice::HugePage).unwrap();
| ^^^^^^ could not find `Advice` in `memmap2`
error[E0599]: no method named `advise` found for struct `memmap2::Mmap` in the current scope
--> src\main.rs:83:14
|
83 | mmap.advise(memmap2::Advice::HugePage).unwrap();
| ^^^^^^ method not found in `Mmap`
Some errors have detailed explanations: E0433, E0599.
For more information about an error, try `rustc --explain E0433`.
error: could not compile `parser` (bin "parser") due to 2 previous errors
This build step doesn't work as expected because rustup does not manage rustc's default target:
- name: Install toolchain
uses: actions-rs/toolchain@v1
with:
toolchain: stable
profile: minimal
target: ${{ matrix.target }}
override: true
- name: Run tests
run: cargo test
The tests fail when the target is set explicitly:
diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 380f1d0..b3e4c16 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -49,7 +49,7 @@ jobs:
override: true
- name: Run tests
- run: cargo test
+ run: cargo test --target ${{ matrix.target }}
test-macos-catalina:
runs-on: macos-10.15
@@ -82,7 +82,7 @@ jobs:
override: true
- name: Run tests
- run: cargo test
+ run: cargo test --target ${{ matrix.target }}
i686-unknown-linux-musl
seems to run into a 32-bit truncation issue:
---- test::map_offset stdout ----
thread 'test::map_offset' panicked at 'attempt to subtract with overflow', src/lib.rs:179:23
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
and i686-unknown-linux-gnu
fails during linking probably because 32-bit glibc isn't installed or something.
Here's a sample run (not sure if it'll survive force-pushes): https://github.com/Mrmaxmeier/memmap2-rs/runs/3251483567 (xref #21)
I am hoping the discussion we have hear could make it into the project's README eventually, so I'll try to keep it general rather than specific to my use case.
The problem: I keep returning to consider mmap2 for my use case, but continue to remain unsure.
The current situation is as follows:
Problem 1: There is a "primary process" which generates information that we want to keep, but keeping it all on the RAM is not feasible.
Question 1: does Problem 1 have the right shape for memmap2 to be considered? If not, what is the right shape of problem for which memmap
should be considered? After all, here's a non-memmap
solution:
Solution 1 (file):
keep a buffer of the generated data
and when the buffer is full, flush the data via mpsc
channels to a writer process. The writer process has a handle to an open file, into which it writes received data to the hard disk using <some binary format>
+ serde
, while the primary process keeps generating data.
On the other hand, here is a memmap2 solution:
Solution 2 (memmap
):
Question 2: Is the following true? "The benefit of Solution 2 (memmap
) over Solution 1 (file) is that we do not have to deal with the overhead of inter-thread communication. Put differently, the primary process does not have to wait for a buffer flush + send to complete before continuing to generate data."
Question 3: Does Solution 2 make sense if you have a hard disk with slower write speed than the rate at which the primary process generates data?
One could also imagine the following solution:
Solution 3 (memmap
, parallel):
keep a buffer of the generated data
and have a main, memory mapped structure which will hold all the generated data
this main memory mapped structure is kept by a writer process, which is sent information by the primary process using mpsc
channels, which it then "appends" to the data in the memory mapped structure it is holding
Question 4: Is the following statement true? "An advantage of solution 2 is that if we have a hard disk with slower write speed than the rate at which the primary process is generating data, then Solution 3 essentially covers up this issue and replaces the cost instead with that of waiting for a buffer flush + send to complete."
Question 5: Is the following statement true? "The main benefit of memory mapping is to avoid the cost of <binary format>
encoding/decoding."
(Thank you for your time.)
I guess I'm asking if Mmap invalidates buffer when it is dropped.
The docs for MmapRaw::as_ptr()
and MmapRaw::as_mut_ptr()
both say:
Safety
To safely dereference this pointer, you need to make sure that the file has not been truncated since the memory map was created.
However, this isn't true in the Rust sense of the word safety: if the file is truncated, the memory mapping itself is not changed. So accessing that memory will simply cause the process to be killed with SIGBUS(1) or equivalent. That's not actually a memory safety issue as no memory will actually be read or written.
To avoid confusion, I think it'd be good to keep this warning, but rename the section to something else and describe why truncation isn't directly a memory safety issue.
I'm wondering if there's a reason for the fork, and upon what basis I should choose to use memmap v1 versus v2?
Hey, I was trying to build my crate (tokei
) which depends deep down on memmap
, however currently memmap2
's MmapInner
currently doesn't compile on WebAssembly platforms. Obviously memory maps don't work on WebAssembly, but it would be nice if it still compiled, but failed at runtime, like std
's File
.
Would it be worth it to update to 2021 and some later MSRV? 1.36 is 4 years old... seems rather dated
On a Mac with an M1 processor, reusing an anonymous memory map can execute the previous code, resulting in seemingly-impossible failure modes.
Here's a minimal reproduction case, alternating between
mov x0 #1
ret
or
mov x0 #2
ret
and asserting that they return the correct value:
fn main() {
let mov_x0_1: u32 = 0x200080D2; // MOV x0, #1
let mov_x0_2: u32 = 0x400080D2; // MOV x0, #2
let ret: u32 = 0xC0035FD6; // RET
let mut mmap = memmap2::MmapMut::map_anon(8).unwrap();
mmap[4..8].copy_from_slice(&ret.to_be_bytes());
for i in 0..100 {
// Build and execute a function that should return 1
mmap[0..4].copy_from_slice(&mov_x0_1.to_be_bytes());
let mmap_exec = mmap.make_exec().unwrap();
let f: unsafe extern "C" fn() -> u32 = unsafe {
std::mem::transmute(mmap_exec.as_ptr())
};
assert_eq!(unsafe { f() }, 1);
mmap = mmap_exec.make_mut().unwrap();
// Build and execute a function that should return 2
mmap[0..4].copy_from_slice(&mov_x0_2.to_be_bytes());
let mmap_exec = mmap.make_exec().unwrap();
let f: unsafe extern "C" fn() -> u32 = unsafe {
std::mem::transmute(mmap_exec.as_ptr())
};
assert_eq!(unsafe { f() }, 2);
mmap = mmap_exec.make_mut().unwrap();
}
}
This fails because the icache
and dcache
are separated, so the modified program written to the buffer will not be used during evaluation. ARM has a 2013 blog post explaining the issue, which makes me suspect this is not unique to the M1 and will be an issue on any ARM processor.
On macOS, this cache is flushed by sys_icache_invalidate
. For reference, here's a fixed version of this program:
use std::ffi::c_void;
#[link(name = "c")]
extern "C" {
fn sys_icache_invalidate(start: *const std::ffi::c_void, size: usize);
}
fn main() {
let mov_x0_1: u32 = 0x200080D2; // MOV x0, #1
let mov_x0_2: u32 = 0x400080D2; // MOV x0, #2
let ret: u32 = 0xC0035FD6; // RET
let mut mmap = memmap2::MmapMut::map_anon(8).unwrap();
mmap[4..8].copy_from_slice(&ret.to_be_bytes());
for i in 0..100 {
// Build and execute a function that should return 1
mmap[0..4].copy_from_slice(&mov_x0_1.to_be_bytes());
unsafe {
sys_icache_invalidate(mmap.as_ptr() as *const c_void, mmap.len())
};
let mmap_exec = mmap.make_exec().unwrap();
let f: unsafe extern "C" fn() -> u32 =
unsafe { std::mem::transmute(mmap_exec.as_ptr()) };
assert_eq!(unsafe { f() }, 1);
mmap = mmap_exec.make_mut().unwrap();
// Build and execute a function that should return 2
mmap[0..4].copy_from_slice(&mov_x0_2.to_be_bytes());
unsafe {
sys_icache_invalidate(mmap.as_ptr() as *const c_void, mmap.len())
};
let mmap_exec = mmap.make_exec().unwrap();
let f: unsafe extern "C" fn() -> u32 =
unsafe { std::mem::transmute(mmap_exec.as_ptr()) };
assert_eq!(unsafe { f() }, 2);
mmap = mmap_exec.make_mut().unwrap();
}
}
I'm not totally sure of the right fix here!
Adding a call to sys_icache_invalidate
(and the Linux equivalent) will fix the issue, but that's is not the recommended way of doing W^X on macOS:
Porting Just-In-Time Compilers to Apple Silicon suggests creating the region with both PROT_WRITE
, PROT_EXEC
, and the MAP_JIT
flag, then using pthread_jit_write_protect_np
to modify write-protection on a per-thread (?!) basis.
This would be a significant change from the common unix
codebase!
Hi, I got this bug report: zkat/cacache-rs#32
It seems to be a regression from memmap
and I'm not really sure what's going on.
The Windows implementation has a problem when trying to create a mmap for an empty file.
Normally, Windows creates a mmap that is aligned in memory. But there's a catch. An empty file gets a pointer to address 1 (0x00000001), which is not aligned.
In Qdrant this was problematic, because we always expected an aligned mmap. We therefore handle this edge case, returning a dangling pointer that is aligned. This is fine because there's no data at the pointer, the byte slice is empty.
Can I submit a PR to implement a similar fix; return an aligned pointer on Windows? Or is the current behavior desirable, and should I not expect to get an aligned pointer?
I'm unsure why this happens on Windows, and couldn't find anything about this in the Windows API documentation. I assume it is an error code, where returning a null pointer is not an option so it picks the next address. This is very easily reproducible however.
This only happens on Windows. Unix-like platforms don't have this problem and return an aligned pointer.
use memmap2::Mmap;
use std::fs::File;
fn main() {
let file = File::options()
.read(true)
.write(true)
.create(true)
.open("./empty-file.txt")
.unwrap();
file.set_len(0).unwrap();
let mmap = unsafe { Mmap::map(&file).unwrap() };
println!("Mmap pointer: {:?}", mmap.as_ptr());
}
# Windows
$ cargo run
Mmap pointer: 0x000000000001
# Linux, macOS
$ cargo run
Mmap pointer: 0x7f51653c1000
Related: qdrant/qdrant#1873
CC: @Jesse-Bakker
From my mac m1:
$ rustup target add x86_64-unknown-freebsd
$ cargo check --target x86_64-unknown-freebsd
Checking libc v0.2.151
Checking memmap2 v0.9.1 (/Users/marcoieni/tmp/memmap2-rs)
error[E0425]: cannot find value `MAP_HUGETLB` in crate `libc`
--> src/unix.rs:30:40
|
30 | const MAP_HUGETLB: libc::c_int = libc::MAP_HUGETLB;
| ^^^^^^^^^^^ help: a constant with a similar name exists: `MFD_HUGETLB`
|
::: /Users/marcoieni/.cargo/registry/src/index.crates.io-6f17d22bba15001f/libc-0.2.151/src/unix/bsd/freebsdlike/freebsd/mod.rs:4503:1
|
4503 | pub const MFD_HUGETLB: ::c_uint = 0x00000004;
| ------------------------------- similarly named constant `MFD_HUGETLB` defined here
error[E0425]: cannot find value `MAP_HUGE_MASK` in crate `libc`
--> src/unix.rs:33:42
|
33 | const MAP_HUGE_MASK: libc::c_int = libc::MAP_HUGE_MASK;
| ^^^^^^^^^^^^^ help: a constant with a similar name exists: `MFD_HUGE_MASK`
|
::: /Users/marcoieni/.cargo/registry/src/index.crates.io-6f17d22bba15001f/libc-0.2.151/src/unix/bsd/freebsdlike/freebsd/mod.rs:4504:1
|
4504 | pub const MFD_HUGE_MASK: ::c_uint = 0xFC000000;
| --------------------------------- similarly named constant `MFD_HUGE_MASK` defined here
error[E0425]: cannot find value `MAP_HUGE_SHIFT` in crate `libc`
--> src/unix.rs:36:43
|
36 | const MAP_HUGE_SHIFT: libc::c_int = libc::MAP_HUGE_SHIFT;
| ^^^^^^^^^^^^^^ not found in `libc`
For more information about this error, try `rustc --explain E0425`.
error: could not compile `memmap2` (lib) due to 3 previous errors
I got the same issue here.
One weird thing: I added the freebsd check in my CI and it's compiling there :/
I want to create a thread-local memory map (could be file-based or anonymous, I'd prefer anonymous) in each of N threads where the first M bytes of each thread-local mapping are mapped to the same file (which is itself mapped into memory). I know this allocation scheme is possible using plain mmap
in Linux but I can't find a way to do it in memmap2-rs
. That is, there is no way to map the buffer of a MMapMut
into the buffer of another MMap*
struct.
Am I wrong to believe this or is it truly impossible under the current API?
Most other ffi wrappers provide a safe interface. Why is this unsafe?
.populate()
is currently ignored for map_anon()
mappings. Linux supports it and it's useful, so this crate should as well. Currently I have to work around this by mapping /dev/zero.
madvise()
seems to support specifying a memory range, but the current implementation cannot specify a memory range.
Is there any reason for this?
I'm writing a hex editor, and I want to add memory mapped file support to it.
Writable memory maps write changes directly into a file, which is not ideal, since the user might not want to save their changes.
The changes should only be written when the user saves.
I discovered that memmap2-rs supports copy-on-write memory maps, which sounds ideal for my use case.
However, I'm unsure how to safely write the changes back to the file on save.
Calling .flush()
does not write the changed data back to the file.
I see two options, both of which involve keeping track of the parts of the file that changed:
File
handle to write back the changes right from the mmap.File
handleWhich one should I use? Or is there a better alternative?
Thank you, and sorry for using issues for questions, but I couldn't find good resources on this, and I hope there are some mmap experts here that can help answer.
This is a direct consequence of this rust's libc issue. In rust targeting 32 bits glibc, libc::off_t
is 32 bits, which breaks this code if you are trying to map beyond 4 GB in a file:
let ptr = libc::mmap(
ptr::null_mut(),
map_len as libc::size_t,
prot,
flags,
file,
aligned_offset as libc::off_t,
);
I had a similar issue with nix
crate, where posix_fallocate()
also relied on libc::off_t
for file sizes. The fix was to use rustix
crate, that calls the system-call directly. Maybe a similar workaround could be applied here?
Hello! Thank you for very useful crate!
Before Rust I wrote in python. Python has its implementation for mmap. In python I can work with mmap like with File
I propose to implement traits std::io::{Read, Write, Seek}
. If nobody don't mind, I'm ready to create PR.
I'm the maintainer of the package for this crate in Fedora Linux, and with the update to version 0.9.0, I noticed a new test failure on powerpc64le compared to v0.7.1:
---- test::advise_writes_unsafely_to_part_of_map stdout ----
thread 'test::advise_writes_unsafely_to_part_of_map' panicked at 'assertion failed: `(left == right)`
left: `0`,
right: `255`', src/lib.rs:1978:9
The error message points to this assertion:
https://github.com/RazrFalcon/memmap2-rs/blob/v0.9.0/src/lib.rs#L1978
I don't know how or why powerpc64le is different here exactly ... the only thing I can think of right now is that the default page size is 64KB on powerpc64le, while it's 4KB on most other architectures.
All tests pass on our other supported architectures ({x86_64,i686,aarch64,s390x}-unknown-linux-gnu
).
Test environment:
I think this only applies to the read-only version, but advise
can conceptually performs a write and should require mutable access. Here's a simple reproducer.
use memmap2::*;
fn main() {
let mut a = MmapMut::map_anon(4096).unwrap();
a.as_mut().fill(255);
let a = a.make_read_only().unwrap();
bar(&a, a.as_ref());
}
fn bar(map: &Mmap, slic: &[u8]) {
let a = slic[0];
map.advise(Advice::DontNeed);
let b = slic[0];
println!("{} {}", a, b);
}
In debug, this prints out 255 0
, as DontNeed
has freed the pages. In release, this prints out 255 255
. Needlessly to say, this program has UB because it changes the memory an immutable reference points to while it is active.
First, thank you for keeping alive this very useful crate.
I was wondering if there is any plan to support async_std
types.
Especially in the MmapOptions
.
As mentioned in the doc https://docs.rs/async-std/1.10.0/async_std/fs/index.html those types are just the async version of the ones in the standard library, so I guess it should be quite easy.
Thanks
I posted a stackoverflow question -- looking for some help on how to use this library to write to a file from multiple threads without any synchronization. Re-posting it here for visibility in case the author (thx!) or someone from the wonderful community can help.
I need to create a 40+ GB file using multiple threads. The file is used as a giant vector of u64 values. Threads do not need any kind of synchronization -- each thread's output will be unique to that thread, but each thread does NOT get its own slice. Rather, the nature of the data ensures each thread will generate a set of unique positions in the file to write to. Simple example -- each thread writes to a position [ind / thread_count], where ind goes to millions. For thread_count = 2, one thread writes to odd positions, and the other to even.
The api currently allows us to madvise
memory maps. Would there be an interest in also providing an mbind functionality?
Hi! I have a use case in which I would like to use mremap(2)
to resize a memory mapping created using this crate. I would be happy to submit a PR to implement this but figured that I would open an issue first.
A common pattern with mmap is to provide multiple advice at once:
madvise(addr, size, MADV_WILLNEED | MADV_SEQUENTIAL);
This is not currently possible with the Advice
implementation. Would you accept a PR overloading the Or
operator to provide equivalent functionality?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.