Coder Social home page Coder Social logo

Rustify Snapshot Module about firecracker HOT 7 OPEN

zulinx86 avatar zulinx86 commented on June 5, 2024
Rustify Snapshot Module

from firecracker.

Comments (7)

zulinx86 avatar zulinx86 commented on June 5, 2024 1

Hello @beamandala πŸ‘‹ Yes, feel free to work on this issue! Thank you so much for your interest in this issue!

from firecracker.

roypat avatar roypat commented on June 5, 2024 1

Hi @beamandala,
sorry for the late response, it was holiday season where our team is based. The general structure of what you posted above looks pretty much exactly what I had in mind. The only two comments I have right now is that the check for the magic value of the header should probably be moved into SnapshoHdr::deserialize, together with the version check. Please also feel free to open a draft PR is you want some more early feedback! :)

I also think @ShadowCurse's comment about specializing the generics to &[u8] and Vec<u8> is valid, although we can also keep that as a separate issue/PR (I'm also not 100% convinced on the "presenting mmap as &[u8]", that sounds like a slippery slope to undefined behavior)

from firecracker.

beamandala avatar beamandala commented on June 5, 2024

Hi, can I work on this issue?

from firecracker.

beamandala avatar beamandala commented on June 5, 2024

@zulinx86 I've refactored the snapshot module code based on the pattern provided and wanted to check in with you to make sure I'm heading in the right direction. Let me know if there's anything I need to change or if everything looks good.

/// Firecracker snapshot header
#[derive(Debug, Serialize, Deserialize)]
struct SnapshotHdr {
    /// magic value
    magic: u64,
    /// Snapshot data version
    version: Version,
}

impl SnapshotHdr {
    fn new(version: Version) -> Self {
        Self {
            magic: SNAPSHOT_MAGIC_ID,
            version,
        }
    }

    fn load<R: Read>(reader: &mut R) -> Result<Self, SnapshotError> {
        let hdr: SnapshotHdr = deserialize(reader)?;

        Ok(hdr)
    }

    fn store<W: Write>(&self, writer: &mut W) -> Result<(), SnapshotError> {
        serialize(writer, self)
    }
}

#[derive(Debug, Serialize, Deserialize)]
pub struct Snapshot<Data> {
    // The snapshot version we can handle
    version: Version,
    data: Data,
}

/// Helper function to serialize an object to a writer
pub fn serialize<T, O>(writer: &mut T, data: &O) -> Result<(), SnapshotError>
where
    T: Write,
    O: Serialize + Debug,
{
    bincode::serialize_into(writer, data).map_err(|err| SnapshotError::Serde(err.to_string()))
}

/// Helper function to deserialize an object from a reader
pub fn deserialize<T, O>(reader: &mut T) -> Result<O, SnapshotError>
where
    T: Read,
    O: DeserializeOwned + Debug,
{
    // flags below are those used by default by bincode::deserialize_from, plus `with_limit`.
    bincode::DefaultOptions::new()
        .with_limit(VM_STATE_DESERIALIZE_LIMIT)
        .with_fixint_encoding()
        .allow_trailing_bytes() // need this because we deserialize header and snapshot from the same file, so after
        // reading the header, there will be trailing bytes.
        .deserialize_from(reader)
        .map_err(|err| SnapshotError::Serde(err.to_string()))
}

impl<Data: for<'a> Deserialize<'a>> Snapshot<Data> {
    pub fn load_unchecked<R: Read>(reader: &mut R) -> Result<Self, SnapshotError>
    where
        Data: DeserializeOwned + Debug,
    {
        let hdr: SnapshotHdr = deserialize(reader)?;
        if hdr.magic != SNAPSHOT_MAGIC_ID {
            return Err(SnapshotError::InvalidMagic(hdr.magic));
        }

        let data: Data = deserialize(reader)?;
        Ok(Self {
            version: hdr.version,
            data,
        })
    }

    pub fn load<R: Read>(reader: &mut R, snapshot_len: usize) -> Result<Self, SnapshotError>
    where
        Data: DeserializeOwned + Debug,
    {
        let mut crc_reader = CRC64Reader::new(reader);

        // Fail-fast if the snapshot length is too small
        let raw_snapshot_len = snapshot_len
            .checked_sub(std::mem::size_of::<u64>())
            .ok_or(SnapshotError::InvalidSnapshotSize)?;

        // Read everything apart from the CRC.
        let mut snapshot = vec![0u8; raw_snapshot_len];
        crc_reader
            .read_exact(&mut snapshot)
            .map_err(|ref err| SnapshotError::Io(err.raw_os_error().unwrap_or(libc::EINVAL)))?;

        // Since the reader updates the checksum as bytes ar being read from it, the order of these
        // 2 statements is important, we first get the checksum computed on the read bytes
        // then read the stored checksum.
        let computed_checksum = crc_reader.checksum();
        let stored_checksum: u64 = deserialize(&mut crc_reader)?;
        if computed_checksum != stored_checksum {
            return Err(SnapshotError::Crc64(computed_checksum));
        }

        let mut snapshot_slice: &[u8] = snapshot.as_mut_slice();
        Snapshot::load_unchecked::<_>(&mut snapshot_slice)
    }
}

impl<Data: Serialize + Debug> Snapshot<Data> {
    pub fn save<W: Write>(&self, mut writer: &mut W) -> Result<(), SnapshotError> {
        // Write magic value and snapshot version
        serialize(&mut writer, &SnapshotHdr::new(self.version.clone()))?;
        // Write data
        serialize(&mut writer, &self.data)
    }

    pub fn save_with_crc<W: Write>(&self, writer: &mut W) -> Result<(), SnapshotError> {
        let mut crc_writer = CRC64Writer::new(writer);
        self.save(&mut crc_writer)?;

        // Now write CRC value
        let checksum = crc_writer.checksum();
        serialize(&mut crc_writer, &checksum)
    }
}

from firecracker.

zulinx86 avatar zulinx86 commented on June 5, 2024

@roypat @bchalios Could you please check if the above code looks legit to you?

from firecracker.

ShadowCurse avatar ShadowCurse commented on June 5, 2024

I have been looking through our usage of snapshots and I think we can remove Write/Read traits and use plain old &[u8] for load and just return a Vec<u8> for save. This will simplify loading/saving logic, because we will be working with known types and will remove need for CRC64Reader/CRC64Writer Also will not need a snapshot_len parameter for load as we will be able to pass a slice with known size.
Additionally if we play smart, we can avoid doing memcopy of a snapshot data during the loading stage (the crc_reader.read_exact(&mut snapshot) part) if we mmap the file so it can be presented as &[u8].

from firecracker.

ShadowCurse avatar ShadowCurse commented on June 5, 2024

@roypat The mmap thing is basically:

  • open snapshot file
  • get the length (file.metadata().size())
  • mmap the file with correct length

The way we can keep it within Rust safety is to create a wrapper that can be created from &File so it will live as long as file itself. Snippet of potential solution:

struct FileBytes<'a> {
  slice: &'a[u8],
}

impl<'a> FileBytes<'a> {
  fn new(file: &'a File) -> Self {
    let length = file.metadata().size();
    // Safe as file fd and length are valid
    let ptr = usafe { libc::mmap(0, length, ..., file.as_raw_fd(), 0 ) };
    // Safe as we just created a mapping. This just converts it to 
    // more convenient Rust type
    let slice = unsafe { std::slice::from_raw_parts(ptr, length) };
    Self { slice }
  }
}

// For convenient usage 
impl<'a> Deref for FileBytes<'a> {
  type Target = &[u8];
...
}

from firecracker.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.