Coder Social home page Coder Social logo

Comments (6)

coldfunction avatar coldfunction commented on June 12, 2024 1

I find a possible way to eliminate the gfn assert.

If we checked the dirty page tracked in each epoch by kvmft_assert_ram_hash_and_dlist, we would find that the gfn assert occurs.
The scenario is that the VM boot until entering login screen (don’t do anything), and then open FT mode. Ensure the system is already in FT mode, then type login account and password. At this time, the gfn assert will occur.

To solve this issue, the virtio-blk.c I changed below:

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index c019d2d..848eb1a 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -218,7 +218,7 @@ static void virtio_blk_req_complete(VirtIOBlockReq *req, unsigned char status)
    }

    stb_p(&req->in->status, status);
-    virtqueue_push(req->vq, &req->elem, req->in_len);
+    virtqueue_push(req->vq, &req->elem, iov_size(req->elem.in_sg, req->elem.in_num));
    if (s->dataplane_started && !s->dataplane_disabled) {
        virtio_blk_data_plane_notify(s->dataplane, req->vq);
    } else {

Since the req->in_len is always 0 in virtio_blk_handle_request when entering FT mode in master node, it will cause the incorrect result from virtqueue_push.

However, clear the gfn assert away, but the sector_num and type problems are still here.

from cuju.

PJTPJT avatar PJTPJT commented on June 12, 2024 1

@coldfunction Great!
But I think we need add this function to confirm_req_read_memory_mapped

--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -45,6 +45,7 @@ static void confirm_req_read_memory_mapped(VirtIOBlockReq *req)
         req->in = (void *)in_iov[in_num - 1].iov_base
               + in_iov[in_num - 1].iov_len
               - sizeof(struct virtio_blk_inhdr);
+                req->in_len = iov_size(in_iov, in_num);
     } else {
         printf("%s %p already mapped\n", __func__, req);
         abort();

Because we set req->in_len = 0 on virtio_blk_handle_request function
So we need restore it on confirm_req_read_memory_mapped function

from cuju.

PJTPJT avatar PJTPJT commented on June 12, 2024 1

vring_desc_read

static void vring_desc_read(VirtIODevice *vdev, VRingDesc *desc,
                            hwaddr desc_pa, int i)
{       
    address_space_read(&address_space_memory, desc_pa + i * sizeof(VRingDesc), MEMTXATTRS_UNSPECIFIED, (void *)desc, sizeof(VRingDesc));

Master:
vdev = 0x7ffff50e2410, desc = 0x7fffffff7c30, desc_pa = 899092480, i = 52
caller : virtqueue_pop
&desc->addr = 8896512 &desc->len= 912 &desc->flags= 4 &desc->next= 20 
vdev = 0x7ffff50e2410, desc = 0x7fffffff7c30, desc_pa = 8896512, i = 0
caller : virtqueue_pop
&desc->addr = 899692216 &desc->len= 16 &desc->flags= 1 &desc->next= 1

Slave:
virtio_blk_load_device handle head 3/30: 52
vdev = 0x7ffff50e2410, desc = 0x7fffffff78e0, desc_pa = 899092480, i = 52
caller : virtqueue_get
&desc->addr = 8896512 &desc->len= 912 &desc->flags= 4 &desc->next= 20 
vdev = 0x7ffff50e2410, desc = 0x7fffffff78e0, desc_pa = 8896512, i = 0
caller : virtqueue_get
&desc->addr = 18446612132323116032 &desc->len= 16 &desc->flags= 1 &desc->next= 1

Same desc_pa i and desc is buffer, why the output &desc->addr is different?
The only difference is address_space_memory?

from cuju.

sklin avatar sklin commented on June 12, 2024 1

Because guest OS free and re-allocate the descriptor table entry of the requests that have been done, we cannot let slave re-execute those already-done requests.
Synchronize requests in record list and descriptor table so that slave will not re-execute requests that have been done.

@@ -151,8 +154,16 @@ static void virtio_blk_complete_head(VirtIOBlockReq *req)
 {
     VirtIOBlock *s = global_virtio_block;
     ReqRecord *rec = req->record;
+    int i;
 
     if (rec != NULL) {
+        for (i=0 ; i<rec->len ; ++i) {
+            if (rec->reqs[i] == req) {
+                rec->completed[i] = true;
+                break;
+            }
+        }
+        assert(i<rec->len);
         if (--rec->left == 0) {
             QTAILQ_REMOVE(&s->record_list, rec, node);
             g_free(rec);
@@ -1110,12 +1122,17 @@ static void virtio_blk_save_device(VirtIODevice *vdev, QEMUFile *f)

     // send temp_list and record_list to slave.
     QTAILQ_FOREACH(rec, &s->record_list, node) {
+        int nsend = 0; // debugging
         qemu_put_sbyte(f, 3);
-        qemu_put_be32(f, rec->len);
+        qemu_put_be32(f, rec->left);
         for (i = 0; i < rec->len; i++) {
+            if (rec->completed[i])
+                continue;
             qemu_put_be32(f, rec->list[i]);
             qemu_put_be32(f, rec->idx[i]);
+            nsend++;
         }
+        assert(nsend==rec->left);
     }
     if (s->temp_list && s->temp_list->len > 0) {
         qemu_put_sbyte(f, 3);

from cuju.

meatybobby avatar meatybobby commented on June 12, 2024

Some case caused crash
Master:
backup
Backup:
master
The incorrect address of req->elem.out_sg in Master node caused Backup node crash when failover. Backup node can deal with previous reqs which have correct address, and crash when deal with incorrect reqs.

from cuju.

PJTPJT avatar PJTPJT commented on June 12, 2024

If save_device has wrong type, defer virtio_blk_free_request() can solve this problem

from cuju.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.