Coder Social home page Coder Social logo

Seg fault running Assise as local FS about assise HOT 4 OPEN

ut-osa avatar ut-osa commented on July 20, 2024
Seg fault running Assise as local FS

from assise.

Comments (4)

simpeter avatar simpeter commented on July 20, 2024

from assise.

wreda avatar wreda commented on July 20, 2024

I've added myself as a watcher, so I should be getting notifications.

@hayley-leblanc : There's no need to disable the DISTRIBUTED flag as it has been deprecated. The steps you followed in the README should be sufficient. Since you've modified the storage configuration, I'd first double-check that you rebuilt both LibFS/KernFS and reran mkfs.sh successfully.

If you already did that, I'll likely need more context to know what might be causing this. Can you rerun KernFS in gdb and share the stack trace? You will need to first recompile KernFS with the -g flag.

from assise.

hayley-leblanc avatar hayley-leblanc commented on July 20, 2024

I double checked that I cleaned and rebuilt LibFS and KernFS, ran change_dev_size.py, re-ran mkfs.sh, etc. with the new configurations, but I'm still running into the issue. Here's the output from running KernFS in gdb:

Starting program: /usr/bin/numactl -N0 -m0 kernfs
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
process 3005 is executing new program: /home/novavm/vmshare/assise/kernfs/tests/kernfs
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
initialize file system
dev-dax engine is initialized: dev_path /dev/dax0.0 size 3072 MB
[New Thread 0x7fff371ff700 (LWP 3009)]
[New Thread 0x7fff369fe700 (LWP 3010)]
[New Thread 0x7fff361fd700 (LWP 3011)]
[New Thread 0x7fff359fc700 (LWP 3012)]
[New Thread 0x7fff351fb700 (LWP 3013)]
[New Thread 0x7fff349fa700 (LWP 3014)]
[New Thread 0x7fff341f9700 (LWP 3015)]
[New Thread 0x7fff339f8700 (LWP 3016)]
[New Thread 0x7fff331f7700 (LWP 3017)]
Reading root inode with inum: 1fetching node's IP address..
Process pid is 3005
ip address on interface 'lo' is 127.0.0.1
cluster settings:
--- node 0 - ip:127.0.0.1
[New Thread 0x7fff329f6700 (LWP 3020)]
MLFS cluster initialized
[Local-Server] Listening on port 12345 for connections. interrupt (^C) to exit.
Adding connection with sockfd: 0
[New Thread 0x7fff321f5700 (LWP 3031)]
Adding connection with sockfd: 1
RECV <-- MSG_INIT [pid 0]
[New Thread 0x7fff319f4700 (LWP 3032)]
[add_peer_socket():80] Peer connected (ip: 127.0.0.1, pid: 3025)
[add_peer_socket():98] Established connection with 127.0.0.1 on sock:0 of type:0 and peer:0x7fff30e0f000
RECV <-- MSG_INIT [pid 2]
Adding connection with sockfd: 2
SEND --> MSG_SHM [paths: /shm_recv_0|/shm_send_0]
start shmem_poll_loop for sockfd 0
[add_peer_socket():98] Established connection with 127.0.0.1 on sock:1 of type:2 and peer:0x7fff30e0f000
SEND --> MSG_SHM [paths: /shm_recv_1|/shm_send_1]
start shmem_poll_loop for sockfd 1
[New Thread 0x7fff30bff700 (LWP 3033)]
RECV <-- MSG_INIT [pid 1]
[add_peer_socket():98] Established connection with 127.0.0.1 on sock:2 of type:1 and peer:0x7fff30e0f000
SEND --> MSG_SHM [paths: /shm_recv_2|/shm_send_2]
start shmem_poll_loop for sockfd 2
00000000000000000000000000000001
[New Thread 0x7fff2ffff700 (LWP 3034)]
[New Thread 0x7fff2f7fe700 (LWP 3035)]
Adding connection with sockfd: 3
[New Thread 0x7fff2effd700 (LWP 3048)]
Adding connection with sockfd: 4
RECV <-- MSG_INIT [pid 0]
[add_peer_socket():98] Established connection with 127.0.0.1 on sock:3 of type:0 and peer:0x7fff30e0f000
[New Thread 0x7fff2e7fc700 (LWP 3049)]
SEND --> MSG_SHM [paths: /shm_recv_3|/shm_send_3]
Adding connection with sockfd: 5
RECV <-- MSG_INIT [pid 2]
start shmem_poll_loop for sockfd 3
[New Thread 0x7fff2dbff700 (LWP 3050)]
[add_peer_socket():98] Established connection with 127.0.0.1 on sock:4 of type:2 and peer:0x7fff30e0f000
SEND --> MSG_SHM [paths: /shm_recv_4|/shm_send_4]
start shmem_poll_loop for sockfd 4
RECV <-- MSG_INIT [pid 1]
[add_peer_socket():98] Established connection with 127.0.0.1 on sock:5 of type:1 and peer:0x7fff30e0f000
SEND --> MSG_SHM [paths: /shm_recv_5|/shm_send_5]
start shmem_poll_loop for sockfd 5
00000000000000000000000000000011
[New Thread 0x7fff2cdff700 (LWP 3051)]
[New Thread 0x7fff2c5fe700 (LWP 3052)]

Thread 17 "kernfs" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff2effd700 (LWP 3048)]
0x00007ffff7f4b9dd in init_replication (remote_log_id=remote_log_id@entry=2, peer=0x7ffff746d0c0, begin=begin@entry=644609, size=size@entry=906753, addr=addr@entry=0, end=0x7fff2dc0f020) at ./global/mem.h:36
36		return calloc(1, size);

And the stack trace:

#0  0x00007ffff7f4b9dd in init_replication (
    remote_log_id=remote_log_id@entry=2, peer=0x7ffff746d0c0, 
    begin=begin@entry=644609, size=size@entry=906753, addr=addr@entry=0, 
    end=0x7fff2dc0f020) at ./global/mem.h:36
#1  0x00007ffff7f4d24b in register_peer_log (peer=0x7fff30e0f000, 
    find_id=<optimized out>) at distributed/peer.c:271
#2  0x00007ffff7f57d31 in signal_callback (msg=0x7ffff789f008) at fs.c:2389
#3  0x00007ffff7b11e09 in shmem_poll_loop (sockfd=sockfd@entry=3)
    at shmem_ch.c:106
#4  0x00007ffff7b121a6 in local_server_thread (arg=<optimized out>)
    at shmem_ch.c:339
#5  0x00007ffff7d18609 in start_thread (arg=<optimized out>)
    at pthread_create.c:477
#6  0x00007ffff7e54293 in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

from assise.

wreda avatar wreda commented on July 20, 2024

It seems your segfault was due to an outdated mkdir_user script. It was calling init_fs() explicitly, which is not needed in the case of Assise (since this function is called automatically by LibFS). I've introduced a patch that addresses this.

Please pull and rebuild LibFS, KernFS, and the tests directory. Let me know if you're still having issues.

from assise.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.