Coder Social home page Coder Social logo

diod's People

Contributors

bjornaxis avatar blochl avatar doughdemon avatar dthadi3 avatar dverite avatar ebfe avatar eugmes avatar garlick avatar joeydumont avatar lfourquaux avatar nkichukov avatar pfpacket avatar pipcet avatar sevki avatar snogge avatar tharre avatar vwax avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

diod's Issues

diod/ops.c:845: undefined reference to `makedev'

From Hongzhi, Song [email protected], reported to v9fs-users:

Error:
diod/ops.c:845: undefined reference to `makedev'

Fixed:
Glibc removes sys/sysmacros.h which defines makedev from sys/types.h since v2.28. [Commit ID: e16deca62e16f]

And then glibc suggestions us to include <sys/sysmacros.h>`directly if code needs it.

Signed-off-by: Hongzhi.Song <[email protected]>
---
 diod/ops.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/diod/ops.c b/diod/ops.c
index 7812420..28ec80d 100644
--- a/diod/ops.c
+++ b/diod/ops.c
@@ -74,6 +74,7 @@
 #include <pthread.h>
 #include <errno.h>
 #include <sys/types.h>
+#include <sys/sysmacros.h>

 #ifdef __FreeBSD__
 #if !__BSD_VISIBLE

Windows support?

Is there any chance diod could be made to run on Windows? I'm not asking for the devs to do this work, just wondering how feasible it would be. If not, is anyone aware of a userspace 9P server that runs on Windows (and preferably Linux as well)?

Thanks!

[Question] Performance bottleneck for large files

Where is the performance bottleneck for reading large files through diod?

Here is a micro-benchmark:
Linux v9fs: 300Mb/s
diodcat (default msize): 400Mb/s
direct tcp (netcat): 900Mb/s
This is for reading a 1GB file through a 1Gb/s Ethernet connection (latency 0.4ms, mtu 9000).

Why is diod that much slower at transferring data than netcat? Would it be possible to increase the max msize on the server and would it help?

diod crashed at shutdown

Diod crashes sometimes during shutdown. The cause is different, sometimes it is an assert that is hit and sometimes it is a normal segfault.

The probability for crashing is much higher if diod has open connections accessing files when the shutdown is initiated.

The main cause of this crash is that all the connection threads are detached and not actively shut down from the main thread. When main thread exits it will deallocate all its data structures, and if one of the connection threads are still doing some processing it might access this data after this and thus cause the crash.

The deallocation is done by these lines in diod/diod.c:
diod_fini (ss.srv);
np_srv_destroy (ss.srv);

test rdma transport

Diod has code for rdma but it's not really been tested.

Some benchmarks versus IPoIB would be useful too (what does it gain us?)

"munge cred decode: Socket communication error" on server side

Clients attempting to diodmount from a centos-6 server using munge auth fail. Without authentication diodmount succeeds.

Munge realm is working across all participating machines, both local tests
munge -n | unmunge and remote ones munge -n | ssh host unmunge work fine.

I have tested with diod-1.0.24 and diod-1.0.23. No luck.

Client-side (debian 9.1):

client # munge -n | ssh 10.d.e.f unmunge
STATUS:           Success (0)
ENCODE_HOST:      ??? (0.0.0.0)
ENCODE_TIME:      2017-12-02 01:06:18 -0500 (1512194778)
DECODE_TIME:      2017-12-02 01:08:24 -0500 (1512194904)
TTL:              300
CIPHER:           aes128 (4)
MAC:              sha1 (3)
ZIP:              none (0)
UID:              root (0)
GID:              root (0)
LENGTH:           0

client # diodmount -o ro,noatime 10.d.e.f:/mnt /mnt
diodmount: attach: Operation not permitted

Server side (centos 6):

server # diod  --nwthreads 1 --foreground --export /mnt  -d 1
diod: P9_TVERSION tag 65535 msize 65536 version '9P2000.L'
diod: P9_RVERSION tag 65535 msize 65536 version '9P2000.L'
diod: P9_TAUTH tag 0 afid 0 uname '' aname '/mnt' n_uname 0
diod: P9_RAUTH tag 0 qid (0000000000000000 0 'A')
diod: P9_TWRITE tag 0 fid 0 offset 0 count 127
4d554e47 453a4177 51444141 446e5431 7154577a 5245482f 6d747565 6e48324f
79424749 41557250 51347043 49427836 56726174 582b756d 62526536 75477330
diod: P9_RWRITE tag 0 count 127
diod: P9_TATTACH tag 0 fid 1 afid 0 uname '' aname '/mnt' n_uname 0
diod: checkauth([email protected]:/mnt): munge cred decode: Socket communication error
diod: attach([email protected]:/mnt): checkauth: Operation not permitted
diod: P9_RLERROR tag 0 ecode 1
diod: P9_TCLUNK tag 0 fid 0
diod: P9_RCLUNK tag 0

Any suggestions what could be broken?

Thank you!

kernel postmark test has a typo

Problem: tests/kern/t33 fails due to a typo in the test script

t33.diff:> ../../tests/kern/t33: line 13: popd: /tmp/tmp.dagRQES0tH: invalid argument
t33.diff:> popd: usage: popd [-n] [+N | -N]

This fixes it

diff --git a/tests/kern/t33 b/tests/kern/t33
index d1f5fc5..7cdebea 100755
--- a/tests/kern/t33
+++ b/tests/kern/t33
@@ -10,5 +10,5 @@ show
 run
 quit
 EOT
-popd $PATH_MNTDIR >/dev/null
+popd >/dev/null
 ) >t33.postmark

kernel fstest fails

Problem: tests/kern/t31 fails on kernel 6.9.3 (ubuntu 22.04)

Test Summary Report
-------------------
tests/chown/00.t   (Wstat: 0 Tests: 171 Failed: 19)
  Failed tests:  34-37, 68-69, 83-84, 116-118, 122-124, 129-130
                135-137
tests/chown/05.t   (Wstat: 0 Tests: 15 Failed: 3)
  Failed tests:  5-6, 10
tests/open/06.t    (Wstat: 0 Tests: 72 Failed: 3)
  Failed tests:  9-11
Result: FAIL

Release

Wouldnt be a new release appropriate?

The last one was 2014. And many distros package just the last current version, no matter what.
In some, the software even flies out of the repo, when there is no new version.

curses header files

To compile diod on RHEL 6.x, you also need the ncurses header files.
Please add to the red had compilation instructions:
yum install ncurses-devel ncurses
I still couldn't compile after that... tried to add it to the Makefile but I did something wrong. I gave up (not much time to tinker around) and comiled the older diod-1.0.23 without problems.

test failures on ubuntu14.04.1 LTS

/tmp is part of root, an ext4 file system.

FAIL: t33 # postmark
t33 line 13: popd /tmp/tmp.UhT00nfcTx: invalid argument

FAIL: t37 # fsx
fsx: main: filesystem does not support fallocate punch hole, disabling: Operation not supported

This is under kernel 3.13.0-35generic and x86_64 arch (in a vmware vm)

IPv6 IPs in server:/path specifications / use of URIs

Hi,
diod appears to work fine with IPv6, but diodmount doesn't when an IPv6 IP is used instead of a hostname; IPv6 IPs contain colons, and we split on the first rather than the last colon character to separate hostname/aname or hostname/port.

That's trivial to "fix" by splitting on the last colon (master...pipcet:last-colon), but of course that will fail in the hypothetical case of an aname that includes a colon itself.

However, I do wonder why we're not using URLs? We have an official IANA-assigned port number, so that sounds like a pretty good basis for requesting an URI scheme. (However, the first character of an URI scheme must not be a digit, so we'd have to choose something different from 9pfs://)

Worng fsid

On CentOS7 x64 the fsid passed to the remote client is wrong. It seams that f_fsid.__val[0] and f_fsid.__val[1] are 64bit instead of 32, but the higher part is not 0 (actually is FFFFFFFF)... The attached patch repairs that. I don't know if it is needed for the __ FreeBSD __ branch.
ops.c.patch.txt

kernel fcntl test fails

Problem: tests/kern/t19 fails on linux-6.9.3 (ubuntu 22.04)


$ diff -u t19.exp t19.out
--- t19.exp	2022-02-22 21:08:52.895878041 -0800
+++ t19.out	2024-07-06 07:24:47.000683222 -0700
@@ -72,16 +72,7 @@
 tfcntl2: fd: write-locked 1 byte
 tfcntl2: child forked
 tfcntl2: fd2: open (child)
-tfcntl2: fd2: write-locked rest of file
-tfcntl2: child exited normally
-tfcntl2: fd: closed
-tfcntl2: 5. Non-conflicting read and write locks CAN be held by two processes
-tfcntl2: fd: open
-tfcntl2: fd: write-locked 1 byte
-tfcntl2: child forked
-tfcntl2: fd2: open (child)
-tfcntl2: fd2: read-locked rest of file
-tfcntl2: child exited normally
-tfcntl2: fd: closed
+tfcntl2: fd2: write-lock rest of file failed: Resource temporarily unavailable
+tfcntl2: child exited with 1, aborting
 kconjoin: t19 exited with rc=0
 kconjoin: diod exited with rc=0

t23 also fails:

$ diff -u t23.exp t23.out
--- t23.exp	2022-02-22 21:08:52.895878041 -0800
+++ t23.out	2024-07-06 07:24:47.061684733 -0700
@@ -72,56 +72,6 @@
 tfcntl3: fd: open O_RDWR
 tfcntl3: fd: read-locked
 tfcntl3: fd2: open O_RDWR
-tfcntl3: fd2: write-locked
-tfcntl3: fd2: closed
-tfcntl3: fd: closed
-tfcntl3: 3. Upgrade byte range of write lock (one fd)
-tfcntl3: fd: open O_RDWR
-tfcntl3: fd: write-locked 1 byte
-tfcntl3: fd: write-locked entire file
-tfcntl3: fd: closed
-tfcntl3: 4. Downgrade byte range of write lock (one fd)
-tfcntl3: fd: open O_RDWR
-tfcntl3: fd: write-locked entire file
-tfcntl3: fd: write-locked 1 byte
-tfcntl3: fd: closed
-tfcntl3: 5. Downgrade write lock to read lock (one fd)
-tfcntl3: fd: open O_RDWR
-tfcntl3: fd: write-locked
-tfcntl3: fd: read-locked
-tfcntl3: fd: closed
-tfcntl3: 6. Read lock with O_RDONLY should succeed
-tfcntl3: fd: open O_RDONLY
-tfcntl3: fd: read-locked
-tfcntl3: fd: closed
-tfcntl3: 7. Read lock with O_WRONLY should fail
-tfcntl3: fd: open O_WRONLY
-tfcntl3: fd: fcntl F_SETLK rdlock failed: Bad file descriptor
-tfcntl3: fd: closed
-tfcntl3: 8. Write lock with O_RDONLY should fail
-tfcntl3: fd: open O_RDONLY
-tfcntl3: fd: fcntl F_SETLK wrlock failed: Bad file descriptor
-tfcntl3: fd: closed
-tfcntl3: 9. Write lock with O_WRONLY should succeed
-tfcntl3: fd: open O_WRONLY
-tfcntl3: fd: write-locked
-tfcntl3: fd: closed
-tfcntl3: 10. Write lock is not inherited across a fork
-tfcntl3: fd: open O_RDWR
-tfcntl3: fd: write-locked
-tfcntl3: child forked
-tfcntl3: fd: read-lock failed (child): Resource temporarily unavailable
-tfcntl3: child exited normally
-tfcntl3: fd: closed
-tfcntl3: 11. Write lock is dropped if another fd to same file is closed
-tfcntl3: fd: open O_RDWR
-tfcntl3: fd: write-locked
-tfcntl3: fd2: open O_RDWR
-tfcntl3: fd2: closed
-tfcntl3: child forked
-tfcntl3: fd: closed (child)
-tfcntl3: fd2: open O_RDWR (child)
-tfcntl3: fd2: read-locked (child)
-tfcntl3: child exited normally
+tfcntl3: fd2: fcntl F_SETLK wrlock failed: Resource temporarily unavailable
 kconjoin: t23 exited with rc=0
 kconjoin: diod exited with rc=0

diod wiki pages from google code?

diod had extensive wiki pages when it was hosted on Google Code. Now these are inaccessible and there is little documentation available.

Could someone who has a copy of them port them over here, maybe to Github pages or this Github wiki?

RLERROR doesn't work across architectures

Hi, I'm not sure if this is the right place to file this - this is a spec issue, is this upstream for the spec?

RLERROR transmits the numeric errno, which isn't consistent across all Linux architectures. For instance, errno 95 is EOPNOTSUPP on most architectures including x86_64, but is, e.g., ENOTSOCK on MIPS. So if I'm running a MIPS guest connecting to qemu's 9p server on an x86_64 host, and I mount with 9p2000.L, executing any file gives me "Socket operation on non-socket" because the kernel wants to look up the security.capability xattr, and the server is returning errno 95.

Adding if (err == -95) err = -EOPNOTSUPP; to two spots in the Linux 9p client causes it to start working, but I'm not sure that's the right solution. (So does mounting with 9p2000.u.)

I see that the FreeBSD support commit in diod also needed to hard-code errno 95. Probably one approach would be to specify that errnos are interpreted following Linux's include/uapi/asm-generic/errno.h, which is used on most architectures, and require those that don't (Linux alpha, mips, parisc, sparc, and other OSes like FreeBSD) to translate them to/from the asm-generic version.

There may be a inconsistency or mistake in description for `fsync` of 9P2000.L .

In In the description of this project, fsync is formatted to this:

size[4] Tfsync tag[2] fid[4]

However, in linux and qemu, it should be like this:

size[4] Tfsync tag[2] fid[4] datasync[4]

,while it means fsync both content and metadata when datasync[4] is 0 , and 1 means fsync content only.

The implement in Linux's 9p client is below:
image

I also conducted experiments based on the qemu‘s 9pfs backend and proved that the datasync field is necessary.

Is this inconsistency caused by some reason?

add an example to the home page about authentication

The example on the diod home page disables authentication.

But that leaves first time users like myself wondering how authentication is supposed to work. I looked through the diod documentation, 9p documentation, and googled around a bit, and didn't see much that was relevant. It looks like there is another channel or service involved, but I couldn't figure out anything more than that.

So, I think it would be great if you could add an example of how authenticated exports are supposed to work.

renameat and unlinkat are commented out.

It seems like these calls were disabled for some reason (perhaps to support old kernels). They might be worth adding again?

I do wonder what benefit they provided compared to walk and remove - it seems slightly unusual to include '/' in names when the original operations did not.

kernel fsx test fails

Problem: tests/kern/t37 fails on kernel 6.9.3

fsx actually succeeds

kconjoin: diodmount exited with rc=0
mapped writes DISABLED
fsx: main: filesystem does not support fallocate punch hole, disabling: Operation not supported
truncating to largest ever: 0x3645e
truncating to largest ever: 0x3f958
truncating to largest ever: 0x3f9e9
All operations completed A-OK!
kconjoin: t37 exited with rc=0
kconjoin: diod exited with rc=0

but the unsupported message is unexpected by the test script

diod does not work properly with RHEL 7 systemd

I don't know how this works yet, but although RHEL 7 systemd thinks it can start/stop services with system V init scripts like diod, and can start diod, it can't stop it.
systemctl stop diod does not complain but doesn't seem to have an effect.

Reportedly other llnl tools that share a common origin for their sysv init scripts have the same problem.

Probably we should provide a systemd service unit and be done with it.

support open-unlink-fstat

Original issue reported here then discussed on v9fs-developer beginning 12 April 2015.

We need a regression test for this and to verify that diod correctly handles the server end.

kernel setfacl test fails

Problem: tests/kern/t43 fails on kernel 6.9.3 (ubuntu 22.04)

The test aborts with

setfacl: testfile: Operation not supported

path->ioctx == NULL crash for rename after create

starting diod [-f -l 127.0.0.1:34321 -e /run/user/1000/TestDotlRename3873657423/001 -d 1 -n -U ac]
diod: P9_TVERSION tag 65535 msize 4096 version '9P2000.L'
diod: P9_RVERSION tag 65535 msize 4096 version '9P2000.L'
diod: P9_TATTACH tag 0 fid 0 afid -1 uname 'ac' aname '/run/user/1000/TestDotlRename3873657423/001' n_uname P9_NONUNAME
diod: P9_RATTACH tag 0 qid (000000000017e13c 0 'd')
diod: P9_TWALK tag 0 fid 0 newfid 1 nwname 0
diod: P9_RWALK tag 0 nwqid 0 
diod: P9_TLCREATE tag 0 fid 1 name 'x' flags 0x0 mode 00 gid 0
diod: P9_RLCREATE tag 0 qid (000000000017e13d 0 '') iounit 0
diod: P9_TRENAME tag 0 fid 1 dfid 0 name 'y'
diod: assertion failure: ioctx.c:448: path->ioctx == NULL

In this test I create a new file with LCREATE, then immediately try to rename it with TRENAME.
The protocol documentation says after LCREATE the fid points to the new item, but this causes a failure.

I think most existing clients tend to walk a new fid and avoid combining these operations hence we don't normally see this problem.

split kernel tests out to a new project

The diod kernel tests require root to run and the results are dependent on which 9P kernel client is available in the test environment, what kind of file system backs /tmp, etc.. As such they don't really belong in an in-tree "make check" test suite.

They might be pulled out into another project and generalized.

Very slow over SSH port forwarding

Throughput is very slow ~ 9 MiB/s with SSH port forwarding. CPU utilization diod+ssh only ~ 35%-
Can I do anything to optimize the speed? With scp I can get ~ 120 MiB/s (AES-NI on both sides).

My idea was that it could be caused by the large default MTU size (65536) of the loopback interface on the server side and I used iptables to rewrite the MSS-value of the initializing SYN-paket:
iptables -A OUTPUT -o lo -p tcp --dport 564 --syn -j TCPMSS --set-mss 1412
It does rewrite the initial MSS, but doesn't help to improve performance.

Also keeping the Nagle algorithm doesn't help, but decreases performance to ~ 2 MiB/s
https://unix.stackexchange.com/questions/434825/ssh-speed-greatly-improved-via-proxycommand-but-why

Ubuntu 18.04.2 with HWE-Kernel 4.18

Client Configuration:
ssh -L 1564:localhost:564 -NT -o StrictHostKeyChecking=accept-new servername &
mount -t 9p -n -o aname=/,access=client,cache=loose,msize=65536,port=1564 127.0.0.1 /mnt

Server Configuration:
/etc/diod.conf
listen = { "127.0.0.1:564" }
auth_required = 0
exports = { "ctl", "/", "/var" }

config file support is silently disabled if lua not found

If liblua is not detected at configure time, diod's config file support is disabled, which could be surprising, for example if the reason that liblua is not detected is due to unintentional circumstances, like bug #72.

We need the capability to build diod without config file support for embedded use cases. As I recall the original request was from busybox/buildroot developers. However, lua is widely available in other environments, and so we should enable config file support, unless the user specifically opts out.

Let's add a --disable-config-file configure option for the embedded use case. If this option is not provided, configure should fail if lua is not detected. The config file parsing code should be conditionally compiled not based on lua availability but based on a flag set by this option. Same for the tests.

tests/kern: make check lock

make check works for both misc/user but keeps locking
during kernel testing (from t05) in root mode with latest linux-next
kernel and debian 8 on ext4 partition.
Problem appears for all tests (CTRL-C to quit)
in runtest when calling kconjoin.

IMHO t00 also has the same problem:
cd tests/kern
make check TESTS=t00
works because it does only execute bash but when typing exit,
we're locked as well.

end of t05.diod before CTRL-C:

diod: P9_TWALK tag 1 fid 1 newfid 2 nwname 1 '.Trash'
diod: P9_RLERROR tag 1 ecode 2
diod: P9_TWALK tag 1 fid 1 newfid 2 nwname 1 '.Trash-1000'
diod: P9_RLERROR tag 1 ecode 2

t05.out
kconjoin: diodmount exited with rc=0
kconjoin: t05 exited with rc=0

./configure: 2581: Syntax error: word unexpected (expecting ")")

On FreeBSD 13.1-RELEASE :

mario@marietto:/home/marietto/Desktop/Downloads/diod-master # ./autogen.sh

Running aclocal ...
fatal: not a git repository (or any of the parent directories): .git
configure.ac:15: error: AC_INIT should be called with package and version arguments
/usr/local/share/aclocal-1.16/init.m4:29: AM_INIT_AUTOMAKE is expanded from...
configure.ac:15: the top level
autom4te2.71: error: /usr/local/bin/gm4 failed with exit status: 1
aclocal: error: autom4te failed with exit status: 1
Running autoheader ...
fatal: not a git repository (or any of the parent directories): .git
autoheader2.71: error: error: AC_CONFIG_HEADERS not found in configure.ac
Running automake ...
configure.ac: error: no proper invocation of AM_INIT_AUTOMAKE was found.
configure.ac: You should verify that configure.ac invokes AM_INIT_AUTOMAKE,
configure.ac: that aclocal.m4 is present in the top-level directory,
configure.ac: and that aclocal.m4 was recently regenerated (using aclocal)
configure.ac:9: installing 'config/config.guess'
configure.ac:9: installing 'config/config.sub'
Makefile.am:11: error: ENABLE_TESTS does not appear in AM_CONDITIONAL
diod/Makefile.am: installing 'config/depcomp'
/usr/local/share/automake-1.16/am/depend2.am: error: am__fastdepCC does not appear in AM_CONDITIONAL
/usr/local/share/automake-1.16/am/depend2.am: The usual way to define 'am__fastdepCC' is to add 'AC_PROG_CC'
/usr/local/share/automake-1.16/am/depend2.am: to 'configure.ac' and run 'aclocal' and 'autoconf' again
/usr/local/share/automake-1.16/am/depend2.am: error: AMDEP does not appear in AM_CONDITIONAL
/usr/local/share/automake-1.16/am/depend2.am: The usual way to define 'AMDEP' is to add one of the compiler tests
/usr/local/share/automake-1.16/am/depend2.am: AC_PROG_CC, AC_PROG_CXX, AC_PROG_OBJC, AC_PROG_OBJCXX,
/usr/local/share/automake-1.16/am/depend2.am: AM_PROG_AS, AM_PROG_GCJ, AM_PROG_UPC
/usr/local/share/automake-1.16/am/depend2.am: to 'configure.ac' and run 'aclocal' and 'autoconf' again
libdiod/Makefile.am:21: error: RDMATRANS does not appear in AM_CONDITIONAL
libnpfs/Makefile.am:25: error: USE_IMPERSONATION_LINUX does not appear in AM_CONDITIONAL
libnpfs/Makefile.am:28: error: USE_IMPERSONATION_GANESHA does not appear in AM_CONDITIONAL
libnpfs/Makefile.am:35: error: RDMATRANS does not appear in AM_CONDITIONAL
tests/kern/Makefile.am:17: error: DBENCH does not appear in AM_CONDITIONAL
parallel-tests: installing 'config/test-driver'
/usr/local/share/automake-1.16/am/check2.am: error: am__EXEEXT does not appear in AM_CONDITIONAL
/usr/local/share/automake-1.16/am/check2.am: error: am__EXEEXT does not appear in AM_CONDITIONAL
/usr/local/share/automake-1.16/am/check2.am: error: am__EXEEXT does not appear in AM_CONDITIONAL
utils/Makefile.am:13: error: ENABLE_DIODMOUNT does not appear in AM_CONDITIONAL
utils/Makefile.am:58: error: ENABLE_DIODMOUNT does not appear in AM_CONDITIONAL
Running autoconf ...
fatal: not a git repository (or any of the parent directories): .git
configure.ac:62: warning: The macro `AC_HEADER_STDC' is obsolete.
configure.ac:62: You should run autoupdate.
./lib/autoconf/headers.m4:704: AC_HEADER_STDC is expanded from...
configure.ac:62: the top level
configure.ac:250: warning: AC_C_BIGENDIAN should be used with AC_CONFIG_HEADERS
configure.ac:10: error: possibly undefined macro: X_AC_EXPAND_INSTALL_DIRS
If this token and others are legitimate, please use m4_pattern_allow.
See the Autoconf documentation.
configure.ac:15: error: possibly undefined macro: AM_INIT_AUTOMAKE
configure.ac:20: error: possibly undefined macro: AM_SILENT_RULES
configure.ac:21: error: possibly undefined macro: AM_CONFIG_HEADER
configure.ac:22: error: possibly undefined macro: AM_MAINTAINER_MODE
configure.ac:31: error: possibly undefined macro: AM_PROG_CC_C_O
configure.ac:49: error: possibly undefined macro: AC_MSG_ERROR
configure.ac:88: error: possibly undefined macro: X_AC_CHECK_PTHREADS
configure.ac:89: error: possibly undefined macro: X_AC_WRAP
configure.ac:90: error: possibly undefined macro: X_AC_CHECK_COND_LIB
configure.ac:92: error: possibly undefined macro: X_AC_TCMALLOC
configure.ac:93: error: possibly undefined macro: X_AC_RDMATRANS
configure.ac:201: error: possibly undefined macro: AM_CONDITIONAL
Cleaning up ...
mv: rename aclocal.m4 to config/aclocal.m4: No such file or directory
Now run ./configure to configure diod for your environment.

mario@marietto:/home/marietto/Desktop/Downloads/diod-master # ./configure
checking build system type... amd64-unknown-freebsd13.1
checking host system type... amd64-unknown-freebsd13.1
./configure: X_AC_EXPAND_INSTALL_DIRS: not found
./configure: 2581: Syntax error: word unexpected (expecting ")")

Tgetattr/Tsetattr semantics on opened fids

It's a bit unclear how Tgetattr/Tsetattr should operate when called with fids that have been opened.
protocol.md says "setattr sets attributes of a file system object referenced by fid". In the case of an already opened fid I would expect that to be the corresponding fd. The code however only operates on the filename f->path (e.g. invoking stat(2) on the path that was used to open the fid). This causes some problems when the client expects the former behaviour (see #19 for example).

Would you be open to changes that switches diod_{get,set}attr to use fstat, fchmod, ftruncate... when called on an opened fid?

disappearing 9p mount - troubleshooting ideas

I recognize the kernel module is ultimately responsible for the filesystem mount, but can you share a suggested way to troubleshoot client issues, where diodmount/9p mountpoints started to disappear on some machines?

Amount of traffic processed does not seem to play a role here. On some machines things work fine for random amount of time, then all further operations return Input/output error. On others, running the same kernel, the issue is never triggered.

Background:
Recently some of the clients running started loosing mountpoints. I suspect factors external to diod/9p play a role, Meltdown and Spectre mitigation being one, but I'd like to figure out whether others are encountering this issue as well, and if not, the best way to easily reproduce it for kernel folks.

Server: diod server on RH 6.6, built from 1.0.24 release using
./configure --enable-rdmatrans --with-ncurses

Clients:

  • Debian stretch, kernel 4.9.0-6-amd64
  • stock diod package provided with stretch (1.0.24-3+b1)
  • Infiniband

Timeline:

  • diodmount succeeds and 9p mount is established on client
  • minutes to hours pass before 9p mount fails
  • standalone diodls and diodcat continue to work
  • IB fabric passes rudimentary tests between clients where this issue occurred and the corresponding server

Thanks!

Support OFD locks?

It seems linux now supports private file locks: https://www.gnu.org/software/libc/manual/html_node/Open-File-Description-Locks.html

These essentially solve the large problem diod has implementing POSIX locks (The fact they are per pid which is essentially impossible to implement) - it seems like we would need to extend the protocol to add these new lock types and teach the v9fs driver about them.

I think adding the new flags would be backwards compatible, old servers and clients would likely just return an error.

diod does not work properly with RHEL 7 systemd

I don't know how this works yet, but although RHEL 7 systemd thinks it can start/stop services with system V init scripts like diod, and can start diod, it can't stop it.
systemctl stop diod does not complain but doesn't seem to have an effect.

Reportedly other llnl tools that share a common origin for their sysv init scripts have the same problem.

Probably we should provide a systemd service unit and be done with it.

configure does not find lua

As noted in #71, lua detection is not working properly.

Example, on Ubuntu 20.04 LTS, with liblua5.1-0-dev installed:

checking lua.h usability... no
checking lua.h presence... no
checking for lua.h... no
checking lualib.h usability... no
checking lualib.h presence... no
checking for lualib.h... no
checking for lua.h... (cached) no
checking for lualib.h... (cached) no

POSIX ACL support?

The code seems to suggest that both Diod and the v9fs driver support POSIX ACLs, but I can figure out how to properly get them to work. On the remote machine, ls -l identifies the files in question as having ACLs, and getfacl correctly reports them, but they seem to be completely ignored as far as file accesses are concerned.

On files where the ACL grants access but traditional unix permissions wouldn't, I get Permission denied errors. On files where the ACL should deny access but traditional unix permissions would grant it, I can successfully read the contents of the file. (The latter especially surprised me, as the user definitely cannot read the file in question locally, and I would have expected Diod's use of setfs[ug]id to prevent access even if the client machine were ignoring ACLs.)

I have CONFIG_9P_FS_POSIX_ACL enabled in the client kernel, and I tried adding -o posixacl to my diodmount command without any apparent effect.

Am I missing a step somewhere, or is this not supported?

Thanks.

Client directory listing causes 'Bad file descriptor' errors for each child file/directory

Hello,

When you install diod server from 'master' (the issue is not present on 1.0.24 version) and try to list a directory on the client that contains other directories or files, 'ls' complains that it cannot access it with a Bad file descriptor error.

Here is a sample setup:

Server:
diod -f -n -d 3 -H -l 10.0.0.1:564 -e /mnt/cryptdata/9p/

Client:
mount -t 9p 10.0.0.1 /mnt/test -oaname=/mnt/cryptdata/9p

Client errors:

# ls /mnt/test
ls: cannot access '/mnt/test/testdir': Bad file descriptor
ls: cannot access '/mnt/test/testfile': Bad file descriptor
testdir  testfile

Server debug output for the mount and the ls command:

diod: P9_TVERSION tag 65535 msize 8192 version '9P2000.L'
diod: P9_RVERSION tag 65535 msize 8192 version '9P2000.L'
diod: P9_TATTACH tag 0 fid 0 afid -1 uname 'nobody' aname '/mnt/cryptdata/9p' n_uname P9_NONUNAME
diod: user lookup: 65534
diod: P9_RATTACH tag 0 qid (0000000005d80001 0 'd')
diod: P9_TGETATTR tag 0 fid 0 request_mask 0x7ff
diod: P9_RGETATTR tag 0 valid 0x7ff qid (0000000005d80001 0 'd') mode 040755 uid 0 gid 0 nlink 3 rdev 0 size 4096 blksize 4096 blocks 8 atime Fri Oct  8 14:26:30 2021 mtime Fri Oct 15 11:24:59 2021 ctime Fri Oct 15 11:24:59 2021 btime X gen X data_version X
diod: P9_TATTACH tag 0 fid 1 afid -1 uname '' aname '/mnt/cryptdata/9p' n_uname 1000
diod: user lookup: 1000
diod: P9_RATTACH tag 0 qid (0000000005d80001 0 'd')
diod: P9_TWALK tag 0 fid 1 newfid 2 nwname 1 '.Trash'
diod: P9_RLERROR tag 0 ecode 2
diod: P9_TWALK tag 0 fid 1 newfid 2 nwname 1 '.Trash-1000'
diod: P9_RLERROR tag 0 ecode 2
diod: P9_TATTACH tag 0 fid 2 afid -1 uname '' aname '/mnt/cryptdata/9p' n_uname 0
diod: user lookup: 0
diod: P9_RATTACH tag 0 qid (0000000005d80001 0 'd')
diod: P9_TGETATTR tag 0 fid 2 request_mask 0x3fff
diod: P9_RGETATTR tag 0 valid 0x3fff qid (0000000005d80001 0 'd') mode 040755 uid 0 gid 0 nlink 3 rdev 0 size 4096 blksize 4096 blocks 8 atime Fri Oct  8 14:26:30 2021 mtime Fri Oct 15 11:24:59 2021 ctime Fri Oct 15 11:24:59 2021 btime 0 gen 0 data_version 0
diod: P9_TWALK tag 0 fid 2 newfid 3 nwname 0
diod: P9_RWALK tag 0 nwqid 0 
diod: P9_TLOPEN tag 0 fid 3 flags 0304000
diod: P9_RLOPEN tag 0 qid (0000000005d80001 0 'd') iounit 0
diod: P9_TGETATTR tag 0 fid 3 request_mask 0x3fff
diod: P9_RGETATTR tag 0 valid 0x3fff qid (0000000005d80001 0 'd') mode 040755 uid 0 gid 0 nlink 3 rdev 0 size 4096 blksize 4096 blocks 8 atime Fri Oct  8 14:26:30 2021 mtime Fri Oct 15 11:24:59 2021 ctime Fri Oct 15 11:24:59 2021 btime 0 gen 0 data_version 0
diod: P9_TREADDIR tag 0 fid 3 offset 0 count 8168
diod: P9_RREADDIR tag 0 count 114
80000000 000200d8 05000000 00476b8c dba80059 4c040700 74657374 64697200 
00000000 0400d805 00000000 1325360e 5dab7c55 08080074 65737466 696c6580 
diod: P9_TREADDIR tag 0 fid 3 offset 9223372036854775807 count 8168
diod: P9_RREADDIR tag 0 count 0
diod: P9_TWALK tag 0 fid 3 newfid 4 nwname 1 'testdir'
diod: diod_clone [email protected]:/mnt/cryptdata/9p: Bad file descriptor
diod: P9_RLERROR tag 0 ecode 9
diod: P9_TWALK tag 0 fid 3 newfid 4 nwname 1 'testfile'
diod: diod_clone [email protected]:/mnt/cryptdata/9p: Bad file descriptor
diod: P9_RLERROR tag 0 ecode 9
diod: P9_TREADDIR tag 0 fid 3 offset 9223372036854775807 count 8168
diod: P9_RREADDIR tag 0 count 0
diod: P9_TCLUNK tag 0 fid 3
diod: P9_RCLUNK tag 0
^C

Reverting 6749db3 fixes the problem.

Client and server both running:
This is GNU/Gentoo Linux, arch amd64, kernel 5.14.12
sys-apps/coreutils-8.32-r1
sys-libs/glibc-2.33-r7
sys-devel/binutils-2.37_p1

make check hangs in tests/kern

Running make check as root down in tests/kern hangs at test t05.
My kernel is 5.4.0-7634-generic (ubuntu 20.04 LTS).

This may be a dup of #23 which was against linux-next in 2015, but wanted to open up a new bug until that is confirmed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.