distributed-system-analysis / smallfile Goto Github PK

View Code? Open in Web Editor NEW

199.0 199.0 62.0 617 KB

distributed metadata-intensive workload generator for POSIX-like filesystems

License: Apache License 2.0

Python 90.22% Shell 9.78%

smallfile's People

Contributors

Stargazers

Watchers

Forkers

anushshetty johnmcmains thiagodasilva dangzhiqiang c2j sola-scriptura-zz upgirl curratore ekuric mayasd rafikc30 gburiticato jmikovic devzero2000 j050622 kvnkuo pryorda randyorr sl31pn1r nishantyp stroboscope swamireddy spetrovi young8 david-z yamori yunfeiguan yanyungao khaledsmq rendraperdana zphj1987 jerrywang1974 cmo123 yanghonggang hellohaihai meshops rjerk vicunana ywlt chawlanikhil24 strongjz ateleshev xuyang02965 helloqingbing varshar16 avilir kofemann netman2k zhuanjiaozhou bengland2 jobajuba jerry-jibu changweige d-hans celebaldeepak wenlien portante bsbernd sectechtool xchen007 jameswswu

smallfile's Issues

Control maximum directories that can exist at any given time.

How would you restrict maximum directories that can exist any given time ?
It seems logic revolves around maximum files, and if maximum files is 10k-20k then no of directories are way too much. script spends a lot of time in creating those directories first, i would like to restrict maximum dirs.

Interested in incorporating these options.
dirs_per_dir
maximum_dirs

Any suggestion ?

smallfile requires the --top directory to be mounted on driver

This is new behavior (version : 3.1) that doesn't seem correct. I've never had to have the target directory mounted on the driver used to run smallfile tests on remote clients.

# ./smallfile_cli.py --host-set "gprfc001" --response-times N --same-dir N --stonewall N --top /mnt/cephfs/smf/ --network-sync-dir /smfnfs/sync/ --threads 1 --files 256 --files-per-dir 1000 --file-size 4 --file-size-distribution exponential --operation create

ERROR: you must ensure that shared directory /mnt/cephfs/smf is accessible from this host and every remote host in test

on baremetal RHEL 7.5 (3.10.0-862)

Can this test tool be used to test non-distributed file systems?

Can this test tool be used to test non-distributed file systems? like ext4 or btrfs

delete response times from SmallfileWorkload instance before saving

I noticed today when running huge tests that the pickle files output by each thread are enormous, and they cause smallfile_cli.py to blow up in size to many GB. These pickle files are used to return per-thread result data to the master process that initiated the test. The only explanation I have for this is that response times are saved in a python list and are thus put into the pickle file. But there is no need for this - if the user requests response time data, it should be saved in separate .csv files in network_shared/ and so there's no need to return it in the pickle file. If this is correct, then it should be a simple fix.

smallfile not working on windows

I tried the latest master branch on windows client. It returned me below error

adding time for Windows synchronization
Traceback (most recent call last):
  File "C:\Users\administrator.GCHILD\Desktop\smallfile-master\smallfile-master\
smallfile_cli.py", line 288, in <module>
    run_workload()
  File "C:\Users\administrator.GCHILD\Desktop\smallfile-master\smallfile-master\
smallfile_cli.py", line 279, in run_workload
    return multi_thread_workload.run_multi_thread_workload(params)
  File "C:\Users\administrator.GCHILD\Desktop\smallfile-master\smallfile-master\
multi_thread_workload.py", line 121, in run_multi_thread_workload
    sync_files.write_sync_file(sg, 'hi there')
  File "C:\Users\administrator.GCHILD\Desktop\smallfile-master\smallfile-master\
sync_files.py", line 16, in write_sync_file
    os.rename(fpath+notyet, fpath)
WindowsError: [Error 32] The process cannot access the file because it is being
used by another process

But branch "smallfile-dir-share-wi-clnts" is working perfectly on single client.

Small file test aborts on windows after few iterations

launch_host_smf on secondary node and start create, mkdir operation on the main node. Test aborts on the main node

Logs placed at https://drive.google.com/drive/folders/0B8kUR8TCDdh3Mk1rclVvNHBfaGc?usp=sharing

should we convert to argparse module?

The python argparse module automates a lot of what is in parse.py and elsewhere. Nick Dokos turned me on to it. Should we convert to it? How much code would it save? how much would it change existing syntax?

one area where smallfile's python interpreter isn't fast enough - readdir

I was trying to re-check whether python itself is becoming a bottleneck for smallfile, and specifically I wanted to see if pypy3 was better than python3 and could keep up with the "find" utility for reading directory trees. Initially pypy3 would not run because of a new dependency on PyYAML, but when I commented out the YAML parsing option in smallfile (just 2 lines), pypy3 worked fine.

So I created a tree containing 1 million 0-byte (empty) files using a single smallfile thread, and then tried doing the smallfile readdir operation both with pypy and with python3, and then compared it to the same result with the "find" utility. No cache-dropping was done, so that all the metadata could be memory-resident. While this may seem an unfair comparison, NVDIMM-N memories can provide low response times similar to this, and in addition cached-storage performance is something we need to measure. So the 3 commands were:

python3 ./smallfile_cli.py --threads 1 --file-size 0 --files 1048576 --operation readdir
pypy3 ./smallfile_cli.py --threads 1 --file-size 0 --files 1048576 --operation readdir
find /var/tmp/smf/file_srcdir/bene-laptop/thrd_00 -type f  | wc -l

The results were:

test      thousands of files/sec
----      ---------------------------
python3   160
pypy3     352
find      1000

Are all 3 benchmarks doing the same system calls? When I used strace to compare, smallfile was originally at a disadvantage because it was doing system calls to see if other threads had finished. Specifically it was looking for stonewall.tmp in the shared network directory every 100 files. This is not a big deal when doing actual file reads/writes, but for readdir this is a significant increase in the number of system calls. Here's what smallfile was doing per directory:

5693  openat(AT_FDCWD, "/var/tmp/smf/file_srcdir/bene-laptop/thrd_00/d_001/d_002/d_008", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 8
5693  fstat(8, {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
5693  getdents(8, /* 112 entries */, 32768) = 5168
5693  getdents(8, /* 0 entries */, 32768) = 0
5693  close(8)                          = 0
5693  stat("/var/tmp/smf/network_shared/stonewall.tmp", 0x7ffc19b899e0) = -1 ENOENT (No such file or directory)
5693  stat("/var/tmp/smf/network_shared/stonewall.tmp", 0x7ffc19b899e0) = -1 ENOENT (No such file or directory)
5693  stat("/var/tmp/smf/network_shared/stonewall.tmp", 0x7ffc19b899e0) = -1 ENOENT (No such file or directory)
5693  stat("/var/tmp/smf/network_shared/stonewall.tmp", 0x7ffc19b899e0) = -1 ENOENT (No such file or directory)
5693  stat("/var/tmp/smf/network_shared/stonewall.tmp", 0x7ffc19b899e0) = -1 ENOENT (No such file or directory)

By adding the parameter "--stonewall Y" we get rid of the excess stat() system calls and the sequence per directory then becomes optimal. Rerunning the tests with this parameter, we get:

test      thousands of files/sec
----      ---------------------------
python3   172 (was 160)
pypy3     380 (was 352
find      1000

Here's the system call pattern for the "find" command:

fcntl(9, F_DUPFD_CLOEXEC, 0)            = 4
newfstatat(9, "d_009", {st_mode=S_IFDIR|0775, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
openat(9, "d_009", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC|O_DIRECTORY) = 6
fstat(6, {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
fcntl(6, F_GETFL)                       = 0x38800 (flags O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW|O_DIRECTORY)
fcntl(6, F_SETFD, FD_CLOEXEC)           = 0
newfstatat(9, "d_009", {st_mode=S_IFDIR|0775, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
fcntl(6, F_DUPFD_CLOEXEC, 3)            = 10
getdents(6, /* 102 entries */, 32768)   = 4848
getdents(6, /* 0 entries */, 32768)     = 0
close(6)                                = 0

So the find utility is actually making more system calls.

pypy3 and python3 are both using 100% CPU according to "top", which means they are using up a whole core and can't go any faster. Most of their time is spent in user space not system space. Yet even pypy3 is 1/3 the speed of "find".

So what conclusion do we draw from this? Perhaps smallfile and other utilities will need to be rewritten in a compiled language for greater speed, in order to keep up with modern storage hardware.

smallfile hangs if filesystem runs out of space

should not do this if any exception happens in a worker thread

Consider releasing smallfile via PyPI

Under which license is this repo available?

Licensed under the Apache License, Version 2.0 (the "License"); you may not use files except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

It'd be great to add to the repo, so it's parsed by Github.

Please create a v1.1 or later release build

The smallfile tool is being used version-controlled as a plugin for Arcaflow. We are currently using the github tag v1.1 to pull the latest release, which we use for the histogram post-processing features. As github tags can be unreliable, we would like to point to a release build instead per arcalot/arcaflow-plugins-incubator#22.

Also see related issue #34

never reaching the 100% file processed

I am running a series of tests, the --create operation which is the first performed never reaches the 100% of file processed and actually almost always ends up with an 87,5%. This happens no matter the number of files/threads/fsync/pause options i use. This of course cause subsequent operations to reach 62%, 80%, 25% etc.
It seems to me there is a timeout somewhere that stops the operation execution no matter how many files i want to process.
I am running the test on 9 clients with 24 cores each, for instance:
files/thread : 200
threads : 2
record size (KB, 0 = maximum) : 0
file size (KB) : 64
file size distribution : fixed
files per dir : 100
dirs per dir : 10
threads share directories? : N
filename prefix : fsync_
filename suffix :
hash file number into dir.? : N
fsync after modify? : Y
pause between files (microsec) : 300
finish all requests? : Y
stonewall? : Y
measure response times? : N
verify read? : Y
verbose? : False
log to stderr? : False
ext.attr.size : 0
ext.attr.count : 0
permute host directories? : N

example output:
total threads = 16
total files = 2800
total IOPS = 2800
total data = 0.171 GiB
87.50% of requested files processed, minimum is 90.00
elapsed time = 3.961
files/sec = 706.826175
IOPS = 706.826175
MiB/sec = 44.176636
not enough total files processed, change test parameters

Meaning of this Error Message "ERROR: not enough total files processed, change test parameters"

What is the meaning of this error ?

ERROR: not enough total files processed, change test parameters

Error displaying error message

The error message "record size cannot exceed file size" is correct and informative but it looks like the usage statement has a slight problem (undefined).

Traceback (most recent call last):
File "./smallfile_cli.py", line 284, in
run_workload()
File "./smallfile_cli.py", line 265, in run_workload
params = parse.parse()
File "/root/smallfile/parse.py", line 192, in parse
usage('record size cannot exceed file size')
NameError: global name 'usage' is not defined

pypy, pypy3.5 do not find xattr module

If I run "python smallfile.py" the unit test passes, but if I run "pypy smallfile.py" it fails with "NameError: name 'xattr' is not defined", but python*pyxattr packages are installed.

specify input parameters in YAML file

This feature makes it easier to integrate smallfile with various test harnesses and CIs, such as Ceph Teuthology and Jenkins pipelines. The proposed branch name is "yaml". This will be merged soon if no objections are heard.

Exception handling is not done right

Smallfile should define its own exception classes instead of throwing generic Exception. This allows more specific exception handling and easier debugging if there is a problem.

why

[root@localhost smallfile-master]# python smallfile_cli.py --top /nfs/test/ --operation create --host-set 172.16.149.42 --network-sync-dir /tmp/smallfile-master/test/
smallfile version 3.0
hosts in test : ['172.16.149.42']
top test directory(s) : ['/nfs/test']
operation : create
files/thread : 200
threads : 2
record size (KB, 0 = maximum) : 0
file size (KB) : 64
file size distribution : fixed
files per dir : 100
dirs per dir : 10
threads share directories? : N
filename prefix :
filename suffix :
hash file number into dir.? : N
fsync after modify? : N
pause between files (microsec) : 0
finish all requests? : Y
stonewall? : Y
measure response times? : N
verify read? : Y
verbose? : False
log to stderr? : False
ext.attr.size : 0
ext.attr.count : 0
permute host directories? : N
remote program directory : /tmp/smallfile-master
network thread sync. dir. : /tmp/smallfile-master/test/
Exception seen in thread 00 host 172.16.149.42 (tail /var/tmp/invoke_logs-00.log)
Exception seen in thread 01 host 172.16.149.42 (tail /var/tmp/invoke_logs-01.log)
Traceback (most recent call last):
File "/tmp/smallfile-master/smallfile_remote.py", line 48, in
run_workload()
File "/tmp/smallfile-master/smallfile_remote.py", line 39, in run_workload
return multi_thread_workload.run_multi_thread_workload(params)
File "/tmp/smallfile-master/multi_thread_workload.py", line 107, in run_multi_thread_workload
% startup_timeout)
Exception: threads did not reach starting gate within 11 sec
thread ssh-thread:172.16.149.42:256:ssh -x -o StrictHostKeyChecking=no 172.16.149.42 "python /tmp/smallfile-master/smallfile_remote.py --network-sync-dir /tmp/smallfile-master/test/ --as-host 172.16.149.42" has died
Traceback (most recent call last):
File "smallfile_cli.py", line 280, in
run_workload()
File "smallfile_cli.py", line 270, in run_workload
return run_multi_host_workload(params)
File "smallfile_cli.py", line 182, in run_multi_host_workload
'within %d seconds' % host_timeout)
Exception: hosts did not reach starting gate within 17 seconds

thread pacing needed

for large tests we need threads to start slow so that all threads can get into the mix and 1 thread doesn't pull way ahead of others. This is particularly important for tests with large numbers of threads (i.e. lots of client hosts). fio does this with "ramp_time" parameter and "startdelay" parameter. Would like to have something more automatic for smallfile, but we could start out this way.

issue with ssh_thread.py

self.remote_cmd = '%s %s "%s"'%\
    (self.ssh_prefix, self.remote_host, remote_cmd_in)

The command is not appended with 'python' because of which the ssh <remote_host > cmd is not getting executed on the remote system.

self.remote_cmd = '%s %s "python %s"'%\
    (self.ssh_prefix, self.remote_host, remote_cmd_in)

fixed the issue for me.

AttributeError: smf_test_params instance has no attribute 'recalculate_timeouts'

While trying to do some IOs on mount point, got this-->>
sudo python smallfile_cli.py --operation create --threads 8 --file-size 1024 --files 2048 --top /mnt/mpoint/
version : 3.1
hosts in test : None
top test directory(s) : ['/mnt/cephfs']
operation : create
files/thread : 2048
threads : 8
record size (KB, 0 = maximum) : 0
file size (KB) : 1024
file size distribution : fixed
files per dir : 100
dirs per dir : 10
threads share directories? : N
filename prefix :
filename suffix :
hash file number into dir.? : N
fsync after modify? : N
pause between files (microsec) : 0
finish all requests? : Y
stonewall? : Y
measure response times? : N
verify read? : Y
verbose? : False
log to stderr? : False
ext.attr.size : 0
ext.attr.count : 0
Traceback (most recent call last):
File "smallfile_cli.py", line 279, in
run_workload()
File "smallfile_cli.py", line 264, in run_workload
params = parse.parse()
File "/home/cephuser/smallfile/parse.py", line 280, in parse
test_params.recalculate_timeouts()
AttributeError: smf_test_params instance has no attribute 'recalculate_timeouts'

Only working for localhost

Hi,
I used smallfile a couple of years ago to get metric on a glusterfs setup using up to 4 hosts-set, and it worked great!

Now I'm kind of replicating my metrics with a cephfs, but I'm not able to do any test with any other host than the localhost?

I mean, lets use this simple command:

./smallfile_cli.py --operation create --top $MY_MOUNTPOINT  --host-set $MY_HOSTNAME

Changing the calues of this command, I get this:

if $MY_HOSTNAME is only the localhost, it works
if $MY_HOSTNAME contains any other host (eg c7), it fails with:

Traceback (most recent call last):
  File "/imatge/imagen/smf-benchmark/smallfile/smallfile_remote.py", line 48, in <module>
    run_workload()
  File "/imatge/imagen/smf-benchmark/smallfile/smallfile_remote.py", line 39, in run_workload
    return multi_thread_workload.run_multi_thread_workload(params)
  File "/imatge/imagen/smf-benchmark/smallfile/multi_thread_workload.py", line 143, in run_multi_thread_workload
    % prm.host_startup_timeout)
Exception: starting signal not seen within 7 seconds
ERROR: ssh thread for host c7 completed with status 256
  pickle file /mnt/cephfs/smf/network_shared/c7_result.pickle not found
no pickled invokes read, so no results

Trying it with any $MY_MOUNTPOINT returns the same results
No log files

The ssh works fine without password on any host.
We're using Debian-Jessie ssh.

Do you have any idea that were the problem could be?

Exception: starting signal not seen within 11 seconds

Hi,

I don't understand why I get the below error.

[root@host1 smallfile]# python smallfile_cli.py --top /mnt/gluster/gv0_ganesha_nfs/smallfile --host-set host1,host2,host3,host4,host6,host7 --threads 8 --file-size 4 --files 10 --response-times Y --operation create
                                 version : 3.1
                           hosts in test : ['host1', 'host2', 'host3', 'host4', 'host6', 'host7']
                   top test directory(s) : ['/mnt/gluster/gv0_ganesha_nfs/smallfile']
                               operation : create
                            files/thread : 10
                                 threads : 8
           record size (KB, 0 = maximum) : 0
                          file size (KB) : 4
                  file size distribution : fixed
                           files per dir : 100
                            dirs per dir : 10
              threads share directories? : N
                         filename prefix :
                         filename suffix :
             hash file number into dir.? : N
                     fsync after modify? : N
          pause between files (microsec) : 0
                    finish all requests? : Y
                              stonewall? : Y
                 measure response times? : Y
                            verify read? : Y
                                verbose? : False
                          log to stderr? : False
                           ext.attr.size : 0
                          ext.attr.count : 0
               permute host directories? : N
                remote program directory : /root/smallfile
               network thread sync. dir. : /mnt/gluster/gv0_ganesha_nfs/smallfile/network_shared
starting all threads by creating starting gate file /mnt/gluster/gv0_ganesha_nfs/smallfile/network_shared/starting_gate.tmp
Traceback (most recent call last):
  File "/root/smallfile/smallfile_remote.py", line 48, in <module>
    run_workload()
  File "/root/smallfile/smallfile_remote.py", line 39, in run_workload
    return multi_thread_workload.run_multi_thread_workload(params)
  File "/root/smallfile/multi_thread_workload.py", line 143, in run_multi_thread_workload
    % prm.host_startup_timeout)
Exception: starting signal not seen within 11 seconds
Traceback (most recent call last):
  File "/root/smallfile/smallfile_remote.py", line 48, in <module>
    run_workload()
  File "/root/smallfile/smallfile_remote.py", line 39, in run_workload
    return multi_thread_workload.run_multi_thread_workload(params)
  File "/root/smallfile/multi_thread_workload.py", line 143, in run_multi_thread_workload
    % prm.host_startup_timeout)
Exception: starting signal not seen within 11 seconds
ERROR: ssh thread for host host2 completed with status 256
Traceback (most recent call last):
  File "/root/smallfile/smallfile_remote.py", line 48, in <module>
    run_workload()
  File "/root/smallfile/smallfile_remote.py", line 39, in run_workload
    return multi_thread_workload.run_multi_thread_workload(params)
  File "/root/smallfile/multi_thread_workload.py", line 143, in run_multi_thread_workload
    % prm.host_startup_timeout)
Exception: starting signal not seen within 11 seconds
ERROR: ssh thread for host host3 completed with status 256
Traceback (most recent call last):
  File "/root/smallfile/smallfile_remote.py", line 48, in <module>
    run_workload()
  File "/root/smallfile/smallfile_remote.py", line 39, in run_workload
    return multi_thread_workload.run_multi_thread_workload(params)
  File "/root/smallfile/multi_thread_workload.py", line 143, in run_multi_thread_workload
    % prm.host_startup_timeout)
Exception: starting signal not seen within 11 seconds
ERROR: ssh thread for host host4 completed with status 256
Traceback (most recent call last):
  File "/root/smallfile/smallfile_remote.py", line 48, in <module>
    run_workload()
  File "/root/smallfile/smallfile_remote.py", line 39, in run_workload
    return multi_thread_workload.run_multi_thread_workload(params)
  File "/root/smallfile/multi_thread_workload.py", line 143, in run_multi_thread_workload
    % prm.host_startup_timeout)
Exception: starting signal not seen within 11 seconds
ERROR: ssh thread for host host6 completed with status 256
ERROR: ssh thread for host host7 completed with status 256
  pickle file /mnt/gluster/gv0_ganesha_nfs/smallfile/network_shared/host2_result.pickle not found
  pickle file /mnt/gluster/gv0_ganesha_nfs/smallfile/network_shared/host3_result.pickle not found
  pickle file /mnt/gluster/gv0_ganesha_nfs/smallfile/network_shared/host4_result.pickle not found
  pickle file /mnt/gluster/gv0_ganesha_nfs/smallfile/network_shared/host6_result.pickle not found
  pickle file /mnt/gluster/gv0_ganesha_nfs/smallfile/network_shared/host7_result.pickle not found
host = host1,thr = 00,elapsed = 1.292002,files = 10,records = 10,status = ok
host = host1,thr = 01,elapsed = 1.314542,files = 10,records = 10,status = ok
host = host1,thr = 02,elapsed = 1.298001,files = 10,records = 10,status = ok
host = host1,thr = 03,elapsed = 1.308788,files = 10,records = 10,status = ok
host = host1,thr = 04,elapsed = 1.302692,files = 10,records = 10,status = ok
host = host1,thr = 05,elapsed = 1.300747,files = 10,records = 10,status = ok
host = host1,thr = 06,elapsed = 1.293828,files = 10,records = 10,status = ok
host = host1,thr = 07,elapsed = 1.299923,files = 10,records = 10,status = ok
total threads = 8
total files = 80
total IOPS = 80
total data =     0.000 GiB
WARNING: failed to get some responses from workload generators
100.00% of requested files processed, minimum is  90.00
elapsed time =     1.315
files/sec = 60.857695
IOPS = 60.857695
MiB/sec = 0.237725

And FYI, I have edited the ssh_thread.py script as follows for passwordless authentication:

ssh_prefix = 'ssh -i /var/lib/glusterd/nfs/secret.pem -x -o StrictHostKeyChecking=no '