Coder Social home page Coder Social logo

ceph / dmclock Goto Github PK

View Code? Open in Web Editor NEW
88.0 81.0 54.0 830 KB

Code that implements the dmclock distributed quality of service algorithm. See "mClock: Handling Throughput Variability for Hypervisor IO Scheduling" by Gulati, Merchant, and Varman.

License: Other

CMake 2.75% C++ 86.38% Shell 1.01% Python 1.56% Roff 8.30%

dmclock's Issues

How to config mclock_client queue in ceph mimic? How to distinguish different clients?

I am now trying to run tests to see how mclock_client queue works on mimic. But when I tried to config tag (r,w,l) of each client, I found there are no options to distinguish different clients.
All I got are following options for mclock_opclass, which are used to distinguish different types of operations.

[root@ceph-node1 ~]# ceph daemon osd.0 config show | grep mclock
"osd_op_queue": "mclock_opclass",
"osd_op_queue_mclock_client_op_lim": "100.000000",
"osd_op_queue_mclock_client_op_res": "100.000000",
"osd_op_queue_mclock_client_op_wgt": "500.000000",
"osd_op_queue_mclock_osd_subop_lim": "0.000000",
"osd_op_queue_mclock_osd_subop_res": "1000.000000",
"osd_op_queue_mclock_osd_subop_wgt": "500.000000",
"osd_op_queue_mclock_recov_lim": "0.001000",
"osd_op_queue_mclock_recov_res": "0.000000",
"osd_op_queue_mclock_recov_wgt": "1.000000",
"osd_op_queue_mclock_scrub_lim": "100.000000",
"osd_op_queue_mclock_scrub_res": "100.000000",
"osd_op_queue_mclock_scrub_wgt": "500.000000",
"osd_op_queue_mclock_snap_lim": "0.001000",
"osd_op_queue_mclock_snap_res": "0.000000",
"osd_op_queue_mclock_snap_wgt": "1.000000",

I am wondering if ceph mimic provide any configuration interfaces for mclock_client queue? Has anyone ever set up mclock_client queue successfully?

Segmentfault encountered during rados/basic testing

Branch:
https://github.com/ceph/ceph/tree/wip_dmc2_limits

Testing job:

description: rados:basic/{rados.yaml clusters/{fixed-2.yaml openstack.yaml} fs/btrfs.yaml
    mon_kv_backend/leveldb.yaml msgr/random.yaml msgr-failures/few.yaml tasks/repair_test.yaml}

Symptom:
There are a lot of crashes, see below:

2016-08-31T14:19:49.977 INFO:tasks.ceph.osd.0.plana120.stderr:*** Caught signal (Segmentation fault) **
2016-08-31T14:19:49.978 INFO:tasks.ceph.osd.0.plana120.stderr: in thread 7fb12d04c700 thread_name:ceph-osd
2016-08-31T14:19:49.979 INFO:tasks.ceph.osd.0.plana120.stderr: ceph version v11.0.0-1779-g94ce34c (94ce34ccc4f3ee62a95e902e6c5658e6eb04c671)
2016-08-31T14:19:49.981 INFO:tasks.ceph.osd.0.plana120.stderr: 1: (()+0x8865e2) [0x5622ea1ea5e2]
2016-08-31T14:19:49.982 INFO:tasks.ceph.osd.0.plana120.stderr: 2: (()+0x10340) [0x7fb1339b5340]
2016-08-31T14:19:49.983 INFO:tasks.ceph.osd.0.plana120.stderr: 3: (std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release()+0x1e) [0x5622e9d06e7e]
2016-08-31T14:19:49.984 INFO:tasks.ceph.osd.0.plana120.stderr: 4: (crimson::dmclock::PriorityQueueBase<ceph::mClockOpClassQueue::osd_op_type_t, std::pair<boost::intrusive_ptr<PG>, PGQueueable>, 2u>::do_clean()+0x729) [0x5622e9f6dfb9]
2016-08-31T14:19:49.986 INFO:tasks.ceph.osd.0.plana120.stderr: 5: (crimson::RunEvery::run()+0xd2) [0x5622ea1ebd8a]
2016-08-31T14:19:49.987 INFO:tasks.ceph.osd.0.plana120.stderr: 6: (()+0xb1a60) [0x7fb132bb4a60]
2016-08-31T14:19:49.988 INFO:tasks.ceph.osd.0.plana120.stderr: 7: (()+0x8182) [0x7fb1339ad182]
2016-08-31T14:19:49.989 INFO:tasks.ceph.osd.0.plana120.stderr: 8: (clone()+0x6d) [0x7fb13231c47d]
2016-08-31T14:19:49.990 INFO:tasks.ceph.osd.0.plana120.stderr:2016-08-31 14:19:49.973384 7fb12d04c700 -1 *** Caught signal (Segmentation fault) **
2016-08-31T14:19:49.991 INFO:tasks.ceph.osd.0.plana120.stderr: in thread 7fb12d04c700 thread_name:ceph-osd
2016-08-31T14:19:49.992 INFO:tasks.ceph.osd.0.plana120.stderr:
2016-08-31T14:19:49.993 INFO:tasks.ceph.osd.0.plana120.stderr: ceph version v11.0.0-1779-g94ce34c (94ce34ccc4f3ee62a95e902e6c5658e6eb04c671)
2016-08-31T14:19:49.994 INFO:tasks.ceph.osd.0.plana120.stderr: 1: (()+0x8865e2) [0x5622ea1ea5e2]
2016-08-31T14:19:49.995 INFO:tasks.ceph.osd.0.plana120.stderr: 2: (()+0x10340) [0x7fb1339b5340]
2016-08-31T14:19:49.998 INFO:tasks.ceph.osd.0.plana120.stderr: 3: (std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release()+0x1e) [0x5622e9d06e7e]
2016-08-31T14:19:49.999 INFO:tasks.ceph.osd.0.plana120.stderr: 4: (crimson::dmclock::PriorityQueueBase<ceph::mClockOpClassQueue::osd_op_type_t, std::pair<boost::intrusive_ptr<PG>, PGQueueable>, 2u>::do_clean()+0x729) [0x5622e9f6dfb9]
2016-08-31T14:19:50.000 INFO:tasks.ceph.osd.0.plana120.stderr: 5: (crimson::RunEvery::run()+0xd2) [0x5622ea1ebd8a]
2016-08-31T14:19:50.002 INFO:tasks.ceph.osd.0.plana120.stderr: 6: (()+0xb1a60) [0x7fb132bb4a60]
2016-08-31T14:19:50.003 INFO:tasks.ceph.osd.0.plana120.stderr: 7: (()+0x8182) [0x7fb1339ad182]
2016-08-31T14:19:50.004 INFO:tasks.ceph.osd.0.plana120.stderr: 8: (clone()+0x6d) [0x7fb13231c47d]
2016-08-31T14:19:50.004 INFO:tasks.ceph.osd.0.plana120.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2016-08-31T14:19:50.006 INFO:tasks.ceph.osd.0.plana120.stderr:
2016-08-31T14:19:50.040 INFO:tasks.ceph.osd.0.plana120.stderr:     0> 2016-08-31 14:19:49.973384 7fb12d04c700 -1 *** Caught signal (Segmentation fault) **
2016-08-31T14:19:50.042 INFO:tasks.ceph.osd.0.plana120.stderr: in thread 7fb12d04c700 thread_name:ceph-osd
2016-08-31T14:19:50.043 INFO:tasks.ceph.osd.0.plana120.stderr:
2016-08-31T14:19:50.044 INFO:tasks.ceph.osd.0.plana120.stderr: ceph version v11.0.0-1779-g94ce34c (94ce34ccc4f3ee62a95e902e6c5658e6eb04c671)
2016-08-31T14:19:50.045 INFO:tasks.ceph.osd.0.plana120.stderr: 1: (()+0x8865e2) [0x5622ea1ea5e2]
2016-08-31T14:19:50.046 INFO:tasks.ceph.osd.0.plana120.stderr: 2: (()+0x10340) [0x7fb1339b5340]
2016-08-31T14:19:50.047 INFO:tasks.ceph.osd.0.plana120.stderr: 3: (std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release()+0x1e) [0x5622e9d06e7e]
2016-08-31T14:19:50.048 INFO:tasks.ceph.osd.0.plana120.stderr: 4: (crimson::dmclock::PriorityQueueBase<ceph::mClockOpClassQueue::osd_op_type_t, std::pair<boost::intrusive_ptr<PG>, PGQueueable>, 2u>::do_clean()+0x729) [0x5622e9f6dfb9]
2016-08-31T14:19:50.050 INFO:tasks.ceph.osd.0.plana120.stderr: 5: (crimson::RunEvery::run()+0xd2) [0x5622ea1ebd8a]
2016-08-31T14:19:50.051 INFO:tasks.ceph.osd.0.plana120.stderr: 6: (()+0xb1a60) [0x7fb132bb4a60]
2016-08-31T14:19:50.052 INFO:tasks.ceph.osd.0.plana120.stderr: 7: (()+0x8182) [0x7fb1339ad182]
2016-08-31T14:19:50.053 INFO:tasks.ceph.osd.0.plana120.stderr: 8: (clone()+0x6d) [0x7fb13231c47d]
2016-08-31T14:19:50.054 INFO:tasks.ceph.osd.0.plana120.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2016-08-31T14:19:50.055 INFO:tasks.ceph.osd.0.plana120.stderr:
2016-08-31T14:19:50.249 INFO:tasks.ceph.osd.0.plana120.stderr:daemon-helper: command crashed with signal 11

where is gtest/gtest_prod.h

Hi, i'm so interesting in dmclock.
I was just compiling the wip_dmclock2, but i met this error:

In file included from ./common/mClockPriorityQueue.h:26:0,
from ./osd/mClockOpClassQueue.h:26,
from osd/OSD.h:52,
from osd/PG.cc:23:
./dmclock/src/dmclock_server.h:45:30: fatal error: gtest/gtest_prod.h: No such file or directory
#include "gtest/gtest_prod.h"

What can be done to fix it? Thank you very much!

will dmclock queue get stuck?

As I read the code, I find that there are at most three opportunities for a request to trigger schedule_request: after adding a new request, after a request completed and when a non-ready request gets ready. My question is, will dmclock queue get stuck if can_handle_fn frequently return false and waste all the opportunities to call schedule_request?

how to configure reject_threshold

I don't know how to configure reject_threshold, tag.limit > time + reject_threshold. what the meaning of tag.limit
can you please explain this.

confused error while compiling "dmclock-tests"

@ivancich Hi sir, I'm interested in dmclock and tried to use it.

[root@tana75 dmclock]# cmake . -DCMAKE_BUILD_TYPE=Debug -DPROFILE=yes -DDO_NOT_DELAY_TAG_CALC=yes
-- Boost version: 1.53.0
-- Configuring done
-- Generating done
-- Build files have been written to: /home/yanjun/git/dmclock
[root@tana75 dmclock]# make
[ 33%] Building CXX object src/CMakeFiles/dmclock.dir/dmclock_util.cc.o
[ 66%] Building CXX object src/CMakeFiles/dmclock.dir/__/support/src/run_every.cc.o
Linking CXX static library libdmclock.a
[ 66%] Built target dmclock
[100%] Building CXX object support/test/CMakeFiles/test_intru_heap.dir/test_intrusive_heap.cc.o
Linking CXX executable test_intru_heap
[100%] Built target test_intru_heap
[root@tana75 dmclock]# 

But I encountered problems when I tried to compile the dmclock-tests or dmclock-sims targets as following:

[root@tana75 dmclock]# make dmclock-tests
Linking CXX executable ../../test/data-struct-tests
/usr/bin/ld: cannot find -lPUBLIC
collect2: error: ld returned 1 exit status
make[3]: *** [test/data-struct-tests] Error 1
make[2]: *** [support/test/CMakeFiles/data-struct-tests.dir/all] Error 2
make[1]: *** [test/CMakeFiles/dmclock-tests.dir/rule] Error 2
make: *** [dmclock-tests] Error 2
[root@tana75 dmclock]# 

It sounds that PUBLIC is not a normal library, why shoud we link to this? Is there anything I missed?

how to configure the io's r, w, l parameters?

I want to test dmclock in ceph cluester(luminous (mclock_client)). but I have two troubles
1, r,w,l parameters are based on iops. the different block sizes correspond to different iops, hou do I config it?
2, I want to reduce the performance of client io by about 10% in the recovery scenario.
How can I adjust these parameters?
@ivancich

"limit" is out of control ?

I used the following configuration, that's disable allow_limit_break and set limit to a low value.

[global]
server_groups = 1
client_groups = 1
server_random_selection = 0
server_soft_limit = 0                <== disable allow_limit_break

[client.0]
client_count = 100
client_wait = 0
client_total_ops = 1000
client_server_select_range = 10
client_iops_goal = 100
client_outstanding_ops = 100
client_reservation = 20.0
client_limit = 30.0                  <== set limit to 30
client_weight = 1.0

[server.0]
server_count = 100
server_iops = 80
server_threads = 1

the result showed as below, we expect the iops to be about 30, but it highly break this limit.

simulation completed in 30495 millisecs
==== Client Data ====
     client:       0       1       2      97      98      99  total
        t_0:   55.00   55.00   56.00   44.00   42.00   51.50 5307.00    ==> expect to be about 30.00 per client?
        t_1:   71.50   74.50   66.00   85.50   86.50   39.00 6153.00
        t_2:   18.50   19.50   23.00   23.00   12.50   39.50 2937.50
        t_3:   15.00   11.50   15.00   10.50   13.00   34.00 1857.00
        t_4:   34.00   33.00   34.00   37.00   39.50   31.00 3276.50
        t_5:   29.50   28.50   30.00   24.00   29.50   63.00 3712.00
        t_6:   71.50   32.00   61.00   31.00   30.50   10.50 4150.00
        t_7:   10.00   61.00   10.00   43.50   60.00   17.00 2593.50
        t_8:   10.00   10.00   29.50   43.50   10.00   36.50 2604.00
        t_9:   40.00   31.00   25.50   10.50   33.00   21.00 2463.50
       t_10:   20.00   24.00   24.50   36.50   22.00   32.00 2612.00
       t_11:   29.00   22.50   29.50   20.00   23.00   30.00 2975.50
       t_12:   28.50   30.50   65.00   23.00   32.00   59.50 4108.00
       t_13:   62.50   49.50   20.50   32.50   59.50   10.50 3423.50
       t_14:    0.00   12.50    5.50   35.50    7.00   25.00 1436.50
       t_15:    0.00    0.00    0.00    0.00    0.00    0.00   28.50
       t_16:    0.00    0.00    0.00    0.00    0.00    0.00    0.00
    res_ops:     642     633     653     621     640     671   64620
   prop_ops:     358     367     347     379     360     329   35380
total time to track responses: 418389197 nanoseconds;
    count: 100000;
    average: 4183.89 nanoseconds per request/response
total time to get request parameters: 554717345 nanoseconds;
    count: 100000;
    average: 5547.17 nanoseconds per request/response

I found it in the ceph wip_dmclock2 branch, and tried to verify it here. It seems to be related with inaccurate delta/rho value ?

There are "out of order op" errors while testing qa suites

@ivancich hi sir, we saw errors in the qa testing as following which seems to be out of order op, could you take a look at this:

2016-09-07T13:41:05.288 INFO:tasks.rados.rados.0.plana133.stdout:2473:  finishing write tid 4 to ---plana1338849-11 
2016-09-07T13:41:05.291 INFO:tasks.rados.rados.0.plana133.stderr:Error: finished tid 4 when last_acked_tid was 5
2016-09-07T13:41:05.293 INFO:teuthology.orchestra.run.plana133:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph osd unset noscrub'
2016-09-07T13:41:05.303 INFO:tasks.ceph.osd.4.plana138.stderr:2016-09-07 13:41:05.298396 7ff185546700 -1 received  signal: Hangup from  PID: 10335 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 4  UID: 0
2016-09-07T13:41:05.410 INFO:tasks.ceph.osd.2.plana133.stderr:2016-09-07 13:41:05.398355 7fd7d2de1700 -1 received  signal: Hangup from  PID: 7952 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 2  UID: 0
2016-09-07T13:41:05.501 INFO:tasks.ceph.osd.4.plana138.stderr:2016-09-07 13:41:05.499323 7ff185546700 -1 received  signal: Hangup from  PID: 10335 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 4  UID: 0
2016-09-07T13:41:05.606 INFO:tasks.ceph.osd.2.plana133.stderr:2016-09-07 13:41:05.605954 7fd7d2de1700 -1 received  signal: Hangup from  PID: 7952 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 2  UID: 0
2016-09-07T13:41:05.644 INFO:teuthology.orchestra.run.plana138:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.3.asok dump_ops_in_flight'
2016-09-07T13:41:05.702 INFO:tasks.ceph.osd.3.plana138.stderr:2016-09-07 13:41:05.702094 7f0d1cd2b700 -1 received  signal: Hangup from  PID: 28456 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 3  UID: 0
2016-09-07T13:41:05.706 INFO:tasks.rados.rados.0.plana133.stderr:/srv/autobuild-ceph/gitbuilder.git/build/out~/ceph-11.0.0-2034-gd9ad852/src/test/osd/RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)' thread 7f2e2f7fe700 time 2016-09-07 13:41:05.250253
2016-09-07T13:41:05.709 INFO:tasks.rados.rados.0.plana133.stderr:/srv/autobuild-ceph/gitbuilder.git/build/out~/ceph-11.0.0-2034-gd9ad852/src/test/osd/RadosModel.h: 866: FAILED assert(0)
2016-09-07T13:41:05.722 INFO:tasks.rados.rados.0.plana133.stderr: ceph version v11.0.0-2034-gd9ad852 (d9ad852b9b7f54c4993b63f6651063b479c4de2c)
2016-09-07T13:41:05.724 INFO:tasks.rados.rados.0.plana133.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x56002930651b]
2016-09-07T13:41:05.727 INFO:tasks.rados.rados.0.plana133.stderr: 2: (WriteOp::_finish(TestOp::CallbackInfo*)+0x4b8) [0x5600292f3238]
2016-09-07T13:41:05.728 INFO:tasks.rados.rados.0.plana133.stderr: 3: (write_callback(void*, void*)+0x19) [0x560029305a89]
2016-09-07T13:41:05.730 INFO:tasks.rados.rados.0.plana133.stderr: 4: (librados::C_AioSafe::finish(int)+0x1d) [0x7f2e4d2d292d]
2016-09-07T13:41:05.731 INFO:tasks.rados.rados.0.plana133.stderr: 5: (Context::complete(int)+0x9) [0x7f2e4d2b68b9]
2016-09-07T13:41:05.733 INFO:tasks.rados.rados.0.plana133.stderr: 6: (()+0x142cd6) [0x7f2e4d363cd6]
2016-09-07T13:41:05.734 INFO:tasks.rados.rados.0.plana133.stderr: 7: (()+0x8182) [0x7f2e4c7d0182]
2016-09-07T13:41:05.736 INFO:tasks.rados.rados.0.plana133.stderr: 8: (clone()+0x6d) [0x7f2e4b56247d]
2016-09-07T13:41:05.738 INFO:tasks.rados.rados.0.plana133.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2016-09-07T13:41:05.803 INFO:tasks.ceph.osd.2.plana133.stderr:2016-09-07 13:41:05.802800 7fd7d2de1700 -1 received  signal: Hangup from  PID: 7952 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 2  UID: 0

the branch:
https://github.com/ceph/ceph/tree/wip_dmclock2

and testcase:

description: rados/thrash-erasure-code/{leveldb.yaml rados.yaml clusters/{fixed-2.yaml
    openstack.yaml} fs/btrfs.yaml msgr-failures/fastclose.yaml thrashers/default.yaml
    workloads/ec-rados-plugin=jerasure-k=2-m=1.yaml}

I switched the default op_queue to mclock_opclass in src/common/config_opts.h

-OPTION(osd_op_queue, OPT_STR, "wpq")
+OPTION(osd_op_queue, OPT_STR, "mclock_opclass") 

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.