Coder Social home page Coder Social logo

Understanding Soft Tkos about mcrouter HOT 1 CLOSED

facebook avatar facebook commented on May 3, 2024
Understanding Soft Tkos

from mcrouter.

Comments (1)

alikhtarov avatar alikhtarov commented on May 3, 2024

Unfortunately TKO logic is something that is being actively improved/changed, so documentation will be out of date :)
Please also make sure you run a recent version, as there were a few important bug fixes to TKO logic.

--latency-threshold-us as of most recent version is deprecated and has no effect.

You also need to specify the timeout for requests. The easiest way right now is -t (or --server-timeout), and the setting is in ms. mc_res_timeout in the log means that some individual request had a round trip exceeding this timeout.
--timeouts-until-tko specifies how many 'soft errors' (meaning timeouts) need to happen in a row for the host to be marked down.
'hard errors' (e.g. connection closed by peer) need only to happen once, and the host will be marked as 'hard TKO'

The distinction between soft and hard TKO is that any number of hosts can be marked as hard TKO, while there is a limit on maximum number of soft TKOs. The rationale is to keep the site operational under heavy load - even when the limit is reached, some hosts will be kept up even though many individual requests will time out.

The options --probe-timeout-initial and --probe-timeout-max only have effect after the host has been marked TKO; no regular requests are sent at this time. These options control the initial interval between sending probes (version\r\n commands) and the maximum interval; the interval is exponentially decayed until the maximum. These probes are sent in the background until a successful reply, which unmarks the host.

For debugging, something that might help is 'stats servers' command (echo 'stats servers' | nc mcrouter_host mcrouter_port). This will list error counts (including timeouts) for every server in the config, for example

STAT [1.2.3.4]:11111:TCP:ascii-1000 avg_latency_us:5447.932 pending_reqs:0 inflight_reqs:0 new:4; found:9543 notfound:4448 stored:5820

avg_latency_us is exponentially weighted round trip average and should tell you if something is slow for that server. A timeout: counter if present will show the number of timeouts for destination, normally it should be at 0.

from mcrouter.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.