Coder Social home page Coder Social logo

RFC: more advanced app clustering about uwsgi HOT 15 CLOSED

unbit avatar unbit commented on August 19, 2024
RFC: more advanced app clustering

from uwsgi.

Comments (15)

unbit avatar unbit commented on August 19, 2024

i am really interested in that, and i have a customer constantly asking me for such a feature. The only thing i am not sure about if it is better to follow a master-locker approach (like the one you described) or a fully distributed one, like paxos http://en.wikipedia.org/wiki/Paxos_%28computer_science%29

from uwsgi.

edevil avatar edevil commented on August 19, 2024

Why not use a Zookeeper ensemble for that? That's what I do.

I'm afraid someday uWSGI will become sentient with all this new functionality. :)

from uwsgi.

prymitive avatar prymitive commented on August 19, 2024

I'm not very familiar with ZooKeeper, can post more details on how do You use it for cron tasks?

from uwsgi.

prymitive avatar prymitive commented on August 19, 2024

Let me clarify what I would like uWSGI to handle:

  • provide runtime environment (env settings, namespace/chroot)
  • app wide resource management - I want to enforce limits for app as whole, including cron tasks (cgroup)
  • execution tracking - was cron executed successfully? maybe it died due to memory limits and OOM kill
  • execution scheduling - on which node should given cron task run next? We could take load, cpu usage, free memory (either system or in cgroup) into account for better node selection

What I would "outsource" to other tools:

  • cluster wide locking
  • advanced clustering with quorum (needed for locks)
  • shared data storage

Bonus features:

  • pausing ability to pause cron tasks (selected or all), useful for maintenance
  • storing execution history (return code, any stdout/err messages generated during cron task execution)
  • alerts on errors (using uWSGI alert plugins)

uWSGI provides single environment for both web apps and cron tasks those apps require to run, what we need is a way to make those cron tasks executed in a coordinated way across cluster of uWSGI nodes.

from uwsgi.

edevil avatar edevil commented on August 19, 2024

What you said you would "outsource" is what I recommend Zookeeper for. It's fault tolerant and provides CAS operations and shared storage. Perfect for locks, coordination or just shared configuration.

As for running cron jobs, I use cron. :) Of course, I have to embed functionality in these scripts to talk with Zookeeper and decide what to do.

I guess I just find it weird that what I consider an excellent web server/application server can also be an excellent tool for everything you describe. Seems like a whole different functionality. But I don't doubt that uWSGI does that job just as well. :)

from uwsgi.

prymitive avatar prymitive commented on August 19, 2024

Zookeeper does look like the right tool for this job, I'll look into into it.

from uwsgi.

prymitive avatar prymitive commented on August 19, 2024

It will require lot of work so I will aim for 1.6 with that, hopefully this will keep me from falling into winter sleep.

from uwsgi.

unbit avatar unbit commented on August 19, 2024

Ok, i have started investigating better integration with various clustering infrastructures/products.

Currently the exposed api is the following (take it as a draft even if i have implemented all of them in my company private repository)

uwsgi.cluster_members() -> returns the list of cluster members
uwsgi.cluster_lord() -> returns boolean, True if the calling instance is the master/lord or whatever term it is used
uwsgi.cluster_quorum() -> returns boolean if (for backends supporting it) the cluster has the quorum
uwsgi.cluster_lock(resource) -> returns nothing, blocks on a clustered resource
uwsgi.cluster_unlock(resource) -> returns nothing, unlock a previously locked clustered resource

The only developed backend is for the redhat cluster suite (cman/dlm)

I plan to add zookeeper and pacemaker soon after. Both legion and cluster subsystem will be available as backend even if they need some kind of work as they could overlap.

Some other project to look at in this area ?

from uwsgi.

prymitive avatar prymitive commented on August 19, 2024

IMHO most people (including me) does not need anything more sophisticated than a redis server (and with redis-sentinel it should be easy to have decent HA in place now) and I think that the goal in the first place should be to create something that is both: easy to use and "good enough" for non-critical tasks.
AFAIK cman is limited to 16 nodes, other tools might have other limits, and they all add complexity that might not be compensated if all You need to do is running harmless cron tasks. In case of my apps nothing will break if something goes wrong and cron is re-executed to fast, so I would rather have less reliable solution that have to debug cman from time to time.

I would rather use legion and storage plugins to do this job. Legion can talk over multicast so there is no need to have a full list of nodes, no configuration You need to recreate every time You add or remove a node. And with storage plugin I can share some data between nodes (like last run time). Legion just needs few more touches like way of handling nodes with same valor.

from uwsgi.

unbit avatar unbit commented on August 19, 2024

Let me show you an example of the proposed cluster api using redis:

uwsgi --cluster-engine redis --cluster 192.168.0.17:4000 --ini cluster://foobar.ini

that means:

join (see below) the cluster managed by the redis server 192.168.0.17 port 4000 and get instance configuration using the object foobar.ini (that in redis could be a simple key)

uWSGI configure its internals (redis-specific) and then gets the instance configurations and (if all goes well) starts accepting requests. An additional thread (in the master) starts waiting for redis event (read: redis-sentinel events) and checks for deadlocks (see below).

The app want to lock a shared resource and calls uwsgi.cluster_lock("item001") internally the request is mapped to
a series of redis calls (like described here: http://redis.io/commands/setnx). When finished it calls uwsgi.cluster_unlock("item001").

The address of the redis instance is in a shared memory area managed by the master-thread. If the sentinel notify of changed master the value is changed accordingly, and pending locks (or deadlocks) are managed (whenever a core is cluster locking something a global table is filled, so the master-thread can check it)

This is for the locking part. Regarding routers (fastrouter,httprouter...) when the instance joins the cluster an internal list is filled with the instance address and the router can use that list (asynchronously managed by the master-thread) for load balancing.

Regarding cron locking, i believe the best approach would be improving the legion subsystem, as for the nature of the system it is better to ensure a single node "hold" a resource (the cron task) indefinitely.

from uwsgi.

prymitive avatar prymitive commented on August 19, 2024

uwsgi --cluster-engine redis --cluster 192.168.0.17:4000 --ini cluster://foobar.ini

that means:

join (see below) the cluster managed by the redis server 192.168.0.17 port 4000 and get instance configuration using > the object foobar.ini (that in redis could be a simple key)

uWSGI configure its internals (redis-specific) and then gets the instance configurations and (if all goes well) starts accepting requests. An additional thread (in the master) starts waiting for redis event (read: redis-sentinel events) and checks for deadlocks (see below).

Isn't this emperor with redis backend? What advantages does it offer over emperor? The only difference would be reading node list from redis rather than from subscription table (?)

from uwsgi.

unbit avatar unbit commented on August 19, 2024

configuration download is only to be backward-compatible with the old (current) clustering subsystem, i do not think there are cases where emperor is not the best choice. The same for the subscription system (even if storing nodes data in different storage could be interesting).

Main purpose (for me) of the new infrastructure will be node synchronization
(linux kernel dlm was the reason for working on redhat cluster suite before the others) and "revamping" the old cluster-messaging attempt (i think there is still documentation on the old site) that never worked as expected.

So, to sum up:

redhat (config, nodes, locking, messaging)
corosync (config, nodes, messaging)
zookeeper (config, locking)
redis (config, nodes, locking)
uwsgi-multicast/old clustering (config, nodes, messaging)

will be the backends exporting various features via the clustering api.

Currently i needed only the redhat plugin for a customer and the redis setnx support for another (this second one is still in development).

I suppose you are suggesting a more modular approach where we simply have "cluster-locking" plugins, "emperor-storage" and simply drop messaging (little use as no one asked for it in the past :P). Nodes infos are irrelevant in the choice as i think the subscription system is versatile enough to make everyone happy.

from uwsgi.

prymitive avatar prymitive commented on August 19, 2024

With legion subsystem and this toy plugin one can have such crons, but it needs more work to be really usable. This will probably not happen in 1.5 since there are other things cooking there.
We would still need persistent store, the simplest solution would be to use uWSGI cache with cache-store option, but to make it work with more than one node we would need a way to sync cache from other nodes on startup. This is already possible with cache-sync but You need to specify a working node, to make it really usable feature we would need a way to fetch cache data from oldest node connected in cluster, so first we need to get the list of nodes (from legion cluster or fastrouter?) and then pick the oldest one (since other nodes might be also syncing at the moment, unless syncing is blocking so only fully synced nodes are connected to cluster).

from uwsgi.

prymitive avatar prymitive commented on August 19, 2024

update: --legion-cron command was added by Roberto yesterday, it will run all cron tasks only on node which is lord for given legion (you can use multiple legions and spread cron tasks manually).

I need more features for crons so I'm working on plugin that will implement them (storing job history somewhere, spreading jobs across all nodes based on few factors and so).

from uwsgi.

prymitive avatar prymitive commented on August 19, 2024

Legion subsystem was added to 1.9 and in current master there is legion-cron, so this we have working solution for this issue, closing

from uwsgi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.