Coder Social home page Coder Social logo

es-image's Introduction

Mintel Kubernetes ( Mostly ) Elasticsearch Image

This repository contains the Mintel docker image that we use , mostly , within Kubernetes

Build Details

The Image builds on-top of the Official Elasticsearch image and customize the following :

  • It only run as unpriviliged user elasticsearch (uid:1000 / gid: 1000)
  • The indexes are stored in /data if attaching a volume onto /data/ make sure the elasticsearch user can R/W to it
  • JQ is added to the image
  • elasticsearch-py python client library is added to the image
  • run.sh entrypoint is customized to support a series of options / functionalities - see below
  • some management scripts are added to enable rolling restarts on Kubernetes - see below Those scripts are meant to be used as lifecycle hooks in kubernetes

run.sh customizations

  • On Graceful stop of the java process force sleep a few seconds to allow for other nodes to try to connect to the now stopped-pod and get a connection refuse. This is extremely important when stopping the active master - you can read more here Time to sleep can be customized by exporting POST_TERM_WAIT environment variable (Default to 15s)
  • Install plugins at boot by specifying a comma separated list in ES_PLUGINS_INSTALL environment variable
  • Support defining a custom SHARD_ALLOCATION_AWARENESS - TODO: To be improved soon with support with Kubernetes cloud zones
  • Customization of Network DNS Caching TTL By default this java configuration from upstream will cache positive names forever and negative for a while ( TODO: how long? ) set NETWORK_ADDRESS_CACHE_TTL environment variable to define positive caching in seconds ( default to 3s ) set NETWORK_ADDRESS_CACHE_NEGATIVE_TTL environment variable to define negative caching in seconds ( default to 10s )

lifecycle scripts

a Python script to manage various aspects of the Elasticsearch lifecycle is provided in /manage-es.py

A simple set of sh script to be used in kubernetes as lifecycle hooks is provided in

pre-stop hook

/stop-data-node.sh

python /manage-es.py pre-stop-data

post-start hook

/start-data-node.sh

python /manage-es.py persitent-settings
python /manage-es.py post-start-data

See the minikube example for a working example of the definition of those hooks

the following action are supported by the python script NOTE: In some cases running this scripts as lifecycle hook can lead to a cluster that can't startup

  • For example all data action expect to be able to contact the cluster masters before proceeding. Is MASTER and DATA functions are on the same node this will never work

  • pre-stop-data - Hook for Pre-Stop of a data node

    if mode.upper() == "ALLOCATION":
        # Sequence:
        # - Disable Shard Allocation
        # - Perform a Synced Flush
    elif mode.upper() == "DRAIN":
        # Sequence:
        # - set recovery settings
        # - drain Node
        # - Wait for 0 shards in relocating or initializing status

  • post-start-data - Hook for Post-Start of a data node
    if mode.upper() == "ALLOCATION":                        
        # Sequence:                                                                                           
        # - wait for node to join cluster                                                                                                                                                    
        # - set recovery settings                           
        # - enable shards allocation                                                                          
        # - Wait for 0 Initializing or Relocating Shards ( Unassigned shards should be ok if this is cold startup of an elasticsearch cluster )                           
        # - remove temporary recovery settings  
    elif mode.upper() == "DRAIN":
        # Sequence:
        # - wait for node to join cluster
        # - set recovery settings
        # - undrain Node
        # - Wait for 0 Initializing or Relocating Shards ( Unassigned shards should be ok if this is cold startup of an elasticsearch cluster )                        
        # - remove temporary recovery settings
  • pre-stop-master - Hook for Pre-Stop of a master node
Not implemented yet 
  • post-start-master - Hook for Post-Start of a master node
Not implemented yet 
  • peristent-settings - Set some elasticsearch persistent settings from a provided file
If path to a json persitent settings file is provided

push persistent settings to the cluster

ENVironment Variables

Startup Environment variables

  • CLUSTER_NAME
  • NODE_NAME
  • NODE_MASTER
  • NODE_DATA
  • NETWORK_HOST
  • HTTP_CORS_ENABLE
  • HTTP_CORS_ALLOW_ORIGIN
  • NUMBER_OF_MASTERS
  • ES_GCLOG_FILE_COUNT - Number of GC log files to keep in the rotation.
  • ES_GCLOG_FILE_PATH - Location of main GC file (e.g. data/gc.log).
  • ES_GCLOG_FILE_SIZE - Max size of each rolled GC log (e.g. 64m).
  • ES_JAVA_OPTS
  • ES_PLUGINS_INSTALL - comma separated list of Elasticsearch plugins to be installed. Example: ES_PLUGINS_INSTALL="repository-gcs,x-pack"
  • SHARD_ALLOCATION_AWARENESS
  • SHARD_ALLOCATION_AWARENESS_ATTR
  • KUBERNETES_SHARD_ALLOCATION_AWARENESS - Enable SHARD Allocation awareness for a Kubernetes Cluster using labels
    • server = WORKER NODE NAME
      • zone = Kubernetes failure domain zone ( Only for cloud )
  • MEMORY_LOCK - memory locking control - enable to prevent swap (default = true) .
  • REPO_LOCATIONS - list of registered repository locations. For example "/backup" (default = []). The value of REPO_LOCATIONS is automatically wrapped within an [] and therefore should not be included in the variable declaration. To specify multiple repository locations simply specify a comma separated string for example "/backup", "/backup2".
  • PROCESSORS - allow elasticsearch to optimize for the actual number of available cpus (must be an integer - default = 1)

Elasticsearch settings environment variables

Transient settings set during Maintenance

  • WAIT_FOR_NODE_IN_CLUSTER - default: 180 - Seconds to wait for NODE to rejoin cluster
  • WAIT_FOR_NO_SHARDS_RELOCATING - default: 1800 - Seconds to wait for cluster to have no Relocating nor Initializing shards
  • NODE_CONCURRENT_INCOMING_RECOVERIES - default: ES Version defaut - set cluster.routing.allocation.node_concurrent_incoming_recoveries
  • NODE_CONCURRENT_OUTGOING_RECOVERIES - default: ES Version defaut - set cluster.routing.allocation.node_concurrent_outgoing_recoveries
  • NODE_INITIAL_PRIMARIES_RECOVERIES - default: ES Version defaut - set cluster.routing.allocation.node_initial_primaries_recoveries
  • CLUSTER_CONCURRENT_REBALANCE - default: ES Version defaut - set cluster.routing.allocation.cluster_concurrent_rebalance

Persistent Settings set by every node at startup

  • PERSITENT_SETTINGS_FILE_PATH - path to a JSON file of persistent settings

Other Startup Settings

  • DISCOVERY_SERVICE - Elasticsearch discovery URL
  • MAINTENANCE_MODE
    • None ( default ) - no Management of maintenance mode, pod will just be stopped by Kubernetes
    • Drain ( Drain local node ) - The node will be Drained ( moving all shards ) before proceeding with stop - NOTE: This need to finish before GracePeriod expire
    • Allocation ( Disable shard allocation ) - This will disable shards allocation as described https://www.elastic.co/guide/en/elasticsearch/reference/current/rolling-upgrades.html

run in minikube example

see example here

es-image's People

Contributors

fciocchetti avatar primeroz avatar bcbrockway avatar nickmintel avatar nabadger avatar chrlwrd avatar

Stargazers

Pavel Dmytrenko avatar  avatar

Watchers

Adrian avatar  avatar James Cloos avatar Jaye Doepke avatar  avatar  avatar  avatar  avatar  avatar

es-image's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.