Coder Social home page Coder Social logo

ansible-role-postgres-ha's Introduction

postgres-ha

With this role, you will transform your standalone postgresql server to N-node postgres cluster with automated failover. You only need one working postgresql server and other hosts with clean CentOS 7 or CentOS 6 minimal install.

Alternatively, this role can create a database cluster for you from scratch. If no postgres database is detected, it will be created.

What it will do:

  • install the cluster software stack (pcs, corosync, pacemaker)
  • add IPs of cluster hosts to /etc/hosts files
  • create a pcs cluster from all play hosts
  • install database binaries if needed
  • init master database if needed
  • alter postgresql configuration if needed
  • sync slave databases from master host
  • make sure the DB replication is working
  • create cluster resources for database, floating IP and constraints
  • check again that everything is working as expected

Automated failover is setup using PAF pacemaker module: https://github.com/dalibo/PAF

What you should know

  • The role is idempotent. I've made many checks to allow running it multiple times without breaking things. You can run it again safely even if the role fails. The only thing you need to check before the run is the postgres_ha_cluster_master_host variable. But don't worry, if the specified host is not the master database, the role will fail gracefully without disrupting things.

  • During the run, the role will alter your postgresql.conf and pg_hba.conf to enable replication. You can review the changes to postgresql.conf in defaults/main.yml (postgres_ha_postgresql_conf_vars variable). In pg_hba.conf, the host ACL statements will be added for every cluster node. They will be added before all previously existing host ACL statements.

  • The postgres replication is asynchronnous by default. If you want synchronnous replication, alter the postgres_ha_postgresql_conf_vars variable by adding synchronous_standby_names parameter. Please see postgresql manual for more info. Also note that if the last synchronnous replica disconnects from master, the master database will stop serving requests.

  • You should have at least a basic understanding of clustering and how to work with pcs command. If the role fails for some reason, it is relatively easy to recover from it.. if you understand what logs are trying to say and/or how to run appropriate recovery actions. See cleanup section for more info.

  • You need to alter firewall settings before running this role. The cluster members need to communicate among each other to form a cluster and to replicate postgres DB. I recommend adding some firewall role before the postgres-ha role.

  • If the master datadir is empty on the first run, the role will init an empty datadir. Slave nodes will then download this empty database. If the datadir is not empty, the initdb will be skipped. This means that you can run this role on clean CentOS installs that don't have any postgresql database installed. The result will be fully working empty database cluster.

  • On the first run, the datadirs on slave nodes will be deleted without prompt. Please make sure you specify the correct postgres_ha_cluster_master_host at least for this first run (slave datadirs will NEVER be deleted after first initial sync is done).

  • If you plan to apply the role to higher number of servers (7+) please be aware that the servers are downloading rpms packages simultaneously. This can be identified as DDoS and some repository providers may refuse your downloads. As a result, the role will fail. I recommend setting up your own repository mirror in such cases.

  • Please don't change the cluster resource name parameters after the role has been applied. In next run, it will result in trying to create the new colliding resources.

  • Fencing is not configured by this role. If you need one, you have to configure it manually after running the role.

Requirements

This role works on CentOS 6 and 7. RHEL was not tested but should work without problem. If you need support for other distribution, I can help. Post an issue.

The postgresql binaries on your primary server should be installed from the official repository:

https://yum.postgresql.org/repopackages.php

Note: If you have binaries from other repo, you need to modify the postgres_ha_repo_url variable to change the postgres repository source and maybe also bindir and datadir paths in other role variables. If you need to change the installed package name(s), you need to directly modify install pg* task in tasks/postgresql_sync.yml file.

Role Variables

For all variables with description see defaults/main.yml

Variables that must be changed:

  • postgres_ha_cluster_master_host - the master database host (WARNING: please make sure you fill this correctly, otherwise you may lose data!)
  • postgres_ha_cluster_vip - a floating IP address that travels with master database
  • postgres_ha_pg_repl_pass - password for replicating postgresql data
  • postgres_ha_cluster_ha_password - password for cluster config replication
  • postgres_ha_cluster_ha_password_hash - password hash of postgres_ha_cluster_ha_password

Password hash can be generated for example by this command:

python -c 'import crypt; print(crypt.crypt("my_cluster_ha_password", crypt.mksalt(crypt.METHOD_SHA512)))'

Dependencies

No other roles are required as a dependency. However you can combine this role with some other role that installs a postgresql database.

Example Playbook

The usage is relatively simple - install minimal CentOS-es, set the variables and run the role.

Two settings are required:

  • gather_facts=True - we need to know the IP addresses of cluster nodes
  • any_errors_fatal=True - it ensures that error on any node will result in stopping the whole ansible run. Because it doesn't make sense to continue when you lose some of your cluster nodes during transit.
    - name: install PG HA
      hosts: db?
      gather_facts: True
      any_errors_fatal: True
      vars:
        postgres_ha_cluster_master_host: db1
        postgres_ha_cluster_vip: 10.10.10.10
        postgres_ha_pg_repl_pass: MySuperSecretDBPass
        postgres_ha_cluster_ha_password: AnotherSuperSecretPass1234
        postgres_ha_cluster_ha_password_hash: '$6$mHeZ7/LD1y.........7VJYu.'
      pre_tasks:
        - name: disable firewall
          service: name=firewalld state=stopped enabled=no
      roles:
         - postgres-ha

Cleanup after failure

If the role fails repeatedly and you want to run it fresh as if it was the first time, you need to clean up some things. Please note that default resource names are used here. If you change them using variables, you need to change it also in these commands.

  • RUN ON ANY NODE:
pcs resource delete pg-vip
pcs resource delete postgres
#pcs resource delete postgres-ha   # probably not needed
#pcs resource cleanup postgres     # probably not needed

# Make sure no (related) cluster resources are defined.
  • RUN ON ALL SLAVE NODES:
systemctl stop postgresql-9.6
# Make sure no postgres db is running.
systemctl status postgresql-9.6
ps aux | grep postgres
rm -rf /var/lib/pgsql/9.6/data
rm -f /var/lib/pgsql/9.6/recovery.conf.pgcluster.pcmk
rm -f /var/lib/pgsql/9.6/.*_constraints_processed   # name generated from postgres_ha_cluster_pg_res_name
  • RUN ONLY ON MASTER NODE:
systemctl stop postgresql-9.6
rm -f /var/lib/pgsql/9.6/recovery.conf.pgcluster.pcmk
rm -f /var/lib/pgsql/9.6/.*_constraints_processed
rm -f /var/lib/pgsql/9.6/data/recovery.conf
rm -f /var/lib/pgsql/9.6/data/.synchronized
# Make sure no postgres db is running.
ps aux | grep postgres
systemctl start postgresql-9.6
systemctl status postgresql-9.6
# Check postgres db functionality.
  • START AGAIN
# Check variables & defaults and run ansible role again.

License

BSD

Author Information

Created by YanChi.

Originally part of the Danube Cloud project (https://github.com/erigones/esdc-ce).

ansible-role-postgres-ha's People

Contributors

anayrat avatar benmccown avatar dan-hill2802 avatar yanchii avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

ansible-role-postgres-ha's Issues

Gui and UDPU support

Jan,

what do you think about adding support to enable GUI and option Unccast (UDPU) vs mutilcast (UDP).

frank

Error in "init DB dir on master if necessary" step for Postgresql 10

I'm trying to use this role to deploy Postgres 10 after using it for quite a few 9.6 clusters over the last couple months.

The postgresql_sync.yml play fails on the "init DB dir on master if necessary" due to a change in the filename for postgres version 10 (and I'm guessing above.)

In the bin dir for postgres 9.6:

[root@sol-usgp2-postgres-service-01-d bin]# pwd
/usr/pgsql-9.6/bin
[root@sol-usgp2-postgres-service-01-d bin]# ls postgres*
postgres  postgresql96-check-db-dir  postgresql96-setup

Notice the "postgresql96-setup" script

Now in postgres 10's bin dir:

[root@sol-usgp2-postgres-test-01-d bin]# pwd
/usr/pgsql-10/bin
[root@sol-usgp2-postgres-test-01-d bin]# ls postgres*
postgres  postgresql-10-check-db-dir  postgresql-10-setup

Notice the added dash after "postgresql" in "postgresql-10-setup"

This seems to cause some obvious problems with the playbook.

Planning to use Ubuntu

I want to adapt this role to work on an Ubuntu distro. I know little about the system-level differences so I'm not sure how hard this will be. Can you provide any guidance?

not able to run on centos 6

i know you said that should run on centos 7 and I am trying to get it to run on centos 6,
none of the tasks that create or modify file are not working, also it seems that any task that includes postgres_ha_cluster_master_host variable does not work either.

any help would be greatly appreciated

frank

PostgreSQL 11, PAF 2.2.1 + geo-patch = playbook succeeds but master flip flops

on fresh run on CentOS 7.

If I managed to catch the log from the start (as soon as the db data is initialized on master)

[root@prometheus-dev01 ~]# tail -f /var/lib/pgsql/11/data/log/postgresql-Wed.log 
2019-02-13 00:59:19.503 UTC [28138] LOG:  database system was shut down at 2019-02-13 00:59:15 UTC
2019-02-13 00:59:19.508 UTC [28136] LOG:  database system is ready to accept connections
2019-02-13 00:59:24.375 UTC [28136] LOG:  received SIGHUP, reloading configuration files
2019-02-13 00:59:32.239 UTC [28136] LOG:  received fast shutdown request
2019-02-13 00:59:32.240 UTC [28136] LOG:  aborting any active transactions
2019-02-13 00:59:32.241 UTC [28136] LOG:  background worker "logical replication launcher" (PID 28144) exited with exit code 1
2019-02-13 00:59:32.242 UTC [28139] LOG:  shutting down
2019-02-13 00:59:32.423 UTC [28136] LOG:  database system is shut down

2019-02-13 00:59:40.780 UTC [29677] LOG:  database system was shut down at 2019-02-13 00:59:32 UTC
2019-02-13 00:59:40.781 UTC [29677] LOG:  entering standby mode
2019-02-13 00:59:40.783 UTC [29677] LOG:  consistent recovery state reached at 0/3000098
2019-02-13 00:59:40.783 UTC [29677] LOG:  invalid record length at 0/3000098: wanted 24, got 0
2019-02-13 00:59:40.784 UTC [29673] LOG:  database system is ready to accept read only connections
2019-02-13 00:59:43.847 UTC [29682] FATAL:  could not connect to the primary server: could not connect to server: No route to host
                Is the server running on host "192.168.30.183" and accepting
                TCP/IP connections on port 5432?
2019-02-13 00:59:44.670 UTC [29677] LOG:  received promote request
2019-02-13 00:59:44.670 UTC [29779] FATAL:  terminating walreceiver process due to administrator command
2019-02-13 00:59:44.670 UTC [29677] LOG:  redo is not required
2019-02-13 00:59:44.763 UTC [29677] LOG:  selected new timeline ID: 2
2019-02-13 00:59:45.045 UTC [29677] LOG:  archive recovery complete
2019-02-13 00:59:45.256 UTC [29673] LOG:  database system is ready to accept connections

2019-02-13 01:00:01.894 UTC [29673] LOG:  received fast shutdown request
2019-02-13 01:00:01.894 UTC [29673] LOG:  aborting any active transactions
2019-02-13 01:00:01.896 UTC [29673] LOG:  background worker "logical replication launcher" (PID 29846) exited with exit code 1
2019-02-13 01:00:01.896 UTC [29679] LOG:  shutting down
2019-02-13 01:00:01.916 UTC [29673] LOG:  database system is shut down
2019-02-13 01:00:02.042 UTC [30825] LOG:  database system was shut down at 2019-02-13 01:00:01 UTC
2019-02-13 01:00:02.043 UTC [30825] LOG:  entering standby mode
2019-02-13 01:00:02.045 UTC [30825] LOG:  consistent recovery state reached at 0/30002C0
2019-02-13 01:00:02.045 UTC [30825] LOG:  invalid record length at 0/30002C0: wanted 24, got 0
2019-02-13 01:00:02.046 UTC [30823] LOG:  database system is ready to accept read only connections
2019-02-13 01:00:02.049 UTC [30830] FATAL:  pg_hba.conf rejects replication connection for host "192.168.30.181", user "replicator", SSL off
2019-02-13 01:00:02.049 UTC [30829] FATAL:  could not connect to the primary server: FATAL:  pg_hba.conf rejects replication connection for host "192.168.30.181", user "replicator", SSL off
2019-02-13 01:00:02.053 UTC [30832] FATAL:  pg_hba.conf rejects replication connection for host "192.168.30.181", user "replicator", SSL off
2019-02-13 01:00:02.053 UTC [30831] FATAL:  could not connect to the primary server: FATAL:  pg_hba.conf rejects replication connection for host "192.168.30.181", user "replicator", SSL off
2019-02-13 01:00:02.438 UTC [30823] LOG:  received fast shutdown request
2019-02-13 01:00:02.451 UTC [30823] LOG:  aborting any active transactions
2019-02-13 01:00:06.922 UTC [30884] FATAL:  the database system is shutting down
2019-02-13 01:00:07.457 UTC [30826] LOG:  shutting down
2019-02-13 01:00:07.462 UTC [30823] LOG:  database system is shut down
2019-02-13 01:00:07.848 UTC [30911] LOG:  database system was shut down in recovery at 2019-02-13 01:00:07 UTC
2019-02-13 01:00:07.848 UTC [30911] LOG:  entering standby mode

Error: You must set meta parameter notify=true for your master resource

I am testing further with Postgres 10 (using postgres_ha_paf_geo_patch: true) The playbook runs alright until it gets to the "check if all slaves are connected" step which fails all of its retries.

Checking the servers it appears the pcs resources failed to start. Running a "pcs resource debug-start postgres" I get the following output

[root@postgres-test-01 ~]# pcs resource debug-start postgres                                       
Operation start for postgres:0 (ocf:heartbeat:pgsqlms) returned: 'not installed' (5)                             
 >  stderr: ocf-exit-reason:You must set meta parameter notify=true for your master resource      

[root@postgres-test-01 ~]# pcs resource show                                                                                          
 pg-vip (ocf::heartbeat:IPaddr2):       Stopped                                                                                                                       
 Master/Slave Set: postgres-ha [postgres]                                                                                                         
     Stopped: [ postgres-test-01.example.net postgres-test-02.example.net postgres-test-03.example.net ]

So it appears that the "create master DB resource" step is having some kind of problem getting the notify=true parameter to stick.

I ran the following manually which seemed to fix it:
pcs resource update postgres-ha notify=true

I am wondering if there is some kind of problem with the pcs_resource module? Here's the output of that part of the playbook (below). No errors resulted, just a "changed" status. But something is being lost in translation.

{
    "_ansible_parsed": true,
    "invocation": {
        "module_args": {
            "operations": null,
            "group": null,
            "name": "postgres-ha",
            "resource_id": "postgres-ha",
            "disabled": true,
            "ms_name": "postgres",
            "command": "master",
            "type": null,
            "options": "master-node-max=\"1\" master-max=\"1\" clone-max=\"3\" notify=\"True\" clone-node-max=\"1\""
        }
    },
    "changed": true,
    "_ansible_no_log": false,
    "msg": "Running cmd: pcs resource master postgres-ha postgres master-node-max=\"1\" master-max=\"1\" clone-max=\"3\" notify=\"True\" clone-node-max=\"1\" --disabled"
}

Any ideas I could try?

Thanks for your help.

Regards,
Ben

old postgres_ha_cluster_master_host stuck when trying new cluster

I am running against a new cluster group
when I run the role against a new cluster group, it is failing
[root@AnsibleServer ~]# ansible-playbook --ask-pass sds-postgres-ha.yml --flush-cache
SSH password:

PLAY [install PG/Redis HA] *********************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************************
ok: [sds02.prodea-int.net]
ok: [sds01.prodea-int.net]

TASK [disable firewall] ************************************************************************************************************
ok: [sds01.prodea-int.net]
ok: [sds02.prodea-int.net]

TASK [postgres-ha : debug] *********************************************************************************************************
ok: [sds01.prodea-int.net] => {
"msg": "MASTER NODE SET TO dbs03.prodea-int.net"
}

TASK [postgres-ha : verify postgres_ha_cluster_master_host] ************************************************************************
fatal: [sds01.prodea-int.net]: FAILED! => {"changed": false, "failed": true, "msg": "CRITICAL: defined master host (dbs03.prodea-int.net) is not in host list ([u'sds01.prodea-int.net', u'sds02.prodea-int.net'])"}
fatal: [sds02.prodea-int.net]: FAILED! => {"changed": false, "failed": true, "msg": "CRITICAL: defined master host (dbs03.prodea-int.net) is not in host list ([u'sds01.prodea-int.net', u'sds02.prodea-int.net'])"}
to retry, use: --limit @/root/sds-postgres-ha.retry

PLAY RECAP *************************************************************************************************************************
sds01.prodea-int.net : ok=3 changed=0 unreachable=0 failed=1
sds02.prodea-int.net : ok=2 changed=0 unreachable=0 failed=1

[root@AnsibleServer ~]#

bad pre-task

centos 7 does not install iptables it uses firewalld
the following does not work;
pre_tasks:
- name: disable firewall
service: name=iptables state=stopped enabled=no

Stonith settings cannot be altered

Hi all,

when running the minimal example on fresh CentOS 7 instances provided I get the following error:

TASK [ansible-role-postgres-ha : alter stonith settings] *************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************** fatal: [*******]: FAILED! => {"msg": "Unable to import pcs_property due to Missing parentheses in call to 'print'. Did you mean print(\"absent?=?\")?"}

Postgres 10 synchronize slave databases fails due to bg_basebackup change

std_err output:

    "/usr/pgsql-10/bin/pg_basebackup: unrecognized option '--xlog-method=stream'",
    "Try \"pg_basebackup --help\" for more information."

For some reason pg_basebackup removed the --xlog-method command switch in Postgres 10.

Note the docs for 9.6:
https://www.postgresql.org/docs/9.6/app-pgbasebackup.html

-X method
--xlog-method=method

Both options are listed

For Postgres 10
-X method
--wal-method=method

The below line leads me to believe xlog was renamed to wal for this utility:
"The write-ahead log files are written to a separate file named pg_wal.tar (if the server is a version earlier than 10, the file will be named pg_xlog.tar)."

Changing to -X will be compatible with both old and new versions

So instead of
--xlog-method=stream
-X stream

Installation on RHEL 7.4 fails

Executing the role on a RHEL 7.4 system leads to dependency conflicts – something like:

Requires dependency: centos-release

This is related to the first task of postgres-ha/tasks/postgresql_sync.yml. I had a look at the repository and found a related RHEL package, that can be installed more smoothly. Replacing pgdg-centos with pgdg-redhat did resolve the installation incompatibilities:

- name: 'import pg{{ postgres_ha_pg_version | replace(".", "") }} repo'
  yum:
    name: 'https://download.postgresql.org/pub/repos/yum/{{ postgres_ha_pg_version }}/redhat/rhel-7-x86_64/pgdg-redhat{{ postgres_ha_pg_version | replace(".", "") }}-{{ postgres_ha_pg_version }}-3.noarch.rpm'
    state: installed

Is there a simple way to make this kind of differentiation in this task?

first run, fresh vm's

[root@AnsibleServer ~]# ansible-playbook --ask-pass dbs3-postgres-ha.yml
SSH password:

PLAY [install PG HA] ***************************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************************
ok: [dbs04.prodea-int.net]
ok: [dbs03.prodea-int.net]

TASK [disable firewall] ************************************************************************************************************
changed: [dbs03.prodea-int.net]
changed: [dbs04.prodea-int.net]

TASK [postgres-ha6 : debug] ********************************************************************************************************
ok: [dbs03.prodea-int.net] => {
"msg": "MASTER NODE SET TO dbs03.prodea-int.net"
}

TASK [postgres-ha6 : verify postgres_ha_cluster_master_host] ***********************************************************************
skipping: [dbs03.prodea-int.net]
skipping: [dbs04.prodea-int.net]

TASK [postgres-ha6 : identify the OS] **********************************************************************************************
ok: [dbs03.prodea-int.net]
ok: [dbs04.prodea-int.net]

TASK [postgres-ha6 : debug] ********************************************************************************************************
ok: [dbs03.prodea-int.net] => {
"msg": "cluster_members=[u'dbs03.prodea-int.net', u'dbs04.prodea-int.net']"
}

TASK [postgres-ha6 : install cluster pkgs] *****************************************************************************************
changed: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : install additional cluster pkgs for centos6] ******************************************************************
changed: [dbs03.prodea-int.net] => (item=[u'pacemaker', u'libselinux-python'])
changed: [dbs04.prodea-int.net] => (item=[u'pacemaker', u'libselinux-python'])

TASK [postgres-ha6 : Build hosts file] *********************************************************************************************
changed: [dbs03.prodea-int.net] => (item=dbs03.prodea-int.net)
changed: [dbs04.prodea-int.net] => (item=dbs03.prodea-int.net)
changed: [dbs03.prodea-int.net] => (item=dbs04.prodea-int.net)
changed: [dbs04.prodea-int.net] => (item=dbs04.prodea-int.net)

TASK [postgres-ha6 : service pcsd start] *******************************************************************************************
changed: [dbs03.prodea-int.net]
changed: [dbs04.prodea-int.net]

TASK [postgres-ha6 : setup hacluster password] *************************************************************************************
changed: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : setup cluster auth] *******************************************************************************************
skipping: [dbs03.prodea-int.net]
skipping: [dbs04.prodea-int.net]

TASK [postgres-ha6 : create cluster] ***********************************************************************************************
skipping: [dbs03.prodea-int.net]

TASK [postgres-ha6 : create cluster] ***********************************************************************************************
changed: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : join cluster nodes] *******************************************************************************************
skipping: [dbs03.prodea-int.net] => (item=dbs04.prodea-int.net)

TASK [postgres-ha6 : start cluster] ************************************************************************************************
changed: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : alter stonith settings] ***************************************************************************************
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : alter cluster policy settings] ********************************************************************************
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : alter cluster transition settings] ****************************************************************************
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : verify cluster configuration] *********************************************************************************
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : enable cluster autostart] *************************************************************************************
changed: [dbs03.prodea-int.net]
changed: [dbs04.prodea-int.net]

TASK [postgres-ha6 : create virtual IP resource] ***********************************************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
skipping: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : import pg96 repo] *********************************************************************************************
changed: [dbs03.prodea-int.net]
changed: [dbs04.prodea-int.net]

TASK [postgres-ha6 : install pg96] *************************************************************************************************
changed: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : init DB dir on master if necessary (centos 7)] ****************************************************************
skipping: [dbs03.prodea-int.net]
skipping: [dbs04.prodea-int.net]

TASK [postgres-ha6 : init DB dir on master if necessary (centos 6)] ****************************************************************
skipping: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : check if DB was synchronized before] **************************************************************************
ok: [dbs03.prodea-int.net]
ok: [dbs04.prodea-int.net]

TASK [postgres-ha6 : alter clustering-related settings in postgresql.conf] *********************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
skipping: [dbs04.prodea-int.net] => (item={'key': u'hot_standby', 'value': u'on'})
skipping: [dbs04.prodea-int.net] => (item={'key': u'listen_addresses', 'value': u"''"})
skipping: [dbs04.prodea-int.net] => (item={'key': u'wal_level', 'value': u'hot_standby'})
skipping: [dbs04.prodea-int.net] => (item={'key': u'wal_log_hints', 'value': u'on'})
skipping: [dbs04.prodea-int.net] => (item={'key': u'max_wal_senders', 'value': u'4'})
skipping: [dbs04.prodea-int.net] => (item={'key': u'max_replication_slots', 'value': u'4'})
changed: [dbs03.prodea-int.net] => (item={'key': u'hot_standby', 'value': u'on'})
changed: [dbs03.prodea-int.net] => (item={'key': u'listen_addresses', 'value': u"'
'"})
changed: [dbs03.prodea-int.net] => (item={'key': u'wal_level', 'value': u'hot_standby'})
changed: [dbs03.prodea-int.net] => (item={'key': u'wal_log_hints', 'value': u'on'})
changed: [dbs03.prodea-int.net] => (item={'key': u'max_wal_senders', 'value': u'4'})
changed: [dbs03.prodea-int.net] => (item={'key': u'max_replication_slots', 'value': u'4'})

RUNNING HANDLER [postgres-ha6 : restart postgresql] ********************************************************************************
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : alter DB ACL in pg_hba.conf] **********************************************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}" or db_prevsync_file.stat.exists
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}" or db_prevsync_file.stat.exists
skipping: [dbs04.prodea-int.net] => (item=dbs03.prodea-int.net)
skipping: [dbs04.prodea-int.net] => (item=dbs04.prodea-int.net)
changed: [dbs03.prodea-int.net] => (item=dbs03.prodea-int.net)
changed: [dbs03.prodea-int.net] => (item=dbs04.prodea-int.net)

TASK [postgres-ha6 : alter DB replication ACL in pg_hba.conf] **********************************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}" or db_prevsync_file.stat.exists
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}" or db_prevsync_file.stat.exists
skipping: [dbs04.prodea-int.net] => (item=dbs03.prodea-int.net)
skipping: [dbs04.prodea-int.net] => (item=dbs04.prodea-int.net)
changed: [dbs03.prodea-int.net] => (item=dbs03.prodea-int.net)
changed: [dbs03.prodea-int.net] => (item=dbs04.prodea-int.net)

RUNNING HANDLER [postgres-ha6 : reload postgresql] *********************************************************************************
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : setup DB cluster auth (master IP)] ****************************************************************************
changed: [dbs03.prodea-int.net]
changed: [dbs04.prodea-int.net]

TASK [postgres-ha6 : setup .pgpass replication auth for master IP] *****************************************************************
changed: [dbs03.prodea-int.net]
changed: [dbs04.prodea-int.net]

TASK [postgres-ha6 : setup .pgpass replication auth for other IPs] *****************************************************************
changed: [dbs04.prodea-int.net] => (item=dbs03.prodea-int.net)
changed: [dbs03.prodea-int.net] => (item=dbs03.prodea-int.net)
changed: [dbs04.prodea-int.net] => (item=dbs04.prodea-int.net)
changed: [dbs03.prodea-int.net] => (item=dbs04.prodea-int.net)

TASK [postgres-ha6 : check if master host "dbs03.prodea-int.net" is really a DB master] ********************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
skipping: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : mark master DB] ***********************************************************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
skipping: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : check if DB is running (failure is ok)] ***********************************************************************
changed: [dbs03.prodea-int.net]
fatal: [dbs04.prodea-int.net]: FAILED! => {"changed": true, "cmd": "/usr/pgsql-9.6/bin/pg_ctl -D /var/lib/pgsql/9.6/data status", "delta": "0:00:00.042861", "end": "2017-08-28 23:38:32.862478", "failed": true, "rc": 4, "start": "2017-08-28 23:38:32.819617", "stderr": "pg_ctl: directory "/var/lib/pgsql/9.6/data" is not a database cluster directory", "stderr_lines": ["pg_ctl: directory "/var/lib/pgsql/9.6/data" is not a database cluster directory"], "stdout": "", "stdout_lines": []}
...ignoring

TASK [postgres-ha6 : check if DB is running in cluster (failure is OK)] ************************************************************
fatal: [dbs03.prodea-int.net]: FAILED! => {"changed": true, "cmd": "pcs constraint location show resources "postgres-ha" | grep -q Enabled", "delta": "0:00:00.344149", "end": "2017-08-28 23:38:34.271540", "failed": true, "rc": 1, "start": "2017-08-28 23:38:33.927391", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
...ignoring
fatal: [dbs04.prodea-int.net]: FAILED! => {"changed": true, "cmd": "pcs constraint location show resources "postgres-ha" | grep -q Enabled", "delta": "0:00:00.330268", "end": "2017-08-28 23:38:34.298727", "failed": true, "rc": 1, "start": "2017-08-28 23:38:33.968459", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
...ignoring

TASK [postgres-ha6 : start master DB if necessary (without cluster)] ***************************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: (inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}") and (db_resource_exists|failed) and (db_running|failed)
skipping: [dbs03.prodea-int.net]
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: (inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}") and (db_resource_exists|failed) and (db_running|failed)
skipping: [dbs04.prodea-int.net]

TASK [postgres-ha6 : start master DB if necessary (in cluster)] ********************************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: (inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}") and (db_resource_exists|succeeded) and (db_running|failed)
skipping: [dbs03.prodea-int.net]
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: (inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}") and (db_resource_exists|succeeded) and (db_running|failed)
skipping: [dbs04.prodea-int.net]

TASK [postgres-ha6 : setup DB replication auth] ************************************************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
skipping: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : check if DB sync is required] *********************************************************************************
ok: [dbs04.prodea-int.net]
ok: [dbs03.prodea-int.net]

TASK [postgres-ha6 : stop slave DB] ************************************************************************************************
skipping: [dbs03.prodea-int.net]
skipping: [dbs04.prodea-int.net]

TASK [postgres-ha6 : remove slave DB datadir before sync] **************************************************************************
skipping: [dbs03.prodea-int.net]
changed: [dbs04.prodea-int.net]

TASK [postgres-ha6 : synchronize slave databases] **********************************************************************************
skipping: [dbs03.prodea-int.net]
changed: [dbs04.prodea-int.net]

TASK [postgres-ha6 : start slave DBs] **********************************************************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: (inventory_hostname !=
"{{ postgres_ha_cluster_master_host }}") and (db_resource_exists|failed)
skipping: [dbs03.prodea-int.net]
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: (inventory_hostname !=
"{{ postgres_ha_cluster_master_host }}") and (db_resource_exists|failed)
changed: [dbs04.prodea-int.net]

TASK [postgres-ha6 : check if slaves are connected] ********************************************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
skipping: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : select proper PAF package (centos6)] **************************************************************************
skipping: [dbs03.prodea-int.net]
skipping: [dbs04.prodea-int.net]

TASK [postgres-ha6 : select proper PAF package (centos6)] **************************************************************************
ok: [dbs03.prodea-int.net]
ok: [dbs04.prodea-int.net]

TASK [postgres-ha6 : copy PAF rpm to hosts] ****************************************************************************************
changed: [dbs03.prodea-int.net]
changed: [dbs04.prodea-int.net]

TASK [postgres-ha6 : install PAF DB failover agent] ********************************************************************************
changed: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : prepare DB recovery config] ***********************************************************************************
changed: [dbs03.prodea-int.net]
changed: [dbs04.prodea-int.net]

TASK [postgres-ha6 : stop database for clustering] *********************************************************************************
changed: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : create database cluster resource] *****************************************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
skipping: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : create master DB resource] ************************************************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
skipping: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : test constraints presence] ************************************************************************************
ok: [dbs04.prodea-int.net]
ok: [dbs03.prodea-int.net]

TASK [postgres-ha6 : setting VIP location constraints] *****************************************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
skipping: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : setting DB location constraints] ******************************************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
skipping: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : setting resources colocation group 1] *************************************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
skipping: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : setting resources start order] ********************************************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
skipping: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : setting resources stop order] *********************************************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
skipping: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : marking constraints as processed] *****************************************************************************
changed: [dbs03.prodea-int.net]
changed: [dbs04.prodea-int.net]

TASK [postgres-ha6 : enable database cluster resource] *****************************************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
skipping: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : refresh database cluster resource] ****************************************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
skipping: [dbs04.prodea-int.net]
changed: [dbs03.prodea-int.net]

TASK [postgres-ha6 : check if all slaves are connected] ****************************************************************************
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
[WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname ==
"{{ postgres_ha_cluster_master_host }}"
skipping: [dbs04.prodea-int.net]
FAILED - RETRYING: check if all slaves are connected (16 retries left).
FAILED - RETRYING: check if all slaves are connected (15 retries left).
FAILED - RETRYING: check if all slaves are connected (14 retries left).
changed: [dbs03.prodea-int.net]

PLAY RECAP *************************************************************************************************************************
dbs03.prodea-int.net : ok=54 changed=46 unreachable=0 failed=0
dbs04.prodea-int.net : ok=30 changed=24 unreachable=0 failed=0

[root@AnsibleServer ~]#

[root@localhost ~]# pcs status
Cluster name: pgcluster
Stack: cman
Current DC: dbs04.prodea-int.net (version 1.1.15-5.el6-e174ec8) - partition with quorum
Last updated: Mon Aug 28 23:50:05 2017 Last change: Mon Aug 28 23:39:18 2017 by root via crm_attribute on dbs03.prodea-int.net

2 nodes and 3 resources configured

Online: [ dbs03.prodea-int.net dbs04.prodea-int.net ]

Full list of resources:

pg-vip (ocf::heartbeat:IPaddr2): Started dbs03.prodea-int.net
Master/Slave Set: postgres-ha [postgres]
Slaves: [ dbs03.prodea-int.net dbs04.prodea-int.net ]

Daemon Status:
cman: active/disabled
corosync: active/disabled
pacemaker: active/enabled
pcsd: active/enabled
[root@localhost ~]#

separate role out

A while ago we chatted about separating the role so that it could be more modular for other applications that one might want to cluster a few examples;
apache
mysql
redis
haproxy

my thought was 3 roles

  1. deploy cluster
  2. deploy application
  3. deploy and define cluster resources and constraints

How to replace a node using role's playbook?

Hi,
Thanks for this role. Now I have a 3 nodes cluster and 1 node's OS has been reinstalled and hard drive has been replaced with new one , how can I recover this node with playbooks? thanks.

Add rule in pg_hba.conf to forbid self-replication

Hello,

Thanks for this role!

As explained in PAF's documentation you should add a rule in pg_hba.conf to forbid self-replication :

Moreover, if you rely on Pacemaker to move an IP resource on the node hosting the master role of PostgreSQL, make sure to add rules on the pg_hba.conf file of each instance to forbid self-replication.

Regards,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.