Coder Social home page Coder Social logo

suse / saphanasr Goto Github PK

View Code? Open in Web Editor NEW
26.0 28.0 22.0 5.78 MB

SAP HANA System Replication Resource Agent for Pacemaker Cluster

License: GNU General Public License v2.0

Makefile 0.45% Python 46.04% Shell 22.82% Perl 13.49% CSS 0.26% Roff 16.93%

saphanasr's Introduction

SAPHanaSR-angi - SAP HANA System Replication
A New Generation Interface

The SUSE resource agents to control the SAP HANA database in system replication setups

Build Status

Introduction

SAPHanaSR-angi provides an automatic failover between SAP HANA nodes with configured System Replication in HANA. Currently Scale-Up setups are targeted.

CIB attributes are not backward compatible between SAPHanaSR-angi and SAPHanaSR. So there is currently no easy migration path.

This technology is included in the SUSE Linux Enterprise Server for SAP Applications 15 (as technology preview), via the RPM package with the same name.

System replication will help to replicate the database data from one node to another node in order to compensate for database failures. With this mode of operation, internal SAP HANA high-availability (HA) mechanisms and the Linux cluster have to work together.

The SAPHanaController resource agent performs the actual check of the SAP HANA database instances and is configured as a promotable multi-state resource. Managing the two SAP HANA instances means that the resource agent controls the start/stop of the instances. In addition the resource agent is able to monitor the SAP HANA databases on landscape host configuration level.

For this monitoring the resource agent relies on interfaces provided by SAP.

As long as the HANA landscape status is not "ERROR" the Linux cluster will not act. The main purpose of the Linux cluster is to handle the takeover to the other site.

Only if the HANA landscape status indicates that HANA can not recover from the failure and the replication is in sync, then Linux will act.

An important task of the resource agent is to check the synchronisation status of the two SAP HANA databases. If the synchronisation is not "SOK", then the cluster avoids to take over to the secondary side, if the primary fails. This is to improve the data consistency.

For more information, refer to the "Supported High Availability Solutions by SLES for SAP Applications and all the manual pages shipped with the package.

File structure of installed package

  • /usr/share/SAPHanaSR-angi/doc contains readme and license;
  • /usr/share/man and it's subdirectories contains manual pages;
  • /usr/lib/ocf/resource.d/suse contains the actual resource agents, SAPHanaController and SAPHanaTopology;
  • /usr/lib/SAPHanaSR-angi contains the libraries for the resource agents;
  • /usr/share/SAPHanaSR-angi contains SAP HA/DR provider hook scripts;
  • /usr/share/SAPHanaSR-angi/samples contains examples for global ini configuration and various additional stuff;
  • /usr/bin contains tools;

License

See the LICENSE file for license rights and limitations.

Contributing

If you are interested in contributing to this project, read the CONTRIBUTING.md for more information.

Feedback

Do you have suggestions for improvement? Let us know!

Go to Issues, create a new issue and describe what you think could be improved.

Feedback is always welcome!

Development and Branches

Please read development.md for more information.

saphanasr's People

Contributors

alex-burlakov avatar angelabriel avatar diegoakechi avatar fdanapfel avatar fmherschel avatar gereonvey avatar imanyugin avatar lpinne avatar mguertler avatar oalbrigt avatar peterpitterling avatar thr3d avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

saphanasr's Issues

ra/SAPHanaRA* - HANA_CALL - use defined execution shell

currently SIDadms login shell could be sh,bash and csh. Because of different default settings (e.g., expand_aliases) and shell behaviors, coding and testing is unnecessary complicated. The su command allows the specification of the executing shell. By providing a fixed shell (sh, bash) this would always provide a defined runtime environment.

su - SIDadm -s /bin/sh -c '(cd $DIR_INSTANCE/exe/python_support; python landscapeHostConfiguration.py)'
su - SIDadm -s /bin/bash -c '(cd $DIR_INSTANCE/exe/python_support; python landscapeHostConfiguration.py)'

https://man7.org/linux/man-pages/man1/su.1.html

-s, --shell=shell
Run the specified shell instead of the default. The shell to
run is selected according to the following rules, in order:

Add hdbindexserver to list of monitored services on local instance?

Currently the list of monitored services in the 'saphana_check_local_instance()' function in the SAPHana resource agent only includes the 'hdbnameserver' and 'hdbdaemon' processes:

local MONITOR_SERVICES="hdbnameserver|hdbdaemon" # TODO: PRIO1: exact list of Services

In a support case a customer reported that even though HANA is able to restart a hdbindexserver process after it has crashed, it can take up to 40 minutes before the HANA instance is fully active again. During this time HANA apparently still reports that System Replication is working as expected, and since the status of the hdbindexserver isn't monitored no takeover is performed.

The customer however would prefer if a takeover was performed in case the hdbindexserver process has failed, so that they don't have to wait for the restart of the hdbindexserver process to finish before business can continue.

Are there any objections to extend the list of processes for the 'MONITOR_SERVICES' parameter in the 'saphana_check_local_instance()' function to also include the 'hdbindexserver' process so that the SAPHana resource agent can detect if it has failed and initiate a takeover?

Cost optimized scenario

Hello,
I already opened a Suse SR but expected better help from here:

I'm using a two nodes hana cluster, but for cost optimization on non criticals environments, the second node is power off (it can be activated on demand to reduce downtime during maintenance operation or for testing purpose before going on production):

  • first node has HSR setup with no replication partner
  • second node is in standby mode in cluster configuration and power off

The issue is when the first node is restarted, the cluster failed to start Hana on startup, it should be done manually. I'm trying to understand why and if something should be improve in my cluster configuration to prevent this behaviour.

Error in crm status output:

Failed Resource Actions:

  • rsc_SAPHana_S90_start_0 on szxdh1hant1 'not running' (7): call=36, status=complete, exitreason='',

Output of hdbnsutil -sr_state command:

System Replication State


online: true

mode: primary
operation mode: primary
site id: 1
site name: TESTHANT1


is source system: true
is secondary/consumer system: false
has secondaries/consumers attached: false
is a takeover active: false

Host Mappings:
~~~~~~~~~~~~~~

szxdh1hant1 -> [TESTHANT1] szxdh1hant1


Site Mappings:
~~~~~~~~~~~~~~
TESTHANT1 (primary/primary)

Tier of TESTHANT1: 1

Replication mode of TESTHANT1: primary

Operation mode of TESTHANT1: primary

done.

And cluster configuration:


node 1: szxdh1hant1 \
        attributes lpa_s90_lpt=1604587629 hana_s90_vhost=szxdh1hant1 hana_s90_site=TESTHANT1 hana_s90_srmode=sync hana_s90_remoteHost=szxdh1hant2 standby=off hana_s90_op_mode=logreplay
node 2: szxdh1hant2 \
        attributes lpa_s90_lpt=10 hana_s90_op_mode=logreplay hana_s90_vhost=szxdh1hant2 hana_s90_remoteHost=szxdh1hant1 hana_s90_site=TEST1HANT2 hana_s90_srmode=sync standby=on
primitive STONITH-primary stonith:external/gcpstonith \
        op monitor interval=300s timeout=300s on-fail=restart \
        op start interval=0 timeout=60s on-fail=restart \
        params instance_name=szxdh1hant1 gcloud_path="/usr/bin/gcloud" logging=yes \
        meta is-managed=true
primitive STONITH-secondary stonith:external/gcpstonith \
        op monitor interval=300s timeout=300s on-fail=restart \
        op start interval=0 timeout=60s on-fail=restart \
        params instance_name=szxdh1hant2 gcloud_path="/usr/bin/gcloud" logging=yes \
        meta is-managed=true
primitive rsc_SAPHanaTopology_S90 ocf:suse:SAPHanaTopology \
        operations $id=rsc_sap2_S90-operations \
        op monitor interval=10 timeout=600 \
        op start interval=0 timeout=600 \
        op stop interval=0 timeout=300 \
        params SID=S90 InstanceNumber=00
primitive rsc_SAPHana_S90 ocf:suse:SAPHana \
        operations $id=rsc_sap_S90-operations \
        op start interval=0 timeout=3600 \
        op stop interval=0 timeout=3600 \
        op promote interval=0 timeout=3600 \
        op monitor interval=60 role=Master timeout=700 \
        op monitor interval=61 role=Slave timeout=700 \
        params SID=S90 InstanceNumber=00 PREFER_SITE_TAKEOVER=true DUPLICATE_PRIMARY_TIMEOUT=7200 AUTOMATED_REGISTER=false
primitive rsc_vip_gcp-primary ocf:gcp:alias \
        op monitor interval=300s timeout=300s \
        op start interval=0 timeout=300 \
        op stop interval=0 timeout=180s \
        params alias_ip="<my_ip>/32" hostlist="szxdh1hant1 szxdh1hant2" gcloud_path="/usr/bin/gcloud" logging=yes \
        meta priority=10
primitive rsc_vip_int-primary IPaddr2 \
        params ip=<my_ip> nic=eth0 cidr_netmask=32 \
        op monitor interval=10s
group gcp-ip-primary rsc_vip_int-primary rsc_vip_gcp-primary
ms msl_SAPHana_S90 rsc_SAPHana_S90 \
        meta clone-max=2 clone-node-max=1 interleave=true
clone cln_SAPHanaTopology_S90 rsc_SAPHanaTopology_S90 \
        meta clone-node-max=1 interleave=true
location LOC_STONITH_primary STONITH-primary -inf: szxdh1hant1
location LOC_STONITH_secondary STONITH-secondary -inf: szxdh1hant2
location cli-prefer-msl_SAPHana_S90 msl_SAPHana_S90 role=Started inf: szxdh1hant1
colocation col_saphana_ip_primary 4000: gcp-ip-primary:Started msl_SAPHana_S90:Master
order ord_SAPHana_S90 Optional: cln_SAPHanaTopology_S90 msl_SAPHana_S90
property cib-bootstrap-options: \
        maintenance-mode=false \
        no-quorum-policy=ignore \
        have-watchdog=false \
        dc-version="1.1.19+20180928.0d2680780-1.8-1.1.19+20180928.0d2680780" \
        cluster-infrastructure=corosync \
        cluster-name=S90cluster \
        stonith-enabled=true \
        stonith-action=off \
        stonith-timeout=420s \
        last-lrm-refresh=1604586788
rsc_defaults rsc-options: \
        resource-stickiness=1000 \
        migration-threshold=5000 \
        failure-timeout=3600
op_defaults op-options: \

Thanks for your help.

ra/* - hdbMap usage should be revised

SAPHana - is not used at all and can be removed

hdbMap="hdbnsutil -sr_state"

SAPHanaTopology - only used for >= 1.00.111 - sr_state never ever called

hdbMap="hdbnsutil -sr_state"

    hdbMap="hdbnsutil -sr_state"
    if version "$hdbver" ">=" "1.00.111"; then
        hdbState="hdbnsutil -sr_stateConfiguration"
        hdbMap="hdbnsutil -sr_stateHostMapping"
    fi

    if version "$hdbver" ">=" "1.00.111"; then
        hdbANSWER=$(HANA_CALL --timeout "$HANA_CALL_TIMEOUT" --cmd "$hdbMap --sapcontrol=1" 2>/dev/null)
    fi

"we didn't expect node_status to be: DUMP <00000000 0a |.|#01200000001>"

Whenever we see the output "we didn't expect node_status to be: DUMP" it is always (>99%) exactly this output:

"we didn't expect node_status to be: DUMP <00000000 0a |.|#01200000001>"

<00000000 0a |.|#01200000001>
is created by the hexdump -C call

SAPHanaSR/ra/SAPHana

Lines 1276 to 1282 in 83ddf57

* )
super_ocf_log err "ACT: check_for_primary: we didn't expect node_status to be: <$node_status>"
dump=$( echo $node_status | hexdump -C );
super_ocf_log err "ACT: check_for_primary: we didn't expect node_status to be: DUMP <$dump>"
((i++))
super_ocf_log debug "DEC: check_for_primary: loop=$i: node_status=$node_status"
# TODO: PRIO1: Maybe we need to keep the old value for P/S/N, if hdbnsutil just crashes

|.| just means "non-printable char"
#0120 just means \n (caused by rsyslog which is used by super-ocf-log)
https://stackoverflow.com/questions/24606757/rsyslogd-and-characters-012-and-015
The characters \n and \r are dumped as #12 and #15 in the log file

Therefore the DUMP output above just means "empty string+LF"
00000000 \n
00000001

This special case should be treated separately instead of printing the DUMP.

FULL_SR_STATUS changed from 60s to 5s

I would like to ask if we can a variable instead of hard coded value and second in our shop systems are getting SFAIL production due to less value. can we keep 60s or why this has been changed from 60s to 5s ?

  • FULL_SR_STATUS=$(HANA_CALL --timeout 60 --cmd "systemReplicationStatus.py $siteParam" 2&gt;/dev/null); srRc=$?
  • FULL_SR_STATUS=$(HANA_CALL --timeout 5 --cmd "systemReplicationStatus.py $siteParam" 2&gt;/dev/null); srRc=$?

Thanks, Viney

Adjust default operation timeouts in SAPHana and SAPHanaTopology resource agents?

Currently the defaults for the operation timeouts for both resource agents are set to low values compared to the values that are given as recommenations in all the documentation from SUSE, Red Hat and others:

  • defaults in SAPHana resource agent:
    <action name="start"   timeout="180" />
    <action name="stop"    timeout="240" />
    <action name="status"  timeout="60" />
    <action name="monitor" depth="0" timeout="60" interval="120" />
    <action name="monitor" depth="0" timeout="60" interval="121" role="Slave" />
    <action name="monitor" depth="0" timeout="60" interval="119" role="Master" />
    <action name="promote" timeout="320" />
    <action name="demote"  timeout="320" />
  • defaults defined SAPHanaTopology resource agent:
    <action name="start" timeout="180" />
    <action name="stop" timeout="60" />
    <action name="status" timeout="60" />
    <action name="monitor" depth="0" timeout="60" interval="60" />

The values to use given in the documentation are usually much higher, for example here are values given by Microsoft when setting up HANA System Replication HA on SLES on Azure:

sudo crm configure primitive rsc_SAPHanaTopology_HN1_HDB03 ocf:suse:SAPHanaTopology \
  operations \$id="rsc_sap2_HN1_HDB03-operations" \
  op monitor interval="10" timeout="600" \
  op start interval="0" timeout="600" \
  op stop interval="0" timeout="300" \
  params SID="HN1" InstanceNumber="03"
sudo crm configure primitive rsc_SAPHana_HN1_HDB03 ocf:suse:SAPHana \
  operations \$id="rsc_sap_HN1_HDB03-operations" \
  op start interval="0" timeout="3600" \
  op stop interval="0" timeout="3600" \
  op promote interval="0" timeout="3600" \
  op monitor interval="60" role="Master" timeout="700" \
  op monitor interval="61" role="Slave" timeout="700" \
  params SID="HN1" InstanceNumber="03" PREFER_SITE_TAKEOVER="true" \
  DUPLICATE_PRIMARY_TIMEOUT="7200" AUTOMATED_REGISTER="false"

We've seen a number of issues now where customers ran into issues because they forgot to explicitly define the values to use during resource creation, and then the cluster used the defaults defined in the resource agents, which led to cluster failures since the responses from the HANA instances took much longer than the timeouts currently set as defaults in the resource agents.

Also we've received questions from customers why the values recommended in the documentation are different from the defaults defined in the resource agents.

My suggestion therefore would be to change the default values for the operation timeouts in both resource agents to the values given as recommendations in customer facing documentation from SUSE, Red Hat and others.

In addition at least in the SAPHana resource agent there are some default timeouts defined for operations which are actually never used:

    <action name="monitor" depth="0" timeout="60" interval="120" />
    <action name="methods" timeout="5" />

The 'monitor' operation is only used for standard resources, but in case of master/slave resources it is not needed, since there are separate monitor operations for the master and the slave resource.

'methods' seems to refer to an internal function of the resource agent, which as far as I can see is never called from outside as an operation. Also I haven't seen any other resource agents where a timeout for a 'methods' operation is defined.

Therefore I'd suggest to remove these two definitions from the resource agent (the 'methods' timeout will most likely also have to be removed from the SAPHanaToploogy resource agent).

HANA_CALL - handle all timeout return codes, not only 124

https://man7.org/linux/man-pages/man1/timeout.1.html

Exit status:
124 if COMMAND times out, and --preserve-status is not specified
125 if the timeout command itself fails
126 if COMMAND is found but cannot be invoked
127 if COMMAND cannot be found
137 if COMMAND (or timeout itself) is sent the KILL (9) signal (128+9)
- the exit status of COMMAND otherwise

SAPHanaSR/ra/SAPHana

Lines 692 to 697 in 8db3d75

#
# on timeout ...
#
if [ $rc -eq 124 ]; then
super_ocf_log warn "RA: HANA_CALL timed out after $timeOut seconds running command '$cmd'"
fi

Install srHook

According to HANA admin guide, it doesn't require a HANA restart to load srHook.

All scripts are loaded during the start up phase of the name server, alternatively, to avoid the need for a restart, run the following command to reload the scripts immediately:

hdbnsutil -reloadHADRProviders

However, if there is no replication synchronization status change event happens after reloading the srHook, the srHook attribute will not be created. If calling the systemReplicationStatus.py timeouts for whatever reason and the sync status is set to SFAIL.

Now if primary HANA crashes, the get_SRHOOK function will take the SFAIL status as the srHOOK attribute is never created, thus the RA prefers local HANA restart instead of failover. Am I understand it correctly?

The reason to open this issue is to confirm whether a HANA restart can be avoided when installing the srHook.

Add a check to identify the status of the primary node and set the "hana__roles" attribute during probe in SAPHana Resource agent

Issue:
On a fully operational cluster, when cluster is put to maintenance mode and Pacemaker/Cluster service is restarted then after removing cluster from maintenance mode DB on primary is stopped and started again which results in an outage to the customers.

Recreate the issue with below steps

Make sure cluster is fully operational with one Promoted and one Demoted node. HSR is in sync
Put the cluster to maintenance mode ( crm configure property maintenance-mode=true )
Stop the cluster service on both the nodes ( crm cluster stop )
Start the cluster service on both the nodes ( crm cluster start )
Remove cluster from Maintenance mode ( crm configure property maintenance-mode=false )
After Step 5 DB on Primary will be restarted or sometimes triggers the failover.

Reason:
This is happening because If you attempt to start cluster services on a node while the cluster or node is in maintenance mode, Pacemaker will initiate a single one-shot monitor operation (a “probe”) for every resource to evaluate which resources are currently running on that node. However, it will take no further action other than determining the resources' status.

so after step 4 a probe is initiated using SAPHana and SAPHanaTopology Resources.

In SAPHanaTopology when it is identified as probe in monitoring clone function it only check and Sets the attribute for Hana Version, but it is not doing any check for current cluster state. Because of this "hana_roles" and "master-rsc_SAPHana_HDB42" attributes are not set in the cluster primary.

Also in SAPHana Resource agent it is trying to get the status of role attribute (which is not set by that time) and setting score to 5 during the probe and later when cluster is removed from maintenance mode, resource agent checks for the roles attribute and its score, as those values are not as expected, agent is trying to fix the cluster and DB stop-Start is happening.

Resolution:
If we add a check to identify the status of the primary node and set the "hana__roles" attribute during probe, then when cluster is removed from the maintenance, cluster will not try to stop and start the DB or to trigger a failover as it will see the operational primary node.

I have already modified the code and tested multiple scenarios, cluster functionality is not disturbed and the mentioned issue is resolved. I don't think these changes to SAPHana Resource agent will cause additional issues because, during probe we are setting the attributes only if the we identify the primary node. But need your expertise to check and finalize if this approach can be used or suggest any other alternative/fix to overcome the above mentioned issue.

Created this new issue as suggested by @PeterPitterling

Provide more consistent man page examples within SAPHanaSR_maintenance_examples

  1. Problem
    Sometimes there are real commands for the maintenance tasks examples written, but sometimes only the tasks are show.
    This man page is a very very important one as the maintenance task is done wrong by many admins.

  2. Proposal
    Be more consistent with examples for the shown tasks.
    It's clear that this does not fit everywhere, but it would help the users.

  3. Maybe a further way would be the new Smart Docs which describes a task step by step

ra/SAPHanaRA* - HANA_CALL reason for inner timeout?

2cc0fba introduced an inner timeout call for the command to be executed

unfortunately, the commit comment does not provide more insights.

The reason for this clarification is the current modification to get rid of the HDBSettings.sh call. The inner timeout does not like 'cdpy' as first command. see #121

an additional true as first command would work around this, but the question is, whether this 2nd timeout could be removed at all ?!

No Role information collected

We have installed a distributed HANA system(2+1) using non shared storage and installation completed successfully.Now we are facing an issue with SAPHanaTopology resource agent, agent is active but does not collect information regarding host roles, HANA version and host details are collected correctly, we are using RedHat 8.1

Consistent copyright entries?

Angela just have changed the "Copyright" entries in the package SPEC file:

# Copyright (c) 2013-2014 SUSE Linux Products GmbH, Nuernberg, Germany.
# Copyright (c) 2014-2016 SUSE Linux GmbH, Nuernberg, Germany.
# Copyright (c) 2017-2019 SUSE LLC.

Should we also change the copyright notes in the resource agents, scripts, example python hooks?

Run "SAPHanaSR-showAttr " Show score = -INFINITY

I run SAPHanaSR-showAttr command and the slave node have iusse :
it show that "
Hosts clone_state lpa_hdp_lpt node_state op_mode remoteHost roles score site srmode sync_state version vhost

NODE1 PROMOTED 1619661320 online logreplay_readaccess omsapdb2 4:P:default:master1:master:worker 150 SiteA sync PRIM 2.00.036.00.1547699771 NODE1
NODE2 WAITING4PRIM 30 online logreplay omsapdb1 4:S:master1:master:worker:master -INFINITY NODE2 sync SOK 2.00.036.00.1547699771 omsapdb2
"
I notice that the node2(slave node) score = infinity
I want to know what problem happened .
And what should i do to resovle it.

Best Regards

About Hook SAPHanaSR

Hello, i am configuring the Hook SAPHanaSR on SLES12 SP3, with a SAP HANA 1.00.122.20.1536720790.
Such sections have been added to global.ini

[ha_dr_provider_SAPHanaSR]
provider = SAPHanaSR
path = /usr/share/SAPHanaSR
execution_order = 1

[trace]
ha_dr_saphanasr = info

Such line in the sudoers file:
h1padm ALL=(ALL) NOPASSWD: /usr/sbin/crm_attribute -n hana_h1p_site_srHook_*

When i try to start the HANA instance, it ends in a loop with these messages:

[75888]{-1}[-1/-1] 2020-01-31 15:03:43.061605 i ha_dr_provider   PythonProxyImpl.cpp(01052) : calling HA/DR provider SAPHanaSR.hookDRConnectionChanged(hostname=lnxhdb120pri, port=30001, database=SYSTEMDB, status=12, database_status=12, system_status=12, timestamp=2020-01-31T15:03:43.061563+01:00, is_in_sync=0, reason=Starting)
[75888]{-1}[-1/-1] 2020-01-31 15:03:43.061744 i ha_dr_SAPHanaSR  SAPHanaSR.py(00086) : SAPHanaSR (0.161.0) SAPHanaSR.srConnectionChanged method called with Dict={'status': 12, 'is_in_sync': False, 'database': 'SYSTEMDB', 'timestamp': '2020-01-31T15:03:43.061563+01:00', 'hostname': 'lnxhdb120pri', 'system_status': 12, 'reason': 'Starting', 'database_status': 12, 'port': '30001'}
[75888]{-1}[-1/-1] 2020-01-31 15:03:43.061787 e ha_dr_provider   PythonProxyImpl.cpp(01099) : SAPHanaSR/SAPHanaSR:srConnectionChanged() failed with python error: 'siteName'
  SAPHanaSR.py(94): mySite = ParamDict["siteName"]
[75443]{-1}[-1/-1] 2020-01-31 15:03:43.061827 e sr_nameserver    DisasterRecoveryPrimaryCallbackImpl.cpp(00181) : Could not send connectionEstablished event to HA/DR Provider. Retry in 5 seconds.

Is this a known issue? Am i doing something wrong?

Regards,
Marco

SAPHanaTopology - DIR_PROFILE usage ?

function header claims DIR_PROFILE is set .. what is the purpose, as this variable is not used in this RA?

# function: sht_init - initialize variables for the resource agent
# globals:  DIR_EXECUTABLE(w), SAPSTARTSRV(w), SAPCONTROL(w), DIR_PROFILE(w), SAPSTARTPROFILE(w), LD_LIBRARY_PATH(w), PATH(w), nodelist(w)

if [ -z "$OCF_RESKEY_DIR_PROFILE" ]
then
DIR_PROFILE="/usr/sap/$SID/SYS/profile"
else
DIR_PROFILE="$OCF_RESKEY_DIR_PROFILE"
fi

ra/SAPHana HANA_CALL - inner timeout prevents proper logging in hdbnsutil trace file

ra/SAPHanaTopology is properly traced

CommandUtil.cpp(00065) : command: hdbnsutil -sr_stateConfiguration --sapcontrol=1
CommandUtil.cpp(00098) : called by user 'tqladm' with UID: 1001 (parent process command line '-sh -c (: [8792]; hdbnsutil -sr_stateConfiguration --sapcontrol=1 > /run/SAPHanaSR_tql/HANA_CALL_CMD_TOP_OUT_1710875260638426633_tqladm) >& /run/SAPHanaSR_tql/HANA_CALL_CMD_TOP_1710875260638426633_tqladm ' with PID: 8882) (parent process executable /bin/bash')

while ra/SAPHana not; hdbnsutil is containing the full cmdline from the parent process, which is "timeout" in this case

CommandUtil.cpp(00065) : command: hdbnsutil -sr_stateConfiguration --sapcontrol=1
CommandUtil.cpp(00098) : called by user 'tqladm' with UID: 1001 (parent process command line 'timeout -s 9 60 hdbnsutil -sr_stateConfiguration --sapcontrol=1 ' with PID: 8108) (parent process executable /usr/bin/timeout')

all important information like PID and filenames are missing.

The inner timeout has anyway no meaning as it has the same timeout value as the outer timeout, which will be trigger earlier

Why is there a hardcoded timeout of 5s for calls of systemReplicationStatus.py?

With 7c66a3b the option to change the default value for HANA_CALL_TIMEOUT was (re-)introduced.

But for calls of systemReplicationStatus.py a hardcoded value of 5s is still used:
https://github.com/SUSE/SAPHanaSR/blob/master/ra/SAPHana#L1312

Is there a specific reason for this?

The reason I ask is because we received the following from a customer who ran into issues with their cluster setup due to this hardcoded timeout (the analysis actually seems to have been done by SAP):
"
Mar 11 06:00:48 SAPHana(SAPHana_SPR_00)[3828054]: INFO: RA ==== begin action monitor_clone (0.154.0) ====
Mar 11 06:00:48 SAPHana(SAPHana_SPR_00)[3828054]: INFO: RA: SRHOOK1=PRIM
Mar 11 06:00:48 SAPHana(SAPHana_SPR_00)[3828054]: INFO: RA: SRHOOK3=PRIM
Mar 11 06:00:57 SAPHana(SAPHana_SPR_00)[3828054]: INFO: RA: SRHOOK1=PRIM
Mar 11 06:00:57 SAPHana(SAPHana_SPR_00)[3828054]: INFO: RA: SRHOOK3=PRIM
Mar 11 06:01:03 SAPHana(SAPHana_SPR_00)[3828054]: WARNING: HANA_CALL timed out after 5 seconds running command 'systemReplicationStatus.py --site=Site2'
Mar 11 06:01:03 SAPHana(SAPHana_SPR_00)[3828054]: INFO: DEC analyze_hana_sync_statusSRS systemReplicationStatus.py (to site 'Site2')-> 124
Mar 11 06:01:03 SAPHana(SAPHana_SPR_00)[3828054]: INFO: ACT site=Site1, setting SFAIL for secondary (2) - srRc=124
Mar 11 06:01:03 SAPHana(SAPHana_SPR_00)[3828054]: INFO: RA ==== end action monitor_clone with rc=8 (0.154.0) (18s)====

When executed manually directly in the O/S, this command typically takes around 3 – 4 seconds to return depending on HANA load. It is possible under high HANA load that this command may take slightly longer to execute. For example, when I executed this command manually in SPR today at 13:00AEDT it responded in 6.7 seconds, which would be enough to cause the cluster to indicate a failure, as its “overrun” by 1.7 seconds.

This then caused the cluster to failover.

It is possible that the HANA system cannot respond in the hard coded 5 seconds in all load situation, which is falsely causing the cluster to raise a failure as HANA has not responded as expected. Setting the parameter HANA_CALL_TIMEOUT has no effect here, as the timeout is hardcoded to 5 seconds."

Edit: I found 2cc0fba where the hardcoded timeout was lowered from 60s to 5s, but there is no clear explanation on the reason behind this.

Use 'actual_mode' instead of 'mode' when global.ini is used as fallback to detect replication mode

During the investigation of a customer issue where the customer provoked that the fallback to retrieve the current state of System Replication by checking the global ini is used (by removing access to the 'hdb*' binaries), it was discovered that while the initial takeover after the fault has been created works. But then a takeover back to the initial node didn't work, because the resource agents weren't able to determine that the current active HANA instance was actually the primary site.

When investigating the issue together with SAP it was discovered that the reason why the second takeover didn't work was because it is actually not correct to use the 'mode' parameter in the global.ini to determine the current state of System Replication of a HANA instance.

Here's the information provided from the colleagues who were working with the customer and SAP in debugging the issue:


The SAPHana resource agent uses the system_replication/mode attribute from global.ini as a fallback if the $hdbState command fails. The expectation is that a takeover event updates the mode parameter so that it's usually a valid representation of which node is currently primary.

However, mode is a static parameter that does not change with a takeover event. Instead, the takeover updates the actual mode parameter.

The resource agent needs to be updated to query the correct parameter in the event that $hdbState fails. This way, it can respond more appropriately to edge-case situations like missing hdb* binaries.

Adapted from an SAP engineer in a support collaboration email:

# Before takeover
# node1 is primary, node2 is secondary
	
global.ini on node1:
  mode  = primary
  actual mode = primary
  operation_mode = logreplay

global.ini on node2:
  mode  = sync
  actual mode = sync
  operation_mode = logreplay

hdbnsutil -sr_state on node1:
  mode: primary
  operation mode: primary

hdbnsutil -sr_state on node2:
  mode: sync
  operation mode: logreplay

# After takeover/failover
# node1 is secondary, node2 is primary

global.ini on node1:
  mode  = primary
  actual mode = sync
  operation_mode = logreplay

global.ini on node2:
  mode  = sync
  actual mode = primary
  operation_mode = logreplay

hdbnsutil -sr_state on node1:
  mode: sync
  operation mode: logreplay

hdbnsutil -sr_state on node2:
  mode: primary
  operation mode: primary

Just have a look how we change the parameter values depending on the source – global.ini or hdbnsutil -sr_state. I highlighted major differences.

This behavior doesn’t depend on removing binaries, it’s normal HANA parameter change after the takeover/re-registering secondaries to new primary. Confirmation is provided in SAP Note 1999880 - FAQ: SAP HANA System Replication:

  1. Why do I see deviating values in the system replication mode parameter?

The value in parameter global.ini -> [system_replication] -> mode depends on the original role of the site:

· If site was originally configured as primary site: mode = 'primary'

· If site was originally configured as secondary / tertiary site: mode = 'sync', 'async', ... (dependent on the system replication mode)

As a consequence the mode value can be different in two identically configured systems if a takeover happened in one system, but not in the other.


Is there a specific reason why the 'mode' parameter is checked instead of 'actual_mode'? If not I would like to update the SAPHana resource agent to get this fixed.

HANA_CALL - show {output,suErr,cmdErr} in any error case, not only for RC=1

SAPHanaSR/ra/SAPHana

Lines 679 to 691 in 8db3d75

output=$(if [ -f "$cmd_out_log" ]; then cat "$cmd_out_log"; rm -f "$cmd_out_log"; fi)
suErr=$(if [ -f "$su_err_log" ]; then cat "$su_err_log"; rm -f "$su_err_log"; else echo "NA"; fi)
cmdErr=$(if [ -f "$cmd_err_log" ]; then cat "$cmd_err_log"; rm -f "$cmd_err_log"; else echo "NA"; fi)
super_ocf_log debug "DBG: RA ==== action HANA_CALL (cmd is '$cmd', rc is '$rc', stderr from su is '$suErr', stderr from cmd is '$cmdErr') ===="
# on rc=1 - retry to improve the behavior in AD environments
if [ $rc -eq 1 ]; then
super_ocf_log warn "RA: HANA_CALL stderr from command '$pre_cmd' is '$suErr', stderr from command '$cmd' is '$cmdErr'"
if [ "$cmdErr" == "NA" ]; then
# seems something was going wrong with the 'pre_cmd' (su)
super_ocf_log warn "DEC: HANA_CALL returned '1' for command '$pre_cmd'. Retry once."
output=$(timeout --foreground -s 9 "$timeOut" $pre_cmd "$pre_script; timeout -s 9 $timeOut $cmd"); rc=$?
fi
fi

HANA_CALL_TIMEOUT can't be configured in cluster configuration

With commits c618d8f and 4d7057b the option to make the internal timeout used for some of the calls of SAP HANA commands configurable was introduced.

Even though there is a description of the HANA_CALL_TIMEOUT parameter in the metadata of the resource agents the parameter currently can't be set via the cluster configuration, since there is no 'link' between OCF_RESKEY_HANA_CALL_TIMEOUT and the internal HANA_CALL_TIMEOUT variable. In fact HANA_CALL_TIMEOUT still gets hardcoded to 60 seconds.

Is this intentional?

If yes it might be better to remove the description of the HANA_CALL_TIMEOUT parameter form the metadata to avoid customers trying to set it and then wondering why it doesn't work.

Hardcoded timeout for call of getParameter.py to get operation mode?

Hi,

while the timeouts for most calls of HANA binaries and python scripts have been made configurable with commit 7c66a3b , the call to run getParameter.py to get the operation mode still uses a hardcoded timeout of 10 seconds:

https://github.com/SUSE/SAPHanaSR/blob/maintenance-classic/ra/SAPHana#L2664

For other calls of getParameter.py in the resource agents however the $HANA_CALL_TIMEOUT variable is used to use a configurable timeout.

Is there a specific reason why a hardcoded timeout is used for the getParameter.py call to get the operation mode, or would it be possible to also make the timeout configurable by using the $HANA_CALL_TIMEOUT variable instead?

ra/SAPHANA* - remove redundant HDBSettings.sh calls

currently all HANA_CALL executions are prefixed by "su - sidadm"

User login shell is already executing HDBSettings.sh and therefore additional call ' /usr/sap/XXX/HDB00/HDBSettings.sh' is redundant.
HDBSettings.sh is sourcing/calling additional scripts, which all do their initial executions. As these calls are executed almost every 2-3s within a cluster environment this is creating a lot of unnecessary filesystem activities.

su - xxxadm -c '(true; timeout -s 9 10 /usr/sap/XXX/HDB00/HDBSettings.sh HDB version'
su - xxxadm -c '(true; timeout -s 9 120 /usr/sap/XXX/HDB00/HDBSettings.sh hdbnsutil -sr_stateConfiguration'
su - xxxadm -c '(true; timeout -s 9 120 /usr/sap/XXX/HDB00/HDBSettings.sh landscapeHostConfiguration.py'
su - xxxadm -c '(true; timeout -s 9 5 /usr/sap/XXX/HDB00/HDBSettings.sh systemReplicationStatus.py --site=SiteB'
and more

As long as su - sidadm is anyway the default and cannot be changed by script parameters simply remove the redundant call.

Unexpected DB outage when cluster is removed from maintenance mode after cluster service restart.

Issue:
On a fully operational cluster, when cluster is put to maintenance mode and Pacemaker/Cluster service is restarted then after removing cluster from maintenance mode DB on primary is stopped and started again which results in an outage to the customers.

Recreate the issue with below steps

  1. Make sure cluster is fully operational with one Promoted and one Demoted node. HSR is in sync
  2. Put the cluster to maintenance mode ( crm configure property maintenance-mode=true )
  3. Stop the cluster service on both the nodes ( crm cluster stop )
  4. Start the cluster service on both the nodes ( crm cluster start )
  5. Remove cluster from Maintenance mode ( crm configure property maintenance-mode=false )

After Step 5 DB on Primary will be restarted or sometimes triggers the failover.

Reason:
This is happening because If you attempt to start cluster services on a node while the cluster or node is in maintenance mode, Pacemaker will initiate a single one-shot monitor operation (a “probe”) for every resource to evaluate which resources are currently running on that node. However, it will take no further action other than determining the resources' status.

so after step 4 a probe is initiated using SAPHana and SAPHanaTopology Resources.

In SAPHanaTopology when it is identified as probe in monitoring clone function it only check and Sets the attribute for Hana Version, but it is not doing any check for current cluster state. Because of this "hana_roles" and "master-rsc_SAPHana_HDB42" attributes are not set in the cluster primary.

Also in SAPHana Resource agent it is trying to get the status of role attribute (which is not set by that time) and setting score to 5 during the probe and later when cluster is removed from maintenance mode, resource agent checks for the roles attribute and its score, as those values are not as expected, agent is trying to fix the cluster and DB stop-Start is happening.

Resolution:
To overcome this issue, If we add a check to identify the status of the primary node and set the "hana__roles" attribute during probe, then when cluster is removed from the maintenance, cluster will not try to stop and start the DB or to trigger a failover as it will see the operational primary node.

I have already modified the code and tested multiple scenarios, cluster functionality is not disturbed and the mentioned issue is resolved. I don't think these changes to SAPHana Resource agent will cause additional issues because, during probe we are setting the attributes only if the we identify the primary node. But need your expertise to check and finalize if this approach can be used or suggest any other alternative/fix to overcome the above mentioned issue.

man page - Message Types section requires review

  • currently only included in SAPHanaTopoloy but not SAPHana
  • LPA not described
  • TOP described, but not used any longer

Message Types: [ act | dbg | dec | flow | top ] - Default value: ra-act-dec-lpa
ACT: Action. Start, stop, sr_takeover and others. See also section SUPPORTED ACTIONS.
DBG: Debugging info. Usually not needed at customer site. See SUSE TID 7022678 for maximum RA tracing.
DEC: Decision taken by the RA.
FLOW: Function calls and the respective return codes.
TOP: Topology. Messages related to HANA SR topology, like site name and remote site.

About WAITING4LPA

I'd like to share a behaviour i found while "joking" with a test cluster SLES15SP1 4SAP Application.

nodea and nodeb are running fine.

The hook wrote its properties into CIB:
property SAPHanaSR: \ hana_ph1_site_srHook_nodea=PRIM \ hana_ph1_site_srHook_nodeb=SOK

On nodeb i've shotdown pacemaker
systemctl stop pacemaker
The hook modified its properties:
property SAPHanaSR: \ hana_ph1_site_srHook_nodea=PRIM \ hana_ph1_site_srHook_nodeb=SFAIL
Hana is still Active.

Let's suppose nodeb is broken and can't be put online.
A maintenance operation on nodea needs a restart.

After the restart, the cluster do not put online Hana instance.
The command SAPHanaSR-showAttr reports that nodea is in status WAITING4LPA

Is this the correct behaviour? If this happens on a Production system, are we forced to (temporarely) disable the SAPHanaSR hook in order to restart the hana instance within the cluster?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.