Coder Social home page Coder Social logo

vitabaks / pgbackrest_auto Goto Github PK

View Code? Open in Web Editor NEW
55.0 55.0 16.0 275 KB

pgbackrest_auto: Automatic Restore and Validate for physical and logical database corruption (PostgreSQL)

License: MIT License

Shell 100.00%
automate pgbackrest postgresql restore validate

pgbackrest_auto's People

Contributors

artemsafiyulin avatar jwpit avatar vitabaks avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pgbackrest_auto's Issues

Checksum does not match

Starting from two days ago, we noticed that the pg restore failes with a checksum verification.

On both the PG server and on the pgbackrest node were no changes made aka patching etc.

2023-11-15 00:01:14 INFO: [STEP 1]: Starting
2023-11-15 00:01:14 INFO: Starting. Restore Type: Full PostgreSQL Restore FROM Stanza: postgres-cluster --> TO Directory: /var/lib/postgresql/15/backup
2023-11-15 00:01:14 INFO: Starting. Restore Settings: immediate   
2023-11-15 00:01:14 INFO: Starting. Run settings: Log: /var/log/pgbackrest/pgbackrest_auto_postgres-cluster.log
2023-11-15 00:01:14 INFO: Starting. Run settings: Lock run: /tmp/pgbackrest_auto_postgres-cluster.lock
2023-11-15 00:01:14 INFO: Starting. PostgreSQL version: 15
2023-11-15 00:01:14 INFO: Starting. PostgreSQL data directory: /var/lib/postgresql/15/backup
2023-11-15 00:01:14 INFO: Starting. PostgreSQL Database Validation: yes
2023-11-15 00:01:14 INFO: Starting. Clear Data Directory after restore: yes
2023-11-15 00:01:14  [33mWARN: [0m Restoring to /var/lib/postgresql/15/backup Waiting 30 seconds. The directory will be overwritten. If mistake, press ^C
2023-11-15 00:01:44 INFO: [STEP 2]: Stopping PostgreSQL
2023-11-15 00:01:44 INFO: attempt: 1/3600
2023-11-15 00:01:44 INFO: PostgreSQL check status
2023-11-15 00:01:44 INFO: PostgreSQL instance not running
2023-11-15 00:01:44 INFO: [STEP 3]: Restoring from backup
2023-11-15 00:01:44 INFO: See detailed log in the file /var/log/pgbackrest/postgres-cluster-restore.log
2023-11-15 00:01:44 INFO: Restore from backup started. Type: Full PostgreSQL Restore pgbackrest --config=/tmp/pgbackrest.conf --stanza=postgres-cluster --pg1-path=/var/lib/postgresql/15/backup  --type=immediate --repo1-path=/var/lib/pgbackrest --repo1-host=localhost --repo1-host-user=postgres --delta restore --process-max=4 --log-level-console=error --log-level-file=detail --recovery-option=recovery_target_action=promote --tablespace-map-all=/var/lib/postgresql/15/backup_remapped_tablespaces
ERROR: [026]: raised from local-3 protocol: error restoring '/var/lib/postgresql/15/backup/base/16415/1249': actual checksum '0756e5e6a05bcb6d9926780f9fa6192425f22069' does not match expected checksum '67f3f5282b004deeb3089e04c0c3fd1cf3adf92b'
       [ChecksumError] on retry after 47ms: [same message]
       [ChecksumError] on retry after 15078ms: [same message]
2023-11-15 00:02:01  [91mERROR: [0m Restore from backup failed

Any hint what might caused this error?

Thanks

tablespace can not be located inside PGDATA

Hello

i try to check restore database with external tablespace mapped to other catalog.
when I run pgbackrest_auto got the error in log:

-------------------PROCESS START-------------------
2022-07-06 09:55:52.049 P00   INFO: restore command begin 2.37: --config=/tmp/pgbackrest_auto.conf --delta --exec-id=8695-ceb72d05 --log-level-console=error --log-level-file=detail --pg1-path=/mnt/test/xx --process-max=4 --recovery-option=recovery_target_action=promote --repo1-host=localhost --repo1-host-user=postgres --stanza=main --tablespace-map-all=/mnt/test/xx/remapped_tablespaces --type=immediate
2022-07-06 09:55:52.538 P00   INFO: repo1: restore backup set 20220112-154850F_20220706-063003D, recovery will start at 2022-07-06 06:30:03
2022-07-06 09:55:52.538 P00   INFO: remap data directory to '/mnt/test/xx'
2022-07-06 09:55:52.538 P00   INFO: map tablespace 'pg_tblspc/16393' to '/mnt/test/xx/remapped_tablespaces/charges'
2022-07-06 09:55:52.538 P00  ERROR: [070]: link 'pg_tblspc/16393' destination '/mnt/test/xx/remapped_tablespaces/charges' is in PGDATA
2022-07-06 09:55:52.538 P00   INFO: restore command end: aborted with exception [070]

PostgreSQL will warn in this case:

	/* Warn if the tablespace is in the data directory. */
	if (path_is_prefix_of_path(DataDir, location))
		ereport(WARNING,
				(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
				 errmsg("tablespace location should not be inside the data directory")));

Failed to run amcheck with Postgresql 15.2 on Debian

When running the command to create extension amcheck the script fail and return
ERROR: CREATE EXTENSION amcheck failed

The command is in this line: https://github.com/vitabaks/pgbackrest_auto/blob/master/pgbackrest_auto#L594

When I run by hand:

$ psql -v "ON_ERROR_STOP" -p "${PGPORT}" -h 127.0.0.1 -U postgres -d "postgres" -tAXc "CREATE EXTENSION if not exists amcheck"

I get the following PostgreSQL error:

ERROR: could not access file "$libdir/pgaudit": No such file or director

It seem that to be able to install amcheck pgaudit is required, so I've installed pgaudit via sudo apt-get install postgresql-15-pgaudit.

After that, running the command to create extension again, I get:

ERROR: pgaudit must be loaded via shared_preload_libraries

I see that postgresql is started at https://github.com/vitabaks/pgbackrest_auto/blob/master/pgbackrest_auto#L465

The solution would be to set shared_preload_libraries config for this instance that is loaded.

pg_dump: error: query failed: ERROR: out of shared memory

Hi Vitaliy,

I am getting the following error in the data validation step.

.
.
.
.
2023-03-08 08:58:11 INFO: Start data validation for database bkm_express
2023-03-08 08:58:11 INFO: starting pg_dump -p 5432 -h 127.0.0.1 -d bkm_express >> /dev/null
pg_dump: error: query failed: ERROR: out of shared memory
HINT: You might need to increase max_locks_per_transaction.
pg_dump: error: query was: LOCK TABLE bkmexpress.virtual_pos_log IN ACCESS SHARE MODE
2023-03-08 08:58:25 ERROR: Data validation in the database bkm_express - Failed
[postgres@dbtest01 data]$ cat postgresql.conf
listen_addresses = '*'
port = 5432
max_connections = 500
max_worker_processes = 24
max_prepared_transactions = 0
max_locks_per_transaction = 64
[postgres@dbtest01 data]$

I was able to solve it by increasing the max_locks_per_transaction value.
Default values are written to the conf file.

I took the following document as an example for optimization.
https://cloud.google.com/sql/docs/postgres/optimize-high-memory-usage

I used the following query for the minimum value.
SELECT ((SELECT count(*) as object_count
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind IN ('r', 'v', 'i', 'S', 's')) /
(SELECT current_setting('max_connections')::float))+1 as MIN_max_locks_per_transaction_VALUE;

How can we optimize this dynamically in your software?

unable to open missing file

A couple a days ago pgbackrest_auto stopped working. In the logs we see:

023-10-03 00:01:18 INFO: [STEP 1]: Starting
2023-10-03 00:01:18 INFO: Starting. Restore Type: Full PostgreSQL Restore FROM Stanza: postgres-cluster --> TO Directory: /var/lib/postgresql/15/backup
2023-10-03 00:01:18 INFO: Starting. Restore Settings: immediate   
2023-10-03 00:01:18 INFO: Starting. Run settings: Log: /var/log/pgbackrest/pgbackrest_auto_postgres-cluster.log
2023-10-03 00:01:18 INFO: Starting. Run settings: Lock run: /tmp/pgbackrest_auto_postgres-cluster.lock
2023-10-03 00:01:18 INFO: Starting. PostgreSQL version: 15
2023-10-03 00:01:18 INFO: Starting. PostgreSQL data directory: /var/lib/postgresql/15/backup
2023-10-03 00:01:18 INFO: Starting. PostgreSQL Database Validation: yes
2023-10-03 00:01:18 INFO: Starting. Clear Data Directory after restore: yes
2023-10-03 00:01:18 �[33mWARN:�[0m Restoring to /var/lib/postgresql/15/backup Waiting 30 seconds. The directory will be overwritten. If mistake, press ^C
2023-10-03 00:01:48 INFO: [STEP 2]: Stopping PostgreSQL
2023-10-03 00:01:48 INFO: attempt: 1/3600
2023-10-03 00:01:48 INFO: PostgreSQL check status
2023-10-03 00:01:48 INFO: [STEP 3]: Restoring from backup
2023-10-03 00:01:48 INFO: See detailed log in the file /var/log/pgbackrest/postgres-cluster-restore.log
2023-10-03 00:01:48 INFO: Restore from backup started. Type: Full PostgreSQL Restore
pgbackrest --config=/tmp/pgbackrest.conf --stanza=postgres-cluster --pg1-path=/var/lib/postgresql/15/backup  --type=immediate --repo1-path=/var/lib/pgbackrest --repo1-host=localhost --repo1-host-user=postgres --delta restore --process-max=4 --log-level-console=error --log-level-file=detail --recovery-option=recovery_target_action=promote --tablespace-map-all=/var/lib/postgresql/15/backup_remapped_tablespaces
WARN: --delta or --force specified but unable to find 'PG_VERSION' or 'backup.manifest' in '/var/lib/postgresql/15/backup' to confirm that this is a valid $PGDATA directory. --delta and --force have been disabled and if any files exist in the destination directories the restore will be aborted.
ERROR: [055]: raised from local-2 protocol: raised from remote-2 ssh protocol on 'localhost': unable to open missing file '/var/lib/pgbackrest/backup/postgres-cluster/20230930-000107F/pg_data/base/16416/29644.gz' for read
       [FileMissingError] on retry after 279ms: [same message]
       [FileMissingError] on retry after 15280ms: [same message]
2023-10-03 00:02:43 �[91mERROR:�[0m Restore from backup failed

pgbackrest_auto is started via cron job:

#=== pgbackrest - Backup PostgreSQL ====================

01 00 * * 6 postgres if pgbackrest --stanza=postgres-cluster --type=full backup; then pgbackrest_auto --from=postgres-cluster --to=/var/lib/postgresql/15/backup --checkdb --clear --report; fi
01 00 * * 0-5 postgres if pgbackrest --stanza=postgres-cluster --type=diff backup; then pgbackrest_auto --from=postgres-cluster --to=/var/lib/postgresql/15/backup  --checkdb --clear --report; fi

Any idea what might caused this issue?

WARN: sendemail could not be found. Please install the sendemail package

Hi,

I get this error when executing the script with the --report flag:

postgres@pgbackrest01:~$ pgbackrest_auto --from=postgres-cluster --to=/var/lib/postgresql/15/main/ --checkdb --report
2023-07-10 13:53:54 WARN: sendemail could not be found. Please install the sendemail package
postgres@pgbackrest01:~$
postgres@pgbackrest01:~$ command -v sendmail
/usr/sbin/sendmail
postgres@pgbackrest01:~$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin

Any idea why the script doesn't find sendmail?

/usr/bin/pgbackrest_auto: line 430: pg_isready: command not found

Hi Vitaliy ,

i'm trying to setting your script .
pgisready steps are giving same error as cycle.
Is pg_isready misspelled inside the function?

2023-01-18 09:32:47 INFO: Restore from backup done
2023-01-18 09:32:47 INFO: [STEP 4]: PostgreSQL Starting for recovery
2023-01-18 09:32:47 INFO: PostgreSQL start
/usr/bin/pgbackrest_auto: line 430: pg_isready: command not found
2023-01-18 09:32:49 INFO: attempt: 1/3600
/usr/bin/pgbackrest_auto: line 430: pg_isready: command not found
2023-01-18 09:32:49 WARN: PostgreSQL instance data no response
2023-01-18 09:32:50 INFO: attempt: 2/3600
/usr/bin/pgbackrest_auto: line 430: pg_isready: command not found
2023-01-18 09:32:50 WARN: PostgreSQL instance data no response
2023-01-18 09:32:51 INFO: attempt: 3/3600
/usr/bin/pgbackrest_auto: line 430: pg_isready: command not found
2023-01-18 09:32:51 WARN: PostgreSQL instance data no response

When pg_isready is changed to pgisready, it gives segment fault error as below.

Jan 18 09:39:12 dbtest01 pgbackrest_auto[1456198]: See detailed log in the file /var/log/pgbackrest/stanza-restore.log
Jan 18 09:39:21 dbtest01 pgbackrest_auto[1456461]: Restore from backup done
Jan 18 09:39:21 dbtest01 pgbackrest_auto[1456485]: [STEP 4]: PostgreSQL Starting for recovery
Jan 18 09:39:21 dbtest01 pgbackrest_auto[1456488]: PostgreSQL start
Jan 18 09:39:33 dbtest01 kernel: pgbackrest_auto[1455941]: segfault at 7ffd7f0c6fc8 ip 00007f8c0b727f39 sp 00007ffd7f0c6fa0 error 6 in libc-2.28.so[7f8c0b68d000+1bc000]
Jan 18 09:39:33 dbtest01 systemd-coredump[1457118]: Resource limits disable core dumping for process 1455941 (pgbackrest_auto).
Jan 18 09:39:33 dbtest01 systemd-coredump[1457118]: Process 1455941 (pgbackrest_auto) of user 53002 dumped core.

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.