kartoza / docker-pg-backup Goto Github PK
View Code? Open in Web Editor NEWA cron job that will back up databases running in a docker postgres container
License: GNU General Public License v2.0
A cron job that will back up databases running in a docker postgres container
License: GNU General Public License v2.0
Thanks for this amazing work!!!
but I noticed that it seems no way to skip the built-in cron schedule. in k8s, we have cronjob controller to schedule job, to run & exit. And I think if there is a RUN_ONCE
would be better, more friendly to k8s.
In the postgress env settings I'm using the '&' character. This gives issues because it doesn't get escaped in pgenv.sh.
The sourcing of the pgenv.sh file gives then an error which causes the backups not to run.
I was using the kartoza/postgis arm64 version, and wanted this backup image as well in arm64. Right now I'm building my own image for backup purposes which takes quite sometime, would be helpful and align with postgis.
Thanks!
Hi all and thanks for great repo!
Are there any plans to add license to this repo?
https://opensource.stackexchange.com/questions/1720/what-can-i-assume-if-a-publicly-published-project-has-no-license
Deploying the imgae (even with environment variables) deploys the backups-cron so that it backups every minute.
I had to manually enter the container and change the "backups-cron" file so that it only runs at 23h.
Is this the supposed behaviour or is there some kind of env-variable that I am missing?
It would be great if the container works out of the box with no manual configuration after starting it.
The latest
tag on Docker Hub was created two years ago.
Please update to 12.0.
How to restore all backup's
The kartoza postgis docker compose allows the usage of e.g. POSTGRES_PASS_FILE in order to pass secrets into the container. It would be nice to be able to provide secret credentials to the docker-pg-backup image via the same way.
The paths seem to jave changed in v13 from / to /backup-scripts which is'nt reflected in the crontab... So the scripts are never called. See #43
Even though the 13-3.1
tag is explicitly mentioned in readme, it doesn't exist (anymore):
We highly suggest that you use a tagged image that match the PostgreSQL image you are running i.e (kartoza/pg-backup:13-3.1 for backing up kartoza/postgis:13-3.1 DB). The latest tag may change and may not successfully back up your database.
Dockerfile has to be modified this way:
RUN apt-get -y update; apt-get -y --no-install-recommends install postgresql-client cron
&& apt-get -y --purge autoremove
&& apt-get clean
&& rm -rf /var/lib/apt/lists/*
I have an app that uses multiple databases (one per customer). Those databases are spread out on multiple servers with several customers databases on each.
Obviously I will have to run an instance of docker-pg-backup on each. I would love to just dump all of them in a single dump, but it is only possible to specify a single database name.
Is it possible to support multiple databases?
When the database is not ready at the time the docker-pg-backup container starts, it initialises with an empty DBLIST and thus never makes any backups.
If DBLIST is empty, it's probably better to exit with an error, so the container is restarted after a bit of a delay.
Hi,
I'm running a very simple version of the s3 docker-compose that you added :
version: '2.1'
volumes:
db-data:
services:
dbbackups:
image: kartoza/pg-backup:11.0
environment:
- DUMPPREFIX=staging
- POSTGRES_DBNAME=xx
- POSTGRES_HOST=xx
- POSTGRES_USER=xx
- POSTGRES_PASS=xx
- POSTGRES_PORT=5432
- STORAGE_BACKEND="S3"
- ACCESS_KEY_ID=xx
- SECRET_ACCESS_KEY=xx
- DEFAULT_REGION=eu-west-1
- BUCKET=xx
- HOST_BASE=
- HOST_BUCKET=
- SSL_SECURE=True
- REMOVE_BEFORE=90
- CRON_SCHEDULE='*/2 * * *'
- DUMP_ARGS='-Fc'
restart: on-failure
But it's just hanging telling me this :
➜ pg-backups docker-compose up Recreating pg-backups_dbbackups_1 ... done Attaching to pg-backups_dbbackups_1 dbbackups_1 | Start script running with these environment options dbbackups_1 | PG_ENV=/pgenv.sh
I'm not even sure if this is an error or just a log line, but nothing's showing up in my s3 bucket. Did I do something wrong in the docker-compose file?
Thanks a lot!
I was just working on a small PR to expose more env vars to control the databases to be backed up and a few other things.
While testing, I realized, that even the current master
doesn't run the cron
job configured in the Docker container, at least for me. I feel a bit stupid right now. I changed the backups-cron
to run the job every minute, but it wouldn't ever run. I tried exec
ing into the container, starting cron
(even though top
showed me it was running). I tried adding a simple echo
in backups-cron
to be run every minute. Nothing ever worked.
However, I could run /backups.sh
successfully in the container. Also in a way that cron
will approx run it: /bin/sh -c "(export PATH=/usr/bin:/bin; /backups.sh 2>&1)"
.
Really, the only thing I tried which got it working, was changing crontab
from within the container and adding the minutely executed backups.sh
right in there. Nothing else seemed to trigger the cron job.
Anyone has experienced that before? I already dumped 3 hours and have no idea what's wrong with the container's cron
setup, actually looks all fine to me.
Hello, thank you for your amazing container ^^
I just can't make the S3 cleaner to work, it seems that awk {'print ${S3_BUCKET}" "${DEL_DAYS}'}
is a syntax error, playing manually with the command I believe this works: awk '{print $S3_BUCKET" "$DEL_DAYS}'
Regards
Hi, As V 14 is merged, can you also update the docker hub image for postgres 14?
Hi, I tried using the S3 bucket; backup container is not running. I tested the AWS credentials and bucket using some other means and bucket is definitely connecting and working.
version: '3'
volumes:
postgres-data:
services:
node-docker:
container_name: node_container
build: .
ports:
- 3000:4000
postgres:
container_name: postgres_container
image: postgis/postgis
restart: always
environment:
- POSTGRES_DB=postgres
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=password
ports:
- '5432:5432'
volumes:
- postgres-data:/var/lib/postgresql/data
dbbackups:
container_name: dbbackups_container
image: kartoza/pg-backup:${POSTGRES_MAJOR_VERSION:-14}-${POSTGIS_MAJOR_VERSION:-3}.${POSTGIS_MINOR_RELEASE:-1}
environment:
- DUMPPREFIX=backup
- POSTGRES_HOST=postgres
- POSTGRES_USER=postgres
- POSTGRES_PASS=password
- POSTGRES_PORT=5432
- CRON_SCHEDULE="*/5 * * * *"
- STORAGE_BACKEND="S3"
- ACCESS_KEY_ID=MY_AWS_KEY_ID
- SECRET_ACCESS_KEY=MY_AWS_SECRET_KEY
- DEFAULT_REGION=eu-west-2
- BUCKET=MY_AWS_BUCKET
- SSL_SECURE=True
restart: on-failure
depends_on:
postgres:
condition: service_healthy
Hello,
Is there a way to encrypt/set a password for the backup?
Thanks in advance!
Will provide PR soon. Also it is not very convenient that cron output/errors are hidden.
Going to change it to:
*/3 * * * * /backup-scripts/backups.sh > /var/log/cron.out 2>&1
Currently, the process in the container is running as root-user. Due to security reasons this is not recommended and even not easily possible in some environments (e.g. openShift).
I tried to add a new user and change the corresponding permissions.
Unfortunately it still fails with the following error message:
cron: can't open or create /var/run/crond.pid: Permission denied
I didn't find any solution how to solve the permission issue with cron. Does anybody have an idea?
When the container starts it reads the list of available databases and stores them in pgenv.sh
, if subsequent databases that are added to the upstream database are not backed up until a restart of the pg_container
Find a way to query the DB and populate a list dynamically or maybe poll the DB based on the cron schedule and populate pgenv.sh
I want to see in the containers logs when it starts creating a backup, when if finishes, what was created and, if failed, some logs to understand the context of failure. The date time of the logs will be also helpful.
Whenever for some reason you don't find a backup file you have no idea what happened and the containers logs stay always empty.
Start script running with these environment options
DUMPPREFIX=PG_backup
PG_CONN_PARAMETERS='-h production_postgres -p 5432 -U postgres'
Hi,
I think I've find a bug in the cleanup process. Here are the traces
root@7d14762e73e7:/backup-scripts# /backup-scripts/backups.sh
Bucket 's3://vault-backups/' created
upload: '<stdin>' -> 's3://vault-backups/globals.sql' [part 1 of -, 481B] [1 of 1]
481 of 481 100% in 0s 35.24 KB/s done
WARNING: Module python-magic is not available. Guessing MIME types based on file extensions.
upload: '/vault-backups/2023/March/PG_ts_vault.17-March-2023.dmp.gz' -> 's3://vault-backups/2023/March/PG_ts_vault.17-March-2023.dmp.gz' [1 of 1]
15364 of 15364 100% in 0s 2.16 MB/s done
Done. Uploaded 15364 bytes in 1.0 seconds, 15.00 KB/s.
awk: line 1: syntax error at or near {
awk: line 1: syntax error at or near {
awk: line 1: syntax error at or near }
date: invalid date ‘+%s’
date: invalid date ‘-vault-backups’
awk: line 1: syntax error at or near {
awk: line 1: syntax error at or near {
awk: line 1: syntax error at or near }
date: invalid date ‘+%s’
date: invalid date ‘-vault-backups’
It sound that the function will not calculate proper dates:
# Cleanup S3 bucket
function clean_s3bucket() {
S3_BUCKET=$1
DEL_DAYS=$2
s3cmd ls s3://${S3_BUCKET} --recursive | while read -r line; do
createDate=$(echo $line | awk {'print ${S3_BUCKET}" "${DEL_DAYS}'})
createDate=$(date -d"$createDate" +%s)
olderThan=$(date -d"-${S3_BUCKET}" +%s)
if [[ $createDate -lt $olderThan ]]; then
fileName=$(echo $line | awk {'print $4'})
echo $fileName
if [[ $fileName != "" ]]; then
s3cmd del "$fileName"
fi
fi
done
}
should be change with:
# Cleanup S3 bucket
function clean_s3bucket() {
S3_BUCKET=$1
DEL_DAYS=$2
s3cmd ls s3://${S3_BUCKET} --recursive | while read -r line; do
createDate=$(echo $line | awk {'print $1'})
createDate=$(date -d"$createDate" +%s)
olderThan=$(date -d"${DEL_DAYS} ago" +%s)
if [[ $createDate -lt $olderThan ]]; then
fileName=$(echo $line | awk {'print $4'})
echo $fileName
if [[ $fileName != "" ]]; then
s3cmd del "$fileName"
fi
fi
done
}
Think it will be better :)
In the docs i can't find a way for changing default time for backuping. Am missing something or it can't be changed?
Not actual link https://registry.hub.docker.com/u/kartoza/pg-backup/
Found by search on Docker Hub https://registry.hub.docker.com/r/kartoza/pg-backup
when i create db name "gis-server" and config with docker-compose volumes /data/gisdata:/var/lib/postgressql ,exec docker restart the container, some problems in logs
ERROR: databases "gis" already exists
i can not restart the docker container
when i rename db "gis-server" to "dbserver" ,it's restart fine.
If restoring fresh or into another environment you need to be able to restore globals (roles) so this needs to be added to the script:
pg_dumpall --globals -f globals.sql
Hello,
Getting the following error when trying to restore a backup.
Command
docker-compose exec postgres_backup /restore.sh
Error
TARGET_DB: sis_db
WITH_POSTGIS:
TARGET_ARCHIVE: backups/2020/June/LOCAL_sis_db.15-June-2020.dmp
Dropping target DB
Recreate target DB without POSTGIS
Restoring dump file
SET
SET
SET
psql:/backups/globals.sql:14: ERROR: role "postgres" already exists
ALTER ROLE
psql:/backups/globals.sql:16: ERROR: role "replicator" already exists
ALTER ROLE
psql:/backups/globals.sql:18: ERROR: role "sis" already exists
ALTER ROLE
pg_restore: error: one of -d/--dbname and -f/--file must be specified
macOS Catalina
Docker version 19.03.8, build afacb8b
docker-compose version 1.25.5, build 8a1c60f6
kartoza/postgis:12.1
kartoza/docker-pg-backup/tree/master
I am using Docker compose and have it set up like this:
dbbackup:
image: kartoza/pg-backup:11.0
hostname: pg-backups
depends_on:
- postgres
volumes:
- /mybackupspath:/backups
environment:
- DUMPPREFIX=myprefix
- POSTGRES_USER=dev
- POSTGRES_PASS=mypassword
- POSTGRES_PORT=5432
- POSTGRES_HOST=postgres
- POSTGRES_DBNAME=mydatabasename
However if I open a terminal inside the container and execute cat pgenv.sh
I get this:
export PGUSER=docker
export PGPASSWORD=mypassword
export PGPORT=5432
export PGHOST=postgres
export PGDATABASE=mydatabasename
export DUMPPREFIX=myprefix
export ARCHIVE_FILENAME=
I expected export PGUSER=dev
but got export PGUSER=docker
.
I haven't been able to figure out why this happens unfortunately.
Hi everyone,
I'm wondering why backups.sh script works fine when running it manually from container but not working on cron schedule. I'm running 13-3.1 version and here is my backups-cron file:
# Run the backups at 11pm each night
0 2 * * * /backup-scripts/backups.sh > /var/log/cron.out 2>&1
# We need a blank line here for it to be a valid cron file
As we can see in /var/log/cron.out file:
psql: error: could not translate host name "db" to address: Temporary failure in name resolution pg_dumpall: error: could not connect to database "template1": could not translate host name "db" to address: Temporary failure in name resolution
Thank you for your help.
Used in for example https://github.com/kartoza/docker-pg-backup/blob/master/scripts/restore.sh#L6
Use version:v12.0
$ docker logs -f --tail 1000 postgis-dbbackups-1
psql: error: could not connect to server: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
Start script running with these environment options
DUMPPREFIX=PG_db
PG_ENV=/pgenv.sh
Hey I read the github readme and the script used for back up and I dont really understand if it works if your backup is in an s3 object storage. You didn't mention it anywhere
Hi,
I am having an issue trying to automate backing up my database using your method.
I edited backups-cron
for it to run every 1 minute and ran docker-compose up dbbackup
.
After awhile I came to check just to discover no .dmp were created 😦 so I ran:
docker-compose run dbbackup /bin/bash
And found no pgenv.sh
were created. Which is weird because the log from above did printed out all environment parameters specified in start.sh
Then I manually ran ./start.sh
and waited for a while. The pgenv.sh
was created then but still no .dmp
in the backups directory 😦
Only running backups.sh
manually resulted in a .dmp
pg_backup
container uses increasingly more memory over time reaching to Gigabytes.
14-3.3
We run pg_bakcup within docker swarm and use volumes to store backups
I'm running the kartoza/pg-backup:14-3.1
image with docker-compose like this:
postgres-backup:
image: kartoza/pg-backup:13-3.1
restart: always
volumes:
- postgres-backup:/backups
environment:
- POSTGRES_HOST=postgres
- POSTGRES_USER=postgres
- POSTGRES_PASS=example
- REMOVE_BEFORE=15
- CRON_SCHEDULE='0 0 * * *'
When I exec into the container, all variables are accessible, for example POSTGRES_HOST
is correctly postgres
instead of db
Also when I manually run /backup-scripts/backups.sh
, the backup is created as expected
But for some reason, the env variables are not passed to cron. I could additionally confirm this by running a test script:
#!/bin/bash
source /backup-scripts/env-data.sh
env
Via cron
POSTGRES_USER=docker
POSTGRES_PASS=docker
Manually
POSTGRES_USER=postgres
POSTGRES_PASS=example
pgenv.sh
has
export PGUSER=docker
export PGPASSWORD=docker
export PGPORT=5432
export PGHOST=db
export PGDATABASE=gis
export DUMPPREFIX=PG
export ARCHIVE_FILENAME=
while the documentation has other variable names
Hello,
is this image good for backing up postgres database? (instead of postgis)
If yes, it'd be good to state it in README explicitly.
This repository comes up at the top of Google results for "docker postgres automatic backup".
Hi all
Thanks for the great repos!
I'm trying to setup postgres and postgres-backup and was wondering if postgres-backup also supports POSTGRES_PASS_FILE
environment command to set a secret.
Kind regards
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.