Coder Social home page Coder Social logo

Comments (3)

Istador avatar Istador commented on June 18, 2024

How long does the backup take approximately?

Why do you have a BACKUP_CMD variable, if you just execute it one line later once?
Wouldn't it be easier to just execute the command directly without the variable?


I'd suggest cleanly stopping the docker container for a short moment (so that the database files on the filesystem are in a consistent state), creating a snapshot of the filesystem, start the docker container again so that the server is running, backup the database files from the snapshot, and then removing the snapshot.

That way the downtime is very short (there is some extra space needed for the snapshot, to track all file changes between the creation and deletion of the snapshot - unless you are doing an osm2pgsql append at the same time, that shouldn't grow too big).


There are probably some compression algorithms that benefit from multithreading.

Usually, I try to avoid compression when dealing with backups (storage is cheap), but rather have exact mirrors (plural) and compress only the transfer of them to different machines (validated via checksums).

That way the contents can be compared and verified fast (without decompressing), only the changes need to be transferred, and data corruption only has an local effect to the data and doesn't render the whole backup corrupt.

(Modern compression formats usually (can) save redundant recovery data (error correction codes), to lower the risk of corrupting the whole compressed file due to a few errors (gzip, used by tar -z does not!) - but for backups I find this too risky to rely on, unless you can tune the percentage of that redundancy to your own liking e.g. to 20-50% or more (some tools/algorithms offer that option)).


Edit: PostgreSQL: Documentation: 10: Chapter 25. Backup and Restore

from openstreetmap-tile-server.

stevo01 avatar stevo01 commented on June 18, 2024

How long does the backup take approximately?

around 20 hours. Thats faster than a import of planet but to slow for backup process ..

Why do you have a BACKUP_CMD variable, if you just execute it one line later once?
Wouldn't it be easier to just execute the command directly without the variable?

yes, its just a copy from another (more complex) script and I forgot to adapt it ...

I'd suggest cleanly stopping the docker container for a short moment (so that the database files on the filesystem are in a consistent state), creating a snapshot of the filesystem, start the docker container again so that the server is running, backup the database files from the snapshot, and then removing the snapshot.

I did not found a snapshot function for docker volumes. Can you provide me with details?

There are probably some compression algorithms that benefit from multithreading.

Thanks for hint. I found this article with some benchmarks.
https://www.peterdavehello.org/2015/02/use-multi-threads-to-compress-files-when-taring-something/

Edit: PostgreSQL: Documentation: 10: Chapter 25. Backup and Restore

I think thats a good approach for database content and allows backup on running container. It would allow to reuse a existing docker container solution - e.g. https://hub.docker.com/r/prodrigestivill/postgres-backup-local

from openstreetmap-tile-server.

Istador avatar Istador commented on June 18, 2024

I did not found a snapshot function for docker volumes. Can you provide me with details?

Snapshots are not a function of docker, but might be offered by your underlying volume manager (LVM) or file system (e.g. Btrfs, NSS, NTFS) on your host.

from openstreetmap-tile-server.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.