ddebeau / zfs_uploader Goto Github PK
View Code? Open in Web Editor NEWSimple program for backing up full and incremental ZFS snapshots to Amazon S3.
License: MIT License
Simple program for backing up full and incremental ZFS snapshots to Amazon S3.
License: MIT License
Support sending encrypted (raw) snapshots.
logfmt
is easier to read than the current format and is very easy to add information to.
ZFSjob should fail early if any of the required arguments are None
.
For file systems with many changes, the incremental backups will continue to grow in size until a new full backup is taken. We should add the ability to take more than one full backup and prune full backups.
Error:
cat 20210607_020000.inc | zfs receive <filesystem>@20210607_020000
cannot receive incremental stream: destination <filesystem> has been modified
since most recent snapshot
zfsup restore <filesystem> 20210607_020000
doesn't return anything. If you wrap the restore command in a try except statement you'll get a BrokenPipeError
. We'll want to catch that and let the user know what the problem likely is. The -F
option forces a rollback to the most recent snapshot. We should add an option for that.
A basic CLI should be added that has the following functions:
First of all, thank you very much for sharing this project! I'm a newcomer to ZFS and I was able to get started in minutes thanks to the excellent documentation!
As I said before, I'm fairly new to ZFS. I've managed to put together a server with the following datasets:
$ zfs list
NAME USED AVAIL REFER MOUNTPOINT
zfspool 244G 5.09T 467M /zfspool
zfspool/bulkstorage 145G 5.09T 145G /zfspool/bulkstorage
zfspool/vm-100-disk-0 3M 5.09T 120K -
zfspool/vm-100-disk-1 33.0G 5.12T 3.30G -
zfspool/vm-102-disk-0 66.0G 5.13T 20.6G -
zfspool/vm-102-disk-1 3M 5.09T 120K -
My config file looks like this (with redacted info styled <like this>
):
[DEFAULT]
bucket_name = <bucket>
region = <region>
access_key = <access key>
secret_key = <secret key>
storage_class = STANDARD
endpoint = <endpoint>
[zfspool]
cron = 0 2 * * *
max_snapshots = 7
max_incremental_backups_per_full = 6
max_backups = 7
I naively expected zfs_uploader
to recursively snapshot and upload backups of each dataset within zfspool
, however it only did so for the data specifically stored in the zfspool
mount point that was not part of any of the children.
Is this supported by zfs_uploader
, or do I need to specify each dataset manually? Ex. [zfspool/bulkstorage]
, [zfspool/vm-100-disk-0]
, etc.
Also related, I did find that ZFS supports recursive snapshots, but I haven't tried it yet: https://docs.oracle.com/cd/E19253-01/819-5461/gdfdt/index.html
The following traceback occurs when uploading large snapshots:
botocore.exceptions.ClientError: An error occurred (InvalidArgument) when calling the UploadPart operation: Part number must be an integer between 1 and 10000, inclusive
The error is caused when the part number limit (10,000) is reached. We'll need to adjust part size ourselves instead of letting Boto do it.
https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html
Build package using GitHub Actions. We'll also want to establish minimum package versions.
Hi @ddebeau I am seeing in my aws console that the incrementals created are based on the full rather than on previews incremental.
If you look at my backups, they are getting bigger and bigger duplicating a lot of data. I can't create a new full as it needs to be stored for 6 months due to deep archive constraints. I am open to create a PR to solve this but I wanted your opinion on the subject.
This affects:
The savings would be massive across time
Thanks
Hello!
Are there any plans to implement encryption of snapshots for unencrypted pools / data sources before uploading to s3?
Example:
zfs send pool/dataset@snapshot-name | gpg --symmetric --cipher-algo AES256 -o /path/to/encrypted-snapshot.gpg
It would improve the readability and organization of the code to move the backup_info
functions to their own object. It would also allow us to better support cross-compatibility with old and new backup_info
formats.
zfs_uploader/zfs_uploader/job.py
Lines 149 to 179 in ccfa4b6
Hi, I'm the author of https://github.com/andaag/zfs-to-glacier/ and I occasionally go hunting for alternative versions, to see if someone has invested more in this problem than me and I can stop maintaining this.
One thing I see you are missing is storage class depending on size. I got a fairly ugly hack here : https://github.com/andaag/zfs-to-glacier/blob/main/src/main.rs#L95 to adjust the storage class to standard in cases where the file size is too small for glacier. (In these cases you pay a premium, and it's actually cheaper with standard storage).
Glacier minimum billable size is 128kb. Quite a lot of incremental backups can be 1kb, so Standard storage is a very clear winner!
Convention is to use None
for optional arguments and set the default values in code.
zfs_uploader/zfs_uploader/job.py
Lines 74 to 76 in ccfa4b6
misfire_grace_time
is currently set to 2 hours which means that if a job takes longer than 2 hours other jobs may not run on schedule. It should be set to None
so that late jobs always run. We should always error on the side of backing up early.
zfs_uploader/zfs_uploader/__main__.py
Line 63 in 585e3ae
Code like snapshot_name = f'{self._file_system}@{backup_time}'
and list_snapshots()
should really be part of a greater SnapshotDB object so we can document and reuse code efficiently. The Backup object should also have a snapshot parameter so we can easily restore from backup.
Are there any plans to add a sort of pruning schedule? Similar to borg
's prune
command with the --keep-hourly
, --keep-daily
, etc. options.
When setting up zfs_uploader against Scaleway Object Storage (Specifically their GLACIER
tier), everything worked as expected except with one caveat: The max part number on Scaleway is 1000 rather than the 10000 used by AWS.
This resulted in an error when uploading with the default setup, since it calculated the part sizes based on 10,000 parts and eventually failed due to exceeding Scaleway's limit of 1,000 parts.
I resolved this for my use-case by simply modifying a number in job.py: Erisa@20ed42f however I feel that going forward it would be a good idea to allow configuration of this value in the zfs_uploader configuration file, and document it on the README.
You could also detect and change the values based on predefined provider limits, however it still would be nice to have the value in a user-configurable place.
Implement flake8 testing so we can enforce style with every merge.
When restoring an incremental backup we shouldn't restore the full backup if the snapshot used for the full backup still exists on the system.
zfs_uploader/zfs_uploader/job.py
Lines 126 to 130 in fd89459
CLI errors like No configuration file found.
should just print the statement and exit 1.
We should be able to run the test suite in the standard GitHub Actions linux runner. zpool
can create pools from files.
The CLI entrypoint is executable-name
instead of zfsup
and the command returns this traceback:
executable-name --version
Traceback (most recent call last):
File "<redacted>/.env/zfs_uploader/bin/executable-name", line 5, in <module>
from zfs_uploader.__main__ import main
ImportError: cannot import name 'main' from 'zfs_uploader.__main__' (<redacted>/.env/zfs_uploader/lib/python3.7/site-packages/zfs_uploader/__main__.py)
If the dataset being backed up has a compression
property set to anything other than off
, the default behaviour of zfs send
is to decompress on the fly and send the full uncompressed dataset.
Simply by adding a -c, --compressed
flag to zfs send
, this will instead be sent compressed and takes up significantly less space on the remote. In my case this reduced a full backup of a PostgreSQL database from 56 GB to 24 GB.
I added this flag to my personal fork in Erisa@c192333 and noticed no regressions or repercussions, however since users may not always have their dataset set to compress or want this behaviour to change across versions, I believe the best way forward would be to add a zfs_uploader config variable that will enable this compressed flag.
Error Message:
Run time of job "ZFSjob.start (trigger: cron[month='*', day='*', day_of_week='*', hour='6', minute='2'], next run at: 2022-02-07 06:02:00 UTC)" was missed by 0:00:28.825414
Setting misfire_grace_time
to None
for all jobs may fix the issue.
zfs_uploader/zfs_uploader/__main__.py
Line 62 in af5868f
Restoring to a new filesystem would help with testing.
We should move the S3 upload code to BackupDB so that it works like SnapshotDB in that it can create the objects it references. The job code should handle the complicated stuff that involves backups and snapshots.
zfs_uploader/zfs_uploader/job.py
Lines 137 to 152 in 74e469d
We should log upload progress and speed so we'll know if the upload is stalled or not.
Can this be installed/used on FreeBSD?
ERROR: For req: zfs_uploader. Invalid script entry point: <ExportEntry zfsup = zfs_uploader.__main__:None []> - A callable suffix is required. Cf https://packaging.python.org/specifications/entry-points/#use-for-scripts for more information.
Right now only the backup
command is logging.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.