Coder Social home page Coder Social logo

danisztls / arbie Goto Github PK

View Code? Open in Web Editor NEW
7.0 1.0 1.0 116 KB

ARBie is the friendly automatic, robust, backup script. It integrates battle-tested backup tools to provide encrypted redundant cloud archiving.

License: Other

Shell 89.55% Makefile 10.45%
borg rclone backup gocryptfs sync

arbie's Introduction

Automatic Robust Backup

Automatic Robust Backup or A.R.B. is an archiving and synchronization tool with automation, encryption, redundancy and performance as it goals. It is fast to deploy and provides pre-built use cases and sensible defaults. It is declarative and easy to customize.

About

Goals

  • Automation: after configuration it should not require intervention.
  • Encryption: man-in-the-middle or server should not be able to read the data content.
  • Redundancy: data should not be lost even if a catastrophic failure happens.
  • Performance: tasks should be done in a timely manner and conserve limited resources like size when possible.

Dependencies

  • Borg, the most popular chunk-based deduplication backup manager for home users.
  • Gocryptfs, the spiritual successor to Ecryptfs that is mature, audited actively developed.
  • Rclone is a mature command line cloud storage manager that supports most if not all common providers.
  • Rsync is the best way to sync a directory to another local or network directory.
  • Git is the most popular version control system.
  • Pass, the standard UNIX password manager, or gopass, its actively developed Go fork.

Features

  • Save space via Borg's fast and effective deduplication and compression.
  • Do daily, or even more granular, archives and mount them wherever you want to check the repository at that time.
  • Set complex retention policies to control repository size while preserving time span coverage.
  • Ensure privacy and integrity of data stored on cloud through Gocryptfs online encryption.
  • Sync your data to over 40 cloud storage services, including all major providers with free tiers (Google Drive, Dropbox, Onedrive, Mega, etc).
  • Store secrets encrypted with GPG key and Git versioned.
  • Archive system configuration files and package list.
  • Archive your personal files and whatever files you wish.

Install

Packages

Make

make
make install

Configure

Pipelines

Edit .config/arbie/config to set up pipelines. Instructions and examples included in the file.

Init

Some of the tools require manual initialization or configuration. In the future there will be a tool to partially automate those.

Pass

Init a password repository

gopass setup

Note: A GPG key is needed.

Generate a long secure password

pass generate $secret_name

Insert a password manually

pass insert $secret_name

Note: They will be needed later for encryption.

System

Init Git in System repository

git -C $repo_path init

Borg

Init Borg repository

$borg init -e none $repo_path

Note: Encryption is done by gocryptfs.

Gocryptfs

Init reverse mode encryption in a dir

gocryptfs -extpass pass -extpass $secret_name -init -reverse $repo_path

Note: Reverse mode encryption mount plain dir and files as encrypted files with encrypted dir names which is ideal for storing on the cloud.

Rclone

Configure streams

rclone config

Service

Enable the systemd timer as user

systemctl --user enable arbie.timer

By default it will try to run daily at midnight and run immediately after login in case of a miss. But you edit the service to make it run whenever you want by using a cron alike syntax.

systemctl --user edit arbie.timer

More information about that on Arch Wiki: Systemd/Timers

Maintenance

Borg

Before anything, export the repository path.

export BORG_REPO="$repo_path"

Show repository info

borg info

List archives

borg list

Mount an archive with FUSE

borg mount ::archiveName mountPoint

Caveats

Security

Security is a big concern. Rclone and Borg have their own encryption features but following the principle of do one thing and do it well Gocrypts is exclusively an audited encryption file system.

Note: The repeated and thus predictable header pattern of Borg files may be a vector for a sophisticated attack.

For each their own

There is no ideal backup method. But for most users their data can be classified in an ABC fashion: few files that they really can't lose; data with average volume and importance; voluminous but not important data. And each of these categories will have their own ideal methods.

Cloud synchronization

Cloud providers are a cheap way to have an off-site copy replicated in data centers globally. Some people may have a limited Internet connection and may find useful to instead sync a secondary archive with higher compression and heavy use of exclusion patterns while syncing a full archive on premise.

Granularity

File-based is simpler but the controlled granularity of chunk-based is ideal. To sync a few large size files would be a PITA because any modification would require a re-upload of the whole file. On the other side to sync a great number of small files directly would congest the API requests quota. Borg allows tuning the chunk size and is performant.

System backup

Reinstalling is faster and saner than doing whole disk backups. It's more practical to backup the system configurations and a list of installed packages. After a fresh minimal install the user can run a script to recover the system settings. The advantages are: no need to restart; instantly done; no voluminous disk images or tar archives; high-granularity history of system changes.

Task automation

Desktops generally don't stay on 24/7 so there's a need for a tool that will reschedule missed tasks. Anacron does that but unfortunately it would require the scripts to run as root. While Systemd allows the scripts to run in the user environment and provides it's own logging feature through Journalctl. Also many distros are coming only with Systemd installed.

arbie's People

Contributors

danisztls avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

heartshare

arbie's Issues

Security

One of the goals of making encrypted backups is protecting the user data in the cloud from scanning and leakages. But I'm not a security expert so I can't guarantee that. Much of the trust is placed on gocryptfs.

Secure by obscurity, as in hiding whatever do you use, is tempting as it's a real possibility that an attacker could find an exploit while exploring this repo. But the odds are as low as the odds of one making a return on such time investment.

Cloud storage is not safe as is. And if current trends continues it shouldn't be long until the vulnerability industry reaches retail level.

This script runs locally on the machine and does not require superadmin privileges or open ports. And as I see any vulnerability would require access to the local system or the data in the cloud which is pretty much game over if you are doing nothing.

System: Automatically exclude protected files from rsync

When doing a system backup rsync will fail when trying to sync files without read permissions and don't know of anyway of making ignore those automatically. The currrent workaround is to add those manually to an ignore file. That's just not practical and instead I want to automatically traverse /etc find those and update rsync ignore file.

Alternatively

Find a less fragile alternative to rsync.

Syncing performance

Note: Syncing currently is serialized. Shell is not the best language to write a daemon with parallel asynchronous operations and this is a glorified backup script which employs good CLI tools like rclone to make much of the work.

I'm currently not satisfied w/ sync performance. Rclone will by default attempt 3 times to sync to a stream in case of errors. Ideally it would skip to the next stream and move the problematic stream to the end of queue and also skip in case of a low upload rate. When I wrote it it didn't not support multi-streams and this is currently being thinly abstracted by the wrapper script.

  • Does Rclone supports a feature similar to multi-streams?
  • Improve the syncing wrapper

The later should be a proper daemon running decoupled from the main script and also able to support tools other than Rclone.

Man pages

Is there a tool better than mandown? I want to write a hook to update man pages automatically and if a manual review is needed to remove bad formatted parts I prefer to not have man pages at all as the markdown notes are currently enough. And in the future online documentation might be more useful. Specially for new users evaluating the usefulness of the software.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.