The qubes-incremental-backup-poc's discuss from v6ak

Reconsider use of check_output for VM calls

VM call can return huge amount of data, DoSing the dom0. In case of a wrong implementation (which seems to be limited to just wrong buffer operation, so not so likely), it could even overflow.

Review usage of utf-8

Two different encodings are used, ASCII and UTF-8. ASCII is much simpler, but UTF-8 might be needed in some cases (e.g. passphrase with national characters).

When UTF-8 is parsed from untrusted source, it adds some attack surface. So, I should review if UTF-8 is used only when not controlled by attacked. (I believe so, but it should be still verified.)

Add versioning and release 0.1

Backup VM metadata

Firewall rules, netvm, storage size, template, …

Create integration test for some basic backup&restore scenarios

Store basic metadata with each subbackup

e.g. master password salt, bcrypt parameters etc.

Those metadata are public and should be kept with each backup. It allows the backup to be restorable without any further metadata.

Make password prompts more friendly

Bad password does not cause a new question
New password is not requested to be entered twice

Provide instructions for how to use this for “production” (practical) use

I would like to use this to backup my personal laptop. How can I do that?

Make the code more asynchronous

The main benefit I can see is starting the DVM while entering the password.

Study Python parallelism model (and memory model)
Study Python Futures
Make it async where reasonable

Implement backup agent for DVM

Sync in more universal way

Calling sync in VM's shell is not much universal. Theoretically, ti can also cause something completely different. So we should use some RPC endpoint for FS sync when it is implemented.

This is followup of #13.

Support storing backups in multiple stores

It can be currently achieved by having separate backups handled through separate BackupStorageVMs. But it is not so convenient.

Should we use one BackupStorageVM, or multiple BackupStorageVMs?
Consider Tahoe-LAFS. I don't know how much is it suitable.

Related issue in another project: duplicati/duplicati#1265, duplicati/duplicati#479

Elaborate: Global exclude lists

When ignoring some files, it would be useful to ignore them in:

a. all VMs
b. in all VMs based on a particular template

It would be useful to have a global exclude lists that could do this instead of hardcoding such files to scripts.

Add support for qvm-backup

This should not be hard and should handle various edge cases of VMs we cannot currently backup.

Currently, we use Duplicity. The reason is not that it was carefuly chosen as the best one. The reason is I have some experience with it, despite I chose it in past for quite a different scenario. So, I am collecting info about backup backends in order to decide well: https://docs.google.com/spreadsheets/d/1rUXn8VkR5nrrtDhywKBpNu2zuTHzOHDX6F053ynBSjw/edit?usp=sharing

Legend for features:

Green feature: great
Red feature: deal-breaker
Orange feature: somewhat suboptimal
Blue feature: not yet known
Grey feature: not evaluated because the backend DNF (it has some serious issue mentioned in another column)

Legend for first column:

Green backend: usable
Orange backend: some suspicion it is not usable
Red backend: DNF

What do we want:

Don't reflect files on input as files on backup directly. This leaks too much metadata and might have too high demand on storage backend (atributtes, filenames, …)
Compression is good for reducing data volume, but it is also a potential side channel. One should be able to turn it off.
Encryption and authentication is useful in short term. In long term, we might want to use a Merkle tree in order to authenticate all the data together, so we will probably move responsibility for encryption and authentication off the backup backend. Nevertheless, I still have briefly evaluated encryption and authentication for the short term. I have not verified backup-volume-level malleability, because this does not look like a feasible attack surface.
We want an easy way to add a custom storage backend. It should require implementing as few methods as possible. The more it gets complex, the harder it will be to integrate.

We will want at least one file-based backend and one block-based backend (qvm-backup or similar).

If you can fill in some missing info or suggest another great backend, write it here, please!

Implement BackupStorageVM

Rewrite cryptopunk

Crypto is currently handled by openssl CLI. (Except for scrypt, which calls Perl.) See cryptopunk.py file. This is a high-latency solution that makes some assumptions (attacker can't read /proc, which is justifiable in dom0) that I would like to get rid off.

Maybe the best available alternative would be nacl/sodium. For Python, we could use python3-pynacl or python3-libnacl. But I can't see those packages in Fedora 23. If Qubes 4.0 updates dom0 to Fedora 24 or newer (not sure), we can go this way. Until then, I'd wait. Or maybe someone will suggest a better solution that we can use without waiting for Qubes 4.0.

Restore on clean system is convenient

Long-term metatask.

Target scenario:

Install Qubes.
Install this backup system to dom0. (Even better: If this tool is integrated into Qubes, then this step would be unneeded.)
Create and install backup-storage VM for your backup backend. This might require installing some software to corresponding backup backend (e.g., Duplicity), but it should be as straightforward as possible.
(Maybe as a part of step 3.) Configure backup URL and credentials.
Run one simple command to restore it all. Minimum interaction is required from user at this moment. Or choose what to restore now and restore the rest later.

Results of the procedure:

All VMs (AppVMs, TemplateVMs, HVMs, …) are restored.
All VM config (firewall routes, VM drive size, …)
User config of dom0 is restored. (Challenge, since we are restoring to a running system.)
TODO: specify behavior for dom0 drivers etc. Maybe those should be optional, as one might want to perform restore on different hardware. Also, in some cases, drivers might be prerequisities of even starting the restore process.
TODO: Specify behavior for volume layout. It might be useful to be preserved in some cases (e.g., laptop repaired), but this might be undesirable in other cases (e.g., completely new laptop).

TODO: list relevant issues.

Allow arbitrary parameters for scrypt

Logging in BackupStorageVM

Currently, we set logger to “some logger”, which is something nonexistent. Think what could be done better.

Also, we do some independent logging to /tmp, which is just a quick hack.

Elaborate file names

We should hide VM names from filenames. Options:

a. Encrypt.
b. Hash.
c. Create a table.

Maybe it would be handy if we have direct access. This disqualifies encrypted names with explicit IVs.

Hashes obscure VM name length (which is not the only way to obscure it, though), but are impractical if you cannot enumerate the VM names./

Encrypted file with table seems to be rather hard approach.

Use various BDVMs

Qubes 4.0 will be able to create a DVM from any AppVM. By default, we should use DVM of the template the VM belongs to.

Use Merkle-tree-based storage

Why?

Backups can be authenticated as whole, not just individual VMs (at best).
Atomicity: When computer crashes during backup, partial backup does not cause problems.
It can obscure what file belongs to what VM. (But data usage patterns can still leak it, at least to some degree.)
Allows buffering volumes. (Better performance, can potentially obscure data access patterns when upload is reordered.)
Allow some data checks on remote. It can check hashes, because they are keys.

BackupStorageVM<->dom0 interface

Rather a simple key-value interface:

get/set KDF params (operates with unauthenticated plaintext and simple KDF params format)
getRoot – returns signed root with timestamp
setRoot (or maybe compareAndSetRoot) – atomically stores a new signed root
getBlob – returns an encrypted blob by ciphertext hash
putBlob – uploads a new blob
deleteBlob – removes blob

dom0 <-> BDVM interface

The interface should be very similar to BackupStorageVM<->dom0, but dom0 has to verify the permissions and maybe handle encryption.

Directory structure

Directory structure would be implemented on top of the mentioned key-value storage as Merkle tree.

What to decide

How to handle permissions to particular files without leaking any data?
How to buffer files in the target VM without leaking any data when maliciously modified? (When buffering, one has to send unauthenticated chunks.) Do we want this at all?
How to handle garbage collection? Maybe it will require parsing and traversing directory structure in dom0. OTOH, it should be a rather simple format.
When user removes a VM and creates a VM with the same name, should they be related in any way?

Allow changing password or key-streching parameters

This has to be elaborated. It does not seem to be easy, if even possible.

Allow passing private img size when restoring

Until #6 is implemented, one should be able to pass private.img size.

Restore hangs when there in not enough space

Expected behavior: It exits with a proper error message.

Update toString for VmKeys

It should not expose the keys, because it could cause accidental exposure.

Log and process Duplicity output

Allow reusing derived key for multiple backup

Initial setup is convenient

TODO: define

Review documentation before 0.1 release

handle notes at https://groups.google.com/forum/#!topic/qubes-users/yTL6c8ArwbI
update outdated parts
document restore testing
consider splitting README.md in multiple parts

Implement backup script for dom0

There should be automated tests for backup and restore scenarios

Related to #14, but this time, we want something that will check the software quality itself.

Maybe we should have mostly integration tests, because we mostly integrate various pieces of software.

Restore: VM with proper name should proper when finished

Currently, VM is created with final name and then restore is performed. If restore fails, some partially restored VM with good name exists.

When a VM with a correct name appears, it should have correct content.

Proposed behavior:

Create VM with temporary name
Disable backups for the VM
Restore
Enable backups for the VM
Rename the temporary VM

Sync filesystem before backup if the VM is running

Derive password for storage backend from master passphrase

This is a challenging task.

We could use passphrase to derive password directly. But this would skip the master secret derivation, essentially bypassing all custom-configured password-stretching parameters. This is bad in long term, as this does not allow to use better key-stretching parameters in future without breaking compatibility. It also cannot be salted by anything else than storage URL and username. Salting with storage URL and username has some drawbacks (mostly the need of exactly same URL and username, even if the backend tolerates some deviation like case), but they are probably justifiable.

We could also download some public data from the backup storage (this can hardly be storage-agnostic) to get key derivation parameters. Those key stretching parameters have to be considered as untrusted. This implies:

If we don't include passphrase_test in public parameters for key derivation, the backup storage can perform downgrade attack*: It can provide weak key stretching parameters and then bruteforce the password. (Rainbow attacks can be avoided, though: We can add URL and username to the salt.)
We can mitigate the downgrade attack by including passphrase_test in public parameters for key derivation. However, if we include passphrase_test in the public parameters for key derivation, anyone (not just the backup storage administrators) can download it and bruteforce master password offline.

Another disadvantage: This can increase practical value of shouldersurfing attacks.

However, maybe the hassle with design and implementation and all the risks are simply not worth of the enhancement.

*) Also anyone who can attack the connection can do this. So, the connection to backup storage is a new weak point.

Review stderr in qrexec handlers

We should ensure it does not leak any sensitive data.

Be more careful when mounting filesystem

Ideas:

Restrict list of allowed filesystems
Configure udev not to probe the attached FS

Change suffix/prefix clone to something more specific

“Clone” is way too generic, so it can collide with some other software. We should use something more specific.

Add checks for backups

Various checks can be considered:

a. Backup can be restored without errors.
b. Compare data from backup and real system. (Challenge: exclusions.)
c. Perform some user-defined VM-specific test.

Consider noexec and nosuid when mounting the filesystem

(And possibly some other options.)

Pros:

hygiene

Cons:

FS compatibility
backup compatibility (maybe some files would be backed up without proper attributes)

Implement restore script for dom0

Since one can do this manually, I am postponing it away from MVP.

Disconnect the BDVM from network

The DVM that performs backup (BDVM) needs no network access. According to principle of least privileges, it should not have it.

Threats

A VM with restricted (e.g., Torified) network access (=Attacking VM, AVM) creates a malformed filesystem on its private partition. The BDVM has some vulnerability in filesystem driver. As a result, AVM is able to execute arbitrary code in BDVM. Since the BDVM can be connected to Internet, AVM gets direct access to the Internet. This can lead to deanonymization.

Advantages

If BDVM had no direct access to the Internet, the adversary would not be able to get the Internet access and deanonymize the user this way. However, advantage of BDVM without Internet access is somewhat limited there. If adversary has an access to the backup storage, she can deanonymize the user anyway. Offloading encryption from BDVM could help partially, but attacker still would be able to observe backup sizes.

Implement dom0 key derivation

Implement scrypt
Implement hmac

Make the BackupStorageVM persistent

When installing BackupStorageVM tools to non-standalone and non-template VM, the policy files do not survive reboot. Fix it.

Add support for TemplateVMs

It might be reasonable to backup just package lists and few files instead of the whole filesystem.

v6ak / qubes-incremental-backup-poc Goto Github PK

qubes-incremental-backup-poc's Issues