Comments (3)
After analyzing the actual recovery data and the local data, as well as the reviedw of the whole process of remote-store, it was found that the problem should be caused by the restore step of uni restore, as well as the combination of the existing activation detection of ood-daemon
Restore process
The current uni restore recovery steps are defined as follows:
CYFS/src/component/cyfs-backup-lib/src/backup/restore_status.rs
Lines 8 to 16 in 89110cc
It can be seen that the restore step restores the key-data first, then the objects and chunks, and the {cyfs}/etc/desc directory in the key-data of the device.desc/device.sec identity file, so it will lead to the whole restoration process in which the files in etc/desc are recovered first
Binding detection mechanism of ood-daemon
The binding detection logic of ood-daemon is that if the identity file (device.desc/device.sec) in the {cyfs}/etc/desc
directory is found to be missing, a monitor process is started to periodically check whether the identity file is created, then it will pull up other services including gateway
So the problem comes, after the gateway is started, because the remote store process is still continuing, so the local objects and chunks are still incomplete at this time, the gateway loads the root-state, in the case of loading the root, there will be
object meta exists but the corresponding object blob is missing, This leads to the belief that the object does not exist and get_object will return None:
CYFS/src/component/cyfs-lib/src/prelude/named_object_cache.rs
Lines 142 to 148 in 89110cc
CYFS/src/component/cyfs-lib/src/prelude/named_object_cache.rs
Lines 261 to 281 in 89110cc
So gateway will initialize a new global-state, the content is also completely empty, resulting in the above problem
from cyfs.
This problem is considered to be fixed from the following two sides:
1. Adjust the steps of uni restore
Give priority to restore objects and chunks, and restore key-data last, make sure {cyfs}/etc/desc is released after objects and chunks are restored to avoid the above problem
Also during the key-data recovery process, we need to improve the order of the files recovered, make sure the etc directory is recovered last, and the db files should be recovered first to ensure that ood-daemon has recovered all the data after detecting the identity files in etc/desc dir
2. Improves the binding detection logic of ood-daemon service
Since the remote-store is carried out inside ood-dameon, ood-daemon can stop detecting bindings when there is a remote store task, and wait until the restore task is completed, then continue with the binding monitor logic.
However, this logic improvement is limited and can only play a supplementary role. If an external independent process is used for restorer-store operation, then ood-daemon has no way to know the progress of restore.
from cyfs.
This problem is considered to be fixed from the following two sides:
1. Adjust the steps of uni restore
Give priority to restore objects and chunks, and restore key-data last, make sure {cyfs}/etc/desc is released after objects and chunks are restored to avoid the above problem
Also during the key-data recovery process, we need to improve the order of the files recovered, make sure the etc directory is recovered last, and the db files should be recovered first to ensure that ood-daemon has recovered all the data after detecting the identity files in etc/desc dir
2. Improves the binding detection logic of ood-daemon service
Since the remote-store is carried out inside ood-dameon, ood-daemon can stop detecting bindings when there is a remote store task, and wait until the restore task is completed, then continue with the binding monitor logic.
However, this logic improvement is limited and can only play a supplementary role. If an external independent process is used for restorer-store operation, then ood-daemon has no way to know the progress of restore.
The First option above has been adopted to make improvements, including the following two major changes. 1.
- In the uni restore logic, the restore order is adjusted to objects->chunks->key-data
- In the key-data restore logic, the etc/desc directory is restored last.
The above two points ensure that even with the existing bind detection logic of ood-daemon, it is possible to restore and bind ood at the same time by restore operation
See at 3cd354a
from cyfs.
Related Issues (20)
- After App-manager restarted the DEC APP installation process, the DEC APP was not started HOT 3
- Linux app-manager restore installation failed when app-manager breaks down when DEC APP is running "npm i" HOT 4
- Optimize service publishing process HOT 1
- How to config handler`s filter? HOT 3
- If ACL Handler response error,satck.root_state_accessor_stub.get_object_by_path() will not return HOT 8
- The `ping` with `sn` is stopped? HOT 2
- Systemctl status of ood-daemon.service is error HOT 8
- Create restore task not save archive file to disk HOT 1
- Connection problem between runtime and gateway HOT 3
- Stable sort for HashSet HOT 4
- Only web dec_app install failed HOT 1
- Customizing system info HOT 3
- Backup service supports conditional filtering of key data HOT 2
- Release schedule for 84
- Test schedule for 84 Release
- Query task group state failed in rust. HOT 3
- Method to get an object from MetaChain by body-hash to get a specific version. HOT 1
- Unsound `transmute` in safe method
- `ChunkId::new` panic with misaligned pointer dereference HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cyfs.