frugalos / frugalos Goto Github PK

View Code? Open in Web Editor NEW

392.0 17.0 28.0 1.56 MB

Frugal Object Storage

License: MIT License

Makefile 0.03% Dockerfile 0.22% Shell 3.87% Rust 95.74% Ruby 0.14%

rust distributed-object-store object-storage

frugalos's Introduction

Frugalos

Frugal Object Storage

Frugalos is a distributed object storage written by Rust.
It is suitable for storing medium size BLOBs that become petabyte scale in total.

Documentation

Installation

You can install frugalos by the following command:

$ cargo install frugalos

Note: The current installation process requires automake, autoconf, and libtool to build liberasurecode internally. If you have not installed them, please install them. (See also liberasurecode's prerequisites)

You can also use pre-build binaries from the releases page.

Simple Example

// Create a cluster.
$ frugalos create --id example --data-dir example/
Oct 26 13:42:06.244 INFO [START] create: local=Server { id: "example", seqno: 0, host: V4(127.0.0.1), port: 14278 }; data_dir.as_ref()="example/"; , server: [email protected]:14278, module: frugalos_config::cluster:121
Oct 26 13:42:06.245 INFO Creates data directry: "example/", server: [email protected]:14278, module: frugalos_config::cluster:113
Oct 26 13:42:06.256 INFO [START] LoadBallot: lump_id=LumpId("03000000000000000000000000000000"); , server: [email protected]:14278, module: frugalos_raft::storage::ballot:21
...
...

// Start a frugalos process in the background.
$ frugalos start --data-dir example/ &
Oct 26 13:46:16.046 INFO Local server info: Server { id: "example", seqno: 0, host: V4(127.0.0.1), port: 14278 }, module: frugalos_config::service:68
Oct 26 13:46:16.062 INFO [START] LoadBallot: lump_id=LumpId("03000000000000000000000000000000"); , module: frugalos_raft::storage::ballot:21
Oct 26 13:46:16.086 INFO Starts RPC server, server: 127.0.0.1:14278, module: fibers_rpc::rpc_server:221
...
...

// Add a device and a bucket to store objects.
$ DEVICE_JSON='{"file": {"id": "file0", "server": "example", "filepath": "example/file0.lusf"}}'
$ curl -XPUT -d "$DEVICE_JSON" http://localhost:3000/v1/devices/file0
{"file":{"id":"file0","seqno":0,"weight":"auto","server":"example","capacity":19556691462,"filepath":"example/file0.lusf"}}%

$ BUCKET_JSON='{"metadata": {"id": "bucket0", "device": "file0", "tolerable_faults": 1}}'
$ curl -XPUT -d "$BUCKET_JSON" http://localhost:3000/v1/buckets/bucket0
{"metadata":{"id":"bucket0","seqno":0,"device":"file0","segment_count":1,"tolerable_faults":1}}%

// PUT and GET an object.
$ curl -XPUT -d 'your_object_data' http://localhost:3000/v1/buckets/bucket0/objects/your_object_id
$ curl http://localhost:3000/v1/buckets/bucket0/objects/your_object_id
your_object_data

Please see REST API for details and other available APIs.

For Frugalos Developers

Please see Developer's Guide.

frugalos's People

Contributors

Stargazers

Watchers

frugalos's Issues

`create` and `join` command sometimes fails due to Monitor target aborted

Error:

Jan 11 15:16:56.587 INFO [FINISH] create, server: [email protected]:14278, module: frugalos_config::cluster:170
thread '<unnamed>' panicked at 'Error: Other (cause; Monitor target aborted)
HISTORY:
  [0] at /root/.cargo/registry/src/github.com-1ecc6299db9ec823/fibers_rpc-0.2.17/src/rpc_server.rs:404
  [1] at /root/.cargo/registry/src/github.com-1ecc6299db9ec823/fibers_rpc-0.2.17/src/rpc_server.rs:281
', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/frugalos_config-0.3.0/src/cluster.rs:153:48

設定ファイルからサーバ/デバイス/バケツを登録可能にする

frugalos create のオプションで構成ファイル(TOML?)を渡せるようにして、一括で必要なクラスタ構成を構築可能になるとテストやデバッグ時に便利。

ディスククリア時の扱いの改善

理想的には「ディスクの内容がクリアされた場合には、そのディスクを使用しているraftノードはIDを変えて、再構成を走らせるようにする(具体的には、IDにcannylsのUUIDを含めるようにする)」といったように形で、Raftレベルでは別のノードとして認識されるのが望ましい（理屈上は「ノードIDは変わらないけれどディスククリア発生後に再起動」してしまうと不整合に陥る可能性がある）。
ただし、これはこれで面倒なので、本当にそうするかどうかは要検討。

device PUT 時 URI の ID が無視される

frugalos の configuration 時,

curl -XPUT -d '{"virtual":{"id":"foo", "children": ["bar"]}}' http://localhost:3100/v1/devices/cafebabe

とすると、このとき,

id が二箇所に出てきて冗長
URI の id は無視されてしまうので RESTful ではない

という 2 つの問題があるように見えます。

バケツやデバイス登録時のバリデーションを強化

今は、登録自体に成功しても、その後に（指定内容が悪くて）エラーになるようなケースがあるので、バリデーションを強化する。

バケツ登録時にリペア機能のON/OFFを指定可能にする

現状は環境変数で制御しているが、これはバケツ単位で指定可能であるべき。

Allow users to change `put_content_timeout` dynamically.

I would like to allow users to change put_content_timeout.

Background

frugalos automatically adds 60 secs to put_content_timeout specified by a user. It's inconvenient to debug a repair operation because the timeout also affects to a processing delay of Synchronizer.

再起動時にInstall Snapshotが無限に走り続ける

問題点は件名の通りですが、詳述のために条件・状況を先に記載します。

再現方法（状況）

複数台のfrugalosインスタンスでクラスタを作る。
ログがスナップショット化される程度に多くのデータをPUTする。
クラスタ中のfrugalosインスタンスのうち１つのインスタンス（以降Xと呼ぶ）を停止する。
4a. Xに関するファイルのうち、 cluster.lusf と local.dat 以外の 全てのデータファイル（lusfファイル）を削除する。
4b. 4aに並行して、データをPUTし続ける。
Xを再起動する。

ここで5においてXを再起動後、2でログがスナップショット化されていることと、4aでデータファイルを消しておりログが初期化されているため、XはSnapshotを受け取りSaveする処理が走ることに注意。

問題が発生する前提

Xの使っているディスクの書き込み・読み込み速度が低下しており、XのSnapshot Save処理が終了するまでに膨大な時間を要する。

frugalos/frugalos_raft/src/storage/log_prefix/save.rs

Lines 64 to 110 in 4b9ca1e

    
           impl Future for SaveLogPrefix { 
        
               type Item = (); 
        
               type Error = Error; 
        
               fn poll(&mut self) -> Poll<Self::Item, Self::Error> { 
        
                   while let Async::Ready(phase) = self.phase.poll()? { 
        
                       let next = match phase { 
        
                           Phase5::A(index) => { 
        
                               let index = index.unwrap_or(Range { start: 0, end: 0 }); 
        
                               self.old_prefix_index = index.clone(); 
        
                               let prefix = self.prefix.take().expect("Never fails"); 
        
                               let future = 
        
                                   track!(SaveLogPrefixBytes::new(self.handle.clone(), index, prefix))?; 
        
                               Phase5::B(future) 
        
                           } 
        
                           Phase5::B(prefix_index) => { 
        
                               let future = 
        
                                   track!(SaveLogPrefixIndex::new(self.handle.clone(), prefix_index))?; 
        
                               Phase5::C(future) 
        
                           } 
        
                           Phase5::C(()) => { 
        
                               let future = DeleteOldLogPrefixBytes::new( 
        
                                   self.handle.clone(), 
        
                                   self.old_prefix_index.clone(), 
        
                               ); 
        
                               Phase5::D(future) 
        
                           } 
        
                           Phase5::D(()) => { 
        
                               let future = track!(DeleteOldLogEntries::new( 
        
                                   self.handle.clone(), 
        
                                   self.old_entries.clone() 
        
                               ))?; 
        
                               Phase5::E(future) 
        
                           } 
        
                           Phase5::E(()) => { 
        
                               info!(self.handle.logger, "[FINISH] SaveLogPrefix"); 
        
                               let event = Event::LogPrefixUpdated { 
        
                                   new_head: self.new_head, 
        
                               }; 
        
                               let _ = self.event_tx.send(event); 
        
                               return Ok(Async::Ready(())); 
        
                           } 
        
                       }; 
        
                       self.phase = next; 
        
                   } 
        
                   Ok(Async::NotReady) 
        
               } 
        
           }

問題の発生

（前提から）XのSnapshot Save処理の終了までに時間がかかってしまうため、4bにおいて他のログが進んでしまう。
このためSnapshot Save処理が終了した段階ではログ情報がリーダーらからみると古いため、Snapshot Save終了後にまたしてもSnapshotをインストールしなければならない。

これが いつまでも 繰り返されてしまう。

メモ

そもそも、これ単体は危機的な状況を招く問題なのか？
やむを得ない状況であり、回避不能ではないか？

ただし、類似した状況で、更に別の要因が重なると、危機的な問題 #54 が起こることは判明している。

デバイスの停止・起動用のAPIを用意する

frugalosプロセスは起動したままで、デバイスの停止・起動が行えると便利なので追加する（特にHDD故障時の対応後に復帰させたい場合に）。

Introduce an option to select whether or not to use `O_DIRECT`

Introduce an option to select whether or not to use O_DIRECT when opening a lusf file.
This option is needed to use frugalos under the tmpfs filesystem which does not support O_DIRECT.

Massive snapshot casting and High memory usage When restart a frugalos instance

To correctly write this problem, we start from a way to reproduce this one.

How to Reproduce

Build a frugalos cluster.
Put many data where the amount of the put data requires frugalos instances to take a snapshot.
Drop a frugalos instance X from the cluster and delete all X's lusf files except cluster.lusf and local.dat .
Restart the frugalos instance X .
Drop another frugalos instance Y in which there is a leader raft node w.r.t a raft node belonging to X.

Assumption

X takes a long time to finish installing snapshot for some reasons on its device:

frugalos/frugalos_raft/src/storage/log_prefix/save.rs

Lines 64 to 110 in 4b9ca1e

    
           impl Future for SaveLogPrefix { 
        
               type Item = (); 
        
               type Error = Error; 
        
               fn poll(&mut self) -> Poll<Self::Item, Self::Error> { 
        
                   while let Async::Ready(phase) = self.phase.poll()? { 
        
                       let next = match phase { 
        
                           Phase5::A(index) => { 
        
                               let index = index.unwrap_or(Range { start: 0, end: 0 }); 
        
                               self.old_prefix_index = index.clone(); 
        
                               let prefix = self.prefix.take().expect("Never fails"); 
        
                               let future = 
        
                                   track!(SaveLogPrefixBytes::new(self.handle.clone(), index, prefix))?; 
        
                               Phase5::B(future) 
        
                           } 
        
                           Phase5::B(prefix_index) => { 
        
                               let future = 
        
                                   track!(SaveLogPrefixIndex::new(self.handle.clone(), prefix_index))?; 
        
                               Phase5::C(future) 
        
                           } 
        
                           Phase5::C(()) => { 
        
                               let future = DeleteOldLogPrefixBytes::new( 
        
                                   self.handle.clone(), 
        
                                   self.old_prefix_index.clone(), 
        
                               ); 
        
                               Phase5::D(future) 
        
                           } 
        
                           Phase5::D(()) => { 
        
                               let future = track!(DeleteOldLogEntries::new( 
        
                                   self.handle.clone(), 
        
                                   self.old_entries.clone() 
        
                               ))?; 
        
                               Phase5::E(future) 
        
                           } 
        
                           Phase5::E(()) => { 
        
                               info!(self.handle.logger, "[FINISH] SaveLogPrefix"); 
        
                               let event = Event::LogPrefixUpdated { 
        
                                   new_head: self.new_head, 
        
                               }; 
        
                               let _ = self.event_tx.send(event); 
        
                               return Ok(Async::Ready(())); 
        
                           } 
        
                       }; 
        
                       self.phase = next; 
        
                   } 
        
                   Ok(Async::NotReady) 
        
               } 
        
           }

What happens ?

After the above reproduce steps, it occurs that massive snapshot casting to X's raft nodes from nodes in the cluster.
If the size of one snapshot size is huge, this behavior leads to out of memory errors in frugalos instances that send massive snapshots and destroy the frugalos cluster.

Why do massive snapshot castings occur?

Here we fix a raft node N belonging to the instance X such that

for N, there is a leader node in Y (we call this node L).

After the step-4, by the step-2, N receives a snapshot and installs it through the following codes:
https://github.com/frugalos/frugalos/blob/master/frugalos_raft/src/storage/log_prefix/save.rs#L64-L110

By the assumption, N takes a long time to install the snapshot.
Furthermore, among this snapshot installing, by the step-5, N changes its state to candidate because of a raft-level timeout due to the absence of its leader L.

After that, N votes a raft node and changes its state to Follower (more specifically, FollowerIdele):
https://github.com/frugalos/raftlog/blob/2e3cb4647d4ebf888f836bb9b1a209626a5c344b/src/node_state/follower/idle.rs#L13-L15

Now, N is installing a snapshot and in FollowerIdle.
This state leads to the next properties:

Since N is installing a snapshot, N still does not change log status (therefore, its log tail position is 0)
Since N is in FollowerIdle and N's log is too old w.r.t that of the leader.

OK, if N receives a heartbeat from a (new) leader, then what happens?

N replies it from a (new) leader with an information such that N's log is too old.
After the leader receives the reply, it casts a snapshot to N.
N receives a snapshot, but it is still installing the old snapshot. Therefore, N drops the cast.
N again receives a heartbeat and goto 1.

Finally, the leader quickly and repeatedly cast a snapshot to N until N finishes its slow snapshot installing.

Math: Estimate how many data will be sent from a frugalos instance Z.

n: the size of the set { <x, z> : a raft node z of Z is the leader of a raft ode x of X }
S: the byte size of snapshots (that is sent to X from Z)
I: the interval seconds to send a heartbeat from Z to X

For every I-seconds, the frugalos instance Z put n*S-bytes data.

For example, on our experiment environment, n = 500, S = 50MB and I = 5 secs.

(On this picture, each 1-block means 10 seconds)

For every 5-seconds, one frugalos instance try to send 25GB-data to X and this eventually leads to run out its memory; finally, the frugalos instance stops.

個人の開発環境を簡単に構築できるようにする

案

vagrant で VM を用意
ansible で必要なツール群をインストールする

erasure coding でデータを復元する時に、fragment 位置を指す index が fragments の最大値を超える

現象

frugalos/frugalos_segment/src/client/storage.rs

Line 357 in d989059

let missing_index = self

上記コードの missing_index の計算時(とその後の使用)にフラグメントの個数を考慮していないため、openstack/liberasurecode 内でデータの復元に失敗し invalid memory reference が発生する。

なお、データを put する際はフラグメントの個数と candidates 数の最小値が使われるのでエラーは発生しない。

再現方法

以下の patch をあてて cargo tet を実行する。

diff --git a/frugalos_segment/src/client/storage.rs b/frugalos_segment/src/client/storage.rs
index 9b2e425..7b84d20 100644
--- a/frugalos_segment/src/client/storage.rs
+++ b/frugalos_segment/src/client/storage.rs
@@ -836,4 +836,43 @@ mod tests {
 
         Ok(())
     }
+
+    #[test]
+    fn it_fails() -> TestResult {
+        // データフラグメントを 5 (data_fragments = 4, parity_fragments = 1) にして、6 ノードのクラスタを組む。
+        let data_fragments = 5;
+        let mut system = System::new(data_fragments)?;
+        let (node_id, device_id, _) = system.make_node()?;
+        let mut members = Vec::new();
+
+        members.push(ClusterMember {
+            node: node_id,
+            device: device_id,
+        });
+
+        for _ in 0..data_fragments {
+            let (node, device, _) = system.make_node()?;
+            members.push(ClusterMember { node, device });
+        }
+
+        let storage_client = system.boot(members)?;
+
+        // 6 を取得する。(node_id は本来はこのオブジェクトの保存担当ではない)
+        let version = ObjectVersion(6);
+
+        let _ = wait(storage_client.clone().put(
+            version.clone(),
+            vec![0x01],
+            Deadline::Infinity,
+            Span::inactive().handle(),
+        ))?;
+
+        // ここで invalid memory reference が発生する
+        let _ = wait(storage_client.clone().get_fragment(
+            node_id.clone(),
+            version.clone()
+        ))?;
+
+        Ok(())
+    }
 }

Break the precondition of `handle_committed`

Description

We can break the following precondition of the handle_committed method:

frugalos/frugalos_mds/src/node/node.rs

Line 478 in 4b9ca1e

track_assert_eq!(self.next_commit, commit, ErrorKind::InvalidInput);

Reproduce

Use these files: https://gist.github.com/yuezato/9c0af68320935b342d0b152811f58cfc

Why is the precondition broken

In this while-loop:
https://github.com/frugalos/frugalos/blob/master/frugalos_mds/src/node/node.rs#L729-L738
here we assume the two raft events [ Event::SnapshotLoaded, Event::Committed ] come in this order.

First, we deal the Event::SnapshotLoaded

frugalos/frugalos_mds/src/node/node.rs

Lines 449 to 464 in 4b9ca1e

    
           E::SnapshotLoaded { new_head, snapshot } => { 
        
               info!( 
        
                   self.logger, 
        
                   "New snapshot is loaded: new_head={:?}, bytes={}", 
        
                   new_head, 
        
                   snapshot.len() 
        
               ); 
        
               let logger = self.logger.clone(); 
        
               let future = fibers_tasque::DefaultCpuTaskQueue.async_call(move || { 
        
                   let machine = track!(codec::decode_machine(&snapshot))?; 
        
                   let versions = machine.to_versions(); 
        
                   info!(logger, "Snapshot decoded: {} bytes", snapshot.len()); 
        
                   Ok((new_head, machine, versions)) 
        
               }); 
        
               self.decoding_snapshot = Some(future); 
        
           }

without updating self.next_commit.

Immediately after receiving Event::Committed, we reach this line:

frugalos/frugalos_mds/src/node/node.rs

Line 478 in 4b9ca1e

track_assert_eq!(self.next_commit, commit, ErrorKind::InvalidInput);

Finally, the precondition is broken.

How Solve This

Once we encounter a SnapshotLoaded event,
we should wait to deal committed events that follows the loaded event among decoding the snapshot.

Indeed, in this part (Especially line 704)

frugalos/frugalos_mds/src/node/node.rs

Lines 688 to 709 in 4b9ca1e

    
           match track!(self.decoding_snapshot.poll().map_err(Error::from))? { 
        
               Async::NotReady => return Ok(Async::NotReady), 
        
               Async::Ready(None) => {} 
        
               Async::Ready(Some(result)) => { 
        
                   let (new_head, machine, versions) = track!(result)?; 
        
                   info!(self.logger, "Snapshot decoded: new_head={:?}", new_head); 
        
                   let delay = env::var("FRUGALOS_SNAPSHOT_REPAIR_DELAY") 
        
                       .ok() 
        
                       .and_then(|v| v.parse().ok()) 
        
                       .unwrap_or(10); 
        
                   self.events.reserve_exact(machine.len()); 
        
                   self.events 
        
                       .extend(versions.into_iter().map(|version| Event::Putted { 
        
                           version, 
        
                           put_content_timeout: Seconds(delay), 
        
                       })); 
        
                   self.next_commit = new_head.index; 
        
                   self.machine = machine; 
        
                   self.metrics.objects.set(self.machine.len() as f64); 
        
                   self.decoding_snapshot = None; 
        
               } 
        
           }

we can correctly update self.next_commit and this maybe solve the present issue.

Functionality to support make a cluster

Now there are no support functionalities or external tools that support to make/start a frugalos cluster. If there is such a builtin functionality, it is convenient and safe.

I consider the following idea. Do you have any comments?

Idea; Using Configuration File

# YAML format
- name: instance_name1
  role: create
  rpc-addr: 192.168.0.1:14000
  http-addr: 192.168.0.1:3000
  files: path1/file1, path2/file2

- name: instance_name2
  role: join
  rpc-addr: 192.168.0.2:14000
  http-addr: 192.168.0.2:3000
  files: pathA/fileA, pathB/fileB, pathC/fileC

- name: instance_name3
  role: join
  rpc-addr: 192.168.0.3:14000
  http-addr: 192.168.0.3:3000
  files: pathX/fileX, pathY/fileY

- bucket_name: bucket
  type: dispersed
  data-fragment: 8
  parity-fragment: 4
  device: { type: virtual, instances: [instance_name1, instance_name2, instance_name3] }

Usage:

on 192.168.0.{1, 2, 3}: frugalos cluster-start config.file

Internally,

instance_name1 gathers the config files from instance_name2 and instance_name3 and then check all the three configuration files are equal.
instance_name1 first execute frugalos create and then request instance_name{2, 3} to execute frugalos join againt instance_name1
Finally, instance_name1 makes buket on itself.

オブジェクトの接頭辞削除ですべてのオブジェクトを削除してしまうミスを防ぐ方法を検討する

すべてのデータを削除したいユースケースは実運用ではほぼないと想定されるため、プログラムのバグによりすべてのデータが消失してしまうのは避けられるといいかもしれないという話です。

以下 sile 先生のアイディア

個人的には、（例えば）以下のような許可リストファイルを起動時にfrugalosに指定するようにするのが良いのではないかと思っています:

# なぜかYAML形式（RustならTOMLの方が良さそう)
allow_prefix_delete:
  - bucket: "foo"                                # 接頭辞削除を許可しているバケツ
    object_prefix: "xxxxx.timeshift."  # この文字列を接頭辞に含まない場合には、削除要求は拒否される

Add a sub-command which invokes repair/delete objects for debugging.

For now we can't repair or delete a specific ObjectVersion. It's convenient to repair an object manually for debugging.

Move FRUGALOS_REPAIR_ENABLED to a command line option

frugalos/frugalos_segment/src/service.rs

Line 250 in 63945b4

let repair_enabled = env::var("FRUGALOS_REPAIR_ENABLED")

Fix this ToDO.

The leader managed by `Request` seems to become inconsistent in `frugalos_segment/src/client/mds.rs`

It seems to become inconsistent state due to a multithreading issue in this method.

frugalos/frugalos_segment/src/client/mds.rs

Line 319 in 420ab1f

fn poll(&mut self) -> Poll<Self::Item, Self::Error> {

The lock of Arc<Mutex<Inner>> should be held from the beginning to the end at each match case.

Error in LoadLogPrefix after restarting a frugalos server that has stopped abnormally.

Where does error occur

track!(protobuf::decode_log_prefix(&bytes))? in the line 57 of

frugalos/frugalos_raft/src/storage/log_prefix/load.rs

Lines 55 to 71 in 420ab1f

    
           Phase::B(bytes) => { 
        
               if let Some(bytes) = bytes { 
        
                   let prefix = track!(protobuf::decode_log_prefix(&bytes))?; 
        
                   info!( 
        
                       self.handle.logger, 
        
                       "[FINISH] LoadLogPrefix: {}", 
        
                       dump!(prefix.tail, prefix.config, bytes.len()) 
        
                   ); 
        
                   return Ok(Async::Ready(Some(prefix))); 
        
               } else { 
        
                   // 対応するlumpが見つからなかった. 
        
                   // => ロード中に新しい`LogPrefix`がインストールされた可能性が高いので、 
        
                   //    リトライを行う. 
        
                   info!(self.handle.logger, "[RETRY] LoadLogPrefix"); 
        
                   Phase::A(LoadLogPrefixIndex::new(self.handle.clone())) 
        
               } 
        
           }

Reproduce

Make a frugalos cluster with multiple frugalos servers
Kill one server in the cluster (DOES NOT issue frugalos stop)
Restart the server

Log

There are three frugalos servers on 127.0.0.1:14278, 127.0.0.1:14279, and 127.0.0.1:14280.
The following log is produced by the server on 127.0.0.1:14279.
For the sake of explanation, I extract the part of the original log that is related to the node 401 in 127.0.0.1:14279.

[frugalos_raft/src/storage/log_prefix/save.rs:47] [START] SaveLogPrefix: prefix.tail=LogPosition { prev_term: Term(13), index: LogIndex(350) }; prefix.config=ClusterConfig { new: {No
deId("[email protected]:14280"), NodeId("[email protected]:14279"), NodeId("[email protected]:14278")}, old: {}, state: Stable }; prefix.snapshot.len()=22;  @ LocalNodeId("401")

[frugalos_raft/src/storage/log_prefix/load.rs:91] [START] LoadLogPrefixIndex: lump_id=LumpId("00000000000401020000000000000000");

[frugalos_raft/src/storage/log_prefix/load.rs:118] [FINISH] LoadLogPrefixIndex: index=Some(23..24);  @ LocalNodeId("401")

[frugalos_raft/src/storage/log_prefix/save.rs:130] [START] SaveLogPrefixBytes: prefix_index=24..25; prefix_bytes.len()=107;  @ LocalNodeId("401")

[frugalos_raft/src/storage/log_prefix/save.rs:165] [PROGRESS] SaveLogPrefixBytes: index=24; lump_id=LumpId("00000000000401030000000000000018"); bytes.len()=107;  @ LocalNodeId("401")

[frugalos_raft/src/storage/log_prefix/save.rs:153] [FINISH] SaveLogPrefixBytes @ LocalNodeId("401")

[frugalos_raft/src/storage/log_prefix/save.rs:201] [START] SaveLogPrefixIndex: index=24..25; bytes.len()=9; lump_id=LumpId("00000000000401020000000000000000");  @ LocalNodeId("401")

[frugalos_raft/src/storage/log_prefix/save.rs:224] [FINISH] SaveLogPrefixIndex @ LocalNodeId("401")

[frugalos_raft/src/storage/log_prefix/delete.rs:50] [PROGRESS] DeleteOldLogPrefixBytes: index=23; lump_id=LumpId("00000000000401030000000000000017");

[frugalos_raft/src/storage/log_prefix/save.rs:99] [FINISH] SaveLogPrefix @ LocalNodeId("401")

Here I killed the frugalos server and the following log is obtained after restarting the server:

[frugalos_raft/src/storage/log_prefix/load.rs:23] [START] LoadLogPrefix @ LocalNodeId("401")

[frugalos_raft/src/storage/log_prefix/load.rs:91] [START] LoadLogPrefixIndex: lump_id=LumpId("00000000000602020000000000000000");

[frugalos_raft/src/storage/log_prefix/load.rs:118] [FINISH] LoadLogPrefixIndex: index=Some(23..24);  @ LocalNodeId("401")

[frugalos_raft/src/storage/log_prefix/load.rs:141] [START] LoadLogPrefixBytes: prefix_index=23..24;  @ LocalNodeId("401")

[frugalos_raft/src/storage/log_prefix/load.rs:178] [FINISH] LoadLogPrefixBytes: bytes.len()=510;  @ LocalNodeId("401")

Jan 29 16:27:16.711 CRIT Node down: Other (cause; assertion failed: `left == right`; assertion failed: `(left == right)` (left: `2147550283`, right: `0`))
HISTORY:
  [0] at frugalos_raft/src/protobuf.rs:85
  [1] at frugalos_raft/src/protobuf.rs:23
  [2] at frugalos_raft/src/storage/log_prefix/load.rs:61
  [3] at frugalos_raft/src/storage/log.rs:86
  [4] at /Users/yuuya_uezato/.cargo/registry/src/github.com-1ecc6299db9ec823/raftlog-0.4.0/src/node_state/loader.rs:97
  [5] at /Users/yuuya_uezato/.cargo/registry/src/github.com-1ecc6299db9ec823/raftlog-0.4.0/src/node_state/loader.rs:23
  [6] at /Users/yuuya_uezato/.cargo/registry/src/github.com-1ecc6299db9ec823/raftlog-0.4.0/src/node_state/mod.rs:113
  [7] at /Users/yuuya_uezato/.cargo/registry/src/github.com-1ecc6299db9ec823/raftlog-0.4.0/src/replicated_log.rs:260 -- node=Node { id: NodeId("[email protected]:14279"), role: Follower, ballot: Ballot { term: Term(13), voted_for: NodeId("[email protected]:14278") } }
  [8] at frugalos_mds/src/node/node.rs:730
  [9] at frugalos_segment/src/service.rs:265
  [10] at frugalos_segment/src/service.rs:280
, node: 401, module: frugalos_segment::service:282

Explanation of log

Let us check the following lines:

[frugalos_raft/src/storage/log_prefix/save.rs:201] [START] SaveLogPrefixIndex: index=24..25; bytes.len()=9; lump_id=LumpId("00000000000401020000000000000000");  @ LocalNodeId("401")

[frugalos_raft/src/storage/log_prefix/save.rs:224] [FINISH] SaveLogPrefixIndex @ LocalNodeId("401")

# after restarting

[frugalos_raft/src/storage/log_prefix/load.rs:91] [START] LoadLogPrefixIndex: lump_id=LumpId("00000000000602020000000000000000");

[frugalos_raft/src/storage/log_prefix/load.rs:118] [FINISH] LoadLogPrefixIndex: index=Some(23..24);  @ LocalNodeId("401")

After the frugalos process put the range (24, 25) into 00000000000401020000000000000000;
however, we get the range (23, 24) from 00000000000401020000000000000000.
This is due to the presence of journal memory buffer.
Note: LogPrefixIndex put in the manner of embedding put.
Therefore, after restarting the frugalos server, unfortunately, we get the old range (23, 24).

Furthermore, since we issue the following delete:

[frugalos_raft/src/storage/log_prefix/delete.rs:50] [PROGRESS] DeleteOldLogPrefixBytes: index=23; lump_id=LumpId("00000000000401030000000000000017")

This may leads that an unknown value is written to the position in which the lump "00000000000401030000000000000017" had lived.

Finally, after restarting the frugalos server, there is an unknown value in LumpId("00000000000401030000000000000017").

How to fix this issue?

Enforce to sync the journal region into a disk by this method:
https://docs.rs/cannyls/0.9.2/cannyls/device/struct.DeviceRequest.html#method.journal_sync
If we fail to decode bytes at here, we simply return Ok(Async::Ready(None)).

I think the latter plan is superior in performance to the former one.

クラスタやバケツ構成の取得・可視化

自分の作ったクラスタやバケツが意図通りの構成になっているかを確認するために、
クラスタやバケツ構成を出力するためのコマンドを追加する。

HTML形式など、構造化された形での出力が得られると尚良い。

Returns the size of an object as a response of `HEAD /v1/buckets//objects/`.

There are some use cases to know the size of an object without getting the content of the object. It's convenient if HEAD /v1/buckets/*/objects/* returns the size of an object.

Add a new option which changes TCP connection timeout(secs).

frugalos/src/daemon.rs

Line 101 in 63945b4

tcp_write_timeout: rpc_write_timeout(),

frugalos/frugalos_config/src/cluster.rs

Line 137 in 63945b4

.logger(logger.clone())

Sets ChannelOptions.tcp_connect_timeout.

Inconsistent use of frugalos_config::machine::Segment

frugalos/src/service.rs

Line 137 in 63945b4

track!(self.handle_patch_segment(bucket_no, segment_no, &groups[0]))?;

The above code implicitly assumes that there is only one DeviceGroup but this is contradictory to the definition of frugalos_config::machine::Segment.

Here are some ideas:

Fix the definition of Segment.
Fix handle_patch_segment to accept multiple DeviceGroups.

各種パラメータをファイルから読み込めるようにする

現在のFrugalosでは、内部の挙動に関わる様々なパラメータを「コマンドライン引数」から渡せるようにしている。
一方で、パラメータの個数が相当増えておりコマンドライン引数に全てを書くのが煩雑であること、実行中のfrugalosインスタンスがどんなパラメータのもとで動いているか後から分かるようにすることなどを鑑み、ファイルからパラメータを読み込めるようにしたい。

Can't build a Docker image from `docker/hub/Docker`.

I met the following error caused by the edition of Rust when I built a Docker image from docker/hub/Dockerfile.

$ docker build -t foo/bar docker/hub
(omitted)
   Compiling rustracing v0.1.8
   Compiling thrift_codec v0.1.1
   Compiling sloggers v0.3.1
   Compiling rustracing_jaeger v0.1.9
error: Edition 2018 is unstable and only available for nightly builds of rustc.

error: Could not compile `rustracing_jaeger`.
warning: build failed, waiting for other jobs to finish...
error: failed to compile `frugalos v0.9.0`, intermediate artifacts can be found at `/tmp/cargo-install8fbnMT`

Caused by:
  build failed
The command '/bin/sh -c cargo install frugalos --version $FRUGALOS_VERSION' returned a non-zero code: 101

The error message says rustracing_jaeger uses Edition 2018 but the version of Rust compiler used in Dockerfile is not compatible with Edition 2018.

I've found that this problem can be resolved by using rust:1.31.0-slim as a base image.

スナップショット取得コマンドのバグ修正

何もobjectをPUTしないでfrugalos take-snapshotを実行すると、次の起動時にsnapshotの無限ロードが走ることがあったので修正する（おそらくterm=0,index=0の場合のraftlogのバグ？)

Possibility of High CPU utilization

cf. frugalos/raftlog#12

frugalos/frugalos_raft/src/storage/log_prefix/delete.rs

Lines 99 to 111 in 4b9ca1e

    
           impl Future for DeleteOldLogEntries { 
        
               type Item = (); 
        
               type Error = Error; 
        
               fn poll(&mut self) -> Poll<Self::Item, Self::Error> { 
        
                   // handle をスレッドを跨がせるのは面倒なのでログは poll() 内で出す 
        
                   track!( 
        
                       self.future 
        
                           .poll() 
        
                           .map(|result| result 
        
                               .map(|_| info!(self.handle.logger, "[FINISH] DeleteOldLogEntries"))) 
        
                   ) 
        
               } 
        
           }

The core part of self.future is the following:

frugalos/frugalos_raft/src/storage/log_prefix/delete.rs

Line 94 in 4b9ca1e

.delete_range(deleted_range)

Run integration tests on travis-ci

It's desirable to execute integration tests automatically when a new PR is proposed or a PR is merged into master branch.

In this context, I refer to it/testsuites/* as integration tests.

mds へ PUT が成功し、storage へ PUT が失敗した時に不整合が発生する

mds への PUT と storage への PUT が atomic ではないために発生しうる。例えば、HTTP で PUT をした際にクライアント側から接続を切るとこの状況がよく起きる。切断された際に fibers_http_server の Handler が途中までしか実行されないことによるものだと思われる。HTTP クライアント側からの切断があり、mds のみに PUT された際 frugalos のログにはオブジェクトの状態に不整合が発生していることを示すものは何も出ない点に注意が必要。

また、別のケースとして、replicated_log に対して proposal が実行され、commit 待ちの状態になっている request の fibers::sync::oneshot::Monitored が drop されると mds への PUT が失敗扱いになるが、タイミングによっては drop された proposal(request) の commit が完了しているケースがあり、その場合も mds の処理中にエラーになった扱いとなり storage への PUT が行なわれなくなってしまう。

Add host information to `to_rpc_error`

Frugalos RPC servers kindly reply error messages to the sender that called RPCs when internal errors occur:
https://github.com/frugalos/frugalos/blob/master/frugalos_mds/src/server.rs#L52-L61

However, such error messages do not contain information of hosts who had internal erros.

I think it is useful to add information that holds which host does invoke errors;
and the following function is suitable for this purpose:
https://github.com/frugalos/frugalos/blob/master/frugalos_mds/src/error.rs#L124-L132

Deleting massive objects using object_prefixes leads to some problems.

Whe issue the following command for the frugalos cluster that has 50million objects whose name starts with frugalos

time curl -XDELETE http://192.168.0.1:3000/v1/buckets/bucket0/object_prefixes/frugalos

then it makes the following result:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>502 Proxy Error</title>
</head><body>
<h1>Proxy Error</h1>
<p>The proxy server received an invalid
response from an upstream server.<br />
The proxy server could not handle the request <em><a href="http://192.168.0.1:3000/v1/buckets/bucket0/object_prefixes/frugalos">DELETE&nbsp;http://192.168.0.1:3000/v1/buckets/bucket0/object_prefixes/frugalos</a></em>.<p>
Reason: <strong>Error reading from remote server</strong></p></p>
</body></html>

real    1m0.075s
user    0m0.004s
sys     0m0.008s

I hope that the following grafana image will be of some help.

リペアの動作検証

リペアは最後に動作検証をしてから時間が経ってしまったため動作検証を再度行う。
大量のリペアが発生した際に、通常の処理の実行を阻害してしまわないかも要確認。

Publish a new release to GitHub automatically.

https://docs.travis-ci.com/user/deployment/releases

I'm trying the process described in the document.

Plans

Deployment will be executed only when a git tag is pushed to GitHub and the tag starts with a semantic version(for example 0.9.1).

Built binaries will be compiled under CentOS7 to support a lagacy environment and attached to a GitHub releases as an asset.

Travis CI supports only Debian(trusty and precise), so use docker on Travis for building binaries.

Constraints

The git tag attached to a released commit MUST satisfy the regular expresion [0-9]\.[0-9].*(Travis CI syntax).

/v1/servers REST API のドキュメントを更新する

ひとまず API Blueprint 形式は維持する方針です。

ランデブーハッシュの計算対象をシーケンス番号からデバイスIDに変更

デバイスの更新(PUT)をどう扱うか次第だが、もしPUT毎にデバイスに新規シーケンス番号を割り振るモデルにするのだとしたら、シーケンス番号ではなくIDに対してハッシュを取るようにした方が、構成変更時のデータ移動が少なくて済むようになる。

Build error

Congratulation on making OSS!
I was trying build fruglos.
However, I got error message.
The version of rustc is 1.31.0-nightly.
In which versions does this work?
Maybe, Is this stable only?

   Compiling regex v1.0.5                                                                                                                                                                                          
   Compiling jemalloc-ctl v0.2.0                                                                                                                                                                                   
   Compiling url v1.7.1                                                                                                                                                                                            
   Compiling slog-term v2.4.0                                                                                                                                                                                      
   Compiling slog-kvfilter v0.7.0                                                                                                                                                                                  
   Compiling trackable_derive v0.1.1                                                                                                                                                                               
   Compiling serde_derive v1.0.80                                                                                                                                                                                  
error: failed to run custom build command for `liberasurecode v1.0.2`                                                                                                                                              
process didn't exit successfully: `/home/utam0k/ghq/github.com/frugalos/frugalos/target/debug/build/liberasurecode-ae7bd1a0d8ba835c/build-script-build` (exit code: 101)
--- stdout
cargo:rerun-if-changed=build.rs

--- stderr
+ BUILD_DIR=/home/utam0k/ghq/github.com/frugalos/frugalos/target/debug/build/liberasurecode-df87a8d2fae3fa75/out/build
+ git clone https://github.com/ceph/gf-complete.git
Cloning into 'gf-complete'...
+ cd gf-complete/
+ git checkout a6862d1
Note: checking out 'a6862d1'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at a6862d1 Merge branch 'wip-signed-integer-overflow-cppcheck' into 'master'
+ ./autogen.sh
./autogen.sh: 2: ./autogen.sh: autoreconf: not found
thread 'main' panicked at './install_deps.sh failed: exit-code=Some(127)', /home/utam0k/.cargo/registry/src/github.com-1ecc6299db9ec823/liberasurecode-1.0.2/build.rs:28:17
note: Run with `RUST_BACKTRACE=1` for a backtrace.

warning: build failed, waiting for other jobs to finish...
error: build failed

CollectFragment may always take 2 seconds when some servers are down.

frugalos/frugalos_segment/src/client/storage.rs

Line 420 in 63945b4

    
           timeout: Some(timer::timeout(Duration::from_secs(2))), // TODO: ハードコーディングは止める

metadataだけをgetするRPCが欲しい

現在のfrugalosでは、GETされると、下記の箇所を通過し、metadata属性がないストレージの場合には実データを見る処理になっている:
https://github.o-in.dwango.co.jp/frugalfs/frugalos/blob/0c73c3ee095c464b728168de6f7f41057129d238/frugalos_segment/src/client/storage.rs#L82

もちろんGETではこの処理で問題がないが、GET_METADATA的な、metadataではない場合は空データを返すRPCが欲しい(REST APIとする必要はない）。
用途は、限定的だが、frugalosを動かして挙動を調べる際にgetのコントロールフローと異なるものがあると便利かもしれないという想定

バケツの削除に対応する

以下が必要:

fruglaos_configで保存しているクラスタ状態から、該当バケツを取り除く
該当バケツのセグメントノードを停止する（必須ではない)
該当バケツ用のストレージ内のデータを削除する（cannylsの範囲削除を、全デバイスに発行すれば良い）

ユニットテストの追加

全体的に不足しているので追加する。

frugalos プロセスを停止した時にタイミングよく API にアクセスすると RPC でエラーが発生する

エラーメッセージ

{"kind":"Other","cause":"client service or server is unavailable","history":[{"module_path":"fibers_rpc::rpc_client","file":"/Users/shinya_yamaoka/.cargo/regi
stry/src/github.com-1ecc6299db9ec823/fibers_rpc-0.2.17/src/rpc_client.rs","line":152,"message":""},{"module_path":"fibers_rpc::client_side_handlers","file":"/
Users/shinya_yamaoka/.cargo/registry/src/github.com-1ecc6299db9ec823/fibers_rpc-0.2.17/src/client_side_handlers.rs","line":37,"message":""},{"module_path":"fr
ugalos_api::client","file":"/Users/shinya_yamaoka/.cargo/git/checkouts/frugalos_api-99a511e2400fd938/26f4432/src/client/mod.rs","line":27,"message":""},{"modu
le_path":"frugalos_api::client::mds","file":"/Users/shinya_yamaoka/.cargo/git/checkouts/frugalos_api-99a511e2400fd938/26f4432/src/client/mds.rs","line":226,"m
essage":"frugalos.mds.object.put"},{"module_path":"frugalos_segment::client::mds","file":"frugalos_segment/src/client/mds.rs","line":324,"message":"node=None"
},{"module_path":"frugalos::client","file":"src/client.rs","line":119,"message":""},{"module_path":"frugalosd::server","file":"frugalosd/src/server.rs","line"
:583,"message":""}]}

再現方法

# 環境構築
cargo run -p frugalosd --bin frugalos -- create --id srv1 --data-dir /tmp/srv1 --addr 127.0.0.1:3201
cargo run -p frugalosd --bin frugalos -- join --id srv2 --data-dir /tmp/srv2 --addr 127.0.0.1:3202 --contact-server=127.0.0.1:3201
cargo run -p frugalosd --bin frugalos -- join --id srv3 --data-dir /tmp/srv3 --addr 127.0.0.1:3203 --contact-server=127.0.0.1:3201

# 起動
cargo run -p frugalosd --bin frugalos -- start --data-dir /tmp/srv1 --http-server-bind-addr 127.0.0.1:3101
cargo run -p frugalosd --bin frugalos -- start --data-dir /tmp/srv2 --http-server-bind-addr 127.0.0.1:3102
cargo run -p frugalosd --bin frugalos -- start --data-dir /tmp/srv3 --http-server-bind-addr 127.0.0.1:3103

# デバイス登録
curl -X PUT -d '{"file": {"id": "dev0", "server": "srv1", "filepath":"/tmp/srv1/devices/dev0.lusf"}}' http://127.0.0.1:3101/v1/devices/dev0
curl -X PUT -d '{"file": {"id": "dev1", "server": "srv1", "filepath":"/tmp/srv1/devices/dev1.lusf"}}' http://127.0.0.1:3101/v1/devices/dev1
curl -X PUT -d '{"file": {"id": "dev2", "server": "srv1", "filepath":"/tmp/srv1/devices/dev2.lusf"}}' http://127.0.0.1:3101/v1/devices/dev2

curl -X PUT -d '{"file": {"id": "dev3", "server": "srv2", "filepath":"/tmp/srv2/devices/dev3.lusf"}}' http://127.0.0.1:3101/v1/devices/dev3
curl -X PUT -d '{"file": {"id": "dev4", "server": "srv2", "filepath":"/tmp/srv2/devices/dev4.lusf"}}' http://127.0.0.1:3101/v1/devices/dev4
curl -X PUT -d '{"file": {"id": "dev5", "server": "srv2", "filepath":"/tmp/srv2/devices/dev5.lusf"}}' http://127.0.0.1:3101/v1/devices/dev5

curl -X PUT -d '{"file": {"id": "dev6", "server": "srv3", "filepath":"/tmp/srv3/devices/dev6.lusf"}}' http://127.0.0.1:3101/v1/devices/dev6
curl -X PUT -d '{"file": {"id": "dev7", "server": "srv3", "filepath":"/tmp/srv3/devices/dev7.lusf"}}' http://127.0.0.1:3101/v1/devices/dev7
curl -X PUT -d '{"file": {"id": "dev8", "server": "srv3", "filepath":"/tmp/srv3/devices/dev8.lusf"}}' http://127.0.0.1:3101/v1/devices/dev8

curl -X PUT -d '{"virtual": {"id": "store01", "children": ["dev0", "dev1", "dev2"]}}' http://127.0.0.1:3101/v1/devices/store01
curl -X PUT -d '{"virtual": {"id": "store02", "children": ["dev3", "dev4", "dev5"]}}' http://127.0.0.1:3101/v1/devices/store02
curl -X PUT -d '{"virtual": {"id": "store03", "children": ["dev6", "dev7", "dev8"]}}' http://127.0.0.1:3101/v1/devices/store03
curl -X PUT -d '{"virtual": {"id": "root", "children": ["store01", "store02", "store03"]}}' http://127.0.0.1:3101/v1/devices/root

# バケツ作成
curl -sf -X PUT -d '{"dispersed": {"id": "vod_chunk", "device":"root", "tolerable_faults": 1, "data_fragment_count": 2}}' http://localhost:3101/v1/buckets/vod_chunk

その後にノードを一台落として、すぐに以下のコマンドを実行する。

$ for vid in `seq 10 100`; do; curl --dump-header - -X PUT http://localhost:3101/v1/buckets/vod_chunk/objects/sm$vid -d test; done

/v1/devices REST API のドキュメントを更新する

ひとまず API Blueprint 形式を維持する方針です。

動的なデバイス構成変更に対応する

既に登録済みのデバイスをPUTで更新可能にする。
その際に、既にセグメントとマッピングされているデバイスの対応関係が変わった場合には、データのマイグレーションも実施する。

travisのbetaとnightlyのbuild失敗を許す

問題

現在のtravis設定では、warningが１つでもあると失敗（RUSTFLAGS="-D warnings"）するようになっている。
一方で、Rustのbetaとnightly channelではwarningレベルでの変更が頻繁に入る。
これらの変更のたびに、stableと、beta/nightlyの両方で１つもwarningを出さないように追従して変更していくことは難しい。

解決方針

stableに対する優先度をbeta/nightlyよりも上げる。前提としてstableでのwarningは一切許さない。
beta/nightlyに対しては、次の二つがすぐに考えられる:

beta/nightlyに対するbuild時のwarningは許し、testは行う。
beta/nightlyに対するbuild時のwarningが発生すると、testは行わない。

Validates erasure coding configuration anywhere before using it.

Now frugalos uses a configuration passed by a user without any validation (the combination of data_fragments = 1 and parity_fragments = 1 is rejected actually). But it is desirable to check a configuration because an incorrect configuration causes internal errors.

再起動時に同期処理を実行する

現状でもRaftの仕組みを使うことで、一定時間切り離されていたノードの再起動時の状態同期を行っている。
ただし、セグメントが保持するオブジェクトインデックスの状態同期に関しては、これで十分ではあるが、オブジェクトデータ（ErasureCodingによって分割されたフラグメント）の同期が上手く行えないことがある。
具体的には、該当ノードの停止と再起動の間に、リーダノードでRaftのスナップショット取得が行われた場合には、それより以前のRaftのログに関しては、再起動ノードに直接的には伝わることがないので、現状の「再起動時にログを辿って処理を再実行する」といった方法で、停止中に追加や削除されたオブジェクトのデータの状態の同期が行えない。
この問題を解決するために、再起動時に各セグメントノードが自分のオブジェクトインデックスの状態とローカルストレージの状態を照会して、必要なオブジェクトデータの追加(リペア)と削除を実施する必要がある。

	impl Future for SaveLogPrefix {
	type Item = ();
	type Error = Error;
	fn poll(&mut self) -> Poll<Self::Item, Self::Error> {
	while let Async::Ready(phase) = self.phase.poll()? {
	let next = match phase {
	Phase5::A(index) => {
	let index = index.unwrap_or(Range { start: 0, end: 0 });
	self.old_prefix_index = index.clone();
	let prefix = self.prefix.take().expect("Never fails");
	let future =
	track!(SaveLogPrefixBytes::new(self.handle.clone(), index, prefix))?;
	Phase5::B(future)
	}
	Phase5::B(prefix_index) => {
	let future =
	track!(SaveLogPrefixIndex::new(self.handle.clone(), prefix_index))?;
	Phase5::C(future)
	}
	Phase5::C(()) => {
	let future = DeleteOldLogPrefixBytes::new(
	self.handle.clone(),
	self.old_prefix_index.clone(),
	);
	Phase5::D(future)
	}
	Phase5::D(()) => {
	let future = track!(DeleteOldLogEntries::new(
	self.handle.clone(),
	self.old_entries.clone()
	))?;
	Phase5::E(future)
	}
	Phase5::E(()) => {
	info!(self.handle.logger, "[FINISH] SaveLogPrefix");
	let event = Event::LogPrefixUpdated {
	new_head: self.new_head,
	};
	let _ = self.event_tx.send(event);
	return Ok(Async::Ready(()));
	}
	};
	self.phase = next;
	}
	Ok(Async::NotReady)
	}
	}

	E::SnapshotLoaded { new_head, snapshot } => {
	info!(
	self.logger,
	"New snapshot is loaded: new_head={:?}, bytes={}",
	new_head,
	snapshot.len()
	);
	let logger = self.logger.clone();
	let future = fibers_tasque::DefaultCpuTaskQueue.async_call(move \|\| {
	let machine = track!(codec::decode_machine(&snapshot))?;
	let versions = machine.to_versions();
	info!(logger, "Snapshot decoded: {} bytes", snapshot.len());
	Ok((new_head, machine, versions))
	});
	self.decoding_snapshot = Some(future);
	}

	match track!(self.decoding_snapshot.poll().map_err(Error::from))? {
	Async::NotReady => return Ok(Async::NotReady),
	Async::Ready(None) => {}
	Async::Ready(Some(result)) => {
	let (new_head, machine, versions) = track!(result)?;
	info!(self.logger, "Snapshot decoded: new_head={:?}", new_head);
	let delay = env::var("FRUGALOS_SNAPSHOT_REPAIR_DELAY")
	.ok()
	.and_then(\|v\| v.parse().ok())
	.unwrap_or(10);
	self.events.reserve_exact(machine.len());
	self.events
	.extend(versions.into_iter().map(\|version\| Event::Putted {
	version,
	put_content_timeout: Seconds(delay),
	}));
	self.next_commit = new_head.index;
	self.machine = machine;
	self.metrics.objects.set(self.machine.len() as f64);
	self.decoding_snapshot = None;
	}
	}

	Phase::B(bytes) => {
	if let Some(bytes) = bytes {
	let prefix = track!(protobuf::decode_log_prefix(&bytes))?;
	info!(
	self.handle.logger,
	"[FINISH] LoadLogPrefix: {}",
	dump!(prefix.tail, prefix.config, bytes.len())
	);
	return Ok(Async::Ready(Some(prefix)));
	} else {
	// 対応するlumpが見つからなかった.
	// => ロード中に新しい`LogPrefix`がインストールされた可能性が高いので、
	// リトライを行う.
	info!(self.handle.logger, "[RETRY] LoadLogPrefix");
	Phase::A(LoadLogPrefixIndex::new(self.handle.clone()))
	}
	}

	impl Future for DeleteOldLogEntries {
	type Item = ();
	type Error = Error;
	fn poll(&mut self) -> Poll<Self::Item, Self::Error> {
	// handle をスレッドを跨がせるのは面倒なのでログは poll() 内で出す
	track!(
	self.future
	.poll()
	.map(\|result\| result
	.map(\|_\| info!(self.handle.logger, "[FINISH] DeleteOldLogEntries")))
	)
	}
	}

frugalos / frugalos Goto Github PK

frugalos's Introduction

Frugalos

Frugal Object Storage

Documentation

Installation

Simple Example

For Frugalos Developers

frugalos's People

Contributors

Stargazers

Watchers

Forkers

frugalos's Issues

Background

再現方法（状況）

問題が発生する前提

問題の発生

メモ

How to Reproduce

Assumption

What happens ?

Why do massive snapshot castings occur?

Math: Estimate how many data will be sent from a frugalos instance Z.

案

現象

再現方法

Description

Reproduce

Why is the precondition broken

How Solve This

Idea; Using Configuration File

以下 sile 先生のアイディア

Where does error occur

Reproduce

Log

Explanation of log

How to fix this issue?

Plans

Constraints

エラーメッセージ

再現方法

問題

解決方針

Recommend Projects

Recommend Topics

Recommend Org