Comments (6)
This SPOF behavior only occurs if I do not enable caching as a argument to my volume mount command.
Thus, the trade-off to enable caching seems to be HA with slower propagation of file changes vs. a SPOF and near synchronous file change propagation.
from infinit.
Hi @tarlano !
Sorry it took me so long to reply. I just asked @mefyl about the consensus, he's the expert.
However, I'm concerned with the last error (the handshake). When you say 3 devices cluster
, how did you bootstrap every single node? Did you use import
/ export
or pull
/ fetch
? It's virtually impossible to have the id collision unless you manually copied your infinit's home.
from infinit.
Hi @tarlano. Here's my suspicion from what I see: your replication factor might be 1, in which case every block only has one copy, and shutting down the node that has the root block will indeed make the network unusable. Can you please check with infinit network export $NAME
, looking for the replication-factor
key ?
$ infinit network export infinit/company | jq '.consensus."replication-factor"'
3
Regarding the fact it works with caching, I suppose you're starting the volume, it works, then shut down the node with the root block, and it keeps working ? That would be normal, since the other nodes cached the root block, so the volume will be up until the cache is invalidated. If you start both other nodes without ever giving them access to the third with the root block, I predict it would not work.
But those are just guesses, let's first check that replication factor :)
from infinit.
Sorry for the late reply. I am not copying over infinit home and I am also not using the hub.
I currently have a 17 node cluster using kouncil and a caching setup.
The current caching setup is working with a replication value of 3, but since I would like the changes to be more synchronous, I will configure another cluster to use kelips and no caching, later on today to get the information you asked for.
Here are the details of my current caching cluster.
I am deploying four exported files via a deployment RPM. The exported files are for the user, silo, network, and volume respectively. Then I import the four files with a post install bash script in the RPM.
All devices are client/server devices. And all devices have the same user that is an admin with read+write to get around any passport issues.
#!/bin/bash
# export INFINIT_HOME and start from a clean slate
#
export INFINIT_HOME="/usr/share/infinit"
rm -rf $INFINIT_HOME
mkdir -p $INFINIT_HOME
# configure fuse
#
/bin/cp -f /usr/share/fuse/fuse.conf /etc/fuse.conf
# create the default infinit user
#
infinit user import --input /etc/infinit.d/infinit.user
# create the default 40GB infinit storage silo
#
infinit silo import --input /etc/infinit.d/infinit.silo
# create the default infinit network
#
infinit network import --as infinit-user --input /etc/infinit.d/infinit.network
infinit network link --as infinit-user --name infinit-network --storage infinit-silo
# create the default infinit volume
#
infinit volume import --as infinit-user --input /etc/infinit.d/infinit.volume
Here is the infinit.silo that I import on each device.
cat infinit.silo
{"capacity":40000000000,"name":"infinit-silo","path":"/usr/share/infinit/.local/share/infinit/filesystem/blocks/infinit-silo","type":"filesystem"}
Here is the infinit.volume that I import on each device.
cat infinit.volume
{"block_size":1024,"mount_options":{"fuse_options":["allow_other"]},"name":"infinit-user/infinit-volume","network":"infinit-user/infinit-network"}
After the deployment RPM is installed I start the node with the following config
environment=INFINIT_HOME="/usr/share/infinit",INFINIT_RDV=""
command=/opt/infinit/bin/infinit volume mount --as infinit-user --name infinit-volume --mountpoint /opt/ecs --allow-root-creation --port 13000 --cache --endpoints-file /etc/infinit.d/templates/endpoints --peer /etc/infinit.d/templates/peeraddresses.conf --port-file /etc/registrar/listenport
Note the peer file (peeraddresses.conf), from the --peer argument, is generated using consul template and contains all the IP addresses of all the other devices/peers in the network cluster. Here is an example with my IP's obfuscated.
cat /etc/infinit.d/templates/peeraddresses.conf
xxx.xxx.118.72:13000
xxx.xxx.118.73:13000
xxx.xxx.118.85:13000
xxx.xxx.118.86:13000
xxx.xxx.118.87:13000
xxx.xxx.117.220:13000
xxx.xxx.117.222:13000
xxx.xxx.117.223:13000
xxx.xxx.68.223:13000
xxx.xxx.68.224:13000
xxx.xxx.68.225:13000
xxx.xxx.68.226:13000
xxx.xxx.68.217:13000
xxx.xxx.68.218:13000
xxx.xxx.68.219:13000
xxx.xxx.68.220:13000
xxx.xxx.68.221:13000
The endpoints file (endpoints) is just the local devices endpoints.
Here is the output of the export of one of the running devices network. As you can see the replication-factor
is 3. I have also obfuscated the keys.
infinit network export --as infinit-user --name infinit-network | python -m json.tool
{
"admin_keys": {
"group_r": [],
"group_w": [],
"r": [],
"w": [
{
"rsa": "MII..<SNIP>"
}
]
},
"consensus": {
"eviction-delay": "10min",
"replication-factor": 3,
"type": "paxos"
},
"encrypt_options": {
"encrypt_at_rest": true,
"encrypt_rpc": true,
"validate_signatures": true
},
"name": "infinit-user/infinit-network",
"overlay": {
"eviction_delay": "200min",
"rpc_protocol": "tcp",
"type": "kouncil"
},
"owner": {
"rsa": "MII....<SNIP>"
},
"peers": [],
"version": "0.8.0"
}
Thanks for your help!
Tony
from infinit.
I want to update you both.
I didn't have the chance to switch the configuration, but even with the caching config I wrote about previously, I am still seeing the following error message in the error log for the infinit volume mount ....
command detailed in my last reply.
[infinit.model.doughnut.Dock ] [dht::Dock::Connection(0x44a4700, 0x5c07897400 -> 0x45ad740200): tcp://172.17.0.1:13000] key exchange failed with 0x45ad74027ea7fdc02749965925f7685585fc152996d5000598e46437f3faca00: Handshake failed: incoming peer has same id as us: 0x5c0789748d5e58bd39aa288ea159a41b925a0bdc55d3d5ec5b129d957ac0b500
The interesting thing about 172.17.0.1
is that it's the bridge IP for the docker bridge interface. I will try to exclude this interface to see if that has any effect.
Have you seen anything along these lines in the past?
Tony
from infinit.
Can you also provide some insight into whether the use of my general infinit-user could be the issue?
As I said, reading and writing files across the filesystem on the networks is completely functional, but maybe the issue is in the DHT/network layer?
Tony
from infinit.
Related Issues (20)
- Google Drive: silo is faulty HOT 4
- how can i reset name to reuse original account name. HOT 2
- the release tar is not update based on changelog HOT 2
- assertion error: could not determine username HOT 1
- very bad write performances HOT 4
- Assertion error: start_block == end_block - 1 HOT 2
- Will infinit implement CSI (Container Storage Interface)? HOT 1
- Docker Swarm Service HOT 9
- Issues creating Infinit volumes HOT 32
- Unexpected message for user pull HOT 7
- fixing --allow-root-creation catch 22 errors
- Does Infinit have enough performance for backing databases? HOT 1
- Pushing a volume is failing with "HTTP error Bad Gateway".
- Unable to create filesystem network v0.8.0
- Why notify Invalid "Access Key ID", try again ... HOT 1
- too few peers are available to reach consensus: 0 of 1 (offline without Hub)
- Project status? HOT 20
- infinit: fatal error: unexpected HTTP error Bad Gateway pushing network: Unknown error HOT 5
- Error message upon Mounting Volume
- Unable to mount volume
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from infinit.