storageos / storageos.github.io Goto Github PK
View Code? Open in Web Editor NEWPublic documentation for StorageOS, persistent storage for Docker and Kubernetes
Home Page: https://docs.storageos.com
Public documentation for StorageOS, persistent storage for Docker and Kubernetes
Home Page: https://docs.storageos.com
Hi,
I am trying to install StorageOS on my Kubernetes cluster, Kubernetes is installed in CoreOS.
Looking at the logs the one line that stands out is: panic: runtime error: slice bounds out of range
time="2019-01-02T12:22:39Z" level=info msg="by using this product, you are agreeing to the terms of the StorageOS Ltd. End User Subscription Agreement (EUSA) found at: https://eusa.storageos.com" module=command
time="2019-01-02T12:22:39Z" level=info msg=starting address=********* hostname=clust1-worker-1 id=856a20dd-1e6f-aabf-a152-fa3c4f5dc5a5 join="*********,*********" module=command version="StorageOS 1.0.2 (13c3612), built: 2018-12-07T140018Z"
panic: runtime error: slice bounds out of range
goroutine 1 [running]:
code.storageos.net/storageos/control/vendor/github.com/aws/aws-sdk-go/aws/ec2metadata.(*EC2Metadata).Region(0xc4200c4650, 0x0, 0xc420296760, 0xc4208e9080, 0xc4204bcc20)
/go/src/code.storageos.net/storageos/control/vendor/github.com/aws/aws-sdk-go/aws/ec2metadata/api.go:122 +0xa3
code.storageos.net/storageos/control/integration/iaas/ec2.(*Provider).GetZone(0xc4202997e0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xffffffffffffffff, 0x20002)
/go/src/code.storageos.net/storageos/control/integration/iaas/ec2/provider.go:112 +0x59
code.storageos.net/storageos/control/controlplane.applyProviderMetadata(0xc4201df900, 0x1fc0600, 0xc4202997e0, 0x0, 0xc4204416b0)
/go/src/code.storageos.net/storageos/control/controlplane/node.go:234 +0x1ea
code.storageos.net/storageos/control/controlplane.applyMetadata(0xc4201df900, 0x9, 0x1f94b90)
/go/src/code.storageos.net/storageos/control/controlplane/node.go:229 +0x66
code.storageos.net/storageos/control/controlplane.writeNodeBootstrapConfig(0x1fc07c0, 0xc42037b620, 0xc420279900, 0xc42000e178, 0xc420440060)
/go/src/code.storageos.net/storageos/control/controlplane/node.go:194 +0xd60
code.storageos.net/storageos/control/controlplane.Create(0xc420279900, 0xc4204d0fe0, 0xc42000e178, 0x1f97820, 0xc4200c4a38, 0x0, 0x1f8a7c8, 0x7)
/go/src/code.storageos.net/storageos/control/controlplane/server.go:182 +0xb52
code.storageos.net/storageos/control/command/server.(*Command).Run(0xc4201968c0, 0xc4200ca010, 0x0, 0x0, 0xc4204d09c0)
/go/src/code.storageos.net/storageos/control/command/server/command.go:122 +0x8f4
code.storageos.net/storageos/control/vendor/github.com/mitchellh/cli.(*CLI).Run(0xc4201b0a00, 0xc4201b0a00, 0xc4204d0a00, 0x0)
/go/src/code.storageos.net/storageos/control/vendor/github.com/mitchellh/cli/cli.go:255 +0x1eb
main.realMain(0xc4200a6058)
/go/src/code.storageos.net/storageos/control/main.go:38 +0x126
main.main()
/go/src/code.storageos.net/storageos/control/main.go:26 +0x22
Any ideas what could be causing this?
Thanks,
Jamie
Hi all,
we are having some issue on more than one installation of StorageOS.
Installation
We followed this installation guide:
https://docs.storageos.com/docs/platforms/rancher/index
Infrastructure
Rancher clean installation (2.2.7) on bare metal on both platforms. On one we have a single worker node, on the other 3 worker nodes. All the required network ports have been open.
Configuration
Issue
After creating (from the Rancher UI) a PVC and using it for a while, a new one was created for a different namespace. Rancher could not finalise the deployment because of the PVC being ready. I checked the StorageOS UI and noticed that the volume was created with status healthy.
I then proceeded to remove the deployment from Rancher and delete the volume from the StorageOS UI. It then turned into "decommissioned".
In a last attempt, I also tried to reboot the cluster. This made also the first volume unavailable.
I tried to disable the encryption rule, but nothing changed. I checked the logs and I am flooded with these messages (rate of 1 per second):
time="2019-09-03T08:19:10.428363523Z" level=error msg="dataplane configuration failed, could not get volume crypto key" action=create error="secrets \"vol-key.d07c42aa-05e4-2a0a-1e59-6707614ef584\" not found" id=d07c42aa-05e4-2a0a-1e59-6707614ef584-162625 inode=162625 module=director-volume replicas="[]" revision=281506 time="2019-09-03T08:19:10.428469525Z" level=warning msg="partial dataplane configuration applied - will retry" error="diff process encoutered errors: secrets \"vol-key.d07c42aa-05e4-2a0a-1e59-6707614ef584\" not found" module=statesync
do you have any idea on how to troubleshoot/fix this?
thanks a lot in advance
Fabio
I don't see where you create the storageClass and it's not being created in my cluster as I'm following the documentation. What did I miss?
The quick install script https://docs.ondat.io/sh/deploy-etcd.sh for etcd on Kubenetes found on https://docs.ondat.io/docs/prerequisites/etcd/ doesn't work on Kubernetes 1.20 because of the deprecation of the annotation "service.alpha.kubernetes.io/tolerate-unready-endpoints: "true" (replaces by the spec publishNotReadyAddresses: true) in the etcd service definition.
The "init container" link on https://docs.storageos.com/docs/install/kubernetes/ links to the private repo.
On the page /_docs/install/docker.md, the link to the Consul Installation page is broken.
Perhaps instead of https://docs.storageos.com/docs/install/consul.html it should be https://docs.storageos.com/docs/install/kvstore.html ?
After starting up a cluster of VMs using the supplied Vagrantfile, they are unable to communicate properly, with the container constantly restarting because of bad health.
Enabling LIO as per the instructions at https://docs.storageos.com/docs/reference/os_support fixes things. Does this need to be added as part of the provisioning script?
Hello.
What happened:
storageos volume create test-volume
storageos volume mount test-volume /test
What you expected to happen:
I anticipated that StorageOS volume would operate as expected or StorageOS CLI would report that's wrong.
How to reproduce it (as minimally and precisely as possible):
I'm not sure since the problem is probably caused either by hardware, my Kubernetes cluster or another external factor. Please see below for more details about the environment.
Anything else we need to know?:
I run 2-node on-premise Kubernetes cluster. The scheduler does not schedule any pods for the master node, so the only active node is a single worker node. Thus, StorageOS DaemonSet contains one pod:
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
storageos 1 1 1 1 1 <none> 2d
Also, when I run storageos version
on the master node, it fails with Get http://storageos-cluster/version: failed to dial all known cluster members, (127.0.0.1:5705)
error because master node itself does not run StorageOS pod. So I run all storageos
CLI commands from the worker node shell.
storageos cluster health
does not display any issues:
root@node1 ~ # storageos cluster health
NODE ADDRESS CP_STATUS DP_STATUS
node1 159.69.116.1 Healthy Healthy
storageos volume ls
displays volume status as active:
root@node1 ~ # storageos volume ls
NAMESPACE/NAME SIZE MOUNT SELECTOR STATUS REPLICAS LOCATION
default/test1 15GB active 0/0 node1 (healthy)
While playing with StorageOS interactive demos on kotacoda - http://play.storageos.com, I noted that StorageOS creates a folder in /var/lib/storageos/volumes
for every created volume.
On my machine, this folder is empty. Note drwxrwxrwx
permissions for volumes
folder. I set it in order to validate that it is not the reason for StorageOS to not being able to create a volume.
root@node1 ~ # ls -la /var/lib/storageos/volumes/
total 8
drwxrwxrwx 2 root root 4096 Aug 24 00:20 .
drwxr-xr-x 8 root root 4096 Aug 24 19:49 ..
Also, I found that after calling storageos volume create
, /var/lib/storageos/logs/storageos.log
contains a number of logging entries, one of which with level=error
:
time="2018-08-24T22:38:54Z" level=error msg="filesystem client: presentation create failed" action=create error="<nil>" module=statesync reason="Create refused by validator" volume_uuid=0a8f6232-c0f5-9ed1-a213-5669ed9533ae
time="2018-08-24T22:38:54Z" level=info msg="virtual bool FsConfig::PresentationEventSemantics::Validate(event_type): Not adding pr_filename '0a8f6232-c0f5-9ed1-a213-5669ed9533ae' for volume 238534 - already exists for volume 238534 category=fscfg level=warn" category=fscfg module=supervisor
time="2018-08-24T22:38:54Z" level=info msg="validator 'device_validator' rejected Event{type CREATE} category=libcfg level=warn" category=libcfg module=supervisor
I found a StackOverflow discussion which seems to be relevant - https://stackoverflow.com/questions/51292759/rancher-kubernetes-and-storageos-persistent-storage-volume-mount-issue
Although, this thread does not contain any solution.
Environment:
kubectl version
):root@master ~ # kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-07T23:08:19Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-07T23:08:19Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
# A successful run is proof of mount propagation enabled
docker run -it --rm -v /mnt:/mnt:shared busybox sh -c /bin/date
OS (e.g. from /etc/os-release):
Debian GNU/Linux 9 (stretch)
on both nodes.
Kernel (e.g. uname -a
):
Linux master 4.9.0-7-amd64 #1 SMP Debian 4.9.110-1 (2018-07-05) x86_64 GNU/Linux
Linux node1 4.9.0-7-amd64 #1 SMP Debian 4.9.110-1 (2018-07-05) x86_64 GNU/Linux
Please let me know if any other details could be useful.
I also posted the same issue in Kubernetes Github, in case if this problem is caused by Kubernetes, and not StorageOS.
Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.