Coder Social home page Coder Social logo

Comments (13)

wallyqs avatar wallyqs commented on May 22, 2024

Thanks for filing the report, this should be fixed now via #59

Also here is an example of how it should work now with NFS with the new volume declaration block:

stan:
  replicas: 2
  nats:
    url: "nats://my-nats:4222"

store:
  type: file

  # 
  # Fault tolerance group
  # 
  ft:
    group: foo

  # 
  # File storage settings.
  # 
  file:
    path: /data/stan/store

  # volume for EFS
  volume:
    mount: /data/stan
    storageSize: 1Gi
    storageClass: aws-efs
    accessModes: ReadWriteMany

from k8s.

samstride avatar samstride commented on May 22, 2024

@wallyqs , We are on k8s 1.17.4, have the latest version of helm 3.2.1, and did a helm repo update.

helm install stan-server nats/stan --namespace=nats \
  --set stan.nats.url=nats://nats-server:4222 \
  --set store.type=file \
  --set store.file.path=/var/mnt/stan/store \
  --set store.volume.mount=/var/mnt/stan \
  --set store.volume.storageSize=1Gi

NFS is mounted at /var/mnt/.

The directory /var/mnt/store was created instead of /var/mnt/stan/store. The stan in the path was skipped? I tried a different path, example /var/mnt/testing/store and testing was skipped.

[1] 2020/05/27 03:04:10.394385 [INF] STREAM: Starting nats-streaming-server[stan-server] version 0.17.0
[1] 2020/05/27 03:04:10.394419 [INF] STREAM: ServerID: AMm8uXPQWkFiWFGXAv6YgK
[1] 2020/05/27 03:04:10.394474 [INF] STREAM: Go version: go1.13.7
[1] 2020/05/27 03:04:10.394478 [INF] STREAM: Git commit: [f4b7190]
[1] 2020/05/27 03:04:10.403630 [INF] STREAM: Recovering the state...
[1] 2020/05/27 03:04:10.405697 [INF] STREAM: No recovered state
[1] 2020/05/27 03:04:10.657416 [INF] STREAM: Message store is FILE
[1] 2020/05/27 03:04:10.657428 [INF] STREAM: Store location: /var/mnt/testing/store
[1] 2020/05/27 03:04:10.657463 [INF] STREAM: ---------- Store Limits ----------
[1] 2020/05/27 03:04:10.657465 [INF] STREAM: Channels:                  100 *
[1] 2020/05/27 03:04:10.657468 [INF] STREAM: --------- Channels Limits --------
[1] 2020/05/27 03:04:10.657470 [INF] STREAM:   Subscriptions:          1000 *
[1] 2020/05/27 03:04:10.657472 [INF] STREAM:   Messages     :       1000000 *
[1] 2020/05/27 03:04:10.657474 [INF] STREAM:   Bytes        :     976.56 MB *
[1] 2020/05/27 03:04:10.657476 [INF] STREAM:   Age          :     unlimited *
[1] 2020/05/27 03:04:10.657479 [INF] STREAM:   Inactivity   :     unlimited *
[1] 2020/05/27 03:04:10.657481 [INF] STREAM: ----------------------------------
[1] 2020/05/27 03:04:10.657483 [INF] STREAM: Streaming Server is ready

However, when I uninstall and try again with fault-tolerance mode:

helm install stan-server nats/stan --namespace=nats \
  --set stan.replicas=2
  --set stan.nats.url=nats://nats-server:4222 \
  --set store.type=file \
  --set store.ft.group=stan-ft-group-1 \
  --set store.file.path=/var/mnt/stan/store \
  --set store.volume.enabled=true \
  --set store.volume.mount=/var/mnt/stan \
  --set store.volume.storageSize=1Gi \
  --set store.volume.accessModes=ReadWriteMany

The chart deploys but the pod has this error:

kubectl logs -f stan-server-0 -n nats -c stan
Parse error on line 5: 'Expected a map value terminator "," or a map terminator "}", but got 's' instead.'

I cloned the repo and ran the chart locally and still the same error. Am I missing a mandatory value in FT mode?

from k8s.

wallyqs avatar wallyqs commented on May 22, 2024

I think the issue is that you are missing one backslash in the snippet you posted so the nats url which is required is not filled in so becoming a config error.

helm install stan-server nats/stan --namespace=nats \
  --set stan.replicas=2 # <-- here missing backslash
  --set stan.nats.url=nats://nats-server:4222 \

I've also just tried with the same and I'm not getting any errors...

helm install stan-server nats/stan \
  --set stan.replicas=2 \
  --set stan.nats.url=nats://my-nats:4222 \
  --set store.type=file \
  --set store.ft.group=stan-ft-group-1 \
  --set store.file.path=/var/mnt/stan/store \
  --set store.volume.enabled=true \
  --set store.volume.mount=/var/mnt/stan \
  --set store.volume.storageSize=100Gi \
  --set store.volume.storageClass=aws-efs \
  --set store.volume.accessModes=ReadWriteMany

from k8s.

samstride avatar samstride commented on May 22, 2024

@wallyqs , oh no, can't believe I missed that 😞 . Thanks.

Sorry, another question around FT mode. I have an on prem NFS mounted at /var/mnt/ on our on prem k8s cluster.

When I start with a fresh NFS mounted onto k8s in FT mode, I run into this situation now:

kubectl get pods -n nats
NAME              READY   STATUS    RESTARTS   AGE
nats-server-0     3/3     Running   0          41m
nats-server-box   1/1     Running   0          41m
stan-server-0     2/2     Running   0          30s
stan-server-1     0/2     Pending   0          2m13s

There is only 1 PV in the cluster but 2 PVC's are created stan-server-pvc-stan-server-0 and stan-server-pvc-stan-server-1 (I was under the impression that pods can share the 1 PVC), the standby server does not seem to start and the logs show this message:

kubectl logs -f stan-server-0 -n nats -c stan        
[1] 2020/05/28 00:42:27.178032 [INF] STREAM: Starting nats-streaming-server[stan-server] version 0.17.0
[1] 2020/05/28 00:42:27.178055 [INF] STREAM: ServerID: Ay94ZUsAjbIqsdAEmmU0uj
[1] 2020/05/28 00:42:27.178057 [INF] STREAM: Go version: go1.13.7
[1] 2020/05/28 00:42:27.178059 [INF] STREAM: Git commit: [f4b7190]
[1] 2020/05/28 00:42:27.186069 [INF] STREAM: Starting in standby mode
[1] 2020/05/28 00:42:34.673856 [INF] STREAM: ft: unable to get store lock at this time, going back to standby
[1] 2020/05/28 00:42:42.097864 [INF] STREAM: ft: unable to get store lock at this time, going back to standby
[1] 2020/05/28 00:42:49.521823 [INF] STREAM: ft: unable to get store lock at this time, going back to standby
[1] 2020/05/28 00:42:56.945823 [INF] STREAM: ft: unable to get store lock at this time, going back to standby
[1] 2020/05/28 00:43:11.793883 [INF] STREAM: ft: unable to get store lock at this time, going back to standby
[1] 2020/05/28 00:43:34.065981 [INF] STREAM: ft: unable to get store lock at this time, going back to standby
[1] 2020/05/28 00:44:11.185989 [INF] STREAM: ft: unable to get store lock at this time, going back to standby

Is there any extra steps that I need to perform for an on prem NFS solution?

Cheers.

from k8s.

wallyqs avatar wallyqs commented on May 22, 2024

@samstride are you able to use --set store.volume.accessModes=ReadWriteMany in your NFS? Fault tolerance mode requires to have a shared filesystem that supports that flag...

from k8s.

samstride avatar samstride commented on May 22, 2024

Yup, Rancher has that option:

Screen Shot 2020-05-28 at 6 43 43 PM

from k8s.

wallyqs avatar wallyqs commented on May 22, 2024

I see... maybe you need to set a special storage class? For example in AWS for the EFS need to set:

--set store.volume.storageClass=aws-efs \

from k8s.

samstride avatar samstride commented on May 22, 2024

@wallyqs , just to be sure, 2 PVC being created is normal? I only have 1 PV.

kubectl get pvc
NAME                            STATUS    VOLUME           CAPACITY   ACCESS MODES   STORAGECLASS   AGE
stan-server-pvc-stan-server-0   Bound     nats-streaming   1Gi        RWX                           3m15s
stan-server-pvc-stan-server-1   Pending                                                             3m13s

from k8s.

wallyqs avatar wallyqs commented on May 22, 2024

I think so... but I'm not sure about rancher for example in efs I'm getting a pvc per pod even though they all share the same efs:

kubectl get pvc
NAME                              STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
efs                               Bound    pvc-fc5adf4b-a0b5-11ea-a784-0286c60df32e   1Mi        RWX            aws-efs        6m16s
stan-example-pvc-stan-example-0   Bound    pvc-54422d31-a0b6-11ea-9447-0a7c1e06ba6e   2Gi        RWX            aws-efs        3m49s

from k8s.

samstride avatar samstride commented on May 22, 2024

@wallyqs, I'm going to try a few things around FT and maybe even a clustered mode and see how things go.

I'd like to thank you for the assistance so far. Really appreciate it.

Cheers.

from k8s.

samstride avatar samstride commented on May 22, 2024

@wallyqs , hey, I finally got it to work. I have a question though, if a PVC is created per pod, how does replication in FT mode work? Or is there no replication?

Example: fire up stan in FT mode with 3 replicas. This seems to create 3 PVCs. Then send a message. The subject folder is created in the corresponding PVC of the active server. Then kubectl delete pod stan-server-0. stan-server-1 is now the active server but the subject folder in corresponding pvc for stan-server-0 is not replicated to pvc of stan-server-1. When a subscriber comes online, the subject folder is created but no messages are replayed. Is this how it's supposed to work in FT mode?

_, _ = stanConnection.Subscribe("subject-1", func(message *stan.Msg) {
			// some processing
			if err != nil {
				log.Println(err)
			}
			_ = message.Ack()
		}, stan.DurableName("durable-1"), stan.SetManualAckMode(), stan.MaxInflight(1))

Also, if I would like to contribute to the documentation of running stan on prem with an NFS server, would it be useful and how do I go about doing it?

from k8s.

wallyqs avatar wallyqs commented on May 22, 2024

Thank you @samstride docs for NFS K8S on prem setup would be great to have you can send a PR for that here: https://github.com/nats-io/nats.docs/tree/master/nats-on-kubernetes
There is no replication in fault tolerance since it depends on the shared filesystem instead, so should be possible for all 3 replicas (1 active and 2 standby) to share the messages in the same filesystem.

from k8s.

samstride avatar samstride commented on May 22, 2024

@wallyqs , PR for nfs docs submitted.

Ok, got it, no replication in FT mode. However, when subscribers are offline and an active server is changed, messages are not being replayed, example:

  • Publisher sends messages 1, 2, 3 when stan-server-0 is active
  • Subscriber consumes messages 1, 2, 3 and goes offline
  • Publisher sends messages 4, 5, 6 when stan-server-0 is active
  • An event causes stan-server-1 to become the active server, stan-server-0 is now in standby mode.
  • Subscriber comes online and does not receive messages 4, 5, 6

Am I doing something wrong during consumption or is this expected behaviour?

Thanks.

from k8s.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.