Coder Social home page Coder Social logo

Comments (28)

AFrangopoulos avatar AFrangopoulos commented on September 18, 2024 3

@sr258 I used both meteor down and meteor-load-testing by allaning. I found the latter to be more consistent in our tests. Our main concern was 'writes' & having 20k+ concurrent users. Scaling horizontally just wasn't cutting it -- diminished returns the more you scaled. Was not cost effective. Also, it seems RC has a LOT of packages and there is a continuous fight for CPU.

Our goal was to reach 1k writes/second through scaling and it couldn't within reason ($/# boxes needed). I think somewhere on these forums pertaining to RC/Meteor I have a detailed writeup of the several setups we tried to get the best performance.

In the end, we concluded this product couldn't scale to what we needed. It seems RedisOplog would be a great candidate to fix the scaling issues. GL to you

from docs-old.

geekgonecrazy avatar geekgonecrazy commented on September 18, 2024 2

Just like with the MONGO_URL you will want to add all nodes into the MONGO_OPLOG_URL incase a primary election happens. But it will only actually be tailing the oplog on the primary node.

from docs-old.

geekgonecrazy avatar geekgonecrazy commented on September 18, 2024 1

also as far as naming used in the connection string. You will want to make sure to use the same name that the nodes identify them selves as in the replicaset.

localhost for example is something I would avoid using when you have a multiple node mongo replicaset.

If you attempt to connect and the primary is localhost it will address attempt to connect to localhost and address it as such. If internally its referencing its self as something else, you will have issues.

Also the other mongo nodes will try to lookup localhost when trying to talk to this peer, it will of course always resolve to its self. So will cause all kinds of issues.

tl;dr Good practice to always use a hostname thats reachable by all other nodes in replicaset

from docs-old.

richardwlu avatar richardwlu commented on September 18, 2024 1

@geekgonecrazy Thank you for clearing this up!

from docs-old.

geekgonecrazy avatar geekgonecrazy commented on September 18, 2024 1

@richardwlu I don't know that its absolutely necessary. But i typically always do this

Looks like what I would use 👍

from docs-old.

AFrangopoulos avatar AFrangopoulos commented on September 18, 2024 1

@geekgonecrazy I assume you mean to contact you via support? I think I shot an email out to sales and support a day or so ago. If there is a more fluid way of contacting RC that would be great -- maybe meet on your chat server for a brief discussion?

Load Testing: We started doing baseline load testing with no subscriptions and just sending messages and we get 100% usage around 60 rocketchat_messages/second. Writes/CPU usage seems to be the largest bottleneck -- the amount of users doesn't seem to really have much affect. We tested the same throughput with varying number of users and it had negligible affect on the system. Our setup involves oplog tailing and the hardware as of now is greater than what RC documented as "minimal specs". If we can have a brief discussion as stated above, we can get more specific if you have other questions.

Oplog Tailing vs RedisOplog: From what I have read, oplog tailing is a very expensive task when there are a lot of writes and redisOplog solution will relieve a lot of this CPU stress. Has Rocket Chat looked into this solution and if so, do you have any data on the results? I am currently trying to setup redisOplog with oplog tailing disabled on another machine but running into snags. Hopefully I get that running soon with Theodor's help.

Lastly, when you mentioned "node heap increased", did you mean the process memory limit or maybe, giving more RAM to mongo?

Update: The tests described above was just for a single instance. We are testing our server (12 cores) to have 10 instances. Very early testing showed 150 messages(writes)/second to take 65% CPU. It peaked at over 75% from bursts. Is it normal for the application to be make 6-7x more disk read / writes than mongo does? Or is this a possible misconfig on my end?

Thanks,
Aleko

from docs-old.

sr258 avatar sr258 commented on September 18, 2024 1

Have you had any conclusive results on how RC scales at your level of users @AFrangopoulos ? I'm evaluating RC for a use case with more than 1 million registered users and hundreds of thousands of concurrent users. Is there any documentation or experience on scaling RC to this level?

from docs-old.

geekgonecrazy avatar geekgonecrazy commented on September 18, 2024 1

I know we talked a bit in the support channel. But curious what specifically is pointing to oplog as the limiting factor here?

We are definitely experimenting with redis oplog and a few others trying to overall increase performance.

from docs-old.

engelgabriel avatar engelgabriel commented on September 18, 2024

https://docs.mongodb.com/manual/reference/connection-string/

from docs-old.

geekgonecrazy avatar geekgonecrazy commented on September 18, 2024

RocketChat/Rocket.Chat#3540 (comment) We need to ensure we mention keeping time in sync between instances also.

from docs-old.

srihas619 avatar srihas619 commented on September 18, 2024

inorder to run two instances of rocket chat on two VMs (web1 and web2) and with a loadbalancer infront of them (haproxy1), do I need to take care of setting up a shared storage for the session info. I have a seperate mongo DB replica setup with another three Vms: so if the session info stored in mongo; I dont need any shared storage I suppopse; please suggest me how I should move from this point

from docs-old.

geekgonecrazy avatar geekgonecrazy commented on September 18, 2024

if you have mongo set up in replicaset mode, you do not need anything else for shared session storage

from docs-old.

srihas619 avatar srihas619 commented on September 18, 2024

@geekgonecrazy thanks for your response.
so, do I need to enable OpLog URL? what does exactly this OpLog do? sorry I am quite new to mongodb; I just want to setup a clustered rocket chat .

from docs-old.

richardwlu avatar richardwlu commented on September 18, 2024

@geekgonecrazy And to add on to @srihas619's question, does the MONGO_OPLOG_URL need to point to the location of the Primary?

I currently have 4 instances of RC running on one server and 3 Mongo replicas and my
MONGO_OPLOG_URL is mongodb://localhost:27017/local
and
MONGO_URL is mongodb://localhost:27017,mongochat02:27017,mongochat03:27017/rocketchat?replicaSet=001-rs&readPreference=primaryPreferred&w=majority

However, I will have 8 instances of RC running between 2 servers while adding 2 additional replica set members on the new server(s) (to be mongochat04 and mongochat05) and I am trying to determine if I need to change MONGO_OPLOG_URL and MONGO_URL on either servers.

from docs-old.

srihas619 avatar srihas619 commented on September 18, 2024

@richardwlu I configured MONGO_OPLOG_URL as mongodb://mongo1:27017,mongo2:27017,mongo3:27017/local?rs0 (rs0 is my replicaset) and it worked. but still I don't know the exact purpose of it; but when OpLog is enabled, rocket chat webs know themselves that they are clustered (Identified by observation)

from docs-old.

richardwlu avatar richardwlu commented on September 18, 2024

@srihas619 Thanks for the tip. What is your value of MONGO_URL?

And the OpLog, according to the docs:

The oplog (operations log) is a special capped collection that keeps a rolling record of all operations that modify the data stored in your databases. MongoDB applies database operations on the primary and then records the operations on the primary’s oplog. The secondary members then copy and apply these operations in an asynchronous process. All replica set members contain a copy of the oplog, in the local.oplog.rs collection, which allows them to maintain the current state of the database.

https://docs.mongodb.com/manual/core/replica-set-oplog/

from docs-old.

srihas619 avatar srihas619 commented on September 18, 2024

@richardwlu thanks for the link to docs. It explains clearly. My MONGO_URL is configured as mongodb://mongo1:27017,mongo2:27017,mongo3:27017/rocketchat?rs0. Have I configured things in a correct way or? I will be glad if you can review :)

from docs-old.

richardwlu avatar richardwlu commented on September 18, 2024

@srihas619 I think we will need to bring in @geekgonecrazy to help verify the questions above for us :)

from docs-old.

richardwlu avatar richardwlu commented on September 18, 2024

@geekgonecrazy Just to clarify, is it necessary to specify the replica set name by adding replicaSet=001-rs to the end of the MONGO_URL and MONGO_OPLOG_URL?

What I intend to have is:
MONGO_URL = "mongodb://den02vmchat01:27017,mongochat02:27017,mongochat03:27017,ash01vmchat01:27017,ash01vmchat02:27017/rocketchat?replicaSet=001-rs&readPreference=primaryPreferred&w=majority"

MONGO_OPLOG_URL = "mongodb://den02vmchat01:27017,mongochat02:27017,mongochat03:27017,ash01vmchat01:27017,ash01vmchat02:27017/local?replicaSet=001-rs"

from docs-old.

richardwlu avatar richardwlu commented on September 18, 2024

@geekgonecrazy After editing the following values for each instance (I have 8 total instances running on 2 servers with 4 on each), we have noticed an issue where users are not receiving desktop notifications and alerts consistently (more failed than not). Would you happen to know why this would occur and if it is related to the mongodb config?

We are on version 0.55.0 (older I know), but everything was running fine prior to me adding 4 instances and editing the mongodb. The version has stayed the same.

MONGO_URL = "mongodb://den02vmchat01:27017,mongochat02:27017,mongochat03:27017,ash01vmchat01:27017,ash01vmchat02:27017/rocketchat?replicaSet=001-rs&readPreference=primaryPreferred&w=majority"

MONGO_OPLOG_URL = "mongodb://den02vmchat01:27017,mongochat02:27017,mongochat03:27017,ash01vmchat01:27017,ash01vmchat02:27017/local?replicaSet=001-rs"

from docs-old.

geekgonecrazy avatar geekgonecrazy commented on September 18, 2024

@richardwlu sorry got a bit behind on github notifications 😁

If you are getting issues like that, typically because the instances cannot talk to each other. Usually something with instance ip or firewall conditions. They need to be able to talk otherwise when it fires event only users on same instance as the one that fired the event will receive it.

from docs-old.

richardwlu avatar richardwlu commented on September 18, 2024

@geekgonecrazy No worries, found out it was the INSTANCE_IP

from docs-old.

AFrangopoulos avatar AFrangopoulos commented on September 18, 2024

@geekgonecrazy @Sing-Li @georgios Are you aware of any RC use-cases with very high concurrency (say, 100k + concurrent users)? Most things I have read were much smaller in scale than this.
After reading the doc page on scaling by adding RC instances (using mongo RS and a RP). In your research, have you found scaling like this to be linear or is there a large drop-off at a certain point? Most things I have read were mostly formula based, ie: " After doing X, we were able to handle 4X the number of Users". I have not been able to find a ballpark estimation of the number of users per instance, given some generic setup. Thank you for your time.

from docs-old.

geekgonecrazy avatar geekgonecrazy commented on September 18, 2024

Ulimit increased, node heap increased and good hardware you can do some amazing stuff. :). At that size of deployment I'd say reach out and talk to us.

from docs-old.

AFrangopoulos avatar AFrangopoulos commented on September 18, 2024

Without getting redisOplog integrated, it can't scale to large numbers of concurrent users (did not have the time to spend on adding this solution to RC, so we abandoned it). The more you scale horizontally, the more your gains diminish due to oplog tailing (and I imagine some other things). Also, it seems the number of packages are also starving the app for resources from what I could tell.

I suspect that if RC were to integrate the redis-oplog feature successfully, it could scale to a very large number of concurrent users (unless there are other bottlenecks that appear after oplog tailing is removed from the equation). Lastly, if you do choose to use RC and try scale it will not be cost-effective -- and eventually you will hit a point where scaling stops and you will likely need to find a different solution.

This is all based on our load/performance testing. It is not fact per se, but I haven't seen anything out there that contradicts our conclusions. Hope this helps! I urge the RC devs to try to integrate the redis-oplog package to see this applications full potential. GL

from docs-old.

sr258 avatar sr258 commented on September 18, 2024

@AFrangopoulos Thanks for your answer, even if it is not what I'd hoped. Are you willing / allowed to hand out the load-tests?
@geekgonecrazy Do you think we could talk about your assessment on how RC scales and what would need to be done for RC to work at the number of users we have?

from docs-old.

sr258 avatar sr258 commented on September 18, 2024

@AFrangopoulos Can you provide some more detail on the load testing you've done (what techniques you've used etc.)? My institution is really interested in setting up their own load testing.

from docs-old.

Rodriq avatar Rodriq commented on September 18, 2024

Docs for high availability here https://docs.rocket.chat/quick-start/deploying-rocket.chat/rapid-deployment-methods/docker-and-docker-compose/docker-containers/high-availability-install

from docs-old.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.