Coder Social home page Coder Social logo

Comments (6)

RobThree avatar RobThree commented on July 19, 2024

I cannot set different generatorId on each instance

Well, you're gonna have to 😅 Because that's exactly what it is meant for.

As documented here:

This library provides a basis for Id generation; it does not provide a service for handing out these Id's nor does it provide generator-id ('worker-id') coordination.

There generatorId doesn't have to be static ofcourse; you could consider any of these things:

  • Look into StatefulSets and try to use that
  • Build a simple 'coordinator'-service your project can ask for a (unique) generator-id on startup
  • Use some sort of a hash of the instance / hostname / pod name / whatever (but make sure it'll be unique across all instances!)
  • Pick a random number and YOLO it 😜

from idgen.

mehdihadeli avatar mehdihadeli commented on July 19, 2024

Thanks for your response

Look into StatefulSets and try to use that

About this I think, it should work for each service instances, but maybe we have some conflicts with other microservices Ids

Build a simple 'coordinator'-service your project can ask for a (unique) generator-id on startup

Yes, this approach works completely, but I need to maintain separate service

Use some sort of a hash of the instance / hostname / pod name / whatever (but make sure it'll be unique across all instances!)

How can I do this in c# and get an integer for using in the generatorId?

Pick a random number and YOLO it

Do you mean using random class?

What about using host MAC address? For example, NewID library uses this approach.

from idgen.

RobThree avatar RobThree commented on July 19, 2024

Yes, this approach works completely, but I need to maintain separate service

Something / someone will have to coordinate the worker-id's (generatorId). If you can't use some kubernetes assigned value then it'll have to come from elsewhere. Where / how you do it is completely up to you - there are an infinite amount of scenario's in which IdGen can be used and that's exactly why it leaves worker-id coordination up to the user. You use kubernetes, someone else uses X or Y, etc. and each time the requirements will differ, as will the implementation.

This service I suggested doesn't have to do very much other than offer a way to get an, say, incremental id for a worker which either wraps around (assuming you're never gonna have more than the 2^10 = 1024 (in the default configuration) workers) or provides a way to declare a worker-id 'disposed' in some way. All you need to do is track which Id is in use and which isn't and that's only on startup / shutdown of your (micro)service / application.

How can I do this in c# and get an integer for using in the generatorId?

Again there are many roads that lead to Rome. You might simply call .GetHashcode() on the hostname (which is probably not a good idea), or use some bits from a SHA1 hash of the hostname or maybe you can make the hostname incremental (like host1, host2, ...) and you can simply get the number from that string. You can then pass that number as generatorId. It could be based on the hostname, some bits from the primary NIC's mac-address, the kubernetes replica index, pod-id, ... anything. Just make sure it's unique (enough) and future proof (enough) so you don't get collisions.

Do you mean using random class?

I was joking 😉 The odds of a collision are much too high.

What about using host MAC address? For example, NewID library uses this approach.

You could use some bits from the MAC address, yes. But note that a MAC address has 48 bits (which are supposed to be unique, but aren't *) and the generator-id part is, by default, only 10 bits. You can reserve and more bits ofcourse, but I don't recommend using all 48. But then the less bits you use (say 10) increases the chance on a collision higher.

* - Cloned VM's, cheap network cards, mistakes by manufacturers, spoofed mac-addresses, it's all fun and games until it's not. MAC addresses aren't as unique as people think. But as long as you have good / total control then, yes, it could be used.

NewID has the 'luxury' of having a total of 128 bits available in which case reserving 48 (plus a few more) bits for the generatorID is a lot easier. IdGen produces more compact (half the number of bits, 64) Id's but the tradeoff is, indeed, that less bits are available for each of the parts that make up an IdGen-id.

About this I think, it should work for each service instances, but maybe we have some conflicts with other microservices Ids

I think I'd first give this option a shot; it looks the most reasonably manageable and useable to me.

from idgen.

mehdihadeli avatar mehdihadeli commented on July 19, 2024

Thanks for all your explanations, I should check options in my app :) Could we also use something like this Guid.NewGuid().GetHashCode() (or Guid.NewGuid().GetDeterministicHashCode() using andrew lock approach for a deterministic gethashcode)? I think it should be also a unique generatorId for all instances and all microservices

from idgen.

RobThree avatar RobThree commented on July 19, 2024

Could we also use something like this Guid.NewGuid().GetHashCode() (or Guid.NewGuid().GetDeterministicHashCode()

This works when you can use all 128 bits of entropy; however: the worker-id is (by default) only 10 bits, 1024 possibilities. The chance of getting a duplicate worker-id is therefore 1 in 1024 for the second worker and this goes up rapidly for each new worker (see birthday paradox). The chance of a collision is much too high.

You can get a hashcode of anything (be it a Guid or just a simple string); you're still stuck with 10 bits you can use for a worker-id (maybe a few more if you adjust the structure of the ID a little). But at that point you may just as well just generate a random 10 bit number. Chances of a collision, be it a random number or some hashcode of some value are, for such a (relatively) small amount of bits, just too high.

I would strongly recommend you do not rely on randomness or hashcodes but rather work towards a (across the board) deterministic way of assigning worker-id's. Be it by just assigning them an incremental number from, say, the kubernetis instance id or by implementing a coordinator service of some sort that keeps track of worker-id's being in use or available.

A hashcode may be deterministic, it still can't guarantee uniqueness; especially if you need to discard some bits of the hashcode since you can only use (by default) 10 bits of that hashcode for the worker-id. Actually, even a GUID can't guarantee uniqueness, but because of the immense space (128 bits of 'randomness' 122 bits actually, because of 4 bits being reserved) the chance of a collision is astronomically small (there are 2122 ≈ 5.3 x 1036 = 5316911983139663491615228241121378304 possible Guid's) and therefore negligible. However, we only have (about) 10 bits (you could crank that up a little) available so chances of a collision are much, much higher (210 = 1024, crank it up to, say, 16 bits and you still 'only' have 216 = 65536 possible generator ID's). You may get away with it for a while, but collisions are pretty much guaranteed to happen pretty quickly. And then things will snowball and spiral VERY quickly.

Again; I urge to not solely rely on a hashcode - though it may be a part of an algorithm to determine a final worker-id for any given worker. That's why I wrote: "make sure it'll be unique across all instances!"

from idgen.

mehdihadeli avatar mehdihadeli commented on July 19, 2024

Thanks for your complete answer, I will skip, using hashcodes.

from idgen.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.