Coder Social home page Coder Social logo

Comments (21)

norpan avatar norpan commented on August 15, 2024 3

So, the issue is not so much the period or the internal state size as the initialization. If you have a 32 bit seed you can have as long a period as you like, but you still only have 2^32 different random series. So if you have a few hundred thousand users generating uuids independently, a collision is still possible.

from elm-uuid.

Zinggi avatar Zinggi commented on August 15, 2024 2

I published it now: http://package.elm-lang.org/packages/Zinggi/elm-uuid/latest

from elm-uuid.

ZimbiX avatar ZimbiX commented on August 15, 2024 1

For anyone else who reaches here and is disheartened to see that @Zinggi's fork is not compatible with Elm 0.19.1 (and won't be), check out NoRedInk/elm-uuid =)

Probably best to update the fork link in the readme here

from elm-uuid.

danyx23 avatar danyx23 commented on August 15, 2024

Hey norpan! I am not very deeply versed in pseudo random number generators, but when the Random.PCG library is used correctly (by using generators instead of re-seeding for every UUID) then AFAIK what is primarily important for UUID collisions is not the bit count of the initial base seed but the period of the PRNG. IIRC this is significantly higher than 2^32 for PCG implementations (it seems the default period for PCG is 2^63 but I might have misread that). Additionally, because of the way the UUID implementation generates random numbers, it is unlikely that the same 31 hex digit sequence would be generated after a single period. (please consult http://www.pcg-random.org/rng-basics.html on how PCG basically works).

So I am relatively confident that the threshold where we cross above a 25% chance of regenerating UUIDs is significantly higher that the 50 000 you estimate (a quick lookup leads me to believe it should happen at about a billion generated UUIDs). But as I said, I am certainly no expert and would be very interested in your findings if you research this in more depth!. @mgold, the author of the elm Random.PCG library might be a good person to talk to about these things.

from elm-uuid.

norpan avatar norpan commented on August 15, 2024

Yes, so the period may be higher, which is good for the single-session case. The worst-case scenario is that these 50k uuids are generated in by different sessions so then there is still only 32 bit (about 4 billion different uuids) generated, since the PRNG is reseeded on each session.

So, my use case is around a thousand users of users over multiple sessions, and generating duplicate uuids is not good. I just wanted to check if this was something you are aware of. It's possible to handle duplicates, but I'd rather not.

from elm-uuid.

mgold avatar mgold commented on August 15, 2024

At one point Pcg did use the 64-bit variation that would not have this problem. It was changed to a 32-bit version for performance reason (JS does not have 64-bit ints).

I think your best bet at this point might be to generate 10k uuids a few times and see if there's a collision.

from elm-uuid.

danyx23 avatar danyx23 commented on August 15, 2024

I thought a bit more about this and this is a pretty nasty gotcha for UUIDs I think. @mgold, is there a way to get back a 64 bit seed for your pcg library? Or would mean in effect duplicating the library?

from elm-uuid.

mgold avatar mgold commented on August 15, 2024

Look at independentSeed. It's not quite a 2^64 period generator. Rather it lets you pick between 2^32 generators with a 2^32 period each. Does that help? Otherwise you'd have to duplicate the library.

from elm-uuid.

Zinggi avatar Zinggi commented on August 15, 2024

You might be interested in my just published Pcg-Extended library.

The purpose of this library is to make the most of your random seed.
E.g. if you seed pcg-extended with 8 32-bit ints from window.crypto.getRandomValues(), you get to use all 2^(32*8)-bits of randomness and an equally large period.

For this library, this means you would have to seed it with at least 4 32-bit ints to get the necessary 128-bit of randomness for the UUID

from elm-uuid.

Zinggi avatar Zinggi commented on August 15, 2024

@danyx23 Would you be open for a PR that adds support for my Pcg-Extended library?
I'd add this function

uuidGenerator : Random.Pcg.Extended.Generator Uuid

Then a user can get enough randomness if needed.

I can also create a fork if you'd prefer not having both options in one package.

from elm-uuid.

norpan avatar norpan commented on August 15, 2024

For sure, although it's not my package :)

from elm-uuid.

Zinggi avatar Zinggi commented on August 15, 2024

oh right, I meant @danyx23 , sorry to ping you instead 😄

from elm-uuid.

danyx23 avatar danyx23 commented on August 15, 2024

Hey @Zinggi yes definitely. I'm even thinking about replacing Pcg-Extended.

The reason I originally went with Pcg-Extended back in Elm 0.16 or so was that it provided more randomness than the default implementation. I think when you use a UUID library you should always get enough bits of randomness to get somewhere close to the potential 123 bits of randomness in a V4 UUID.

What do you think? @mgold what are your thoughts?

from elm-uuid.

Zinggi avatar Zinggi commented on August 15, 2024

Well just using my Pcg-Extended version, doesn't automatically give you enough bits of randomness.
You still have to properly seed it via a flag with enough values from window.crypto.getRandomValues(..).
But at least it has the potential to offer enough randomness if used properly. And even if used wrong, it is just as good as the currently used normal Pcg version.

But in any case, I think I've got time to create a PR in a few days. (Probably the day after tomorrow).
I think I'll just add it as an additional value (e.g. a MINOR) change, and then you can decide if you want to completely remove the other generator in a MAJOR change later.

from elm-uuid.

mgold avatar mgold commented on August 15, 2024

The official elm-lang PRNG will be using the PCG algorithm in 0.19. I think that third-party PRNGs will be something to be avoided at that point. I think you grab multiple ints in 0..2^32-1, you should get a an amount of randomness that is acceptable for anyone using a PRNG. [EDIT: This is incorrect, see next comment.] This can be wrapped up in a Generator Uuid. Otherwise, if someone wants a cryptographically secure uuid, provide a function to create one from 32-bit integers and let the caller worry about passing them in from a port.

from elm-uuid.

danyx23 avatar danyx23 commented on August 15, 2024

Yeah it's a bit of a tricky situation now. So with the upcoming changes to 0.19, what do you think about this: we make this version use the then default PCG PRNG with no further external dependencies, and we put a big disclaimer in this version saying spelling out the details which AFAICS boil down to: collisions are unlikely even for relatively high numbers of UUIDs from a single client, but because the seed is in effect a 32bit int collisions are a real possibility if you start generating UUIDs from more than a few thousand clients.

@norpan you could then publish your fork as an official alternative that I would link from this library. It would pull in your RPE library and could specifically focus on explaining how to get a good enough seed etc..

What do you think?

from elm-uuid.

Zinggi avatar Zinggi commented on August 15, 2024

@danyx23 you probably meant me ;)
The plan seems good to me.

I'm gonna publish my fork, probably some time next week.

from elm-uuid.

mgold avatar mgold commented on August 15, 2024

we make this version use the then default PCG PRNG with no further external dependencies, and we put a big disclaimer in this version saying spelling out the details

Sounds good.

collisions are unlikely even for relatively high numbers of UUIDs from a single client, but because the seed is in effect a 32bit int collisions are a real possibility if you start generating UUIDs from more than a few thousand clients.

Provided your a managing the seeds correctly this should be true. (And if you don't manage the seeds correctly, any PRNG will give collisions.)

@Zinggi Before you publish your fork, please consider: if someone needs really, really random uuids, would they be better off passing them in from a port and having the option of secure randomness? There is a cost to polluting the package ecosystem with similar packages.

from elm-uuid.

danyx23 avatar danyx23 commented on August 15, 2024

Cool, thank you all for the good discussion. I will make the changes to this repo once 0.19 is out (but can link a fork before that).

from elm-uuid.

norpan avatar norpan commented on August 15, 2024

We have actually used ports and a javascript uuid generator that uses good random numbers already so it's not an issue for us anymore.

However, I still think that a uuid generator should by default have as good a random source as possible. And 32 bits initial seed is a bit low. If that's what you get in Elm, so be it, perhaps it would make sense to work on making the built-in random generator have a larger seed instead.

from elm-uuid.

Zinggi avatar Zinggi commented on August 15, 2024

Before you publish your fork, please consider: if someone needs really, really random uuids, would they be better off passing them in from a port and having the option of secure randomness?

I'm in a position where I need random and unique UUIDs across many users for an application I'm working on. For my application passing a generated UUID via port would be an inferior solution to the one I currently use, which is passing in a random seed from crypto.getRandomValues(...) and then use the PCG-extended generator to generate a UUID.
This is better because I need other good random values that also make use of the PCG.extended generator. If I would only need a single UUID and nothing else from this generator, then the solution with ports would be just as good. (Except for the fact that I think the more written in Elm, the better)

There is a cost to polluting the package ecosystem with similar packages.

I agree, but if there is just a single user with similar needs to me, I think it's worthy to be published.

However, I still think that a uuid generator should by default have as good a random source as possible. And 32 bits initial seed is a bit low. If that's what you get in Elm, so be it, perhaps it would make sense to work on making the built-in random generator have a larger seed instead.

It makes sense that the default random generator uses a small seed and internal state, as that's most efficient. For most applications where you need randomness, the normal PCG offers a great trade off.

After 0.19 is out, an official way to access crypto.getRandomValues(...) will probably be worked out, as that's part of the web platform. When this is done, we can use that to provide a UUID generator that cannot be used wrong.
But until then, my fork should be good enough.

from elm-uuid.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.