Give each resource an id attribute by default. Use UUIDs unless you have

Incremental ids are usually only unique within one instance of database. If your

The probability of a collision with UUID's is not 0 + you can't trust client-sid

Why use UUID? about http-api-design HOT 12 CLOSED

interagent commented on September 23, 2024 2

Why use UUID?

from http-api-design.

Comments (12)

gkirill commented on September 23, 2024 12

Incremental ids are usually only unique within one instance of database. If your db is sharded, then it may become difficult to use incremental ids because you may have user/1 in instance 1 and user/1 in instance 2.
Another useful application of uuids is that clients can define them themselves when they create new instances, e.g. I could create new user with JavaScript and set its id right there without needing db to assign an id.

from http-api-design.

alecmev commented on September 23, 2024 9

The probability of a collision with UUID's is not 0 + you can't trust client-side generated ID's (their random generator could be returning the same number with every invocation, for all we know), so you still need to do the appropriate checks (and have a mechanism for denying a resource creation, if the ID collides, just like with the regular integers...)
Performance varies from one DB type to another, but, for example, you still need a good old autoincrement integer in your Postgres (while the situation is even worse in MySQL and MSSQL, from what I've read)
You completely ruin the aesthetics of your URL's: /memberships?user=123&team=456 vs. /memberships?user=1b2d9fb0-d232-49d5-9e60-334bc16d79bc&team=6f6f3d93-df18-495f-8de4-fa29cb2e5835

You make it sound like it's a no-brainer, when it's not. The advantages are accidental (for example, I don't care about 3rd parties analyzing our well-being using the resource ID's, because, firstly, I don't mind, and secondly, they'll find a way), and there's nothing you can do about the aesthetics, if you have no other unique identifier for a resource.

from http-api-design.

rafaelrabeloit commented on September 23, 2024 2

I was still thinking about it... If you open your API to the public, obviously you can't create the UUIDs in the client, because you can't assume that the UUIDs will be generated in the way you'd expect.

Idk about the database scope, if you consider the distributed case, though.

For all other arguments, simply append a random number with fixed length to the resource id (and persisting it with the id itself in the database, maybe as a composite key), this will mask your id, the size of your database and will prevent the attacker to iterate over all your entries, e.g.:

id + 6 digits random number: 1 + 005174 = 1005174, like /user/1005174
Even if the attacker knows the size of the random number, he won't know the number itself. So, he wouldn't know the id 2 + rand (to iterate), or the id 545684 + rand (to try to guess the database size).

I don't care about aesthetics, because I belive the APIs are for client software and not users, but a 36 char string seems like a overkill to me.
And to think that, with more and more entries in your database, the collision chance increases, makes me uneasy. So, if you think in Google parameters, the number of database entries must cause collisions, even with something as improbable as UUID...

from http-api-design.

rafaelrabeloit commented on September 23, 2024

Ok, I think I'm starting to get the idea... Thanks!

from http-api-design.

pedro commented on September 23, 2024

+1!

Also worth noticing uuids present another layer of defense if you forget to scope a query, which is a pretty common mistake even big companies make:

http://mashable.com/2015/04/28/twitter-earnings-selerity/

from http-api-design.

bjeanes commented on September 23, 2024

An additional non-technical reason is that as a company grows and gets attention of competitors, numeric IDs can allow people to discover the relative size of your data based on IDs of newly-created records. Analysts often use this method to estimate how much revenue a company earns too. UUIDs aren't the only solution here, but in the context of an API you'd need to use something other than the numeric ID either way, so UUID is a suitable alternative, especially in the context of the others reasons to use them.

from http-api-design.

crazytonyi commented on September 23, 2024

+1 for code design that doesn't betray it's inner functionality. It's also worth mentioning that UUIDs and GUIDs have a defined standard/algorithm and are not simply a random series of 32 hex digits:

https://en.m.wikipedia.org/wiki/Globally_unique_identifier

from http-api-design.

alecmev commented on September 23, 2024

Regarding aesthetics - yes, they don't matter on the API level, but then you still need routing in your client-side application, right? Let's take a user resource: the service I'm making allows duplicate usernames, while user's email is a private piece of info, so all I'm left with is some unique identifier, and I'd prefer it to be a short number / hash (think Trello), and not 36 char long gibberish.

IMO, this is bad (ignore the product identifier, you get the point):

And this is good:

from http-api-design.

geemus commented on September 23, 2024

In our case at least we expect all the uuids to be generated by us, server side, so the client concerns did not matter. I also agree that not leaking information about how many things you might have is pretty incidental, not really important for most use cases (but matters to some people). Similarly, preventing an attacker from iterating is nice-to-have, but ideally you have enough other protections in place that you would be ok even if they knew keys, so again incidental benefit.

The biggest reason for us, I think, is that it makes it more feasible to shard later as one grows than integers. And if you don't do it sooner, rather than later, the pain/difficulty of later having to switch is pretty bad. So the hope was to head off that issue at the pass and just start with something that should work into the future. Even though each service might be able to have it's own id's, any of the individual services might still grow to the size where sharding would become necessary, so simply dividing things up might delay but I don't think would be able to for-sure prevent this from becoming an issue.

The aesthetics issue is one that bothers me as well. I don't particularly like the way they look and they are quite long. Which in some cases becomes concretely problematic, rather than just ugly, for instance due to a somewhat small limit on total size of query string (though this can be worked around by doing POST with this info in the body, it still seems not-great). I still felt using something that should be able to scale more easily (as well as having some of these nice other properties) out-weighed un-aesthetic things in a context that will mostly be written/created by computers rather than humans. I think if this were being exposed more in web pages it might well be another story.

I suppose if you feel that it is likely that your dataset would never need to grow beyond the bounds of a single database it would lessen some of these pressures, but I was unwilling to make that bet.

from http-api-design.

bjeanes commented on September 23, 2024

Instagram have an interesting blog post about their ID generation. Instagram IDs are shorter, (subjectively) more aesthetically pleasing, and shard ready.

http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram

That might be an appropriate alternative for those seeking to avoid UUIDs.

from http-api-design.

frankieroberto commented on September 23, 2024

Just to chip in here, I also find UUID to be pretty ugly, and their primary use case (allowing distributed clients to generate IDs with a very low chance of collisions) isn't one that I've really come across.

UUIDs imply (in the JSON at least) that they're strings, but they're actually 128 bit values, and whilst many databases / storage engines support UUIDs natively (e.g. Postgres does, but SQLite doesn't) , it's a bit less common than storing integers, and many users of your API might just store them as strings, which is probably ok, but might not scale as well?

On the other hand, 64 bit integers can't always be parsed in javascript environments as an integer if they're above 53 bits, so Twitter always includes a string version with a _str suffix (see https://dev.twitter.com/overview/api/twitter-ids-json-and-snowflake ).

from http-api-design.

geemus commented on September 23, 2024

Yeah, I was about to mention snowflake/twitter as another case.

Distributed id generation is definitely not part of why we wanted unique stuff. Mostly future-proofing and as a means of having consistency, other stuff is more periphery. We chose it over snowflake/etc at least in part because we use postgres and so we already had easy native support.

They are ugly though, for sure. I guess I'm just on the fence about whether that is a strong enough reason to do something more complicated, since they will mostly only be "seen" by computers. I suppose it depends on if the API is then revealed in user facing APIs, where uuids would be more unfortunate.

from http-api-design.

Why use UUID? about http-api-design HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent