Comments (12)
- Incremental ids are usually only unique within one instance of database. If your db is sharded, then it may become difficult to use incremental ids because you may have user/1 in instance 1 and user/1 in instance 2.
- Another useful application of uuids is that clients can define them themselves when they create new instances, e.g. I could create new user with JavaScript and set its id right there without needing db to assign an id.
from http-api-design.
- The probability of a collision with UUID's is not 0 + you can't trust client-side generated ID's (their random generator could be returning the same number with every invocation, for all we know), so you still need to do the appropriate checks (and have a mechanism for denying a resource creation, if the ID collides, just like with the regular integers...)
- Performance varies from one DB type to another, but, for example, you still need a good old autoincrement integer in your Postgres (while the situation is even worse in MySQL and MSSQL, from what I've read)
- You completely ruin the aesthetics of your URL's:
/memberships?user=123&team=456
vs./memberships?user=1b2d9fb0-d232-49d5-9e60-334bc16d79bc&team=6f6f3d93-df18-495f-8de4-fa29cb2e5835
You make it sound like it's a no-brainer, when it's not. The advantages are accidental (for example, I don't care about 3rd parties analyzing our well-being using the resource ID's, because, firstly, I don't mind, and secondly, they'll find a way), and there's nothing you can do about the aesthetics, if you have no other unique identifier for a resource.
from http-api-design.
I was still thinking about it... If you open your API to the public, obviously you can't create the UUIDs in the client, because you can't assume that the UUIDs will be generated in the way you'd expect.
Idk about the database scope, if you consider the distributed case, though.
For all other arguments, simply append a random number with fixed length to the resource id (and persisting it with the id itself in the database, maybe as a composite key), this will mask your id, the size of your database and will prevent the attacker to iterate over all your entries, e.g.:
id + 6 digits random number: 1 + 005174 = 1005174, like /user/1005174
Even if the attacker knows the size of the random number, he won't know the number itself. So, he wouldn't know the id 2 + rand (to iterate), or the id 545684 + rand (to try to guess the database size).
I don't care about aesthetics, because I belive the APIs are for client software and not users, but a 36 char string seems like a overkill to me.
And to think that, with more and more entries in your database, the collision chance increases, makes me uneasy. So, if you think in Google parameters, the number of database entries must cause collisions, even with something as improbable as UUID...
from http-api-design.
Ok, I think I'm starting to get the idea... Thanks!
from http-api-design.
+1!
Also worth noticing uuids present another layer of defense if you forget to scope a query, which is a pretty common mistake even big companies make:
http://mashable.com/2015/04/28/twitter-earnings-selerity/
from http-api-design.
An additional non-technical reason is that as a company grows and gets attention of competitors, numeric IDs can allow people to discover the relative size of your data based on IDs of newly-created records. Analysts often use this method to estimate how much revenue a company earns too. UUIDs aren't the only solution here, but in the context of an API you'd need to use something other than the numeric ID either way, so UUID is a suitable alternative, especially in the context of the others reasons to use them.
from http-api-design.
+1 for code design that doesn't betray it's inner functionality. It's also worth mentioning that UUIDs and GUIDs have a defined standard/algorithm and are not simply a random series of 32 hex digits:
https://en.m.wikipedia.org/wiki/Globally_unique_identifier
from http-api-design.
Regarding aesthetics - yes, they don't matter on the API level, but then you still need routing in your client-side application, right? Let's take a user resource: the service I'm making allows duplicate usernames, while user's email is a private piece of info, so all I'm left with is some unique identifier, and I'd prefer it to be a short number / hash (think Trello), and not 36 char long gibberish.
IMO, this is bad (ignore the product identifier, you get the point):
from http-api-design.
In our case at least we expect all the uuids to be generated by us, server side, so the client concerns did not matter. I also agree that not leaking information about how many things you might have is pretty incidental, not really important for most use cases (but matters to some people). Similarly, preventing an attacker from iterating is nice-to-have, but ideally you have enough other protections in place that you would be ok even if they knew keys, so again incidental benefit.
The biggest reason for us, I think, is that it makes it more feasible to shard later as one grows than integers. And if you don't do it sooner, rather than later, the pain/difficulty of later having to switch is pretty bad. So the hope was to head off that issue at the pass and just start with something that should work into the future. Even though each service might be able to have it's own id's, any of the individual services might still grow to the size where sharding would become necessary, so simply dividing things up might delay but I don't think would be able to for-sure prevent this from becoming an issue.
The aesthetics issue is one that bothers me as well. I don't particularly like the way they look and they are quite long. Which in some cases becomes concretely problematic, rather than just ugly, for instance due to a somewhat small limit on total size of query string (though this can be worked around by doing POST with this info in the body, it still seems not-great). I still felt using something that should be able to scale more easily (as well as having some of these nice other properties) out-weighed un-aesthetic things in a context that will mostly be written/created by computers rather than humans. I think if this were being exposed more in web pages it might well be another story.
I suppose if you feel that it is likely that your dataset would never need to grow beyond the bounds of a single database it would lessen some of these pressures, but I was unwilling to make that bet.
from http-api-design.
Instagram have an interesting blog post about their ID generation. Instagram IDs are shorter, (subjectively) more aesthetically pleasing, and shard ready.
http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram
That might be an appropriate alternative for those seeking to avoid UUIDs.
from http-api-design.
Just to chip in here, I also find UUID to be pretty ugly, and their primary use case (allowing distributed clients to generate IDs with a very low chance of collisions) isn't one that I've really come across.
UUIDs imply (in the JSON at least) that they're strings, but they're actually 128 bit values, and whilst many databases / storage engines support UUIDs natively (e.g. Postgres does, but SQLite doesn't) , it's a bit less common than storing integers, and many users of your API might just store them as strings, which is probably ok, but might not scale as well?
On the other hand, 64 bit integers can't always be parsed in javascript environments as an integer if they're above 53 bits, so Twitter always includes a string version with a _str
suffix (see https://dev.twitter.com/overview/api/twitter-ids-json-and-snowflake ).
from http-api-design.
Yeah, I was about to mention snowflake/twitter as another case.
Distributed id generation is definitely not part of why we wanted unique stuff. Mostly future-proofing and as a means of having consistency, other stuff is more periphery. We chose it over snowflake/etc at least in part because we use postgres and so we already had easy native support.
They are ugly though, for sure. I guess I'm just on the fence about whether that is a strong enough reason to do something more complicated, since they will mostly only be "seen" by computers. I suppose it depends on if the API is then revealed in user facing APIs, where uuids would be more unfortunate.
from http-api-design.
Related Issues (20)
- uuid in doubt HOT 4
- Dead link in about section HOT 1
- Guidance on implementing REST interfaces for state machine HOT 21
- I'm curious to know the reasoning for going with JSON Schemas instead of Swagger HOT 5
- How are you modeling authentication operation? HOT 1
- Using 409 Conflict for uniqueness checks HOT 2
- Create a website (Gitpage) HOT 3
- Traditional Chinese version, and add "List of Translations" HOT 5
- Consider adding language specific resources for implementing these principles HOT 1
- consider expanding error messaging as per white house guide HOT 7
- include more examples for main points HOT 9
- detail expansions
- Consider compatibility with jsonapi.org? HOT 8
- Paginating/ranging over non-unique fields HOT 15
- Pagination using Range cannot be consistent HOT 8
- Test Framework HOT 1
- The used time format is actually RFC5424, a subset of ISO8601 HOT 2
- Json with PLSQL ( PLJSON) HOT 1
- Links in README.md are 404 when viewed on Gitbook HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from http-api-design.