Comments (9)
Hey @beikov , could you please share your opinion on the right solution for this using Hibernate ORM? I'd definitely be interested in hearing it.
from quarkus.
I think the proper solution is to always think about the what data you really need and model this with DTOs to avoid any sort of unpredictable lazy loading.
In this particular case, you seem to need some sort of recursive fetching, so I would suggest you to make use of recursive CTEs to model the data loading e.g.
with statements as (
select cast(:id) as id
union all
select s.previousStatement.id as id from Statement s join statements r on s.id = r.id
)
select s from Statement s where s.id in (select r.id from statements r)
This way, you will select all statements that are reachable via previousStatement
associations, starting at the statement with the id as given through the :id
parameter.
Ideally, we'd offer a recursive fetch join to model this, but we don't have that yet.
from quarkus.
@beikov thanks for your feedback. In my use case:
- most of the time we don't need the previousStatement
- ... but some flows do
- ... and sometimes we need to dig further
As getting a statement is a O(1) operation, but fetching the whole chain is O(n), I do not want to fetch them eagerly or with any kind of projection or recursive fetch. Fetching lazily means that I have a best case of O(1) and a worst case of O(f) (f meaning a few here). The proposed alternative is O(n) best case, making it perform much worse (n can be big here).
There are a lot of other valid use cases. Like the 'debate' about open-session-in-view for spring, there will always be PROs and CONs, and I suggest to not go too deep into this debate here but let users make conscious decisions about their use cases.
Therefore, such a functionality is useful for some users.
from quarkus.
Then IMO you should model your different flows explicitly by creating separate data access methods that initializes exactly the state that the caller needs. Any sort of unpredictable lazy initialization will lead to pain, so it's best to avoid it in the first place.
Open-session-in-view or enabling the hibernate.enable_lazy_load_no_trans
setting are both recipes for disaster.
but fetching the whole chain is O(n)
Talking about O(n)
here is a bit confusing, because the expensive part really is the separate select statement execution, not the lookup in the database itself. Using a recursive fetch like I showed you only does one database request.
There are a lot of other valid use cases. Like the 'debate' about open-session-in-view for spring, there will always be PROs and CONs, and I suggest to not go too deep into this debate here but let users make conscious decisions about their use cases.
The only PRO I see is that you can turn off your brain and not think about the data access design, but that will bite you very fast when you run into performance issues, because changing the design later is painful. So I don't really view that as a PRO, but rather as a CON in disguise. Modelling your data fetching requirements is IMO not something that should be an after-thought just like performance testing shouldn't be, because at some point there will be that one user with hundreds of chained Statement
associations who will bring down your whole application.
Therefore, such a functionality is useful for some users.
If you really want this broken behavior, then let the transaction span across the whole request by annotating your HTTP endpoint method with @Transactional
. Nobody is preventing you from doing that. Having a global flag is IMO a very bad idea as that will make every HTTP request claim a database connection and also span a transaction including locks for much longer than needed.
from quarkus.
This is quite opinionated, and I really have a different view and experience.
Even if the recursive fetch is a single request, it is definitely a O(n) one that I want to avoid. As said, I most often do not need this data.
Of course, we could replace the relation with an ad-hoc call. But that also means refactoring the database to replace the many-to-one by a relation table, which I don't consider better design for this use case.
'Turn off your brain' is rather though wording: I'd rather rephrase it as decoupling the service returning a Statement from whatever the next transaction wants to do with it.
There is no use case where a user will bring down the whole application, for multiple reasons.
This is also a very simple use case, but navigating complex object graphs not known in advance in a different transaction isn't bad design per se. Performance isn't an afterthought here, on the contrary.
I'm new to this project, but having bigger transactions is one of the steps that we are taking further. Still, it doesn't cover all use cases and we sometimes do want a separate one.
Of course, https://hibernate.atlassian.net/browse/HHH-17750 makes our problem worse: if fetching a lazy transient attribute of a detached entity would work in a new transaction, the need for this would be much lower (though still there).
from quarkus.
This is quite opinionated, and I really have a different view and experience.
I was asked for an opinion and here you have it. I simply prefer designing the data access rather than burying requirements in e.g. the UI code through implicit lazy initialization. If you ask me, no entity should ever be serialized to JSON, but only DTOs. I understand that this might seem extreme to some folks, but designing the data access this way makes it way simpler to reason about performance and optimize, simply because there are no unknowns.
Even if the recursive fetch is a single request, it is definitely a O(n) one that I want to avoid. As said, I most often do not need this data.
That's totally understandable, hence my suggestion.
Then IMO you should model your different flows explicitly by creating separate data access methods that initializes exactly the state that the caller needs.
Of course, we could replace the relation with an ad-hoc call. But that also means refactoring the database to replace the many-to-one by a relation table, which I don't consider better design for this use case.
I didn't suggest that and I agree that this would not improve the design.
'Turn off your brain' is rather though wording: I'd rather rephrase it as decoupling the service returning a Statement from whatever the next transaction wants to do with it.
I don't know your application design, but it seems to me that your app could know beforehand what the next transaction requires and hence load the appropriate data without the need for lazy loading. It's a matter of creating separate data access methods for the respective flows.
This is also a very simple use case, but navigating complex object graphs not known in advance in a different transaction isn't bad design per se. Performance isn't an afterthought here, on the contrary.
I'm new to this project, but having bigger transactions is one of the steps that we are taking further. Still, it doesn't cover all use cases and we sometimes do want a separate one.
No idea how you are "navigating" in "a different transaction", but maybe you you can solve your issues by simply loading the subsequent object graphs explicitly instead of relying on lazy loading?
Of course, https://hibernate.atlassian.net/browse/HHH-17750 makes our problem worse: if fetching a lazy transient attribute of a detached entity would work in a new transaction, the need for this would be much lower (though still there).
It seems @mbladel is already looking into this issue, so let's maybe keep the discussion about the Jira issue in the Jira comments.
from quarkus.
I simply prefer designing the data access rather than burying requirements in e.g. the UI code through implicit lazy initialization.
This isn't UI code. The first transaction is simply committed and we end up with an object with lazy relations.
If you ask me, no entity should ever be serialized to JSON, but only DTOs.
I do strictly agree with that. Not what is happening in my example.
I understand that this might seem extreme to some folks, but designing the data access this way makes it way simpler to reason about performance and optimize, simply because there are no unknowns.
There are unknowns, the first service doesn't know how much data it has to send back. Even the caller doesn't know at call time, it might need to fetch relations. It's the essence of lazy relationships, the fact they are in a different transaction doesn't make this different.
Even if the recursive fetch is a single request, it is definitely a O(n) one that I want to avoid. As said, I most often do not need this data.
That's totally understandable, hence my suggestion.
Then IMO you should model your different flows explicitly by creating separate data access methods that initializes exactly the state that the caller needs.
No, at the time of the first call, I can't decide how many previous statements I need. It also depends on expensive calculations on the statement itself (calling third party services, ...). I really can't know beforehand.
I don't know your application design, but it seems to me that your app could know beforehand what the next transaction requires and hence load the appropriate data without the need for lazy loading. It's a matter of creating separate data access methods for the respective flows.
So, basically, this is not the case.
I also don't see different kind of structures for slightly different use cases as a better design. Your proposal does have its merits to optimize some use cases, but isn't the holy grail either.
JPA is a leaky abstraction, and that's both a problem and some of its strengths. Lazy loading, even in a different transaction, is quite useful in some scenarios. I do agree it can be misused.
No idea how you are "navigating" in "a different transaction", but maybe you you can solve your issues by simply loading the subsequent object graphs explicitly instead of relying on lazy loading?
That means you need to know all the paths that you'll have to follow beforehand. If there is any complex logic or dependencies, this isn't necessarily the case.
It seems @mbladel is already looking into this issue, so let's maybe keep the discussion about the Jira issue in the Jira comments.
Yes. I just meant this is the first reason why we wanted this feature, even it were only for migration.
from quarkus.
No, at the time of the first call, I can't decide how many previous statements I need. It also depends on expensive calculations on the statement itself (calling third party services, ...). I really can't know beforehand.
That means you need to know all the paths that you'll have to follow beforehand. If there is any complex logic or dependencies, this isn't necessarily the case.
Maybe I'm not totally understanding, but in your second transaction, you can execute your logic to determine what data needs to be fetched and then load the Statement
with the id of the previousStatement
along with how many previousStatement
you may need, but in one go. Put all of that into a DTO so there is no lazy entity anymore and you're done :)
from quarkus.
Maybe I'm not totally understanding, but in your second transaction, you can execute your logic to determine what data needs to be fetched and then load the
Statement
with the id of thepreviousStatement
along with how manypreviousStatement
you may need, but in one go. Put all of that into a DTO so there is no lazy entity anymore and you're done :)
In the second transaction, I can't know if I need a 3rd or 4th object before checking the second one. How would an ad-hoc call to fetch a new statement be different from a lazy fetch regarding performance? I'm still in the service layer btw.
from quarkus.
Related Issues (20)
- Hibernate Reactive can't persist a many-to-many relation when one of the entities is a child (with the "joined" inheritance type) and the batch size is greater than 1 HOT 10
- Postgresql bump causing detection of instance Random/SplittableRandom HOT 11
- jpamodelgen fails with range out of bounds HOT 4
- Properly pass errors from JsonRPC backends to Dev UI HOT 2
- Unable to build a container with Jib when podman is running HOT 7
- Exception thrown when stopping devmode and with dev services running with Podman HOT 1
- Quarkus opentelemetry extension WARNING when exporting to dynatrace HOT 11
- GRPC server doesn't respond when running on the same server as HTTP and using TLS HOT 6
- Add @PermissionsAllowed to Panache Resources HOT 9
- Native: Introduce flag to favor build time improvements at the cost of using more resources and vice versa HOT 4
- Prevent API scraping through obscurity HOT 1
- Decorating MongoClient does not work because decorator is removed during build HOT 3
- Jakarta RESTful Web Services 3.1 Multipart support does not work HOT 6
- Wrong duplicateEndPoint when implements two Rest interfaces HOT 7
- Quarkus SmallRye GraphQL not emitting metrics HOT 2
- Use Arc AnnotationStore to build Hibernate ORM metadata HOT 25
- Mockito fails to mock non-public inner class in continuous testing due to classloading issues HOT 8
- cert chain public key resolver thumbprints HOT 4
- Quarkus 3.7.4 java.lang.ClassNotFoundException when running devsevices with gradle HOT 14
- Better error message for @ParameterizedTest failures in a @QuarkusTest HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from quarkus.