Deion As a follow-up to <a class="issue-link js-issue-link"

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

allow fetching of lazy relations outside of its original transaction about quarkus HOT 9 OPEN

ymajoros commented on June 19, 2024

allow fetching of lazy relations outside of its original transaction

from quarkus.

Comments (9)

yrodiere commented on June 19, 2024

Hey @beikov , could you please share your opinion on the right solution for this using Hibernate ORM? I'd definitely be interested in hearing it.

from quarkus.

beikov commented on June 19, 2024

I think the proper solution is to always think about the what data you really need and model this with DTOs to avoid any sort of unpredictable lazy loading.
In this particular case, you seem to need some sort of recursive fetching, so I would suggest you to make use of recursive CTEs to model the data loading e.g.

with statements as (
  select cast(:id) as id
  union all
  select s.previousStatement.id as id from Statement s join statements r on s.id = r.id
)
select s from Statement s where s.id in (select r.id from statements r)

This way, you will select all statements that are reachable via previousStatement associations, starting at the statement with the id as given through the :id parameter.

Ideally, we'd offer a recursive fetch join to model this, but we don't have that yet.

from quarkus.

ymajoros commented on June 19, 2024

@beikov thanks for your feedback. In my use case:

most of the time we don't need the previousStatement
... but some flows do
... and sometimes we need to dig further

As getting a statement is a O(1) operation, but fetching the whole chain is O(n), I do not want to fetch them eagerly or with any kind of projection or recursive fetch. Fetching lazily means that I have a best case of O(1) and a worst case of O(f) (f meaning a few here). The proposed alternative is O(n) best case, making it perform much worse (n can be big here).

There are a lot of other valid use cases. Like the 'debate' about open-session-in-view for spring, there will always be PROs and CONs, and I suggest to not go too deep into this debate here but let users make conscious decisions about their use cases.

Therefore, such a functionality is useful for some users.

from quarkus.

beikov commented on June 19, 2024

Then IMO you should model your different flows explicitly by creating separate data access methods that initializes exactly the state that the caller needs. Any sort of unpredictable lazy initialization will lead to pain, so it's best to avoid it in the first place.
Open-session-in-view or enabling the hibernate.enable_lazy_load_no_trans setting are both recipes for disaster.

but fetching the whole chain is O(n)

Talking about O(n) here is a bit confusing, because the expensive part really is the separate select statement execution, not the lookup in the database itself. Using a recursive fetch like I showed you only does one database request.

There are a lot of other valid use cases. Like the 'debate' about open-session-in-view for spring, there will always be PROs and CONs, and I suggest to not go too deep into this debate here but let users make conscious decisions about their use cases.

The only PRO I see is that you can turn off your brain and not think about the data access design, but that will bite you very fast when you run into performance issues, because changing the design later is painful. So I don't really view that as a PRO, but rather as a CON in disguise. Modelling your data fetching requirements is IMO not something that should be an after-thought just like performance testing shouldn't be, because at some point there will be that one user with hundreds of chained Statement associations who will bring down your whole application.

Therefore, such a functionality is useful for some users.

If you really want this broken behavior, then let the transaction span across the whole request by annotating your HTTP endpoint method with @Transactional. Nobody is preventing you from doing that. Having a global flag is IMO a very bad idea as that will make every HTTP request claim a database connection and also span a transaction including locks for much longer than needed.

from quarkus.

ymajoros commented on June 19, 2024

This is quite opinionated, and I really have a different view and experience.

Even if the recursive fetch is a single request, it is definitely a O(n) one that I want to avoid. As said, I most often do not need this data.

Of course, we could replace the relation with an ad-hoc call. But that also means refactoring the database to replace the many-to-one by a relation table, which I don't consider better design for this use case.

'Turn off your brain' is rather though wording: I'd rather rephrase it as decoupling the service returning a Statement from whatever the next transaction wants to do with it.

There is no use case where a user will bring down the whole application, for multiple reasons.

This is also a very simple use case, but navigating complex object graphs not known in advance in a different transaction isn't bad design per se. Performance isn't an afterthought here, on the contrary.

I'm new to this project, but having bigger transactions is one of the steps that we are taking further. Still, it doesn't cover all use cases and we sometimes do want a separate one.

Of course, https://hibernate.atlassian.net/browse/HHH-17750 makes our problem worse: if fetching a lazy transient attribute of a detached entity would work in a new transaction, the need for this would be much lower (though still there).

from quarkus.

beikov commented on June 19, 2024

This is quite opinionated, and I really have a different view and experience.

I was asked for an opinion and here you have it. I simply prefer designing the data access rather than burying requirements in e.g. the UI code through implicit lazy initialization. If you ask me, no entity should ever be serialized to JSON, but only DTOs. I understand that this might seem extreme to some folks, but designing the data access this way makes it way simpler to reason about performance and optimize, simply because there are no unknowns.

Even if the recursive fetch is a single request, it is definitely a O(n) one that I want to avoid. As said, I most often do not need this data.

That's totally understandable, hence my suggestion.

Then IMO you should model your different flows explicitly by creating separate data access methods that initializes exactly the state that the caller needs.

Of course, we could replace the relation with an ad-hoc call. But that also means refactoring the database to replace the many-to-one by a relation table, which I don't consider better design for this use case.

I didn't suggest that and I agree that this would not improve the design.

'Turn off your brain' is rather though wording: I'd rather rephrase it as decoupling the service returning a Statement from whatever the next transaction wants to do with it.

I don't know your application design, but it seems to me that your app could know beforehand what the next transaction requires and hence load the appropriate data without the need for lazy loading. It's a matter of creating separate data access methods for the respective flows.

This is also a very simple use case, but navigating complex object graphs not known in advance in a different transaction isn't bad design per se. Performance isn't an afterthought here, on the contrary.
I'm new to this project, but having bigger transactions is one of the steps that we are taking further. Still, it doesn't cover all use cases and we sometimes do want a separate one.

No idea how you are "navigating" in "a different transaction", but maybe you you can solve your issues by simply loading the subsequent object graphs explicitly instead of relying on lazy loading?

Of course, https://hibernate.atlassian.net/browse/HHH-17750 makes our problem worse: if fetching a lazy transient attribute of a detached entity would work in a new transaction, the need for this would be much lower (though still there).

It seems @mbladel is already looking into this issue, so let's maybe keep the discussion about the Jira issue in the Jira comments.

from quarkus.

ymajoros commented on June 19, 2024

I simply prefer designing the data access rather than burying requirements in e.g. the UI code through implicit lazy initialization.

This isn't UI code. The first transaction is simply committed and we end up with an object with lazy relations.

If you ask me, no entity should ever be serialized to JSON, but only DTOs.

I do strictly agree with that. Not what is happening in my example.

I understand that this might seem extreme to some folks, but designing the data access this way makes it way simpler to reason about performance and optimize, simply because there are no unknowns.

There are unknowns, the first service doesn't know how much data it has to send back. Even the caller doesn't know at call time, it might need to fetch relations. It's the essence of lazy relationships, the fact they are in a different transaction doesn't make this different.

Even if the recursive fetch is a single request, it is definitely a O(n) one that I want to avoid. As said, I most often do not need this data.

That's totally understandable, hence my suggestion.

Then IMO you should model your different flows explicitly by creating separate data access methods that initializes exactly the state that the caller needs.

No, at the time of the first call, I can't decide how many previous statements I need. It also depends on expensive calculations on the statement itself (calling third party services, ...). I really can't know beforehand.

I don't know your application design, but it seems to me that your app could know beforehand what the next transaction requires and hence load the appropriate data without the need for lazy loading. It's a matter of creating separate data access methods for the respective flows.

So, basically, this is not the case.

I also don't see different kind of structures for slightly different use cases as a better design. Your proposal does have its merits to optimize some use cases, but isn't the holy grail either.

JPA is a leaky abstraction, and that's both a problem and some of its strengths. Lazy loading, even in a different transaction, is quite useful in some scenarios. I do agree it can be misused.

No idea how you are "navigating" in "a different transaction", but maybe you you can solve your issues by simply loading the subsequent object graphs explicitly instead of relying on lazy loading?

That means you need to know all the paths that you'll have to follow beforehand. If there is any complex logic or dependencies, this isn't necessarily the case.

It seems @mbladel is already looking into this issue, so let's maybe keep the discussion about the Jira issue in the Jira comments.

Yes. I just meant this is the first reason why we wanted this feature, even it were only for migration.

from quarkus.

beikov commented on June 19, 2024

No, at the time of the first call, I can't decide how many previous statements I need. It also depends on expensive calculations on the statement itself (calling third party services, ...). I really can't know beforehand.
That means you need to know all the paths that you'll have to follow beforehand. If there is any complex logic or dependencies, this isn't necessarily the case.

Maybe I'm not totally understanding, but in your second transaction, you can execute your logic to determine what data needs to be fetched and then load the Statement with the id of the previousStatement along with how many previousStatement you may need, but in one go. Put all of that into a DTO so there is no lazy entity anymore and you're done :)

from quarkus.

ymajoros commented on June 19, 2024

Maybe I'm not totally understanding, but in your second transaction, you can execute your logic to determine what data needs to be fetched and then load the Statement with the id of the previousStatement along with how many previousStatement you may need, but in one go. Put all of that into a DTO so there is no lazy entity anymore and you're done :)

In the second transaction, I can't know if I need a 3rd or 4th object before checking the second one. How would an ad-hoc call to fetch a new statement be different from a lazy fetch regarding performance? I'm still in the service layer btw.

from quarkus.

allow fetching of lazy relations outside of its original transaction about quarkus HOT 9 OPEN

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent