As <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

It seems to me we need to remain on <a class="user-mention notranslate" data-hovercard

new implementation of overlap management about hospitalnetwork HOT 5 OPEN

pascalcrepey commented on July 26, 2024

new implementation of overlap management

from hospitalnetwork.

Comments (5)

tjibbed commented on July 26, 2024

Hi @ClementMassonnaud ,

I'm currently checking your new implementation of the overlapping admission algorithm. (I am using a subset of a real dataset, so no simulated data at this moment). I've run into a couple of issues that might need our attention:

It's quite slow. The original method takes:

user system elapsed
1.413 0.000 0.751

whereas the new implementation times at:

user system elapsed
84.214 0.039 83.084

I think this is mainly die to the "per subject" calculation of overlaps, although I haven't checked this. However, I used to use the by=sID method before switching to the full list method. @PascalCrepey taught me this, and I guess you know it too. It's quite fast... basically you calculate all differences between record N and N+1 and exclude any differences where sID in N is not the sID in N+1. Isn't that possible here as well?

over = base[, list("left" = .SD[, Adate][-1] - .SD[, Ddate][-.N], "right" = .SD[, Ddate][-1] - .SD[, Ddate][-.N], "I" = .I[-1]), by = sID]

The checked database isn't final. It can still contain overlapping admissions, and re-running the function on the corrected database sometimes delivers a new corrected database that is slightly different.
This can be caused by staggered hospital stays or sequential overlaps, e.g.
Facility A: |--------------------------------------------|
Facility B: |----| |----| |----|

Because overlap detection is based on successive records, sorted on patient, admission, and discharge, the algorithm will only detect and correct the first instance, leaving two overlaps:
Facility A: |--------| |------------------------------|
Facility B: |----| |----| |----|

The next run, the following overlap is detected and corrected:
Facility A: |--------| |-------| |-------------------|
Facility B: |----| |----| |----|

Etc.

This is the original method uses the iterations, and I think this could be done in the new method as well.

Some of the newly create records have negative lengths of stay. This not a big issue, as they can easily be filtered out.
If I run the algorithm until it gives the final answer, the answer is not the same as the original method. The new method gives far more records. I am currently going through the database to see what the correct answer should be. As it is real data, this does take some time, but I think the extra "dirt" in the real data may help us find the potential issues here.

We might need to think of a way to also use a real dataset (or subset thereof) to validate the methods.

Minor issue: it currently doesn't carry over auxiliary data. These might be needed further on, but this is relatively easy to implement.

from hospitalnetwork.

tjibbed commented on July 26, 2024

My "beautiful" schematic of overlapping admission didn't align there... But I think you get the idea.

from hospitalnetwork.

ClementMassonnaud commented on July 26, 2024

Hi,
Thank you for your feedback

It can still contain overlapping admissions, and re-running the function on the corrected database sometimes delivers a new corrected database that is slightly different. This can be caused by staggered hospital stays or sequential overlaps, e.g.

I tried to think about the different types of possible overlaps, but I missed this type. I now see what is the issue... I usually try to avoid 'while' loops as much as possible, but here it seems we don't have much choice indeed

We might need to think of a way to also use a real dataset (or subset thereof) to validate the methods.

Yes, I think it is a key point, to make sure the function deals correctly with every possible scenario. I wrote tests in tests/testthat/test-adjust_overlaps.R. I manually created a fake database with all the specific overlaps I could think of, listing them, and anticipating what the correct result should be. I think that it would be a good idea to use the other types of overlaps from a real database, as you suggested, and then 'merge' them with the current test database to complete the tests.

Minor issue: it currently doesn't carry over auxiliary data. These might be needed further on, but this is relatively easy to implement.

I'm not sure I understand what you mean by auxiliary data. If by that you mean additional columns in the database, I don't understand where is the issue

basically you calculate all differences between record N and N+1 and exclude any differences where sID in N is not the sID in N+1. Isn't that possible here as well?

Yes this is interesting, I missed that. I don't see why it would not be possible here as well. I am currently on the road, I will look into it ASAP.

from hospitalnetwork.

PascalCrepey commented on July 26, 2024

It seems to me we need to remain on @tjibbed original implementation, with the loop. Am I right ? does it need further optimization ? Can we merge @ClementMassonnaud data.table optimization into @tjibbed implementation ? or should we simply close this issue ?

from hospitalnetwork.

ClementMassonnaud commented on July 26, 2024

Yes we do need the loop I think
But I think we should also implement the optimizations I proposed.
I'll merge the two versions and close this issue ASAP

from hospitalnetwork.

new implementation of overlap management about hospitalnetwork HOT 5 OPEN

Comments (5)

Related Issues (18)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent