PR <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id

Possibly related?: <a class="issue-link js-issue-link" data-error-text="F

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Possibly related?: <a class="issue-link js-issue-link" data-error-text="Failed to load

Potential regression induced by "Refactored pandas_timedelta_to_timedeltastruct" about pandas HOT 6 OPEN

DeaMariaLeon commented on June 14, 2024

Potential regression induced by "Refactored pandas_timedelta_to_timedeltastruct"

from pandas.

Comments (6)

dontgoto commented on June 14, 2024 1

There are probably some tricks we can try like using a static lookup table instead of branches:
I tried that, but it did not improve the situation on my machine.

I did some other refactoring that brought down the benchmark's runtime by 20% on my machine for s/us/ns.

There are also code paths that the benchmark is not really testing (those I saw a similar improvement from my changes): s/ms/us/ns values so large that the "day" parsing part of the code gets triggered as well as negative offsets, both don't get checked because the benchmark runs only with random numbers between 0 and 10.

from pandas.

dontgoto commented on June 14, 2024 1

I now have a version of the date time conversion that is correct and tested and runs up to 2.5x faster than main and the previous version before the regression (at least on my machine, M2 ARM). Interestingly, I do not see large differences in runtimes between the pre regression commit and main.

I would suggest changing the benchmark inputs to negative values that are at least on the order of one day to make most of the logic in the function actually show up in the runtime of the benchmark. What do you think? Is there a procedure to follow to update the history of benchmark runtimes?

To back that up, here are some of my benchmark results for all three versions, I will update my PR soon:

input range \| version	main	main pre regression	fix
+30±10ns	4.9	4.8	3.9
-30±10ns	7.9	7.2	5.2
±30ns	8.4	8.5	7.1
+4d±100s	11.6	11.4	5.2
-4d±100s	12.7	12.7	5.3
±4d	9.7	9.4	7.5

(Runtimes are in ms, runs are for 10^6 random values)

The current versions of the function have branch prediction and variable dependency issues that only show up when using larger ranges of inputs and negative values. The previously mentioned checks against the base unit size are of very low cost in reality since the unit size does not change (in the benchmark, but also in real world) and get branch predicted without failure.

Randomly negative and positive inputs still do not do all too well in my version, but realistically timestamps would not have this property.

from pandas.

dontgoto commented on June 14, 2024 1

Possibly related?: #57035

Good find, but the regression here should be unrelated to that plotting issue. Converting 1M timestamps takes some milliseconds, but the referenced regression is in the seconds range.

from pandas.

WillAyd commented on June 14, 2024

I think the issue here is that we combined a few case statements into one:

  case NPY_FR_s:
  case NPY_FR_ms:
  case NPY_FR_us:
  case NPY_FR_ns: {
    npy_int64 per_sec;
    if (base == NPY_FR_s) {
      per_sec = 1;
    } else if (base == NPY_FR_ms) {
      per_sec = 1000;
    } else if (base == NPY_FR_us) {
      per_sec = 1000000;
    } else {
      per_sec = 1000000000;
    }

    ...

Whereas previously the code was copy/pasted and slightly tweaked for every case statement.

There are probably some tricks we can try like using a static lookup table instead of branches:

  case NPY_FR_s:
  case NPY_FR_ms:
  case NPY_FR_us:
  case NPY_FR_ns: {
    const npy_int64 sec_per_day = 86400;
    static npy_int64 per_secs[] = {
      1, // NPY_FR_s
      1000, // NPY_FR_ms
      1000000, // NPY_FR_us
      1000000000 // NPY_FR_ns
    };
    const npy_int64 per_sec = per_secs[base - NPY_FR_s];

but I'm not really sure the tricks are worth it. And I think the code de-duplication is probably the increased timing here

from pandas.

WillAyd commented on June 14, 2024

@dontgoto thanks for the great research. If you have improvements to the benchmark I would just go ahead and submit it - we don't manage the history of changes to a benchmark itself that strictly

from pandas.

ba05 commented on June 14, 2024

Possibly related?: #57035

from pandas.

Potential regression induced by "Refactored pandas_timedelta_to_timedeltastruct" about pandas HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent