background In some use cases, users want a timed triggering logic

Can this just be expressed with <div class="highlight highlight-sourc

<div class="highlight highlight-source-sql notranslate position-relative overflow-auto"

Ohh so sorry I misunderstand it ... LGTM to use the syntax. <div class="snippet-cl

By the way, share the <a href="https://docs.timescale.com/api/latest/hyperfunctions/ga

Can this just be expressed with <div class="highlight highlight-source-sql notrans

also it is a big generate_series batch query. <p dir="a

feat: an append-only materialized view storing continuous timestamps about risingwave HOT 13 OPEN

st1page commented on May 27, 2024

feat: an append-only materialized view storing continuous timestamps

from risingwave.

Comments (13)

st1page commented on May 27, 2024 1

Can this just be expressed with

create materialized view xxx as generate_series(
    '2020-01-01 00:00:00'::TIMESTAMP, -- start 
    NOW(), -- end
    interval '1' minute  -- step
);

Internally, NowExecutor will be used

The issue is that it will generate too many data in the table and also it is a big generate_series batch query.

from risingwave.

st1page commented on May 27, 2024 1

```sql
create materialized view xxx as generate_series(
    '2020-01-01 00:00:00'::TIMESTAMP, -- start 
    NOW(), -- end
    interval '1' minute  -- step
);

Oh, create mv with a generate_series is even worse

create materialized view xxx as generate_series(
    '2020-01-01 00:00:00'::TIMESTAMP, -- start 
    NOW(), -- end
    interval '1' minute  -- step
);

WIth this query, every time the now()'s value changes, RW will retract all the result of the old TVF generate_series and send the new value.

from risingwave.

st1page commented on May 27, 2024 1

Ohh so sorry I misunderstand it ... LGTM to use the syntax.

generate_series(
    '2020-01-01 00:00:00'::TIMESTAMP, -- start 
    NOW(), -- end
    interval '1' minute  -- step
);

from risingwave.

fuyufjh commented on May 27, 2024 1

By the way, share the time_bucket_gapfill from TimescaleDB.

Compared with Join, it has 2 advantages:

It allows specifying how to interpolate values.
Streaming Join is very expensive

Regarding of the user interface, I slightly prefer this than JOIN, also because it's more straigh-forward

from risingwave.

fuyufjh commented on May 27, 2024 1

Oh, I just realized the discussion around gap filling and interpolation has been a little bit off-topic. Let's focus on the original requirement for this issue. That is, supporting

create materialized view xxx as generate_series(
    '2020-01-01 00:00:00'::TIMESTAMP, -- start 
    NOW(), -- end
    interval '1' minute  -- step
);

with some efficient incremental way.

from risingwave.

fuyufjh commented on May 27, 2024

Can this just be expressed with

create materialized view xxx as generate_series(
    '2020-01-01 00:00:00'::TIMESTAMP, -- start 
    NOW(), -- end
    interval '1' minute  -- step
);

Internally, NowExecutor will be used

from risingwave.

fuyufjh commented on May 27, 2024

also it is a big generate_series batch query.

Didn't get this. Can you explain it? I don't see any batch query here.

By the way, regarding the original query in PR description

CREATE VIEW timer_per_minutes AS SELECT * FROM per_minutes where minute > now();

Do you mean where minute < now()?

from risingwave.

st1page commented on May 27, 2024

CREATE VIEW timer_per_minutes AS SELECT * FROM per_minutes where minute > now();

Do you mean where minute < now()?

Yes, I will change it.

from risingwave.

fuyufjh commented on May 27, 2024

Oh, create mv with a generate_series is even worse
create materialized view xxx as generate_series(
    '2020-01-01 00:00:00'::TIMESTAMP, -- start 
    NOW(), -- end
    interval '1' minute  -- step
);
WIth this query, every time the now()'s value changes, RW will retract all the result of the old TVF generate_series and send the new value.

Yeah, by the current implementation, it is. I am just proposing to use the syntax. We should make it incremental internally.

from risingwave.

st1page commented on May 27, 2024

Streaming Join is very expensive

So, Even a streaming time_bucket_gapfill is better because it exploits the property that the time columns are basically ordered and that the gaps are not very large? 🤔 It is interesting.

from risingwave.

fuyufjh commented on May 27, 2024

Streaming Join is very expensive

So, Even a streaming time_bucket_gapfill is better because it exploits the property that the time columns are basically ordered and that the gaps are not very large? 🤔 It is interesting.

Let me explain why a (naive) outer join is not the optimal approach.

For example, let's consider the hardest case first - the only left row in a time window was deleted, in this case, a gap is created, and we need to fill the gap with a row of NULLs.

No matter which approach we are using, to implement this, the operator must memorize how many rows were in this window before the delete. Actually, this is exactly what the degree table of left outer join does.

-- imagine it as 
upstream_data LEFT OUTER JOIN per_minutes_series

However, the rest state tables, such as the materialized rows of both left & right sides, were useless, because

The left state table (i.e. the materialized rows from upstream_data) is not necessary because every time an upstream_data row comes, it can be emitted immediately and doesn't need to wait for per_minutes_series
The right statble table (i.e. the materialized rows from per_minutes_series) is wasteful because it's very simple. Just keeping the start_time and end_time would be enough.

I am thinking that we might need to introduce a special operator, saying, like a much simplified outer join operator with a degree table. Of course, this can be a second step and naive join can be used before it's completed, but before this we need to carefully consider the syntax: join, aggregate function, or TVF.

from risingwave.

st1page commented on May 27, 2024

dolphinDB allow a interval syntax in the group by clause, which supports the "fill" argument

fill indicates how to fill the missing values of the result. It can take the value of “prev”, “post”, “linear”, “null”, a specific numeric value and “none”.
“prev”: the previous value
“post”: the next value
“linear”: linear interpolation. For non-numerical data, linear interpolation cannot be used and the “prev” method will be used instead.
“null”: null value
a specific numeric value.
“none”: do not interpolate

https://docs.dolphindb.cn/en/help/SQLStatements/interval.html

from risingwave.

fuyufjh commented on May 27, 2024

https://docs.dolphindb.cn/en/help/SQLStatements/interval.html

Interesting. While for RisingWave, the closest syntax is actually the time window function i.e. TUMBLE and HOP. But regarding of implementation, it seems that reusing the state in Agg is the most efficient way.

from risingwave.

feat: an append-only materialized view storing continuous timestamps about risingwave HOT 13 OPEN

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent