Comments (13)
Can this just be expressed with
create materialized view xxx as generate_series( '2020-01-01 00:00:00'::TIMESTAMP, -- start NOW(), -- end interval '1' minute -- step );Internally,
NowExecutor
will be used
The issue is that it will generate too many data in the table and also it is a big generate_series batch query.
from risingwave.
```sql create materialized view xxx as generate_series( '2020-01-01 00:00:00'::TIMESTAMP, -- start NOW(), -- end interval '1' minute -- step );
Oh, create mv with a generate_series
is even worse
create materialized view xxx as generate_series(
'2020-01-01 00:00:00'::TIMESTAMP, -- start
NOW(), -- end
interval '1' minute -- step
);
WIth this query, every time the now()'s value changes, RW will retract all the result of the old TVF generate_series
and send the new value.
from risingwave.
Ohh so sorry I misunderstand it ... LGTM to use the syntax.
generate_series(
'2020-01-01 00:00:00'::TIMESTAMP, -- start
NOW(), -- end
interval '1' minute -- step
);
from risingwave.
By the way, share the time_bucket_gapfill
from TimescaleDB.
Compared with Join
, it has 2 advantages:
- It allows specifying how to interpolate values.
- Streaming
Join
is very expensive
Regarding of the user interface, I slightly prefer this than JOIN
, also because it's more straigh-forward
from risingwave.
Oh, I just realized the discussion around gap filling and interpolation has been a little bit off-topic. Let's focus on the original requirement for this issue. That is, supporting
create materialized view xxx as generate_series( '2020-01-01 00:00:00'::TIMESTAMP, -- start NOW(), -- end interval '1' minute -- step );
with some efficient incremental way.
from risingwave.
Can this just be expressed with
create materialized view xxx as generate_series(
'2020-01-01 00:00:00'::TIMESTAMP, -- start
NOW(), -- end
interval '1' minute -- step
);
Internally, NowExecutor
will be used
from risingwave.
also it is a big generate_series batch query.
Didn't get this. Can you explain it? I don't see any batch query here.
By the way, regarding the original query in PR description
CREATE VIEW timer_per_minutes AS SELECT * FROM per_minutes where minute > now();
Do you mean where minute < now()
?
from risingwave.
CREATE VIEW timer_per_minutes AS SELECT * FROM per_minutes where minute > now();
Do you mean
where minute < now()
?
Yes, I will change it.
from risingwave.
Oh, create mv with a
generate_series
is even worsecreate materialized view xxx as generate_series( '2020-01-01 00:00:00'::TIMESTAMP, -- start NOW(), -- end interval '1' minute -- step );
WIth this query, every time the now()'s value changes, RW will retract all the result of the old TVF
generate_series
and send the new value.
Yeah, by the current implementation, it is. I am just proposing to use the syntax. We should make it incremental internally.
from risingwave.
Streaming Join is very expensive
So, Even a streaming time_bucket_gapfill
is better because it exploits the property that the time columns are basically ordered and that the gaps are not very large? 🤔 It is interesting.
from risingwave.
Streaming Join is very expensive
So, Even a streaming
time_bucket_gapfill
is better because it exploits the property that the time columns are basically ordered and that the gaps are not very large? 🤔 It is interesting.
Let me explain why a (naive) outer join is not the optimal approach.
For example, let's consider the hardest case first - the only left row in a time window was deleted, in this case, a gap is created, and we need to fill the gap with a row of NULL
s.
No matter which approach we are using, to implement this, the operator must memorize how many rows were in this window before the delete
. Actually, this is exactly what the degree table of left outer join does.
-- imagine it as
upstream_data LEFT OUTER JOIN per_minutes_series
However, the rest state tables, such as the materialized rows of both left & right sides, were useless, because
- The left state table (i.e. the materialized rows from
upstream_data
) is not necessary because every time anupstream_data
row comes, it can be emitted immediately and doesn't need to wait forper_minutes_series
- The right statble table (i.e. the materialized rows from
per_minutes_series
) is wasteful because it's very simple. Just keeping the start_time and end_time would be enough.
I am thinking that we might need to introduce a special operator, saying, like a much simplified outer join operator with a degree table. Of course, this can be a second step and naive join
can be used before it's completed, but before this we need to carefully consider the syntax: join, aggregate function, or TVF.
from risingwave.
dolphinDB allow a interval
syntax in the group by clause, which supports the "fill" argument
fill indicates how to fill the missing values of the result. It can take the value of “prev”, “post”, “linear”, “null”, a specific numeric value and “none”.
“prev”: the previous value
“post”: the next value
“linear”: linear interpolation. For non-numerical data, linear interpolation cannot be used and the “prev” method will be used instead.
“null”: null value
a specific numeric value.
“none”: do not interpolate
https://docs.dolphindb.cn/en/help/SQLStatements/interval.html
from risingwave.
https://docs.dolphindb.cn/en/help/SQLStatements/interval.html
Interesting. While for RisingWave, the closest syntax is actually the time window function i.e. TUMBLE
and HOP
. But regarding of implementation, it seems that reusing the state in Agg
is the most efficient way.
from risingwave.
Related Issues (20)
- cherrypick fix: fix compile err in release build of risectl (#15412) to branch release-1.7
- perf(compactor): Record changes related to the compactor component HOT 1
- Support `rw_int256` type in Postgres CDC source HOT 5
- doris-sink: doris type is "JSON" risingwave type is Jsonb
- cherrypick fix(meta): wait for new worker join in migrate workers (#15967) to branch release-1.7 HOT 2
- cherrypick feat(connector): add SSL support for external cdc postgres connector (#15690) to branch release-1.8
- cherrypick feat(metrics): add Barrier pending time metrics (#15938) to branch release-1.8
- cherrypick feat(metrics): add Barrier pending time metrics (#15938) to branch release-1.7
- cherrypick feat(metrics): add Barrier pending time metrics (#15938) to branch release-1.8
- cherrypick feat(metrics): add Barrier pending time metrics (#15938) to branch release-1.8
- test: add some test for split change
- improve script for running e2e source tests HOT 7
- cherrypick fix: handle kinesis unhandled error: InternalError (#15979) to branch release-1.7
- cherrypick fix(optimizer): fix ColIndexMapping target size for TranslateApplyRule (#15818) to branch release-1.8
- Tracking issue for shared source
- Support serial type (rowid) in sink
- BUG: Permission denied: cannot execute in a read-only transaction HOT 2
- cherrypick fix(frontend): allow write in rw txn for CI usage (#16012) to branch release-1.8 HOT 1
- Limit the DML memory to avoid frontend OOM
- Write partition data in separate data dir for iceberg sink
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from risingwave.