Comments (7)
The throughput increased likely because of 0cd9ff1 #14558
https://github.com/risingwavelabs/rw-commits-history?tab=readme-ov-file#nightly-20240115
now drops because of 9417409 #14855
14588 vs 14855, what a coincidence
from risingwave.
Interesting, but I can't explain this phenomenon. IMO this PR should only have an effect on UDF :)
from risingwave.
Interesting +1, Worth investigating the cause, I think
from risingwave.
Any update? The perf never came back to the original level http://metabase.risingwave-cloud.xyz/question/1304-nexmark-q5-many-windows-blackhole-medium-1cn-avg-source-output-rows-per-second-rows-s-history-thtb-266?start_date=2024-01-22
from risingwave.
@TennyZhuang Any updates?
from risingwave.
CREATE SINK nexmark_q5_many_windows
AS
SELECT
AuctionBids.auction, AuctionBids.num
FROM (
SELECT
bid.auction,
count(*) AS num,
window_start AS starttime
FROM
HOP(bid, date_time, INTERVAL '5' SECOND, INTERVAL '5' MINUTE)
GROUP BY
bid.auction,
window_start
) AS AuctionBids
JOIN (
SELECT
max(CountBids.num) AS maxn,
CountBids.starttime_c
FROM (
SELECT
count(*) AS num,
window_start AS starttime_c
FROM
HOP(bid, date_time, INTERVAL '5' SECOND, INTERVAL '5' MINUTE)
GROUP BY
bid.auction,
window_start
) AS CountBids
GROUP BY
CountBids.starttime_c
) AS MaxBids
ON
AuctionBids.starttime = MaxBids.starttime_c AND
AuctionBids.num >= MaxBids.maxn
WITH ( connector = 'blackhole', type = 'append-only', force_append_only = 'true');
Plan:
StreamSink { type: append-only, columns: [auction, num, window_start(hidden), window_start#1(hidden)] }
└─StreamProject { exprs: [$expr1, count, window_start, window_start] }
└─StreamFilter { predicate: (count >= max(count)) }
└─StreamHashJoin { type: Inner, predicate: window_start = window_start }
├─StreamExchange { dist: HashShard(window_start) }
│ └─StreamShare { id: 7 }
│ └─StreamHashAgg [append_only] { group_key: [$expr1, window_start], aggs: [count] }
│ └─StreamExchange { dist: HashShard($expr1, window_start) }
│ └─StreamHopWindow { time_col: $expr2, slide: 00:00:05, size: 00:05:00, output: [$expr1, window_start, _row_id] }
│ └─StreamProject { exprs: [Field(bid, 0:Int32) as $expr1, Field(bid, 5:Int32) as $expr2, _row_id] }
│ └─StreamFilter { predicate: IsNotNull(Field(bid, 5:Int32)) AND (event_type = 2:Int32) }
│ └─StreamRowIdGen { row_id_index: 4 }
│ └─StreamSource { source: nexmark, columns: [event_type, person, auction, bid, _row_id] }
└─StreamProject { exprs: [window_start, max(count)] }
└─StreamHashAgg { group_key: [window_start], aggs: [max(count), count] }
└─StreamExchange { dist: HashShard(window_start) }
└─StreamShare { id: 7 }
└─StreamHashAgg [append_only] { group_key: [$expr1, window_start], aggs: [count] }
└─StreamExchange { dist: HashShard($expr1, window_start) }
└─StreamHopWindow { time_col: $expr2, slide: 00:00:05, size: 00:05:00, output: [$expr1, window_start, _row_id] }
└─StreamProject { exprs: [Field(bid, 0:Int32) as $expr1, Field(bid, 5:Int32) as $expr2, _row_id] }
└─StreamFilter { predicate: IsNotNull(Field(bid, 5:Int32)) AND (event_type = 2:Int32) }
└─StreamRowIdGen { row_id_index: 4 }
└─StreamSource { source: nexmark, columns: [event_type, person, auction, bid, _row_id] }
from risingwave.
It's indeed caused by #14558, but the reason is unknown. Will continue to investigate.
CPU flamegraph: profile results.zip
from risingwave.
Related Issues (20)
- EOWC: close `RANGE`/`SESSION` window with watermark
- `ALTER TABLE` will refresh previous snapshot values for absent cells
- bug: fail to load workload identity token on azure environment
- cherrypick feat(udf): add metric of UDF memory usage (#16922) to branch release-1.9 HOT 1
- when using external schema, `struct`'s fields are not shown in `describe <table>` HOT 2
- Division by zero exception HOT 12
- e2e test time increased from 15min to 20min in ci HOT 1
- Performance lost after using `BTreeMap` for WITH properties
- hint user when column not found HOT 1
- bug: CI failure: integration test (madsim): progress not within bounds 0.9
- bug(main-cron): pulsar source check: gRPC request to meta service failed: Unknown error HOT 3
- feat: expose dedicated source for table in system catalog rw_sources
- Reclaim space more aggresively for table with vnode table watermark specificed (table with range delete)
- system table for source health status HOT 2
- SSL configurations is not supported for schema registry HOT 3
- Don't let jni_core depends on (the whole) storage crate HOT 2
- dep: try to use either aws-lc-rs or ring
- Rework ci labels
- reorganize directories at the root of the repo HOT 1
- error in metabase: function has_any_column_privilege does not exist HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from risingwave.