Comments (14)
This is a cloud service on top of duckdb - an embedded database engine for Python, which has some origins and ideas borrowed from ClickHouse.
The registration page works, but login fails in some cases:
from clickbench.
The UI looks cute:
One issue with the UI - when I type my query and press Enter, the entered query disappears, and I cannot easily run it again.
SELECT 1
does not work:
from clickbench.
Every example from the list shows "Access denied":
from clickbench.
I tried to use it with our public bucket, s3://clickhouse-public-datasets/
, but it shows "access denied".
It looks like I need to create an IAM role and switch to it: https://www.boilingdata.com/apidoc.html
from clickbench.
My test query is:
SELECT * FROM parquet_scan('s3://clickhouse-public-datasets/hits_compatible/hits.parquet') LIMIT 10
@santrancisco created an IAM role for me: arn:aws:iam::609927696493:role/boilingdata_test
and I've set it with
PRAGMA s3AccessRoleArn='arn:aws:iam::123456789012:role/bdS3';
as in the documentation.
But I'm still getting an access denied error.
from clickbench.
Sometimes it took over 2 minutes for a “select 1” to return for me
from clickbench.
After waiting for a while, the "Access denied" error disappeared, but now I'm getting a cryptic error message:
{
"status": "Forward count > 1, loop?! (2)"
}
which I don't know what it means.
from clickbench.
Sometimes it took over 2 minutes for a “select 1” to return for me
This contradicts the statement on their website:
Instant Computing Power For Realtime Dashboarding
Realtime dashboarding is a natural and well-explored use case for ClickHouse, and I don't like when other inferior engines, like duckdb, make unfounded claims, pretending to have similar capabilities.
from clickbench.
Trying other queries:
SELECT count(*) FROM parquet_scan('s3://clickhouse-public-datasets/hits_compatible/hits.parquet');
SELECT count(DISTINCT URL) FROM parquet_scan('s3://clickhouse-public-datasets/hits_compatible/hits.parquet');
SELECT count(DISTINCT URL) FROM parquet_scan('s3://clickhouse-public-datasets/hits_compatible/hits.parquet') GROUP BY UserID;
but none works giving the same error Forward count > 1, loop?! (2)
from clickbench.
Sometimes it took over 2 minutes for a “select 1” to return for me
I'm waiting for several minutes, but then it logs me out.
When I log in again, there is no query history.
I didn't have a single time the SELECT 1
query succeeded before it logged out my session.
from clickbench.
Hello, sorry for replying this late! Since Jul we have new GUI and new backend release so things should work better. You could use the bdcli to setup the IAM Role as it crafts the IAM Policy for you.
- I think the IAM Role needs access to any public S3 Buckets too (egress/outbound rule). I'm trying to query that public bucket above and I'm getting access denied error too (even I explicitly added that S3 Bucket
clickhouse-public-datasets
into the IAM policy. Is it possible that the bucket policy does not allow get-bucket-location API call?). SELECT 1
now also works 🎉 😅 and it should respond fast and is probably a good measure of the overhead latency between client and Boiling cloud- For now, we are not focusing on building a great GUI, so the GUI is for demo purposes for now. Query history would be definitely nice and also ability to save the queries (there are tons of very good BI Tool/workbenches)
- NOTE: For now, the only up-to-date AWS region is
eu-west-1
. So, you would need to have data there in this region, i.e. the S3 Bucket.
Does this help?
from clickbench.
This is the set of queries I'm running on Boiling. I've fixed some sort orders and column aliases to make the results comparable with local laptop running DuckDB instance. So, I have copied the hits.parquet
file into Boiling demo S3 bucket on eu-west-1
.
SELECT COUNT(*) AS count FROM parquet_scan('s3://boilingdata-demo/hits.parquet');
SELECT COUNT(*) AS count FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE AdvEngineID <> 0;
SELECT UserID FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE UserID = 435090932899640449;
SELECT UserID FROM parquet_scan('s3://boilingdata-demo/hits.parquet') ORDER BY UserID LIMIT 10;
SELECT COUNT(*) AS count FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE URL LIKE '%google%';
SELECT SearchPhrase, EventTime FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase <> '' ORDER BY EventTime, SearchPhrase LIMIT 10;
SELECT SearchPhrase FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase <> '' ORDER BY SearchPhrase LIMIT 10;
SELECT SearchPhrase FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase <> '' ORDER BY EventTime, SearchPhrase LIMIT 10;
SELECT AdvEngineID, COUNT(*) AS count FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE AdvEngineID > 0 GROUP BY AdvEngineID ORDER BY COUNT(*) DESC;
SELECT SearchPhrase, MIN(URL) AS url, COUNT(*) AS c FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE URL LIKE '%google%' AND SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY c DESC, SearchPhrase LIMIT 10;
SELECT COUNT(*) AS total, SearchPhrase FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase != '' GROUP BY SearchPhrase ORDER BY SearchPhrase, total DESC LIMIT 10;
SELECT COUNT(DISTINCT SearchPhrase) AS total FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase != '';
SELECT COUNT(DISTINCT UserID) AS uniqueUserIds FROM parquet_scan('s3://boilingdata-demo/hits.parquet');
SELECT RegionID, COUNT(DISTINCT UserID) AS u FROM parquet_scan('s3://boilingdata-demo/hits.parquet') GROUP BY RegionID ORDER BY u DESC LIMIT 10;
SELECT UserID, COUNT(*) AS count FROM parquet_scan('s3://boilingdata-demo/hits.parquet') GROUP BY UserID ORDER BY COUNT(*) DESC, UserID LIMIT 10;
SELECT UserID, SearchPhrase, COUNT(*) FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase <> '' GROUP BY UserID, SearchPhrase ORDER BY COUNT(*) DESC, UserID LIMIT 10;
SELECT MobilePhoneModel, COUNT(DISTINCT UserID) AS u FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE MobilePhoneModel <> '' AND UserID <> '' GROUP BY MobilePhoneModel ORDER BY u DESC LIMIT 10;
SELECT MobilePhone, MobilePhoneModel, COUNT(DISTINCT UserID) AS u FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE MobilePhoneModel <> ''AND UserID <> '' GROUP BY MobilePhone, MobilePhoneModel ORDER BY u DESC LIMIT 10;
SELECT SearchPhrase, COUNT(DISTINCT UserID) AS u FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY u DESC, SearchPhrase LIMIT 13;
SELECT SearchEngineID, SearchPhrase, COUNT(*) AS c FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase <> '' GROUP BY SearchEngineID, SearchPhrase ORDER BY c DESC LIMIT 10;
SELECT UserID, COUNT(*) FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE UserID <> NULL GROUP BY UserID ORDER BY COUNT(*) DESC LIMIT 10;
SELECT SUM(ResolutionWidth) AS resWth, SUM(ResolutionWidth + 1) AS resW1, SUM(ResolutionWidth + 2) AS resW2, SUM(ResolutionWidth + 3) AS resW3, SUM(ResolutionWidth + 4) AS resW4, SUM(ResolutionWidth + 5) AS resW5, SUM(ResolutionWidth + 6) AS resW6, SUM(ResolutionWidth + 7) AS resW7, SUM(ResolutionWidth + 8) AS resW8, SUM(ResolutionWidth + 9) AS resW9, SUM(ResolutionWidth + 10) AS resW10, SUM(ResolutionWidth + 11) AS resW11, SUM(ResolutionWidth + 12) AS resW12, SUM(ResolutionWidth + 13) AS resW13, SUM(ResolutionWidth + 14) AS resW14, SUM(ResolutionWidth + 15) AS resW15, SUM(ResolutionWidth + 16) AS resW16, SUM(ResolutionWidth + 17) AS resW17, SUM(ResolutionWidth + 18) AS resW18, SUM(ResolutionWidth + 19) AS resW19, SUM(ResolutionWidth + 20) AS resW20, SUM(ResolutionWidth + 21) AS resW21, SUM(ResolutionWidth + 22) AS resW22, SUM(ResolutionWidth + 23) AS resW23, SUM(ResolutionWidth + 24) AS resW24, SUM(ResolutionWidth + 25) AS resW25, SUM(ResolutionWidth + 26) AS resW26, SUM(ResolutionWidth + 27) AS resW27, SUM(ResolutionWidth + 28) AS resW28, SUM(ResolutionWidth + 29) AS resW29, SUM(ResolutionWidth + 30) AS resW30, SUM(ResolutionWidth + 31) AS resW31, SUM(ResolutionWidth + 32) AS resW32, SUM(ResolutionWidth + 33) AS resW33, SUM(ResolutionWidth + 34) AS resW34, SUM(ResolutionWidth + 35) AS resW35, SUM(ResolutionWidth + 36) AS resW36, SUM(ResolutionWidth + 37) AS resW37, SUM(ResolutionWidth + 38) AS resW38, SUM(ResolutionWidth + 39) AS resW39, SUM(ResolutionWidth + 40) AS resW40, SUM(ResolutionWidth + 41) AS resW41, SUM(ResolutionWidth + 42) AS resW42, SUM(ResolutionWidth + 43) AS resW43, SUM(ResolutionWidth + 44) AS resW44, SUM(ResolutionWidth + 45) AS resW45, SUM(ResolutionWidth + 46) AS resW46, SUM(ResolutionWidth + 47) AS resW47, SUM(ResolutionWidth + 48) AS resW48, SUM(ResolutionWidth + 49) AS resW49, SUM(ResolutionWidth + 50) AS resW50, SUM(ResolutionWidth + 51) AS resW51, SUM(ResolutionWidth + 52) AS resW52, SUM(ResolutionWidth + 53) AS resW53, SUM(ResolutionWidth + 54) AS resW54, SUM(ResolutionWidth + 55) AS resW55, SUM(ResolutionWidth + 56) AS resW56, SUM(ResolutionWidth + 57) AS resW57, SUM(ResolutionWidth + 58) AS resW58, SUM(ResolutionWidth + 59) AS resW59, SUM(ResolutionWidth + 60) AS resW60, SUM(ResolutionWidth + 61) AS resW61, SUM(ResolutionWidth + 62) AS resW62, SUM(ResolutionWidth + 63) AS resW63, SUM(ResolutionWidth + 64) AS resW64, SUM(ResolutionWidth + 65) AS resW65, SUM(ResolutionWidth + 66) AS resW66, SUM(ResolutionWidth + 67) AS resW67, SUM(ResolutionWidth + 68) AS resW68, SUM(ResolutionWidth + 69) AS resW69, SUM(ResolutionWidth + 70) AS resW70, SUM(ResolutionWidth + 71) AS resW71, SUM(ResolutionWidth + 72) AS resW72, SUM(ResolutionWidth + 73) AS resW73, SUM(ResolutionWidth + 74) AS resW74, SUM(ResolutionWidth + 75) AS resW75, SUM(ResolutionWidth + 76) AS resW76, SUM(ResolutionWidth + 77) AS resW77, SUM(ResolutionWidth + 78) AS resW78, SUM(ResolutionWidth + 79) AS resW79, SUM(ResolutionWidth + 80) AS resW80, SUM(ResolutionWidth + 81) AS resW81, SUM(ResolutionWidth + 82) AS resW82, SUM(ResolutionWidth + 83) AS resW83, SUM(ResolutionWidth + 84) AS resW84, SUM(ResolutionWidth + 85) AS resW85, SUM(ResolutionWidth + 86) AS resW86, SUM(ResolutionWidth + 87) AS resW87, SUM(ResolutionWidth + 88) AS resW88, SUM(ResolutionWidth + 89) AS resW89 FROM parquet_scan('s3://boilingdata-demo/hits.parquet');
SELECT SearchPhrase, MIN(URL) AS url, MIN(Title) AS title, COUNT(*) AS c, COUNT(DISTINCT UserID) AS usrIdCount FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE Title LIKE '%Google%' AND URL NOT LIKE '%.google.%' AND SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY c, searchphrase DESC LIMIT 10;
SELECT WatchID, ClientIP, COUNT(*) AS c, SUM(IsRefresh), AVG(ResolutionWidth) FROM parquet_scan('s3://boilingdata-demo/hits.parquet') GROUP BY WatchID, ClientIP ORDER BY c DESC, WatchID, ClientIP LIMIT 10;
SELECT URL, COUNT(*) AS c FROM parquet_scan('s3://boilingdata-demo/hits.parquet') GROUP BY URL ORDER BY c DESC LIMIT 12;
SELECT URL, COUNT(*) AS c FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE url ='http://kinopoisk.ru/perm.irr.ru' GROUP BY URL;
SELECT SUM(AdvEngineID), COUNT(*), AVG(ResolutionWidth) FROM parquet_scan('s3://boilingdata-demo/hits.parquet');
SELECT AVG(UserID) FROM parquet_scan('s3://boilingdata-demo/hits.parquet');
SELECT MIN(EventDate) AS minEventDate, MAX(EventDate) AS maxEventDate FROM parquet_scan('s3://boilingdata-demo/hits.parquet');
SELECT RegionID, SUM(AdvEngineID) AS sumAdvEngineId, COUNT(*) AS c, AVG(ResolutionWidth) AS avgResolutionWidth, COUNT(DISTINCT UserID) AS distinctUsers FROM parquet_scan('s3://boilingdata-demo/hits.parquet') GROUP BY RegionID ORDER BY c DESC LIMIT 10;
SELECT 1, URL, COUNT(*) AS c FROM parquet_scan('s3://boilingdata-demo/hits.parquet') GROUP BY 1, URL ORDER BY c DESC LIMIT 10;
SELECT URL, COUNT(*) AS PageViews FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND DontCountHits = 0 AND IsRefresh = 0 AND URL <> '' GROUP BY URL ORDER BY PageViews DESC LIMIT 10;
SELECT Title, COUNT(*) AS PageViews FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND DontCountHits = 0 AND IsRefresh = 0 AND Title <> '' GROUP BY Title ORDER BY PageViews DESC LIMIT 10;
SELECT URL, COUNT(*) AS PageViews FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND IsRefresh = 0 AND IsLink <> 0 AND IsDownload = 0 GROUP BY URL ORDER BY PageViews DESC LIMIT 10 OFFSET 1000;
SELECT TraficSourceID, SearchEngineID, AdvEngineID, CASE WHEN (SearchEngineID = 0 AND AdvEngineID = 0) THEN Referer ELSE '' END AS Src, URL AS Dst, COUNT(*) AS PageViews FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND IsRefresh = 0 GROUP BY TraficSourceID, SearchEngineID, AdvEngineID, Src, Dst ORDER BY PageViews DESC LIMIT 10 OFFSET 1000;
SELECT URLHash, EventDate, COUNT(*) AS PageViews FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND IsRefresh = 0 AND TraficSourceID IN (-1, 6) AND RefererHash = 3594120000172545465 GROUP BY URLHash, EventDate ORDER BY PageViews DESC LIMIT 10 OFFSET 100;
SELECT WindowClientWidth, WindowClientHeight, COUNT(*) AS PageViews FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND IsRefresh = 0 AND DontCountHits = 0 AND URLHash = 2868770270353813622 GROUP BY WindowClientWidth, WindowClientHeight ORDER BY PageViews DESC LIMIT 10 OFFSET 10000;
SELECT REGEXP_REPLACE(Referer, '^https?://(?:www.)?([^/]+)/.*$', '\x01') AS k, AVG(STRLEN(Referer)) AS l, COUNT(*) AS c, MIN(Referer) FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE Referer <> '' GROUP BY k HAVING COUNT(*) > 100000 ORDER BY l DESC, c LIMIT 25;
SELECT SearchEngineID, ClientIP, COUNT(*) AS c, SUM(IsRefresh) AS sumIsRefresh, AVG(ResolutionWidth) AS avgResWidth FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase <> '' GROUP BY SearchEngineID, ClientIP ORDER BY c DESC, SearchEngineID, ClientIP LIMIT 10;
SELECT WatchID, ClientIP, COUNT(*) AS c, SUM(IsRefresh) AS sumIsRefresh, AVG(ResolutionWidth) AS AvgRsesWidth FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase <> '' GROUP BY WatchID, ClientIP ORDER BY c, WatchID, ClientIP DESC LIMIT 10;
SELECT ClientIP, ClientIP - 1, ClientIP - 2, ClientIP - 3, COUNT(*) AS c FROM parquet_scan('s3://boilingdata-demo/hits.parquet') GROUP BY ClientIP, ClientIP - 1, ClientIP - 2, ClientIP - 3 ORDER BY c DESC, ClientIP LIMIT 11;
SELECT COUNT(*) AS count FROM ( SELECT SearchPhrase FROM parquet_scan('s3://boilingdata-demo/hits.parquet') GROUP BY SearchPhrase ) AS a ORDER BY count DESC;
SELECT DATE_TRUNC('minute', EventTime::timestamp) AS M, COUNT(*) AS PageViews FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE CounterID = 62 AND EventDate >= '2013-07-14' AND EventDate <= '2013-07-15' AND IsRefresh = 0 AND DontCountHits = 0 GROUP BY DATE_TRUNC('minute', EventTime::timestamp) ORDER BY DATE_TRUNC('minute', EventTime::timestamp) LIMIT 10 OFFSET 1000;
SELECT CounterID, AVG(STRLEN(URL)) AS l, COUNT(*) AS c FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE URL <> '' GROUP BY CounterID HAVING COUNT(*) > 100000 ORDER BY l DESC LIMIT 25;
SELECT * FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE URL LIKE '%google%' ORDER BY EventTime, WatchID LIMIT 1;
from clickbench.
Thank you! Yes, let's try these queries.
I can also try if I will not forget, but you can also submit the results.
from clickbench.
I've created "engineering white paper" with the test results. Link is on front page: https://www.boilingdata.com/
from clickbench.
Related Issues (20)
- Databend benchmark is not valid the the latest Databend versions HOT 3
- Does the skipping index have more advantages for test dataset HOT 1
- Why is the perf of cold scan much worse than ever in the last commit of m5d.24xlarge. HOT 1
- Add YTsaurus support
- Can we get a larger dataset? HOT 1
- Doris vs Clickhouse for TPC-H HOT 1
- Update DataFusion & results
- AlloyDB benchmark results for 8 vCPU not clear
- Any plan for adding queries with window function? HOT 1
- InfluxDB benchmarks
- Segmentation fault running hardware.sh while running Test 17 HOT 1
- The Pinot benchmark does not have indices. HOT 7
- Elasticsearch benchmarks flush the cache between queries HOT 6
- ClickHouse appears to run into an overflow on Q3 HOT 2
- Add YDB for comparing HOT 1
- Inaccurate table size calculation of Mysql HOT 1
- Add Quickwit support HOT 5
- feat: add Github Actions to auto generate index.html HOT 1
- Syntax error for postgresql CREATE TABLE HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clickbench.