Comments (1)
I think this would be semantically very problematic. Some aggregation functions can take arrays as input and would aggregate on the array in the usual way. If I sawapprox_distinct(array_column)
, I would expect it to give me the number of distinct arrays, and not the number of distinct elements across all arrays.
CROSS JOIN UNNEST can only be done with one array column at a time. What if i want to SUM two such columns in a single pass.
You can unnest multiple array columns at a time. see https://prestodb.io/docs/current/sql/select.html#unnest.
For some of the cases you described above, if you don't want to unnest, you can consider first using an array_function that does part of the aggregation for you on each array row, and then doing an aggregation on top of it. For example:
If you are trying to get the number of distinct elements across all arrays, you could unnest and do an approx_distinct as you described. Alternatively, you could do cardinality(set_union(array_column)).
If you are trying to get the sum across all arrays, you could unnest and sum as you described, or you could do sum(array_sum(array_column)
also cc: @kaikalur
from presto.
Related Issues (20)
- PREPARE fails for INSERT statements using non-standard characters (ex: '-') in quoted identifiers HOT 2
- Pushdown projects into value node HOT 1
- Support different types of COW and MOR queries for Apache Hudi
- More NaNs in UI HOT 1
- Connector specific session properties missed in manually started transaction in test framework
- The links for documents are invalid in presto homepage HOT 3
- Left align message HOT 2
- Query failed: / by zero --> division by zero
- [docs] disable running unneeded tests for docs-only PRs HOT 2
- [Native] Clang format InsertNewlineAtEOF invalid argument error in CI job HOT 1
- Lack of support for the ANSI SQL syntax `FETCH FIRST N ROWS WITH TIES`
- Behavior change in CAST(DATE as VARCHAR(x)) results in versions > 0.280
- Missing an HTTP endpoint to ensure the Presto Docker Container is ready HOT 4
- testQueryHeartbeat is flaky
- CircleCI format check jobs are failing HOT 1
- Session is completely corrupted by the failed statement in a non-autocommit transaction
- CLI should handle JSON error responses
- Rewrite small-medium sized VALUES to unnest
- Equality semantics of TIMESTAMP WITH TIME ZONE type can cause inconsistent behavior HOT 10
- Histograms can consume significant amounts of memory in query history
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from presto.