Coder Social home page Coder Social logo

Sql: Allow array counting about crate HOT 21 CLOSED

crate avatar crate commented on June 22, 2024 1
Sql: Allow array counting

from crate.

Comments (21)

diyism avatar diyism commented on June 22, 2024 1

I need "select sum(array_count(array_matched(time_visit))) as total from statis_user_success where any(time_visit)>1481009355". Related to https://www.bountysource.com/issues/39611200-feature-request-support-array_matched-to-filter-elements-by-any-condition

from crate.

diyism avatar diyism commented on June 22, 2024 1

After 3 years and 4 months, any progress?

from crate.

joemoe avatar joemoe commented on June 22, 2024 1

@diyism unfortunately not.

from crate.

lukasender avatar lukasender commented on June 22, 2024

Hey @eMerzh !

we will put that on our backlog.
stay tuned :)

from crate.

mfussenegger avatar mfussenegger commented on June 22, 2024

Hi @eMerzh

we are about to start working on this and would like to have a bit more feedback from you regarding your use case.

Either here in the github issue or if you prefer you could also drop by on our irc channel on freenode - if no one is answering just mention someone with op rights directly

from crate.

mfussenegger avatar mfussenegger commented on June 22, 2024

Hi @eMerzh

thanks for your feedback so far via IRC.
I've another question regarding the array count feature. Would it also be sufficient for your use case to have a set-count function?

In our index we don't store the array as is but rather generate terms/tokens and our idea is to provide a function which exposes that information. Doing so would be much faster to query on then providing the array count.

from crate.

eMerzh avatar eMerzh commented on June 22, 2024

Hi,
thanks to you!
Not sure to understand the "set-count" part ... but if i'm able to do smth like

select * from myTable where array_lenght(MyColumn) > 3;

or

select * from myTable where MyColumn && [A,B,C];

(&& is "overlap" or have elements in common in postgresql )
then i'm happy :D ...
postgresql might be a good source of inspiration for array function as it support array in core in an sql fashion.

from crate.

seut avatar seut commented on June 22, 2024

Hey,

difference is: set-count would count only distinct values whereas array-count would count all, including duplicates.

from crate.

eMerzh avatar eMerzh commented on June 22, 2024

then, yeah no problem for my use case :)

from crate.

devoncrouse avatar devoncrouse commented on June 22, 2024

+1

from crate.

joemoe avatar joemoe commented on June 22, 2024

@eMerzh we added some array functions in the last time.
https://crate.io/docs/reference/en/latest/sql/scalar.html#array-functions
do you still need the count or overlap?

from crate.

eMerzh avatar eMerzh commented on June 22, 2024

thanks @joemoe :) ... seems better, but how do you count the length of and array now?

from crate.

joemoe avatar joemoe commented on June 22, 2024

@eMerzh at the moment you can't.
the problem is that for the example you give the query will be either time consuming because the function needs to be applied on every row or we need to build a function based index.
as i interpreted your request it looked like you have growing arrays, each with the same elements. that's why array_intersects would also do its job. is this true?
this would be much faster.

from crate.

eMerzh avatar eMerzh commented on June 22, 2024

yes like you have and array of email... you want to know every users with 2 emails... or smth like that...

from crate.

joemoe avatar joemoe commented on June 22, 2024

ok thanks. are you still using crate?

from crate.

eMerzh avatar eMerzh commented on June 22, 2024

not currently ... ;-) so not top priority for now ;)

from crate.

diyism avatar diyism commented on June 22, 2024

@joemoe, Maybe subquery can help:
select (select count(*) from unnest(tb1.array1)) from tb1
Unfortunately, we can only do:
select (select count(*) from unnest([1,2])) from tb1

In https://crate.io/docs/crate/reference/en/latest/sql/reference/value_expressions.html#scalar-subquery :
...Columns from relations from outside of the subquery cannot be accessed from within the subquery...

Why?

In https://crate.io/docs/crate/reference/en/latest/sql/subquery_expressions.html :
...The used subquery has to be uncorrelated which means that the subquery does not contain references to relations in a parent statement...

Why?

from crate.

diyism avatar diyism commented on June 22, 2024

If we have group_concat():
select char_length(regexp_replace(group_concat(tb1.array1)||',', '[^,]*,', ',', 'g'))
but we can only:
select char_length(regexp_replace('ac,b,23'||',', '[^,]*,', ',', 'g'))

from crate.

diyism avatar diyism commented on June 22, 2024

Currently we can add another column "count_dates":

insert into tb1 (key, dates, count_dates) values ('id_2', [1508999991], 1);
update tb1 set dates=array_cat(dates, [1509000401]),count_dates=count_dates+1 where key='id_2';
select count_dates from tb1 where key='id_2';

But if i insert duplicated "dates", the count_dates will be incorrect:

update tb1 set dates=array_cat(dates, [1509000401]),count_dates=count_dates+1 where key='id_2';
update tb1 set dates=array_unique(array_cat(dates, [1509000401])),count_dates=count_dates+1 where key='id_2';
select count_dates from tb1 where key='id_2';

from crate.

diyism avatar diyism commented on June 22, 2024

The CrateDB 2.3.0 has come, the array count() function is still not available.
I think you can only add an extra count index to all array fields to realize this count() function.

from crate.

mfussenegger avatar mfussenegger commented on June 22, 2024

We've added array_upper and
array_length which can be used for
the use-cases here.

They'll be available in 3.2 and can already be used in the latest nightly.

As already mentioned in this issue in some cases performance of using the
scalars in the where clause can be problematic. We've some optimizations
pending
but not all cases can be
optimized efficiently.

from crate.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.