Coder Social home page Coder Social logo

Comments (3)

canhld94 avatar canhld94 commented on June 28, 2024 1

I will fix this asap

from clickhouse.

canhld94 avatar canhld94 commented on June 28, 2024

@tavplubix can we get some information about the query that triggered the issue?

from clickhouse.

canhld94 avatar canhld94 commented on June 28, 2024

We're running some tests to find if we can reproduce the issue. Nevertheless, it's likely that even without this bug, we will remove combineFilterAndIndices in FilterTransform.

Remind that for vertical FINAL, there're two major steps:

  • (1) in merging final step, we don't copy selected row but remember the position of selected row and generate a filter (in the form of list of indices).
  • (2) if query has WHERE, we will merge this filter with the filter generated by WHERE expression in FilterTransform, otherwise we have a filter step right after merging final step.

The feature has been tested thoroughly in our environment. During testing, we hardly found any issue with (1), but we found two issues with (2), including 1 bug and 1 potential performance issue:

  • The bug is if there's an additional steps between merging final and filter transform, those steps may change number of rows in the chunk and (2) will not work correctly anymore. At that time, the case we found is when query has ARRAY JOIN before filter. We fixed the bug by checking if query has any ARRAY JOIN before WHERE before applying (2). We thought that the fix is sufficient because ARRAY JOIN is the only possible expression that will change number of rows in chunk.
  • The performance issue is when WHERE expression is heavy to compute, (2) may be worse than normal because we need to compute WHERE on more rows.

Nevertheless, we decided to keep (2). But now with this bug, I think it's rational to remove (2). The bug appears to me that the number of rows in chunk produced by merging final step has been changed before it reaches FilterTransform. And to be honest, at this point I'm not 100% sure that ARRAY JOIN is the only possible expression that can change number of rows in a chunk, or how we can guarantee that the FilterTransform right after merging final step is produced by WHERE.

So in stead of doing (2), there is a more solid and less error-prone optimization: to leverage the PREWHERE filter in the chunk (if any) to skip row in merging final step.

@tavplubix @nickitat @KochetovNicolai would like to hear your opinion.
cc @jorisgio

from clickhouse.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.