Coder Social home page Coder Social logo

Comments (7)

Fleid avatar Fleid commented on June 9, 2024 3

So what we want is:

on_schema_change In case of new columns... In case of removed columns...
classic columns ignored run failed
ignore columns ignored columns ignored
fail run failed run failed
append_new_columns columns added columns ignored
sync_all_columns columns added columns removed
  • A new classic mode, wired as the current ignore mode
  • Fixing ignore to actually ignore removed columns

I'm good for it!

from dbt-bigquery.

sudo-pradip avatar sudo-pradip commented on June 9, 2024 1

Hello @Fleid,

We're really excited about this feature! 😄

Could you please let us know if development is currently underway for it? Has the DBT community reached a decision on this issue and approved updates to the behavior? And if the work hasn't started yet, would it be possible for me to get involved and contribute to its development? 🚀

from dbt-bigquery.

mbarugelCA avatar mbarugelCA commented on June 9, 2024

I am seeing this same behavior in Redshift.

For further reference, if you set on_schema_change='append_new_columns', the materialization works just fine even if there are columns in the target table that do not exist in the source.

from dbt-bigquery.

Fleid avatar Fleid commented on June 9, 2024

I can reproduce this. After spending some time looking at the code, I feel like the actual descriptions of the options are:

on_schema_change In case of new columns... In case of removed columns...
ignore columns ignored run failed
fail run failed run failed
append_new_columns columns added columns ignored
sync_all_columns columns added columns removed

@jtcohen6 it may just be that ignore is mislabeled and should really be original? Looks like everybody took for granted what the original behavior was, but now that the doc's not there anymore, it's hard to say looking at it from the future ;) Do you remember the time?


Anyway, in incremental.sql, we try process_schema_changes to get the target schema, and if we get nothing we pull the existing one (with the removed rows).
https://github.com/dbt-labs/dbt-core/blob/e8da84fb9e177d9eee3d7722d3d6906bb283183d/core/dbt/include/global_project/macros/materializations/models/incremental/incremental.sql#L45-L49

In on_schema_change.sql, we abandon ship if the option is ignore:

https://github.com/dbt-labs/dbt-core/blob/7efb6ab62dae3ea9da4e6e28e9a30aa4e0e74919/core/dbt/include/global_project/macros/materializations/models/incremental/on_schema_change.sql#L102-L106

So yes, ignore will result in a DML statement being built with dest_columns listing the removed rows.


Speaking of DML statement, looking at incremental merge (through 3 layers of dispatch):
https://github.com/dbt-labs/dbt-core/blob/f841a7ca76b44429eb94174bc5d0c2fecbf2763b/core/dbt/include/global_project/macros/materializations/models/incremental/merge.sql#L40-L44

We're looping on update_columns which comes from:

https://github.com/dbt-labs/dbt-core/blob/f841a7ca76b44429eb94174bc5d0c2fecbf2763b/core/dbt/include/global_project/macros/materializations/models/incremental/merge.sql#L11-L12

So really just dest_columns. Note, I don't think we have documented merge_exclude_columns, only merge_update_columns.

And back to the merge, for inserted columns:

https://github.com/dbt-labs/dbt-core/blob/f841a7ca76b44429eb94174bc5d0c2fecbf2763b/core/dbt/include/global_project/macros/materializations/models/incremental/merge.sql#L47-L50

Which can only fail, since it tries to pull columns from the source that have been removed.


I'd love a sanity check here, but my thinking is that this bug is actually a documentation clarification (ignore really is original or classic) plus an enhancement (a new mode that actually does ignore/ignore on that table at the top)

from dbt-bigquery.

dbeatty10 avatar dbeatty10 commented on June 9, 2024

@Fleid nice sleuth work :shipit:

Although I didn't actually sanity check this for you, I did create the following issue for the documentation clarification:

We can always yank that issue if it turns out to not be needed.

How are you thinking about accomplishing the full switch from ignoreoriginal or classic? One PR or multiple? How would we communicate the change?

from dbt-bigquery.

awong-evoiq avatar awong-evoiq commented on June 9, 2024

Can we make this into a feature request then? It doesn't make a lot of sense, semantically, that append_new_columns will ignore extra columns in the db, but ignore will fail. I think the use case mentioned in the ticket is kind of a reasonable thing to support (i.e. dbt being able to allow another system to manage db schemas).

from dbt-bigquery.

dbeatty10 avatar dbeatty10 commented on June 9, 2024

@awong-evoiq I just switched this issue from a bug to a feature request. Could you take a look at the following subject line and see if you want to adjust it accordingly?

BigQuery Incremental - on_schema_change: ignore cannot ignore missing columns

@Fleid would you be willing to re-assess this through the lens of a feature request?

from dbt-bigquery.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.