Comments (8)
We're going to call this out as a known limitation for the 1.8 release. But this is something we will revisit for 1.9 as an outcome of dbt-labs/dbt-core#8499
from dbt-bigquery.
ZIP file includes two SQL files and a YML file.
from dbt-bigquery.
Hey - thanks so much for opening! Since this is currently broken, we're going to remove that callout from our docs site. But I will sync with our engineers to see how we can fix this.
- We should investigate if we can remove the outer
select * from (
entirely. - We should implement this fix for all adapters that support recursive SQL
from dbt-bigquery.
Just adding the reproducible example Harlan shared.
I have a model I want to unit test:
# models/recursive_cte.sql
WITH RECURSIVE
asdf AS (
SELECT *
FROM {{ ref('recursive_cte_given') }}
)
SELECT *
FROM asdf
I add a unit test:
unit_tests:
- name: test_recursive_cte
model: recursive_cte
given:
- input: ref('recursive_cte_given')
format: csv
rows: |
x
1
expect:
format: csv
rows: |
x
1
from dbt-bigquery.
It looks like WITH RECURSIVE
is supported for most of the dbt Labs supported adapters:
- ✅ BigQuery: * https://cloud.google.com/bigquery/docs/recursive-ctes
✅ Snowflake: https://docs.snowflake.com/en/sql-reference/constructs/with#syntax - ❌ Spark:
- can't find any documented support
- this databricks forum indicates no support?
- ✅ Postgres: https://www.postgresql.org/docs/current/queries-with.html#QUERIES-WITH-RECURSIVE
- ✅ Redshift: https://docs.aws.amazon.com/redshift/latest/dg/r_WITH_clause.html
from dbt-bigquery.
For implementation:
- It looks like the extraneous select * is coming from the
unit
materialization, where we create a temp empty table of the model being tested usingget_create_table_as_sql(True, temp_relation, get_empty_subquery_sql(sql))
- Then, the get_empty_subquery_sql wrapper adds the
select *
andwhere false limit 0
wrapper around the user-provided sql.
I'm not sure if it's possible to remove the select *
but maintain the where false limit 0
filter. If we don't have the where false limit 0
filter, obtaining the column schema of the tested model isn't possible without actually running the query which could be costly...
from dbt-bigquery.
Additionally, we also expect to be able to wrap the user-provided sql in a subquery when constructing the statement that unions actual and expected results for comparison here.
It seems that being able to wrap the user-provided SQL in a subquery is an assumption held by the unit testing framework in multiple places currently.
from dbt-bigquery.
It looks like the dbt-unit-testing package also experiences this limitation, which is unsurprising given the CTE-based approach: EqualExperts/dbt-unit-testing#198
I think the only way to solve this generally is to use a seed-based strategy for ephemeral models, similar to what we'd need for being able to test incremental model upsert/merge logic: dbt-labs/dbt-core#8499
from dbt-bigquery.
Related Issues (20)
- [Bug] State modified does not pick up changes to policy tags
- [Bug] dbt grant doesn't work for clone
- [Feature] Improvement in data processed/cost incurred in insert_overwrite method HOT 4
- [Release Improvements] Refresh workflows for the `pyproject.toml` migration
- [Bug] persist_docs not working for seeds HOT 2
- Import relevant pytest(s) for cross-database `cast` macro
- [Bug] `docs generate` appears to be returning no table metadata when run with the `--no-compile` option HOT 1
- [Tech Debt] `test_dbt_debug` is not dropping its test schema
- [Bug] `docs generate` does not find all schemas when there are more than 10K schemas
- Cross-database `date` macro
- [Bug] incremental run with __dbt_tmp table does not log the real bytes_billed in run_results.json HOT 1
- [Bug] Requested entity not found when writing a Python model to BigQuery
- [Feature] support copy multiple tables in parallel using copy_partitions
- [Feature] Implement batch metadata freshness using `INFORMATION_SCHEMA.TABLE_STORAGE` instead of `client.get_table` HOT 1
- BigQuery authorized dataset HOT 1
- [Bug] --empty flag not working on Pseudo-columns HOT 3
- [Feature] Support overriding `grant_access_to`
- [Bug] many timeouts with `priority: interactive` config HOT 7
- [Bug] Incremental models don't update new elements in struct column
- [Regression] `docs generate` raises error when querying dataset with BQ ML models HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dbt-bigquery.