meltanolabs / target-snowflake Goto Github PK
View Code? Open in Web Editor NEWSinger Target for the Snowflake cloud Data Warehouse
Home Page: https://hub.meltano.com/loaders/target-snowflake--meltanolabs/
License: Other
Singer Target for the Snowflake cloud Data Warehouse
Home Page: https://hub.meltano.com/loaders/target-snowflake--meltanolabs/
License: Other
I'm not sure what this is and how it works, but seems I'm hitting the upstream issue documented in the Meltano SDK issues - meltano/sdk#344
Also discussed in python-jsonschema/jsonschema#247
jsonschema.exceptions.ValidationError: -4.8 is not a multiple of 0.0001 cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
Failed validating 'multipleOf' in schema['properties']['AMT_EXCL_TAX']: cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
{'multipleOf': 0.0001, 'type': ['null', 'number']} cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
On instance['AMT_EXCL_TAX']: cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
-4.8 cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
This particular column in Oracle is defined as
AMT_EXCL_TAX NUMBER(28, 4) DEFAULT 0 NOT NULL
with 20 example values
AMT_EXCL_TAX
-2332.4000
-1212.9600
-1146.0000
-929.0000
-619.0000
-500.0000
-448.0000
-346.0000
-304.0000
-203.5200
-103.2000
-94.9000
-50.4000
-4.8000
21.6000
31.2000
325.2000
1565.6000
3605.0300
3939.7500
I do not have a valid MRE at this time.. Please let me know, if I need to make one.
❯ python --version
Python 3.10.11
❯ meltano --version
meltano, version 2.19.1
❯ pip freeze | grep json
check-jsonschema==0.22.0
jsonschema==4.17.3
I'm running a job with
meltano --environment=dev run tap-oracle target-snowflake
meltano.yaml
version: 1
default_environment: dev
project_id: b1d562d0-d114-42a7-bad1-957891a7285b
environments:
- name: dev
config:
plugins:
loaders:
- name: target-snowflake
config:
database: redacted
warehouse: redacted
role: redacted
user: redacted
schema: redacted
default_target_schema: redacted
- name: uat
- name: prod
plugins:
extractors:
- name: tap-oracle
pip_url: git+https://github.com/christianelliott/pipelinewise-tap-oracle.git
config:
filter_schemas: MY_SCHEMA
default_replication_method: FULL_TABLE
port: 1521
host: redacted
user: redacted
service_name: redacted
use_ora_rowscn: true
add_metadata_columns: true
select:
- MY_SCHEMA-FINANCIAL_TRANSACTION.*
metadata:
MY_SCHEMA-FINANCIAL_TRANSACTION:
replication-method: INCREMENTAL
replication-key: ID
loaders:
- name: target-snowflake
variant: meltanolabs
pip_url: meltanolabs-target-snowflake
config:
account: redacted.eu-west-1
add_record_metadata: true
clean_up_batch_files: true
flattening_enabled: false
jobs:
- name: icx--snowflake
tasks:
- tap-oracle target-snowflake
2023-07-03T19:56:58.891691Z [info ] 2023-07-03 21:56:58,891 | INFO | snowflake.connector.cursor | Number of results in first chunk: 1 cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.192610Z [info ] time=2023-07-03 21:59:07 name=singer level=INFO message=METRIC: b'{"type":"counter","metric":"record_count","value":1,"tags":{}}' cmd_type=elb consumer=False name=tap-oracle producer=True stdio=stderr string_id=tap-oracle
2023-07-03T19:59:07.254068Z [info ] Traceback (most recent call last): cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.254354Z [info ] File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/bin/target-snowflake", line 8, in <module> cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.255159Z [info ] sys.exit(TargetSnowflake.cli()) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.255444Z [info ] File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 1130, in __call__ cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.256212Z [info ] return self.main(*args, **kwargs) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.256460Z [info ] File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 1055, in main cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.256815Z [info ] rv = self.invoke(ctx) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.257037Z [info ] File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 1404, in invoke cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.257552Z [info ] return ctx.invoke(self.callback, **ctx.params) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.257721Z [info ] File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 760, in invoke cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.257945Z [info ] return __callback(*args, **kwargs) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.258128Z [info ] File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 546, in invoke cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.258422Z [info ] target.listen(file_input) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.258502Z [info ] File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/io_base.py", line 33, in listen cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.258562Z [info ] self._process_lines(file_input) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.258615Z [info ] File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 291, in _process_lines cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.258667Z [info ] counter = super()._process_lines(file_input) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.258717Z [info ] File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/io_base.py", line 79, in _process_lines cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.258855Z [info ] self._process_record_message(line_dict) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.259054Z [info ] File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 338, in _process_record_message cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.259223Z [info ] sink._validate_and_parse(transformed_record) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.259390Z [info ] File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/sinks/core.py", line 314, in _validate_and_parse cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.259445Z [info ] self._validator.validate(record) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.259495Z [info ] File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/jsonschema/validators.py", line 314, in validate cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.259973Z [info ] raise error cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.260326Z [info ] jsonschema.exceptions.ValidationError: -4.8 is not a multiple of 0.0001 cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.260406Z [info ] cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.260493Z [info ] Failed validating 'multipleOf' in schema['properties']['AMT_EXCL_TAX']: cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.260721Z [info ] {'multipleOf': 0.0001, 'type': ['null', 'number']} cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.260791Z [info ] cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.260846Z [info ] On instance['AMT_EXCL_TAX']: cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.260946Z [info ] -4.8 cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.298085Z [info ] 2023-07-03 21:59:07,297 | INFO | snowflake.connector.connection | closed cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.298331Z [info ] 2023-07-03 21:59:07,298 | INFO | snowflake.connector.connection | No async queries seem to be running, deleting session cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.368337Z [error ] [Errno 32] Broken pipe
...
Note: No urgency here. Just logging these so we can come back to them next week, and keep iterating on the process.
I created draft release notes for v0.1.0
, which resulted in some changes noted by the GitHub default release drafter. So I then attempted to publish as a release.
A few findings.
v0.1.0
, the CI action tried to publish as v0.0.2
which was already published and is already taken.v0.0.2
already included the changes which had been detected as v0.1.0
changes. It looks like there actually are no pending changes.Links:
I have a table that inserted some reserved words like select
and update
from tap-meltanohub, on initial creation the merge statement fails because of a compilation error due to select/update not being quoted:
sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 001003 (42000): 01acd554-0603-1f7b-0051-5c83035999a2: SQL compilation error:
syntax error line 1 at position 611 unexpected 'SELECT'.
syntax error line 1 at position 705 unexpected 'UPDATE'.
syntax error line 1 at position 2,589 unexpected 'SELECT'.
syntax error line 1 at position 2,615 unexpected 'UPDATE'.
[SQL: merge into USERDEV_RAW.PNADOLNY_MELTANOHUB_NEW.PLUGINS d using (select $1:description::VARCHAR as DESCRIPTION, $1:label::VARCHAR as LABEL, $1:name::VARCHAR as NAME, $1:logo_url::VARCHAR as LOGO_URL, $1:namespace::VARCHAR as NAMESPACE, $1:variant::VARCHAR as VARIANT, $1:pip_url::VARCHAR as PIP_URL, $1:repo::VARCHAR as REPO, $1:settings::VARIANT as SETTINGS, $1:capabilities::VARIANT as CAPABILITIES, $1:docs::VARCHAR as DOCS, $1:default::BOOLEAN as DEFAULT, $1:id::VARCHAR as ID, $1:plugin_type::VARCHAR as PLUGIN_TYPE, $1:settings_group_validation::VARIANT as SETTINGS_GROUP_VALIDATION, $1:select::VARIANT as SELECT, $1:hidden::BOOLEAN as HIDDEN, $1:metadata::VARIANT as METADATA, $1:update::BOOLEAN as UPDATE, $1:dialect::VARCHAR as DIALECT, $1:target_schema::VARCHAR as TARGET_SCHEMA, $1:executable::VARCHAR as EXECUTABLE, $1:commands::VARIANT as COMMANDS, $1:requires::VARIANT as REQUIRES, $1:_sdc_received_at::TIMESTAMP_NTZ as _SDC_RECEIVED_AT, $1:_sdc_deleted_at::TIMESTAMP_NTZ as _SDC_DELETED_AT, $1:_sdc_extracted_at::TIMESTAMP_NTZ as _SDC_EXTRACTED_AT, $1:_sdc_batched_at::TIMESTAMP_NTZ as _SDC_BATCHED_AT, $1:_sdc_table_version::DECIMAL as _SDC_TABLE_VERSION, $1:_sdc_sequence::DECIMAL as _SDC_SEQUENCE from '@~/target-snowflake/plugins-2825d13f-b9ea-430a-92ae-3dcc0ede7e4f'(file_format => USERDEV_RAW.PNADOLNY_MELTANOHUB_NEW."plugins-2825d13f-b9ea-430a-92ae-3dcc0ede7e4f")) s on d."ID" = s."ID" when matched then update set d."DESCRIPTION" = s."DESCRIPTION", d."LABEL" = s."LABEL", d."NAME" = s."NAME", d."LOGO_URL" = s."LOGO_URL", d."NAMESPACE" = s."NAMESPACE", d."VARIANT" = s."VARIANT", d."PIP_URL" = s."PIP_URL", d."REPO" = s."REPO", d."SETTINGS" = s."SETTINGS", d."CAPABILITIES" = s."CAPABILITIES", d."DOCS" = s."DOCS", d."DEFAULT" = s."DEFAULT", d."ID" = s."ID", d."PLUGIN_TYPE" = s."PLUGIN_TYPE", d."SETTINGS_GROUP_VALIDATION" = s."SETTINGS_GROUP_VALIDATION", d."SELECT" = s."SELECT", d."HIDDEN" = s."HIDDEN", d."METADATA" = s."METADATA", d."UPDATE" = s."UPDATE", d."DIALECT" = s."DIALECT", d."TARGET_SCHEMA" = s."TARGET_SCHEMA", d."EXECUTABLE" = s."EXECUTABLE", d."COMMANDS" = s."COMMANDS", d."REQUIRES" = s."REQUIRES", d."_SDC_RECEIVED_AT" = s."_SDC_RECEIVED_AT", d."_SDC_DELETED_AT" = s."_SDC_DELETED_AT", d."_SDC_EXTRACTED_AT" = s."_SDC_EXTRACTED_AT", d."_SDC_BATCHED_AT" = s."_SDC_BATCHED_AT", d."_SDC_TABLE_VERSION" = s."_SDC_TABLE_VERSION", d."_SDC_SEQUENCE" = s."_SDC_SEQUENCE" when not matched then insert (DESCRIPTION, LABEL, NAME, LOGO_URL, NAMESPACE, VARIANT, PIP_URL, REPO, SETTINGS, CAPABILITIES, DOCS, DEFAULT, ID, PLUGIN_TYPE, SETTINGS_GROUP_VALIDATION, SELECT, HIDDEN, METADATA, UPDATE, DIALECT, TARGET_SCHEMA, EXECUTABLE, COMMANDS, REQUIRES, _SDC_RECEIVED_AT, _SDC_DELETED_AT, _SDC_EXTRACTED_AT, _SDC_BATCHED_AT, _SDC_TABLE_VERSION, _SDC_SEQUENCE) values (s."DESCRIPTION", s."LABEL", s."NAME", s."LOGO_URL", s."NAMESPACE", s."VARIANT", s."PIP_URL", s."REPO", s."SETTINGS", s."CAPABILITIES", s."DOCS", s."DEFAULT", s."ID", s."PLUGIN_TYPE", s."SETTINGS_GROUP_VALIDATION", s."SELECT", s."HIDDEN", s."METADATA", s."UPDATE", s."DIALECT", s."TARGET_SCHEMA", s."EXECUTABLE", s."COMMANDS", s."REQUIRES", s."_SDC_RECEIVED_AT", s."_SDC_DELETED_AT", s."_SDC_EXTRACTED_AT", s."_SDC_BATCHED_AT", s."_SDC_TABLE_VERSION", s."_SDC_SEQUENCE")]
Additionally on the second sync, where the table exists already with reserved words, it does a diff on the column names and incorrectly decides to create that column as if its missing. An error is thrown because it already exists. It looks like usually sqlalchemy returns all lowercase columns but in the reserved word case it comes back uppercase. In the DDL of the snowflake table its wrapped in double quotes so I think sqlalchemy is taking it literally while maybe lowercasing the rest of the columns, not sure.
snowflake.connector.errors.ProgrammingError: 001430 (42601): 01acd544-0603-1f7b-0051-5c8303595c36: SQL compilation error:
column 'SELECT' already exists
Fixed in MeltanoLabs/target-postgres#35
When working with Meltano in a Windows environment, target-snowflake leaves behind the json.gz
files it generates to load into Snowflake.
This does not occur when working with Meltano in Linux environment, or a linux container. Likely dealing with path construction again similar to #87.
target-snowflake
loader, and any tap of your choosing.meltano run (tap) target-snowflake
json.gz
files generated are not cleaned upSlack Thread: https://meltano.slack.com/archives/C013EKWA2Q1/p1689800363928689
Theres an initial draft of this in #15
I was trying to show that Windows jobs failed via Github actions while solving #87 , but the job succeeded when running on Windows. I know for a fact running on Windows and pushing data to snowflake doesn't work e2e right now without #89 merged
Maybe we could do a test like https://github.com/MeltanoLabs/target-postgres/blob/89b2b957d136492a2b71a176fef74e610f60b934/.github/workflows/ci_workflow.yml#L99C1-L100C54
Needed to modify meltano.yml to this to get a test going locally
version: 1
send_anonymous_usage_stats: true
project_id: target-snowflake
plugins:
extractors:
- name: tap-smoke-test
variant: meltano
pip_url: git+https://github.com/meltano/tap-smoke-test.git
config:
streams:
- stream_name: animals
input_filename: https://gitlab.com/meltano/tap-smoke-test/-/raw/main/demo-data/animals-data.jsonl
- stream_name: pageviews
input_filename: https://raw.githubusercontent.com/meltano/tap-smoke-test/main/demo-data/pageviews-data.jsonl
loaders:
- name: target-snowflake
namespace: target_snowflake
pip_url: -e .
capabilities:
- about
- schema-flattening
- stream-maps
settings_group_validation:
- - account
- database
- password
- user
settings:
- name: account
kind: string
label: Account
description: Your account identifier. See [Account Identifiers](https://docs.snowflake.com/en/user-guide/admin-account-identifier.html).
- name: add_record_metadata
kind: boolean
value: true
label: Add Record Metadata
description: Whether to add metadata columns.
- name: clean_up_batch_files
kind: boolean
value: true
label: Clean Up Batch Files
description: Whether to remove batch files after processing.
- name: database
kind: string
label: Database
description: The initial database for the Snowflake session.
- name: default_target_schema
kind: string
label: Default Target Schema
description: The default target database schema name to use for all streams.
- name: flattening_enabled
kind: boolean
label: Flattening Enabled
description: "'True' to enable schema flattening and automatically expand nested\
\ properties."
- name: flattening_max_depth
kind: integer
label: Flattening Max Depth
description: The max depth to flatten schemas.
- name: password
kind: password
label: Password
description: The password for your Snowflake user.
- name: role
kind: string
label: Role
description: The initial role for the session.
- name: schema
kind: string
label: Schema
description: The initial schema for the Snowflake session.
- name: stream_map_config
kind: object
label: Stream Map Config
description: User-defined config values to be used within map expressions.
- name: stream_maps
kind: object
label: Stream Maps
description: Config object for stream maps capability. For more information
check out [Stream Maps](https://sdk.meltano.com/en/latest/stream_maps.html).
- name: user
kind: string
label: User
description: The login name for your Snowflake user.
- name: warehouse
kind: string
label: Warehouse
description: The initial warehouse for the session.
config:
start_date: '2010-01-01T00:00:00Z'
default_environment: dev
environments:
- name: dev
Related to #60
In that linked PR there was a bug where the comparison sqlalchemy types fails because the custom snowflake types dont have the python_type
method. In that PR I fixed the way we assign types in the target but it turns out when we ask sqlalchemy for the columns of an existing table it returns the original snowflake sqlalchemy types still so later when we do the same comparsion using python_type
the error pops up again.
We need to ask sqlalchemy for types then convert them to our custom types for now.
Because the copy into doesn't specify the column order if the columns list in the schema doesn't match the order in the table in snowflake exactly you can have some issues. I hit this as we were migrating an existing table to use target-snowflake
. Related code here https://github.com/MeltanoLabs/target-snowflake/blob/main/target_snowflake/connector.py#L355-L358 , I recommend we swap to explicitly defining the column names in the copy into
(https://docs.snowflake.com/en/sql-reference/sql/copy-into-table) statement, maybe there's a better way?
copy into sql example that gets generated by the target
[SQL: copy into RMS_DB.RAW.EXAMPLE_BROKEN from (select $1:"ACCOUNTNAME"::VARCHAR(399) as "ACCOUNTNAME", $1:"ACCOUNTID"::DECIMAL as "ACCOUNTID", $1:"VOID"::BOOLEAN as "VOID", $1:"ETLLASTUPDATEDON"::TIMESTAMP_NTZ as "ETLLASTUPDATEDON", $1:_sdc_extracted_at::TIMESTAMP_NTZ as _sdc_extracted_at, $1:_sdc_received_at::TIMESTAMP_NTZ as _sdc_received_at, $1:_sdc_batched_at::TIMESTAMP_NTZ as _sdc_batched_at, $1:_sdc_deleted_at::TIMESTAMP_NTZ as _sdc_deleted_at, $1:_sdc_sequence::DECIMAL as _sdc_sequence, $1:_sdc_table_version::DECIMAL as _sdc_table_version from '@~/target-snowflake/EXAMPLE_BROKEN-83eb0de6-3121-4230-bcbf-79846ee67ef7')file_format = (format_name='RMS_DB.RAW."EXAMPLE_BROKEN-83eb0de6-3121-4230-bcbf-79846ee67ef7"')]
To reproduce
example_1
{"type": "SCHEMA", "stream": "EXAMPLE_BROKEN", "schema": {"properties": {"ACCOUNTID": {"type": ["number", "null"]}, "ACCOUNTNAME": {"maxLength": 400, "type": ["string", "null"]}, "VOID": {"type": ["boolean", "null"]}, "ETLLASTUPDATEDON": {"format": "date-time", "type": ["string", "null"]}}}, "key_properties": [], "bookmark_properties": []}
{"type": "RECORD", "stream": "EXAMPLE_BROKEN", "record": {"ACCOUNTID": 123456.1, "ACCOUNTNAME": "Name1", "VOID": true, "ETLLASTUPDATEDON": "2023-04-06 07:13:04.380000"}, "time_extracted": "2023-09-11T20:41:59.240392+00:00"}
Run cat example_1 | meltano invoke target-snowflake
{"type": "SCHEMA", "stream": "EXAMPLE_BROKEN", "schema": {"properties": {"ACCOUNTNAME": {"maxLength": 399, "type": ["string", "null"]}, "ACCOUNTID": {"type": ["number", "null"]}, "VOID": {"type": ["boolean", "null"]}, "ETLLASTUPDATEDON": {"format": "date-time", "type": ["string", "null"]}}}, "key_properties": [], "bookmark_properties": []}
{"type": "RECORD", "stream": "EXAMPLE_BROKEN", "record": {"ACCOUNTNAME": "Name1", "ACCOUNTID": 123456.1, "VOID": true, "ETLLASTUPDATEDON": "2023-04-06 07:13:04.380000"}, "time_extracted": "2023-09-11T20:41:59.240392+00:00"}
Run cat example_2 | meltano invoke target-snowflake
You'll get
sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 100038 (22018): 01aeee51-0001-56f3-0004-428e00039da6: Numeric value 'Name1' is not recognized
[SQL: copy into RMS_DB.RAW.EXAMPLE_BROKEN from (select $1:"ACCOUNTNAME"::VARCHAR(399) as "ACCOUNTNAME", $1:"ACCOUNTID"::DECIMAL as "ACCOUNTID", $1:"VOID"::BOOLEAN as "VOID", $1:"ETLLASTUPDATEDON"::TIMESTAMP_NTZ as "ETLLASTUPDATEDON", $1:_sdc_extracted_at::TIMESTAMP_NTZ as _sdc_extracted_at, $1:_sdc_received_at::TIMESTAMP_NTZ as _sdc_received_at, $1:_sdc_batched_at::TIMESTAMP_NTZ as _sdc_batched_at, $1:_sdc_deleted_at::TIMESTAMP_NTZ as _sdc_deleted_at, $1:_sdc_sequence::DECIMAL as _sdc_sequence, $1:_sdc_table_version::DECIMAL as _sdc_table_version from '@~/target-snowflake/EXAMPLE_BROKEN-83eb0de6-3121-4230-bcbf-79846ee67ef7')file_format = (format_name='RMS_DB.RAW."EXAMPLE_BROKEN-83eb0de6-3121-4230-bcbf-79846ee67ef7"')]
Added by this commit: meltano/sdk#1157
I have a table with a column _SDC_DELETED_AT
that is of type VARCHAR(16777216). When I switch to this target it wants to change that to TIMESTAMP_NTZ()
(rightfully so) but it fails at this line https://github.com/meltano/sdk/blob/3ceb28a97a77a5e22163fc887fe16ff129c2dc28/singer_sdk/connectors/sql.py#L912 when trying to adapt the column type because the snowflake SQLAlchemy type TIMESTAMP_NTZ()
doesnt implement python_type
https://github.com/snowflakedb/snowflake-sqlalchemy/blob/c200f971a745eb404ef68b9f8a248c7255056dc9/src/snowflake/sqlalchemy/custom_types.py#L56.
From slack thread https://meltano.slack.com/archives/C01TCRBBJD7/p1689179609824739
The target uses the underlying SDK batch mechanisms to load data to snowflake via internal stages. It looks like somewhere along the way the path is getting mangled and slashes are being removed and it causes a File doesn't exist
to be raised. An example is C:GitHubETLRedesignmeltano-projectMeltanoV2\target-snowflake--Content-ResourceLocationUpdate-c1b6705a-fbbc-4965-ae9c-3cf50ee17296-1.json.gz
. It's very likely that this issue belongs in the SDK but this is where it was identified first.
I tried to track down whats happening but it goes into the fs library and its very difficult to debug without a windows machine to test on.
Having a seperate issue but I'm having a tough time debugging as my debug statements (adding breakpoint()
) don't work seemingly because of the use of joblib.parallel
. Maybe just a writeup in the docs for how to debug this target / add debug statements?
Maybe I'm just a noob and I should use a different debugger
Currently theres a lot of wasted time during startup where the target is running lots of queries to check the existing schema against the schema message it received in case it needs to make alterations. One reason this is slow is because it iterates every column and either:
In doing this its requesting ddl to be generated then executes it in a serial loop.
An optimization to speed this up would be to bring the execution up to the prepare_table method after the for loop. The prepare_column
method would instead return the ddl instead of executing it, so at the prepare_table level it can send one large script.
@edgarrmondragon what do you think about this?
The test throws an IntegrityError
. I suspect that the camel casing is breaking this merge sql in that its looking for lowercase id
and the RECORD sends Id
using camelcase.
sqlalchemy.exc.IntegrityError: (snowflake.connector.errors.IntegrityError) 100072 (22000): None: NULL result in a non-nullable column
E [SQL: merge into MELTANOLABS_RAW.TARGET_SNOWFLAKE_1CE306.TESTCAMELCASE d using (select $1:id::VARCHAR as ID, $1:clientname::VARCHAR as CLIENTNAME, $1:_sdc_batched_at::TIMESTAMP_NTZ as _SDC_BATCHED_AT, $1:_sdc_extracted_at::TIMESTAMP_NTZ as _SDC_EXTRACTED_AT, $1:_sdc_received_at::TIMESTAMP_NTZ as _SDC_RECEIVED_AT, $1:_sdc_deleted_at::TIMESTAMP_NTZ as _SDC_DELETED_AT, $1:_sdc_table_version::DECIMAL as _SDC_TABLE_VERSION, $1:_sdc_sequence::DECIMAL as _SDC_SEQUENCE from '@~/target-snowflake/TestCamelcase-f581638e-d5bc-42a5-a8ae-cec1d109b3ea'(file_format => MELTANOLABS_RAW.TARGET_SNOWFLAKE_1CE306."TestCamelcase-f581638e-d5bc-42a5-a8ae-cec1d109b3ea")) s on d."ID" = s."ID" when matched then update set d."ID" = s."ID", d."CLIENTNAME" = s."CLIENTNAME", d."_SDC_BATCHED_AT" = s."_SDC_BATCHED_AT", d."_SDC_EXTRACTED_AT" = s."_SDC_EXTRACTED_AT", d."_SDC_RECEIVED_AT" = s."_SDC_RECEIVED_AT", d."_SDC_DELETED_AT" = s."_SDC_DELETED_AT", d."_SDC_TABLE_VERSION" = s."_SDC_TABLE_VERSION", d."_SDC_SEQUENCE" = s."_SDC_SEQUENCE" when not matched then insert (ID, CLIENTNAME, _SDC_BATCHED_AT, _SDC_EXTRACTED_AT, _SDC_RECEIVED_AT, _SDC_DELETED_AT, _SDC_TABLE_VERSION, _SDC_SEQUENCE) values (s."ID", s."CLIENTNAME", s."_SDC_BATCHED_AT", s."_SDC_EXTRACTED_AT", s."_SDC_RECEIVED_AT", s."_SDC_DELETED_AT", s."_SDC_TABLE_VERSION", s."_SDC_SEQUENCE")]
I forked the transferwise version of target-snowflake because I needed a way to add the schema name of my MySQL source (tap) to an additional column/property in Snowflake. It would be great to be able to migrate to a new version of this target and still retain this functionality.
More specifically, if I desired to replicate a source MySQL table named schema1.table1
with only 1 column col1
, then the destination table might look like destination_database.destination_schema.table1
with the columns col1
and source_schema_name
. All the values in source_schema_name
would be set to schema1
.
target-postgres might be a helpful reference https://github.com/MeltanoLabs/target-postgres/blob/main/target_postgres/tests/test_standard_target.py
There was a discussion in #47 around the best way to deduplicate within a stream when a key property is set. The PPW variant uses the last arriving record for a PK as the one to use for merging, using that behavior as the base case for this target we implemented the same behavior using sorting in the merge query itself.
Like discussed in that PR thread we should explore better options for deduplicating and fully think through if there are edge cases that we might be missing with the current implementation.
This PR meltano/sdk#1394 makes some small breaking changes to SQLConnector.
Recommendation: Pin the SDK <=0.19.0 in your pyproject.toml. Looks like you're current using poetry's caret specifier in a way that will allow updates to happen automatically past 0.19.0; I'd recommend altering that to <=0.19.0
.
Please take a look at the PR description for a deeper discussion. Here is a summary of what's changed.
Overview:
_connect()
method as a context manager.Specific code changes:
_connection
is gone altogether. You cannot pass it to SQLConnector and you cannot access it off of the connector. We are not caching or passing around an open DB connection anywhere anymore.create_sqlalchemy_engine
and create_sqlalchemy_connection
have been altered to use the single cached engine, and are also deprecated. Ideally, subclasses & other objects would stop using those methods.Suggestions:
_connect()
method as a context manager. It's directly equivalent to SQLAlchemy.Engine.connect(). Here's an example of its use.This is logged as something to consider fixing at the SDK level meltano/sdk#1812 but we should fix it here also.
Invalid character length: 4,294,967,295. Must be between 1 and 16,777,216
Set a limit to the max length parameter for varchars to be 16,777,216.
Originally posted by @aaronsteers in #2 (comment)
Note to our future selves to see what can be improved upstream in the SDK to simplify and remove need to call the private
_json_schema_type_check()
Probably SDK based changes, but we can discuss further here without blocking #2.
target-snowflake
implementationto_sql_type()
target-snowflake/target_snowflake/sinks.py
Lines 151 to 187 in a3caa38
_adapt_column_type()
Custom implementation will be removed after this merges:
TypeMap
logic:target-snowflake/target_snowflake/sinks.py
Lines 23 to 42 in a3caa38
This was mentioned in meltano/meltano#2814 (comment) and Office Hours.
The premise is that taps/targets are all CLIs and they already have all the credentials and logic needed to connect to the source or destination so we could extend it to do other operations.
In this case it would be nice to help users initialize needed snowflake objects so onboarding is seamless. I'm not 100% this is relevant for this target but in pipelinewise's the user needs to create a file format + role + optional stage + apply permissions appropriately. I see in
target-snowflake/target_snowflake/sinks.py
Line 284 in f791770
It would be cool to allow 2 options:
External ref: Slack thread on Meltano Slack community
meltano.yaml
# ...
extractors:
- name: tap-oracle
pip_url: git+https://github.com/christianelliott/pipelinewise-tap-oracle.git
# ...
loaders:
- name: target-snowflake
variant: meltanolabs
pip_url: meltanolabs-target-snowflake
# ...
The error
❯ meltano --environment=dev run tap-oracle target-snowflake
...
2023-07-09T15:36:51.661162Z [info ] 2023-07-09 17:36:51,661 | INFO | target-snowflake | Target sink for 'BUSINESSDATA_PROD-FINANCIAL_TRANSACTION' is full. Draining... cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.491003Z [info ] Traceback (most recent call last): cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.491140Z [info ] File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/bin/target-snowflake", line 8, in <module> cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.491266Z [info ] sys.exit(TargetSnowflake.cli()) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.491388Z [info ] File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 1157, in __call__ cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.492633Z [info ] return self.main(*args, **kwargs) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.492732Z [info ] File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 1078, in main cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.492888Z [info ] rv = self.invoke(ctx) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.492978Z [info ] File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 1434, in invoke cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.493094Z [info ] return ctx.invoke(self.callback, **ctx.params) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.493165Z [info ] File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 783, in invoke cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.493245Z [info ] return __callback(*args, **kwargs) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.493310Z [info ] File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 549, in invoke cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.493539Z [info ] target.listen(file_input) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.493606Z [info ] File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/io_base.py", line 34, in listen cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.494018Z [info ] self._process_lines(file_input) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.494089Z [info ] File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 291, in _process_lines cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.494297Z [info ] counter = super()._process_lines(file_input) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.494490Z [info ] File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/io_base.py", line 93, in _process_lines cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.494551Z [info ] self._process_record_message(line_dict) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.494619Z [info ] File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 353, in _process_record_message cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.494712Z [info ] self.drain_one(sink) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.494974Z [info ] File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 494, in drain_one cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495049Z [info ] sink.process_batch(draining_status) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495120Z [info ] File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/sinks/sql.py", line 267, in process_batch cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495330Z [info ] self.bulk_insert_records( cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495389Z [info ] File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/target_snowflake/sinks.py", line 141, in bulk_insert_records cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495443Z [info ] for files in batches: cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495495Z [info ] File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/batch.py", line 106, in get_batches cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495545Z [info ] gz.writelines( cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495592Z [info ] File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/batch.py", line 107, in <genexpr> cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495641Z [info ] (json.dumps(record) + "\n").encode() for record in chunk cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495825Z [info ] File "/Users/janispuris/.pyenv/versions/3.10.11/lib/python3.10/json/__init__.py", line 231, in dumps cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495973Z [info ] return _default_encoder.encode(obj) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.496033Z [info ] File "/Users/janispuris/.pyenv/versions/3.10.11/lib/python3.10/json/encoder.py", line 199, in encode cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.496118Z [info ] chunks = self.iterencode(o, _one_shot=True) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.496221Z [info ] File "/Users/janispuris/.pyenv/versions/3.10.11/lib/python3.10/json/encoder.py", line 257, in iterencode cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.496284Z [info ] return _iterencode(o, 0) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.496358Z [info ] File "/Users/janispuris/.pyenv/versions/3.10.11/lib/python3.10/json/encoder.py", line 179, in default cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.496420Z [info ] raise TypeError(f'Object of type {o.__class__.__name__} ' cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.496562Z [info ] TypeError: Object of type Decimal is not JSON serializable cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.532959Z [info ] 2023-07-09 17:36:52,532 | INFO | snowflake.connector.connection | closed cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.533122Z [info ] 2023-07-09 17:36:52,532 | INFO | snowflake.connector.connection | No async queries seem to be running, deleting session cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.606484Z [error ] [Errno 32] Broken pipe
...
Might be related to #66
target-bigquery provides the option to either denormalize into columns or leave as a single json column. That could be a useful feature for this target also.
target-bigquery docs:
Denormalized variants indicate data is unpacked during load with a resultant schema in BigQuery based on the tap schema. Non-denormalized means we have a fixed schema which loads all data into an unstructured JSON column. They are both useful patterns. The latter allowing BigQuery to work with schemaless or rapidly changing sources such as MongoDB instantly, while the former is more performant and convenient to start modeling quickly.
The test data https://github.com/meltano/sdk/blob/main/singer_sdk/testing/target_test_streams/schema_updates.singer tries to alter the schema of the table multiple times, each time adding a few records. The test is failing because it tries to add a column that already exists. I suspect its an upper/lower case issue related to comparing what exists.
column 'A3' already exists", 'query': 'ALTER TABLE MELTANOLABS_RAW.TARGET_SNOWFLAKE_685220.TEST_SCHEMA_UPDATES ADD COLUMN a3 BOOLEAN', ...
In our Dynamodb stream in the Squared project there are some column names with colons like tenant_resource_key::project_id
that are causing the merge sql logic to break because theyre unquoted. The query ends up failing with Unsupported data type 'SCHEDULE_NAME'
because the double colons is the cast syntax also.
sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 002040 (42601): 01ace745-0503-27e0-0051-5c830365b15a: SQL compilation error:
Unsupported data type 'SCHEDULE_NAME'.
[SQL: merge into USERDEV_RAW.PNADOLNY_DYNAMODB_NEW.PROJECT_SCHEDULES_TABLE d using (select $1:enabled::BOOLEAN as enabled, $1:deployment_name::schedule_name::VARCHAR as "deployment_name::schedule_name", $1:interval::VARCHAR as interval, $1:tenant_resource_key::project_id::VARCHAR as "tenant_resource_key::project_id", $1:eventbridge_name::VARCHAR as eventbridge_name, $1:_sdc_deleted_at::TIMESTAMP_NTZ as _sdc_deleted_at, $1:_sdc_received_at::TIMESTAMP_NTZ as _sdc_received_at, $1:_sdc_batched_at::TIMESTAMP_NTZ as _sdc_batched_at, $1:_sdc_extracted_at::TIMESTAMP_NTZ as _sdc_extracted_at, $1:_sdc_table_version::DECIMAL as _sdc_table_version, $1:_sdc_sequence::DECIMAL as _sdc_sequence from '@~/target-snowflake/project_schedules_table-52921d9b-6c76-4e0f-beb1-daa92cfa3b98'(file_format => USERDEV_RAW.PNADOLNY_DYNAMODB_NEW."project_schedules_table-52921d9b-6c76-4e0f-beb1-daa92cfa3b98")) s on d.tenant_resource_keyproject_id = s.tenant_resource_keyproject_id and d.deployment_nameschedule_name = s.deployment_nameschedule_name when matched then update set d.enabled = s.enabled, d."deployment_name::schedule_name" = s."deployment_name::schedule_name", d.interval = s.interval, d."tenant_resource_key::project_id" = s."tenant_resource_key::project_id", d.eventbridge_name = s.eventbridge_name, d._sdc_deleted_at = s._sdc_deleted_at, d._sdc_received_at = s._sdc_received_at, d._sdc_batched_at = s._sdc_batched_at, d._sdc_extracted_at = s._sdc_extracted_at, d._sdc_table_version = s._sdc_table_version, d._sdc_sequence = s._sdc_sequence when not matched then insert (enabled, "deployment_name::schedule_name", interval, "tenant_resource_key::project_id", eventbridge_name, _sdc_deleted_at, _sdc_received_at, _sdc_batched_at, _sdc_extracted_at, _sdc_table_version, _sdc_sequence) values (s.enabled, s."deployment_name::schedule_name", s.interval, s."tenant_resource_key::project_id", s.eventbridge_name, s._sdc_deleted_at, s._sdc_received_at, s._sdc_batched_at, s._sdc_extracted_at, s._sdc_table_version, s._sdc_sequence)]
https://github.com/MeltanoLabs/target-snowflake/blob/main/target_snowflake/sinks.py#L248
Instead of table.name = othertable.name
try table.name IS DISTINCT FROM othertable.name
3 valued logic is fun!
The target passes this test by throwing an IntegrityError
exception if the record doesnt contain the key property, which is good, but its relying on snowflake or maybe the snowflake sqlalchemy client specifically to assert integrity errors based on the PK that was defined in the table. Should we be catching this and handling it ourselves in the target or in the SDK instead? It seems risky to rely on the client/destination to raise this, especially because its not necessarily enforced by systems like snowflake.
sqlalchemy.exc.IntegrityError: (snowflake.connector.errors.IntegrityError) 100072 (22000): None: NULL result in a non-nullable column
E [SQL: merge into MELTANOLABS_RAW.TARGET_SNOWFLAKE_17B5D0.TEST_RECORD_MISSING_KEY_PROPERTY d using (select $1:id::DECIMAL as ID, $1:metric::DECIMAL as METRIC, $1:_sdc_extracted_at::TIMESTAMP_NTZ as _SDC_EXTRACTED_AT, $1:_sdc_received_at::TIMESTAMP_NTZ as _SDC_RECEIVED_AT, $1:_sdc_batched_at::TIMESTAMP_NTZ as _SDC_BATCHED_AT, $1:_sdc_deleted_at::TIMESTAMP_NTZ as _SDC_DELETED_AT, $1:_sdc_table_version::DECIMAL as _SDC_TABLE_VERSION, $1:_sdc_sequence::DECIMAL as _SDC_SEQUENCE from '@~/target-snowflake/test_record_missing_key_property-68fbb0ac-0330-49d1-a859-ed77c52dd6a7'(file_format => MELTANOLABS_RAW.TARGET_SNOWFLAKE_17B5D0."test_record_missing_key_property-68fbb0ac-0330-49d1-a859-ed77c52dd6a7")) s on d."ID" = s."ID" when matched then update set d."ID" = s."ID", d."METRIC" = s."METRIC", d."_SDC_EXTRACTED_AT" = s."_SDC_EXTRACTED_AT", d."_SDC_RECEIVED_AT" = s."_SDC_RECEIVED_AT", d."_SDC_BATCHED_AT" = s."_SDC_BATCHED_AT", d."_SDC_DELETED_AT" = s."_SDC_DELETED_AT", d."_SDC_TABLE_VERSION" = s."_SDC_TABLE_VERSION", d."_SDC_SEQUENCE" = s."_SDC_SEQUENCE" when not matched then insert (ID, METRIC, _SDC_EXTRACTED_AT, _SDC_RECEIVED_AT, _SDC_BATCHED_AT, _SDC_DELETED_AT, _SDC_TABLE_VERSION, _SDC_SEQUENCE) values (s."ID", s."METRIC", s."_SDC_EXTRACTED_AT", s."_SDC_RECEIVED_AT", s."_SDC_BATCHED_AT", s."_SDC_DELETED_AT", s."_SDC_TABLE_VERSION", s."_SDC_SEQUENCE")]
E (Background on this error at: https://sqlalche.me/e/14/gkpj)
Implement a Before_run_SQL for SQL commands before sending the data. This would make Drop_Table commands easier for projects that need it. Today doing this is not possible using Meltano.
See https://github.com/visch/target-snowflake/actions/runs/5545236322/jobs/10124133196 , I was trying some blue green testing with github actions (Expecting Windows jobs to fail, and they are failing in Pytest but it's not being reported that way)
As described in #52 (comment). The COPY logic has a bug when schema updates are required. The sync fails with an error related to a mismatch in column counts in the copy statement/table definition/internal stage files.
Draft PR up with test case #52 but still needs logic changes to fix them.
External ref: Slack thread on Meltano Slack community
meltano.yml
# ...
extractors:
- name: tap-oracle
pip_url: git+https://github.com/christianelliott/pipelinewise-tap-oracle.git
# ...
loaders:
- name: target-snowflake
variant: meltanolabs
pip_url: meltanolabs-target-snowflake
# ...
For example, --dump=catalog
's one of the fields in stream are
"AMT_EXCL_TAX": {
"multipleOf": 0.0001,
"type": [
"null",
"number"
]
},
but in snowflake it is created as
AMT_EXCL_TAX NUMBER(38,0)
while the correct field would be
AMT_EXCL_TAX NUMBER(34, 4)
Please let me know, if there is anything in addition I can gather to make this easier to troubleshoot.
I can also try to produce a MRE, if absolutely necessary.
The test data https://github.com/meltano/sdk/blob/main/singer_sdk/testing/target_test_streams/duplicate_records.singer sends 5 records but only 2 distinct key property IDs so they should be updating instead of appending. I added an assertion to the test to make sure only 2 records were present in the final table but theres still 5.
I think in #15 there was an attempt to dedupe using temp tables, maybe that was to fix this issue.
When running a simple ELT from Slack to Snowflake, I get the following error:
2023-06-01T14:52:08.118395Z [info ] 2023-06-01 16:52:08,118 | INFO | target-snowflake | Target 'target-snowflake' completed reading 4299 lines of input (4297 records, (0 batch manifests, 1 state messages). cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.125457Z [info ] Traceback (most recent call last): cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.125600Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/bin/target-snowflake", line 8, in <module> cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.125722Z [info ] sys.exit(TargetSnowflake.cli()) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.125799Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 1130, in __call__ cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.125861Z [info ] return self.main(*args, **kwargs) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.125917Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 1055, in main cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.126391Z [info ] rv = self.invoke(ctx) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.126451Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 1404, in invoke cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.126953Z [info ] return ctx.invoke(self.callback, **ctx.params) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.127037Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 760, in invoke cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.127106Z [info ] return __callback(*args, **kwargs) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.127707Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 572, in cli cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.127818Z [info ] target.listen(file_input) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.128521Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/io_base.py", line 35, in listen cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.128606Z [info ] self._process_endofpipe() cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.128678Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 287, in _process_endofpipe cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.129194Z [info ] self.drain_all(is_endofpipe=True) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.129266Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 449, in drain_all cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.129802Z [info ] self._drain_all(list(self._sinks_active.values()), self.max_parallelism) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.129863Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 482, in _drain_all cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.129918Z [info ] Parallel()(delayed(_drain_sink)(sink=sink) for sink in sink_list) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.130603Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/joblib/parallel.py", line 1098, in __call__ cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.130743Z [info ] self.retrieve() cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.131340Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/joblib/parallel.py", line 975, in retrieve cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.131401Z [info ] self._output.extend(job.get(timeout=self.timeout)) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.131464Z [info ] File "/Users/florian.ernst/.pyenv/versions/3.10.1/lib/python3.10/multiprocessing/pool.py", line 771, in get cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.132069Z [info ] raise self._value cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.132124Z [info ] File "/Users/florian.ernst/.pyenv/versions/3.10.1/lib/python3.10/multiprocessing/pool.py", line 125, in worker cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.132721Z [info ] result = (True, func(*args, **kwds)) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.132775Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/joblib/_parallel_backends.py", line 620, in __call__ cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.132832Z [info ] return self.func(*args, **kwargs) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.133331Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/joblib/parallel.py", line 288, in __call__ cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.133386Z [info ] return [func(*args, **kwargs) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.133998Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/joblib/parallel.py", line 288, in <listcomp> cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.134056Z [info ] return [func(*args, **kwargs) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.134111Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 479, in _drain_sink cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.134608Z [info ] self.drain_one(sink) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.134668Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 469, in drain_one cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.135177Z [info ] sink.process_batch(draining_status) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.135233Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/sinks/sql.py", line 253, in process_batch cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.135282Z [info ] self.bulk_insert_records( cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.135847Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/target_snowflake/sinks.py", line 348, in bulk_insert_records cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.135899Z [info ] encoding, files = self.get_batches( cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.136618Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/target_snowflake/sinks.py", line 374, in get_batches cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.136709Z [info ] prefix = batch_config.storage.prefix or "" cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.136777Z [info ] AttributeError: 'NoneType' object has no attribute 'storage' cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.165845Z [info ] 2023-06-01 16:52:08,165 | INFO | snowflake.connector.cursor | query: [ROLLBACK] cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.170986Z [info ] 2023-06-01 16:52:08,169 | ERROR | sqlalchemy.pool.impl.QueuePool | Exception during reset or similar cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.171085Z [info ] Traceback (most recent call last): cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.171149Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 763, in _finalize_fairy cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.171208Z [info ] fairy._reset(pool, transaction_was_reset) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.171282Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 1038, in _reset cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.171334Z [info ] pool._dialect.do_rollback(self) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.171383Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 683, in do_rollback cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.171878Z [info ] dbapi_connection.rollback() cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.171929Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/connection.py", line 647, in rollback cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.172428Z [info ] self.cursor().execute("ROLLBACK") cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.172476Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/cursor.py", line 801, in execute cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.172969Z [info ] ret = self._execute_helper(query, **kwargs) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.173030Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/cursor.py", line 521, in _execute_helper cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.173079Z [info ] ret = self._connection.cmd_query( cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.173679Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/connection.py", line 1067, in cmd_query cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.173728Z [info ] ret = self.rest.request( cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.174303Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/network.py", line 477, in request cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.174353Z [info ] return self._post_request( cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.174401Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/network.py", line 716, in _post_request cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.174889Z [info ] ret = self.fetch( cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.174938Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/network.py", line 814, in fetch cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.175474Z [info ] ret = self._request_exec_wrapper( cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.175523Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/network.py", line 937, in _request_exec_wrapper cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.175578Z [info ] raise e cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.176179Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/network.py", line 857, in _request_exec_wrapper cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.176237Z [info ] return_object = self._request_exec( cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.176937Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/network.py", line 1131, in _request_exec cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.176991Z [info ] raise err cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.177041Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/network.py", line 1033, in _request_exec cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.177556Z [info ] raw_ret = session.request( cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.177604Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/vendored/requests/sessions.py", line 577, in request cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.178420Z [info ] settings = self.merge_environment_settings( cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.178498Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/vendored/requests/sessions.py", line 759, in merge_environment_settings cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.178994Z [info ] env_proxies = get_environ_proxies(url, no_proxy=no_proxy) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.179050Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/vendored/requests/utils.py", line 825, in get_environ_proxies cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.179794Z [info ] if should_bypass_proxies(url, no_proxy=no_proxy): cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.179853Z [info ] File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/vendored/requests/utils.py", line 809, in should_bypass_proxies cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.179906Z [info ] bypass = proxy_bypass(parsed.hostname) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.180531Z [info ] File "/Users/florian.ernst/.pyenv/versions/3.10.1/lib/python3.10/urllib/request.py", line 2649, in proxy_bypass cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.180596Z [info ] return proxy_bypass_macosx_sysconf(host) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.180647Z [info ] File "/Users/florian.ernst/.pyenv/versions/3.10.1/lib/python3.10/urllib/request.py", line 2626, in proxy_bypass_macosx_sysconf cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.181135Z [info ] return _proxy_bypass_macosx_sysconf(host, proxy_settings) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.181203Z [info ] File "/Users/florian.ernst/.pyenv/versions/3.10.1/lib/python3.10/urllib/request.py", line 2568, in _proxy_bypass_macosx_sysconf cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.181721Z [info ] from fnmatch import fnmatch cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.181774Z [info ] ImportError: sys.meta_path is None, Python is likely shutting down cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.181919Z [info ] 2023-06-01 16:52:08,170 | INFO | snowflake.connector.connection | closed cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.182252Z [info ] 2023-06-01 16:52:08,171 | INFO | snowflake.connector.connection | No async queries seem to be running, deleting session cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.224127Z [error ] Loading failed code=1 message=2023-06-01 16:52:08,171 | INFO | snowflake.connector.connection | No async queries seem to be running, deleting session name=meltano run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake
I added some blank lines to clearly show the 3 distinct parts of the logs. Here, I think the 2 interesting lines are:
2023-06-01T14:52:08.136709Z [info ] prefix = batch_config.storage.prefix or "" cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.136777Z [info ] AttributeError: 'NoneType' object has no attribute 'storage' cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
Not sure what's wrong here.
Following #57 we deactivated altering columns. We should turn that back to true and make change to more efficiently alter columns.
Originally posted by @edgarrmondragon in #2 (comment)
We should probably add pre-commit to lint the repo.
Related to meltano/sdk#1783
We should allow users to configure whether they want every record inserted in append only mode or if we should update records that exist. If someone wants to update then they need to have key properties otherwise the target should raise and exception. It feels like the current default in the community is to update and require key properties, so sticking with the status quo seems like a fine path forward.
This target implements logic to use copy or merge statements depending on whether key properties are present so we can do both, although meltano/sdk#1819 is requiring key properties so it would never use our copy logic.
Originally raised in slack https://meltano.slack.com/archives/C01TCRBBJD7/p1691415382778629
IndexError: list index out of range
The culprit looks to be this logic thats assuming a the numeric scale rules are floats when they should also support integers.
from singer_sdk.helpers.capabilities import PluginCapabilities
class TargetSnowflake(SQLTarget):
capabilities = [*SQLTarget.capabilities, PluginCapabilities.BATCH]
Address meltano/sdk#1667 in this target. This is currently on 1.4.48
.
Context in #33 (comment)
I disabled the python version matrix in CI because they were interfering with each other since right now we share a schema. Adding the github action run ID as a prefix on the schema is probably a good idea.
Traceback (most recent call last):
File "/Users/pnadolny/Documents/Git/GitHub/pnadolny/hub-utils/hub_utils/test_meltano_project/.meltano/loaders/target-snowflake/venv/bin/target-snowflake", line 5, in <module>
from target_snowflake.target import TargetSnowflake
File "/Users/pnadolny/Documents/Git/GitHub/pnadolny/hub-utils/hub_utils/test_meltano_project/.meltano/loaders/target-snowflake/venv/lib/python3.8/site-packages/target_snowflake/target.py", line 8, in <module>
from target_snowflake.sinks import SnowflakeSink
File "/Users/pnadolny/Documents/Git/GitHub/pnadolny/hub-utils/hub_utils/test_meltano_project/.meltano/loaders/target-snowflake/venv/lib/python3.8/site-packages/target_snowflake/sinks.py", line 17, in <module>
from singer_sdk.sinks import SQLConnector, SQLSink
ImportError: cannot import name 'SQLConnector' from 'singer_sdk.sinks' (/Users/pnadolny/Documents/Git/GitHub/pnadolny/hub-utils/hub_utils/test_meltano_project/.meltano/loaders/target-snowflake/venv/lib/python3.8/site-packages/singer_sdk/sinks/__init__.py)
Command '['poetry', 'run', 'meltano', 'invoke', 'target-snowflake', '--help']' returned non-zero exit status 1.
Configuration file exists at /Users/pnadolny/Library/Application Support/pypoetry, reusing this directory.
@kgpayne I was trying to add this to the hub and it threw an error before I could extract the --about
metadata, in this case during a --help
command.
@Jan Soutusta
in slack reported this error:
2023-03-06T08:03:44.213924Z [info ] sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 002002 (42710): 01aac2a3-0001-008b-0001-aed20033302a: SQL compilation error: cmd_type=elb consumer=True name=target-snowflake-sfdc producer=False stdio=stderr string_id=target-snowflake-sfdc
2023-03-06T08:03:44.214106Z [info ] Object 'SFDC_INPUT_STAGE' already exists. cmd_type=elb consumer=True name=target-snowflake-sfdc producer=False stdio=stderr string_id=target-snowflake-sfdc
2023-03-06T08:03:44.214274Z [info ] [SQL: CREATE SCHEMA "SFDC_INPUT_STAGE"] cmd_type=elb consumer=True name=target-snowflake-sfdc producer=False stdio=stderr string_id=target-snowflake-sfdc
2023-03-06T08:03:44.214440Z [info ] (Background on this error at: https://sqlalche.me/e/14/f405) cmd_type=elb consumer=True name=target-snowflake-sfdc producer=False stdio=stderr string_id=target-snowflake-sfdc
While fixing #87 I ran into my schema not getting created automatically, really I think we should just make https://github.com/MeltanoLabs/target-snowflake/blob/main/target_snowflake/target.py#L46 required, and probably check that the value isn't null somewhere.
When rolling back my transactions it was trying to rollback from the DATABASE.None schema which didn't exist. I think just forcing a schema to exist would help this, but there could be other options here!
Stack trace
joblib.externals.loky.process_executor._RemoteTraceback:
"""
Traceback (most recent call last):
File "E:\code\target-snowflake\target_snowflake\sinks.py", line 178, in insert_batch_files_via_internal_stage
self.conform_name(
File "E:\code\target-snowflake\target_snowflake\sinks.py", line 98, in conform_name
return super().conform_name(name=name, object_type=object_type)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\singer_sdk\sinks\sql.py", line 159, in conform_name
name = re.sub(r"[^a-zA-Z0-9_\-\.\s]", "", name)
File "c:\users\derek\appdata\local\programs\python\python38\lib\re.py", line 210, in sub
return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1910, in _execute_context
self.dialect.do_execute(
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\engine\default.py", line 736, in do_execute
cursor.execute(statement, parameters)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\snowflake\connector\cursor.py", line 910, in execute
Error.errorhandler_wrapper(self.connection, self, error_class, errvalue)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\snowflake\connector\errors.py", line 290, in errorhandler_wrapper
handed_over = Error.hand_to_other_handler(
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\snowflake\connector\errors.py", line 345, in hand_to_other_handler
cursor.errorhandler(connection, cursor, error_class, error_value)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\snowflake\connector\errors.py", line 221, in default_errorhandler
raise error_class(
snowflake.connector.errors.ProgrammingError: 002003 (02000): 01ad9a15-0001-32de-0000-0004428e32e9: SQL compilation error:
Schema 'MELTANO_DATABASE.NONE' does not exist or not authorized.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\joblib\_parallel_backends.py", line 273, in _wrap_func_call
return func()
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\joblib\parallel.py", line 588, in __call__
return [func(*args, **kwargs)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\joblib\parallel.py", line 588, in <listcomp>
return [func(*args, **kwargs)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\singer_sdk\target_base.py", line 505, in _drain_sink
self.drain_one(sink)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\singer_sdk\target_base.py", line 495, in drain_one
sink.process_batch(draining_status)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\singer_sdk\sinks\sql.py", line 267, in process_batch
self.bulk_insert_records(
File "E:\code\target-snowflake\target_snowflake\sinks.py", line 142, in bulk_insert_records
self.insert_batch_files_via_internal_stage(
File "E:\code\target-snowflake\target_snowflake\sinks.py", line 205, in insert_batch_files_via_internal_stage
self.connector.drop_file_format(file_format=file_format)
File "E:\code\target-snowflake\target_snowflake\connector.py", line 451, in drop_file_format
conn.execute(drop_statement, **kwargs)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1385, in execute
return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\sql\elements.py", line 334, in _execute_on_connection
return connection._execute_clauseelement(
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1577, in _execute_clauseelement
ret = self._execute_context(
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1953, in _execute_context
self._handle_dbapi_exception(
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\engine\base.py", line 2134, in _handle_dbapi_exception
util.raise_(
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\util\compat.py", line 211, in raise_
raise exception
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1910, in _execute_context
self.dialect.do_execute(
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\engine\default.py", line 736, in do_execute
cursor.execute(statement, parameters)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\snowflake\connector\cursor.py", line 910, in execute
Error.errorhandler_wrapper(self.connection, self, error_class, errvalue)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\snowflake\connector\errors.py", line 290, in errorhandler_wrapper
handed_over = Error.hand_to_other_handler(
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\snowflake\connector\errors.py", line 345, in hand_to_other_handler
cursor.errorhandler(connection, cursor, error_class, error_value)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\snowflake\connector\errors.py", line 221, in default_errorhandler
raise error_class(
sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 002003 (02000): 01ad9a15-0001-32de-0000-0004428e32e9: SQL compilation error:
Schema 'MELTANO_DATABASE.NONE' does not exist or not authorized.
[SQL: drop file format if exists MELTANO_DATABASE.None."animals-5dd839ea-8cd4-44da-b036-5a0be3b679b3"]
(Background on this error at: https://sqlalche.me/e/14/f405)
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "c:\users\derek\appdata\local\programs\python\python38\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\users\derek\appdata\local\programs\python\python38\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\Scripts\target-snowflake.exe\__main__.py", line 7, in <module>
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\click\core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\click\core.py", line 1078, in main
rv = self.invoke(ctx)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\click\core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\click\core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\singer_sdk\target_base.py", line 550, in invoke
target.listen(file_input)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\singer_sdk\io_base.py", line 35, in listen
self._process_endofpipe()
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\singer_sdk\target_base.py", line 310, in _process_endofpipe
self.drain_all(is_endofpipe=True)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\singer_sdk\target_base.py", line 475, in drain_all
2023-07-13 11:49:09,424 | INFO | snowflake.connector.cursor | query execution done
self._drain_all(list(self._sinks_active.values()), self.max_parallelism)
2023-07-13 11:49:09,425 | INFO | snowflake.connector.cursor | Number of results in first chunk: 1
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\singer_sdk\target_base.py", line 508, in _drain_all
Parallel()(delayed(_drain_sink)(sink=sink) for sink in sink_list)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\joblib\parallel.py", line 1944, in __call__
return output if self.return_generator else list(output)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\joblib\parallel.py", line 1587, in _get_outputs
yield from self._retrieve()
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\joblib\parallel.py", line 1691, in _retrieve
self._raise_error_fast()
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\joblib\parallel.py", line 1726, in _raise_error_fast
error_job.get_result(self.timeout)
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\joblib\parallel.py", line 735, in get_result
return self._return_or_raise()
File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\joblib\parallel.py", line 753, in _return_or_raise
raise self._result
sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 002003 (02000): 01ad9a15-0001-32de-0000-0004428e32e9: SQL compilation error:
If a database is specified that doesnt exist then the target returns this error when trying to create the empty table:
Cannot perform CREATE TABLE. This session does not have a current database. Call 'USE DATABASE', or use a qualified name.
Its not clear that the reason for this is because the database wasnt found. If the database is found then its uses it as the default database. We should put a check in place to raise a clear exception when the configured database isnt found. It doesnt look like we're conforming database names right now but its also possible at someone will implement that and some users will be confused when their pre-conformed database name exists but post conformed it does not. So having a clear error message with the database name is important.
2023-09-29T18:56:53.270886Z [info ] 2023-09-29 18:56:53,270 | INFO | snowflake.connector.cursor | Number of results in first chunk: 1 cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.271316Z [info ] Traceback (most recent call last): cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.271668Z [info ] File "/project/.meltano/loaders/target-snowflake/venv/bin/target-snowflake", line 8, in <module> cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.272072Z [info ] sys.exit(TargetSnowflake.cli()) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.272406Z [info ] File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/click/core.py", line 1157, in __call__ cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.272720Z [info ] return self.main(*args, **kwargs) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.273074Z [info ] File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/click/core.py", line 1078, in main cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.273485Z [info ] rv = self.invoke(ctx) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.273831Z [info ] File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/click/core.py", line 1434, in invoke cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.274157Z [info ] return ctx.invoke(self.callback, **ctx.params) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.274471Z [info ] File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/click/core.py", line 783, in invoke cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.274805Z [info ] return __callback(*args, **kwargs) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.275122Z [info ] File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/singer_sdk/target_base.py", line 550, in invoke cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.275445Z [info ] target.listen(file_input) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.275742Z [info ] File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/singer_sdk/io_base.py", line 34, in listen cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.276050Z [info ] self._process_lines(file_input) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.276343Z [info ] File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/singer_sdk/target_base.py", line 291, in _process_lines cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.276789Z [info ] counter = super()._process_lines(file_input) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.277120Z [info ] File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/singer_sdk/io_base.py", line 93, in _process_lines cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.277426Z [info ] self._process_record_message(line_dict) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.277724Z [info ] File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/singer_sdk/target_base.py", line 341, in _process_record_message cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.278017Z [info ] sink._validate_and_parse(transformed_record) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.278309Z [info ] File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/singer_sdk/sinks/core.py", line 317, in _validate_and_parse cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.278603Z [info ] self._validator.validate(record) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.278914Z [info ] File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/jsonschema/validators.py", line 435, in validate cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.279209Z [info ] raise error cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.279504Z [info ] jsonschema.exceptions.ValidationError: 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' is too long cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.279798Z [info ] cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.280089Z [info ] Failed validating 'maxLength' in schema['properties']['fieldname']: cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.280392Z [info ] {'maxLength': 20, 'type': ['string', 'null']} cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.280697Z [info ] cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.281026Z [info ] On instance['fieldname']: cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.281332Z [info ] 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
We could wrap /project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/singer_sdk/sinks/core.py
with a try catch and output the sink
name. Ideally the error would include the stack trace, and jsonschema error, along the with the stream_name
, and key properties of the row that fails (if one exists)
We could do this in the SDK as well 🤷♂️
Related:
BATCH
docs: https://meltano-sdk--904.org.readthedocs.build/en/904/batch.htmlCOPY INTO
docs for bulk load: https://docs.snowflake.com/en/user-guide/data-load-local-file-system.htmltap-snowflake
: MeltanoLabs/tap-snowflake#1Spec details:
jsonl
and gzip
to match initial built-in SDK feature spec.COPY
command per BATCH
message's manifest
. (Each manifest could contain 1 or many files, depending on the tap implementation.)A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.