Coder Social home page Coder Social logo

meltanolabs / target-snowflake Goto Github PK

View Code? Open in Web Editor NEW
9.0 3.0 21.0 732 KB

Singer Target for the Snowflake cloud Data Warehouse

Home Page: https://hub.meltano.com/loaders/target-snowflake--meltanolabs/

License: Other

Python 100.00%
elt meltano singer-sdk singer-target snowflake

target-snowflake's People

Contributors

dependabot[bot] avatar dlouseiro avatar edgarrmondragon avatar meltybot avatar miloszszymczak avatar pnadolny13 avatar pre-commit-ci[bot] avatar visch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

target-snowflake's Issues

Failed validating 'multipleOf'

The problem

I'm not sure what this is and how it works, but seems I'm hitting the upstream issue documented in the Meltano SDK issues - meltano/sdk#344

Also discussed in python-jsonschema/jsonschema#247

The error

jsonschema.exceptions.ValidationError: -4.8 is not a multiple of 0.0001 cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
                               cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
Failed validating 'multipleOf' in schema['properties']['AMT_EXCL_TAX']: cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
    {'multipleOf': 0.0001, 'type': ['null', 'number']} cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
                               cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
On instance['AMT_EXCL_TAX']:   cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
    -4.8                       cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake

This particular column in Oracle is defined as

AMT_EXCL_TAX NUMBER(28, 4) DEFAULT 0 NOT NULL

with 20 example values

AMT_EXCL_TAX
-2332.4000
-1212.9600
-1146.0000
-929.0000
-619.0000
-500.0000
-448.0000
-346.0000
-304.0000
-203.5200
-103.2000
-94.9000
-50.4000
-4.8000
21.6000
31.2000
325.2000
1565.6000
3605.0300
3939.7500

MRE

I do not have a valid MRE at this time.. Please let me know, if I need to make one.

Runtime details

❯ python --version                                    
Python 3.10.11
❯ meltano --version                                        
meltano, version 2.19.1
❯ pip freeze | grep json     
check-jsonschema==0.22.0
jsonschema==4.17.3

I'm running a job with

meltano --environment=dev run tap-oracle target-snowflake

meltano.yaml

version: 1
default_environment: dev
project_id: b1d562d0-d114-42a7-bad1-957891a7285b

environments:
  - name: dev
    config:
      plugins:
        loaders:
          - name: target-snowflake
            config:
              database: redacted
              warehouse: redacted
              role: redacted
              user: redacted
              schema: redacted
              default_target_schema: redacted
  - name: uat
  - name: prod

plugins:
  extractors:
    - name: tap-oracle
      pip_url: git+https://github.com/christianelliott/pipelinewise-tap-oracle.git
      config:
        filter_schemas: MY_SCHEMA
        default_replication_method: FULL_TABLE
        port: 1521
        host: redacted
        user: redacted
        service_name: redacted
        use_ora_rowscn: true
        add_metadata_columns: true
      select:
        - MY_SCHEMA-FINANCIAL_TRANSACTION.*
      metadata:
        MY_SCHEMA-FINANCIAL_TRANSACTION:
          replication-method: INCREMENTAL
          replication-key: ID
  loaders:
    - name: target-snowflake
      variant: meltanolabs
      pip_url: meltanolabs-target-snowflake
      config:
        account: redacted.eu-west-1
        add_record_metadata: true
        clean_up_batch_files: true
        flattening_enabled: false
jobs:
  - name: icx--snowflake
    tasks:
      - tap-oracle target-snowflake
2023-07-03T19:56:58.891691Z [info     ] 2023-07-03 21:56:58,891 | INFO     | snowflake.connector.cursor | Number of results in first chunk: 1 cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.192610Z [info     ] time=2023-07-03 21:59:07 name=singer level=INFO message=METRIC: b'{"type":"counter","metric":"record_count","value":1,"tags":{}}' cmd_type=elb consumer=False name=tap-oracle producer=True stdio=stderr string_id=tap-oracle
2023-07-03T19:59:07.254068Z [info     ] Traceback (most recent call last): cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.254354Z [info     ]   File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/bin/target-snowflake", line 8, in <module> cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.255159Z [info     ]     sys.exit(TargetSnowflake.cli()) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.255444Z [info     ]   File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 1130, in __call__ cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.256212Z [info     ]     return self.main(*args, **kwargs) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.256460Z [info     ]   File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 1055, in main cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.256815Z [info     ]     rv = self.invoke(ctx)      cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.257037Z [info     ]   File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 1404, in invoke cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.257552Z [info     ]     return ctx.invoke(self.callback, **ctx.params) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.257721Z [info     ]   File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 760, in invoke cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.257945Z [info     ]     return __callback(*args, **kwargs) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.258128Z [info     ]   File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 546, in invoke cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.258422Z [info     ]     target.listen(file_input)  cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.258502Z [info     ]   File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/io_base.py", line 33, in listen cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.258562Z [info     ]     self._process_lines(file_input) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.258615Z [info     ]   File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 291, in _process_lines cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.258667Z [info     ]     counter = super()._process_lines(file_input) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.258717Z [info     ]   File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/io_base.py", line 79, in _process_lines cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.258855Z [info     ]     self._process_record_message(line_dict) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.259054Z [info     ]   File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 338, in _process_record_message cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.259223Z [info     ]     sink._validate_and_parse(transformed_record) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.259390Z [info     ]   File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/sinks/core.py", line 314, in _validate_and_parse cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.259445Z [info     ]     self._validator.validate(record) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.259495Z [info     ]   File "/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/jsonschema/validators.py", line 314, in validate cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.259973Z [info     ]     raise error                cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.260326Z [info     ] jsonschema.exceptions.ValidationError: -4.8 is not a multiple of 0.0001 cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.260406Z [info     ]                                cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.260493Z [info     ] Failed validating 'multipleOf' in schema['properties']['AMT_EXCL_TAX']: cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.260721Z [info     ]     {'multipleOf': 0.0001, 'type': ['null', 'number']} cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.260791Z [info     ]                                cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.260846Z [info     ] On instance['AMT_EXCL_TAX']:   cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.260946Z [info     ]     -4.8                       cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.298085Z [info     ] 2023-07-03 21:59:07,297 | INFO     | snowflake.connector.connection | closed cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.298331Z [info     ] 2023-07-03 21:59:07,298 | INFO     | snowflake.connector.connection | No async queries seem to be running, deleting session cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-03T19:59:07.368337Z [error    ] [Errno 32] Broken pipe
...

Some oddities in Publish workflow

Note: No urgency here. Just logging these so we can come back to them next week, and keep iterating on the process.

I created draft release notes for v0.1.0, which resulted in some changes noted by the GitHub default release drafter. So I then attempted to publish as a release.

A few findings.

  • Instead of v0.1.0, the CI action tried to publish as v0.0.2 which was already published and is already taken.
  • On further inspection, v0.0.2 already included the changes which had been detected as v0.1.0 changes. It looks like there actually are no pending changes.

Links:

bug: reserved words cause compilation error

I have a table that inserted some reserved words like select and update from tap-meltanohub, on initial creation the merge statement fails because of a compilation error due to select/update not being quoted:

sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 001003 (42000): 01acd554-0603-1f7b-0051-5c83035999a2: SQL compilation error:
syntax error line 1 at position 611 unexpected 'SELECT'.
syntax error line 1 at position 705 unexpected 'UPDATE'.
syntax error line 1 at position 2,589 unexpected 'SELECT'.
syntax error line 1 at position 2,615 unexpected 'UPDATE'.
[SQL: merge into USERDEV_RAW.PNADOLNY_MELTANOHUB_NEW.PLUGINS d using (select $1:description::VARCHAR as DESCRIPTION, $1:label::VARCHAR as LABEL, $1:name::VARCHAR as NAME, $1:logo_url::VARCHAR as LOGO_URL, $1:namespace::VARCHAR as NAMESPACE, $1:variant::VARCHAR as VARIANT, $1:pip_url::VARCHAR as PIP_URL, $1:repo::VARCHAR as REPO, $1:settings::VARIANT as SETTINGS, $1:capabilities::VARIANT as CAPABILITIES, $1:docs::VARCHAR as DOCS, $1:default::BOOLEAN as DEFAULT, $1:id::VARCHAR as ID, $1:plugin_type::VARCHAR as PLUGIN_TYPE, $1:settings_group_validation::VARIANT as SETTINGS_GROUP_VALIDATION, $1:select::VARIANT as SELECT, $1:hidden::BOOLEAN as HIDDEN, $1:metadata::VARIANT as METADATA, $1:update::BOOLEAN as UPDATE, $1:dialect::VARCHAR as DIALECT, $1:target_schema::VARCHAR as TARGET_SCHEMA, $1:executable::VARCHAR as EXECUTABLE, $1:commands::VARIANT as COMMANDS, $1:requires::VARIANT as REQUIRES, $1:_sdc_received_at::TIMESTAMP_NTZ as _SDC_RECEIVED_AT, $1:_sdc_deleted_at::TIMESTAMP_NTZ as _SDC_DELETED_AT, $1:_sdc_extracted_at::TIMESTAMP_NTZ as _SDC_EXTRACTED_AT, $1:_sdc_batched_at::TIMESTAMP_NTZ as _SDC_BATCHED_AT, $1:_sdc_table_version::DECIMAL as _SDC_TABLE_VERSION, $1:_sdc_sequence::DECIMAL as _SDC_SEQUENCE from '@~/target-snowflake/plugins-2825d13f-b9ea-430a-92ae-3dcc0ede7e4f'(file_format => USERDEV_RAW.PNADOLNY_MELTANOHUB_NEW."plugins-2825d13f-b9ea-430a-92ae-3dcc0ede7e4f")) s on d."ID" = s."ID" when matched then update set d."DESCRIPTION" = s."DESCRIPTION", d."LABEL" = s."LABEL", d."NAME" = s."NAME", d."LOGO_URL" = s."LOGO_URL", d."NAMESPACE" = s."NAMESPACE", d."VARIANT" = s."VARIANT", d."PIP_URL" = s."PIP_URL", d."REPO" = s."REPO", d."SETTINGS" = s."SETTINGS", d."CAPABILITIES" = s."CAPABILITIES", d."DOCS" = s."DOCS", d."DEFAULT" = s."DEFAULT", d."ID" = s."ID", d."PLUGIN_TYPE" = s."PLUGIN_TYPE", d."SETTINGS_GROUP_VALIDATION" = s."SETTINGS_GROUP_VALIDATION", d."SELECT" = s."SELECT", d."HIDDEN" = s."HIDDEN", d."METADATA" = s."METADATA", d."UPDATE" = s."UPDATE", d."DIALECT" = s."DIALECT", d."TARGET_SCHEMA" = s."TARGET_SCHEMA", d."EXECUTABLE" = s."EXECUTABLE", d."COMMANDS" = s."COMMANDS", d."REQUIRES" = s."REQUIRES", d."_SDC_RECEIVED_AT" = s."_SDC_RECEIVED_AT", d."_SDC_DELETED_AT" = s."_SDC_DELETED_AT", d."_SDC_EXTRACTED_AT" = s."_SDC_EXTRACTED_AT", d."_SDC_BATCHED_AT" = s."_SDC_BATCHED_AT", d."_SDC_TABLE_VERSION" = s."_SDC_TABLE_VERSION", d."_SDC_SEQUENCE" = s."_SDC_SEQUENCE" when not matched then insert (DESCRIPTION, LABEL, NAME, LOGO_URL, NAMESPACE, VARIANT, PIP_URL, REPO, SETTINGS, CAPABILITIES, DOCS, DEFAULT, ID, PLUGIN_TYPE, SETTINGS_GROUP_VALIDATION, SELECT, HIDDEN, METADATA, UPDATE, DIALECT, TARGET_SCHEMA, EXECUTABLE, COMMANDS, REQUIRES, _SDC_RECEIVED_AT, _SDC_DELETED_AT, _SDC_EXTRACTED_AT, _SDC_BATCHED_AT, _SDC_TABLE_VERSION, _SDC_SEQUENCE) values (s."DESCRIPTION", s."LABEL", s."NAME", s."LOGO_URL", s."NAMESPACE", s."VARIANT", s."PIP_URL", s."REPO", s."SETTINGS", s."CAPABILITIES", s."DOCS", s."DEFAULT", s."ID", s."PLUGIN_TYPE", s."SETTINGS_GROUP_VALIDATION", s."SELECT", s."HIDDEN", s."METADATA", s."UPDATE", s."DIALECT", s."TARGET_SCHEMA", s."EXECUTABLE", s."COMMANDS", s."REQUIRES", s."_SDC_RECEIVED_AT", s."_SDC_DELETED_AT", s."_SDC_EXTRACTED_AT", s."_SDC_BATCHED_AT", s."_SDC_TABLE_VERSION", s."_SDC_SEQUENCE")]

Additionally on the second sync, where the table exists already with reserved words, it does a diff on the column names and incorrectly decides to create that column as if its missing. An error is thrown because it already exists. It looks like usually sqlalchemy returns all lowercase columns but in the reserved word case it comes back uppercase. In the DDL of the snowflake table its wrapped in double quotes so I think sqlalchemy is taking it literally while maybe lowercasing the rest of the columns, not sure.

snowflake.connector.errors.ProgrammingError: 001430 (42601): 01acd544-0603-1f7b-0051-5c8303595c36: SQL compilation error:
column 'SELECT' already exists

Fixed in MeltanoLabs/target-postgres#35

Target-Snowflake does not clean up json.gz files after loading to Snowflake (Windows only)

What is the Issue?

When working with Meltano in a Windows environment, target-snowflake leaves behind the json.gz files it generates to load into Snowflake.

This does not occur when working with Meltano in Linux environment, or a linux container. Likely dealing with path construction again similar to #87.

Steps to Reproduce

  • In a Windows environment, install and configure meltano with the meltano-variant target-snowflake loader, and any tap of your choosing.
  • Run meltano run (tap) target-snowflake
  • Note that the json.gz files generated are not cleaned up

image
image (1)
image (2)

Slack Thread: https://meltano.slack.com/archives/C013EKWA2Q1/p1689800363928689

End to end test doesn't exist

I was trying to show that Windows jobs failed via Github actions while solving #87 , but the job succeeded when running on Windows. I know for a fact running on Windows and pushing data to snowflake doesn't work e2e right now without #89 merged

Maybe we could do a test like https://github.com/MeltanoLabs/target-postgres/blob/89b2b957d136492a2b71a176fef74e610f60b934/.github/workflows/ci_workflow.yml#L99C1-L100C54

Needed to modify meltano.yml to this to get a test going locally

version: 1
send_anonymous_usage_stats: true
project_id: target-snowflake
plugins:
  extractors:
  - name: tap-smoke-test
    variant: meltano
    pip_url: git+https://github.com/meltano/tap-smoke-test.git
    config:
      streams:
      - stream_name: animals
        input_filename: https://gitlab.com/meltano/tap-smoke-test/-/raw/main/demo-data/animals-data.jsonl
      - stream_name: pageviews
        input_filename: https://raw.githubusercontent.com/meltano/tap-smoke-test/main/demo-data/pageviews-data.jsonl
  loaders:
  - name: target-snowflake
    namespace: target_snowflake
    pip_url: -e .
    capabilities:
    - about
    - schema-flattening
    - stream-maps
    settings_group_validation:
    - - account
      - database
      - password
      - user
    settings:
    - name: account
      kind: string
      label: Account
      description: Your account identifier. See [Account Identifiers](https://docs.snowflake.com/en/user-guide/admin-account-identifier.html).
    - name: add_record_metadata
      kind: boolean
      value: true
      label: Add Record Metadata
      description: Whether to add metadata columns.
    - name: clean_up_batch_files
      kind: boolean
      value: true
      label: Clean Up Batch Files
      description: Whether to remove batch files after processing.
    - name: database
      kind: string
      label: Database
      description: The initial database for the Snowflake session.
    - name: default_target_schema
      kind: string
      label: Default Target Schema
      description: The default target database schema name to use for all streams.
    - name: flattening_enabled
      kind: boolean
      label: Flattening Enabled
      description: "'True' to enable schema flattening and automatically expand nested\
        \ properties."
    - name: flattening_max_depth
      kind: integer
      label: Flattening Max Depth
      description: The max depth to flatten schemas.
    - name: password
      kind: password
      label: Password
      description: The password for your Snowflake user.
    - name: role
      kind: string
      label: Role
      description: The initial role for the session.
    - name: schema
      kind: string
      label: Schema
      description: The initial schema for the Snowflake session.
    - name: stream_map_config
      kind: object
      label: Stream Map Config
      description: User-defined config values to be used within map expressions.
    - name: stream_maps
      kind: object
      label: Stream Maps
      description: Config object for stream maps capability. For more information
        check out [Stream Maps](https://sdk.meltano.com/en/latest/stream_maps.html).
    - name: user
      kind: string
      label: User
      description: The login name for your Snowflake user.
    - name: warehouse
      kind: string
      label: Warehouse
      description: The initial warehouse for the session.
    config:
      start_date: '2010-01-01T00:00:00Z'
default_environment: dev
environments:
  - name: dev

bug: altering existing columns with custom snowflake types fails

Related to #60

In that linked PR there was a bug where the comparison sqlalchemy types fails because the custom snowflake types dont have the python_type method. In that PR I fixed the way we assign types in the target but it turns out when we ask sqlalchemy for the columns of an existing table it returns the original snowflake sqlalchemy types still so later when we do the same comparsion using python_type the error pops up again.

We need to ask sqlalchemy for types then convert them to our custom types for now.

bug: Data can be copied into wrong columns

Because the copy into doesn't specify the column order if the columns list in the schema doesn't match the order in the table in snowflake exactly you can have some issues. I hit this as we were migrating an existing table to use target-snowflake. Related code here https://github.com/MeltanoLabs/target-snowflake/blob/main/target_snowflake/connector.py#L355-L358 , I recommend we swap to explicitly defining the column names in the copy into (https://docs.snowflake.com/en/sql-reference/sql/copy-into-table) statement, maybe there's a better way?

copy into sql example that gets generated by the target

[SQL: copy into RMS_DB.RAW.EXAMPLE_BROKEN from (select $1:"ACCOUNTNAME"::VARCHAR(399) as "ACCOUNTNAME", $1:"ACCOUNTID"::DECIMAL as "ACCOUNTID", $1:"VOID"::BOOLEAN as "VOID", $1:"ETLLASTUPDATEDON"::TIMESTAMP_NTZ as "ETLLASTUPDATEDON", $1:_sdc_extracted_at::TIMESTAMP_NTZ as _sdc_extracted_at, $1:_sdc_received_at::TIMESTAMP_NTZ as _sdc_received_at, $1:_sdc_batched_at::TIMESTAMP_NTZ as _sdc_batched_at, $1:_sdc_deleted_at::TIMESTAMP_NTZ as _sdc_deleted_at, $1:_sdc_sequence::DECIMAL as _sdc_sequence, $1:_sdc_table_version::DECIMAL as _sdc_table_version from '@~/target-snowflake/EXAMPLE_BROKEN-83eb0de6-3121-4230-bcbf-79846ee67ef7')file_format = (format_name='RMS_DB.RAW."EXAMPLE_BROKEN-83eb0de6-3121-4230-bcbf-79846ee67ef7"')]

To reproduce

  1. Run the table setup process here

example_1

{"type": "SCHEMA", "stream": "EXAMPLE_BROKEN", "schema": {"properties": {"ACCOUNTID": {"type": ["number", "null"]}, "ACCOUNTNAME": {"maxLength": 400, "type": ["string", "null"]}, "VOID": {"type": ["boolean", "null"]}, "ETLLASTUPDATEDON": {"format": "date-time", "type": ["string", "null"]}}}, "key_properties": [], "bookmark_properties": []}
{"type": "RECORD", "stream": "EXAMPLE_BROKEN", "record": {"ACCOUNTID": 123456.1, "ACCOUNTNAME": "Name1", "VOID": true, "ETLLASTUPDATEDON": "2023-04-06 07:13:04.380000"}, "time_extracted": "2023-09-11T20:41:59.240392+00:00"}

Run cat example_1 | meltano invoke target-snowflake

  1. (Note we swapped the first column order)
    example_2
{"type": "SCHEMA", "stream": "EXAMPLE_BROKEN", "schema": {"properties": {"ACCOUNTNAME": {"maxLength": 399, "type": ["string", "null"]}, "ACCOUNTID": {"type": ["number", "null"]}, "VOID": {"type": ["boolean", "null"]}, "ETLLASTUPDATEDON": {"format": "date-time", "type": ["string", "null"]}}}, "key_properties": [], "bookmark_properties": []}
{"type": "RECORD", "stream": "EXAMPLE_BROKEN", "record": {"ACCOUNTNAME": "Name1", "ACCOUNTID": 123456.1, "VOID": true, "ETLLASTUPDATEDON": "2023-04-06 07:13:04.380000"}, "time_extracted": "2023-09-11T20:41:59.240392+00:00"}

Run cat example_2 | meltano invoke target-snowflake

You'll get

sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 100038 (22018): 01aeee51-0001-56f3-0004-428e00039da6: Numeric value 'Name1' is not recognized
[SQL: copy into RMS_DB.RAW.EXAMPLE_BROKEN from (select $1:"ACCOUNTNAME"::VARCHAR(399) as "ACCOUNTNAME", $1:"ACCOUNTID"::DECIMAL as "ACCOUNTID", $1:"VOID"::BOOLEAN as "VOID", $1:"ETLLASTUPDATEDON"::TIMESTAMP_NTZ as "ETLLASTUPDATEDON", $1:_sdc_extracted_at::TIMESTAMP_NTZ as _sdc_extracted_at, $1:_sdc_received_at::TIMESTAMP_NTZ as _sdc_received_at, $1:_sdc_batched_at::TIMESTAMP_NTZ as _sdc_batched_at, $1:_sdc_deleted_at::TIMESTAMP_NTZ as _sdc_deleted_at, $1:_sdc_sequence::DECIMAL as _sdc_sequence, $1:_sdc_table_version::DECIMAL as _sdc_table_version from '@~/target-snowflake/EXAMPLE_BROKEN-83eb0de6-3121-4230-bcbf-79846ee67ef7')file_format = (format_name='RMS_DB.RAW."EXAMPLE_BROKEN-83eb0de6-3121-4230-bcbf-79846ee67ef7"')]

bug: TIMESTAMP_NTZ() fails when casting to python type while sorting types

I have a table with a column _SDC_DELETED_AT that is of type VARCHAR(16777216). When I switch to this target it wants to change that to TIMESTAMP_NTZ() (rightfully so) but it fails at this line https://github.com/meltano/sdk/blob/3ceb28a97a77a5e22163fc887fe16ff129c2dc28/singer_sdk/connectors/sql.py#L912 when trying to adapt the column type because the snowflake SQLAlchemy type TIMESTAMP_NTZ() doesnt implement python_type https://github.com/snowflakedb/snowflake-sqlalchemy/blob/c200f971a745eb404ef68b9f8a248c7255056dc9/src/snowflake/sqlalchemy/custom_types.py#L56.

`File doesn't exist` when putting batch to internal stage using Windows machine

From slack thread https://meltano.slack.com/archives/C01TCRBBJD7/p1689179609824739

The target uses the underlying SDK batch mechanisms to load data to snowflake via internal stages. It looks like somewhere along the way the path is getting mangled and slashes are being removed and it causes a File doesn't exist to be raised. An example is C:GitHubETLRedesignmeltano-projectMeltanoV2\target-snowflake--Content-ResourceLocationUpdate-c1b6705a-fbbc-4965-ae9c-3cf50ee17296-1.json.gz. It's very likely that this issue belongs in the SDK but this is where it was identified first.

I tried to track down whats happening but it goes into the fs library and its very difficult to debug without a windows machine to test on.

cc @edgarrmondragon @visch

feat: Support breakpoint()'s / document debug steps

Having a seperate issue but I'm having a tough time debugging as my debug statements (adding breakpoint() ) don't work seemingly because of the use of joblib.parallel. Maybe just a writeup in the docs for how to debug this target / add debug statements?

Maybe I'm just a noob and I should use a different debugger

Optimize column altering checks

Currently theres a lot of wasted time during startup where the target is running lots of queries to check the existing schema against the schema message it received in case it needs to make alterations. One reason this is slow is because it iterates every column and either:

In doing this its requesting ddl to be generated then executes it in a serial loop.

An optimization to speed this up would be to bring the execution up to the prepare_table method after the for loop. The prepare_column method would instead return the ddl instead of executing it, so at the prepare_table level it can send one large script.

@edgarrmondragon what do you think about this?

bug: default test `TargetCamelcaseTest` throws IntegrityError

The test throws an IntegrityError. I suspect that the camel casing is breaking this merge sql in that its looking for lowercase id and the RECORD sends Id using camelcase.

sqlalchemy.exc.IntegrityError: (snowflake.connector.errors.IntegrityError) 100072 (22000): None: NULL result in a non-nullable column
E       [SQL: merge into MELTANOLABS_RAW.TARGET_SNOWFLAKE_1CE306.TESTCAMELCASE d using (select $1:id::VARCHAR as ID, $1:clientname::VARCHAR as CLIENTNAME, $1:_sdc_batched_at::TIMESTAMP_NTZ as _SDC_BATCHED_AT, $1:_sdc_extracted_at::TIMESTAMP_NTZ as _SDC_EXTRACTED_AT, $1:_sdc_received_at::TIMESTAMP_NTZ as _SDC_RECEIVED_AT, $1:_sdc_deleted_at::TIMESTAMP_NTZ as _SDC_DELETED_AT, $1:_sdc_table_version::DECIMAL as _SDC_TABLE_VERSION, $1:_sdc_sequence::DECIMAL as _SDC_SEQUENCE from '@~/target-snowflake/TestCamelcase-f581638e-d5bc-42a5-a8ae-cec1d109b3ea'(file_format => MELTANOLABS_RAW.TARGET_SNOWFLAKE_1CE306."TestCamelcase-f581638e-d5bc-42a5-a8ae-cec1d109b3ea")) s on d."ID" = s."ID" when matched then update set d."ID" = s."ID", d."CLIENTNAME" = s."CLIENTNAME", d."_SDC_BATCHED_AT" = s."_SDC_BATCHED_AT", d."_SDC_EXTRACTED_AT" = s."_SDC_EXTRACTED_AT", d."_SDC_RECEIVED_AT" = s."_SDC_RECEIVED_AT", d."_SDC_DELETED_AT" = s."_SDC_DELETED_AT", d."_SDC_TABLE_VERSION" = s."_SDC_TABLE_VERSION", d."_SDC_SEQUENCE" = s."_SDC_SEQUENCE" when not matched then insert (ID, CLIENTNAME, _SDC_BATCHED_AT, _SDC_EXTRACTED_AT, _SDC_RECEIVED_AT, _SDC_DELETED_AT, _SDC_TABLE_VERSION, _SDC_SEQUENCE) values (s."ID", s."CLIENTNAME", s."_SDC_BATCHED_AT", s."_SDC_EXTRACTED_AT", s."_SDC_RECEIVED_AT", s."_SDC_DELETED_AT", s."_SDC_TABLE_VERSION", s."_SDC_SEQUENCE")]

Support adding the source's schema name to additional field in the target table

I forked the transferwise version of target-snowflake because I needed a way to add the schema name of my MySQL source (tap) to an additional column/property in Snowflake. It would be great to be able to migrate to a new version of this target and still retain this functionality.

More specifically, if I desired to replicate a source MySQL table named schema1.table1 with only 1 column col1, then the destination table might look like destination_database.destination_schema.table1 with the columns col1 and source_schema_name. All the values in source_schema_name would be set to schema1.

Add integration tests

target-postgres might be a helpful reference https://github.com/MeltanoLabs/target-postgres/blob/main/target_postgres/tests/test_standard_target.py

  • explore using the SDK target test framework
  • add test to create tables and assert theyre as expected
  • cleanup snowflake afterwards
  • test cases:
    • empty schema, create table from scratch
    • existing table, no schema updates
    • existing table, alter schema (add, remove, data type change)
    • duplicate row sent with and without key properties. if key properties it should be updated, without it should be inserted.
    • type mapping edge cases
    • handling reserved words in schema/table/column names
    • handling upper/lower case schema/table/column comparisons
    • lack of permissions handling
    • handling of file format of same name existing already. We cant really trust that the configs are correct so I think it needs to fail.

spike: explore sorting edge cases

There was a discussion in #47 around the best way to deduplicate within a stream when a key property is set. The PPW variant uses the last arriving record for a PK as the one to use for merging, using that behavior as the base case for this target we implemented the same behavior using sorting in the merge query itself.

Like discussed in that PR thread we should explore better options for deduplicating and fully think through if there are edge cases that we might be missing with the current implementation.

chore: Upstream breaking changes to SQLConnector - please pin sdk

This PR meltano/sdk#1394 makes some small breaking changes to SQLConnector.

Recommendation: Pin the SDK <=0.19.0 in your pyproject.toml. Looks like you're current using poetry's caret specifier in a way that will allow updates to happen automatically past 0.19.0; I'd recommend altering that to <=0.19.0.

Please take a look at the PR description for a deeper discussion. Here is a summary of what's changed.

Overview:

  • The SQLAlchemy connection on the SQLConnector is now protected, and belongs only to the SQLConnector or it's subclasses. Streams and other objects/functions should stop accessing the SQLConnector's engine or connection.
  • Within the SQLConnector, DB connections are opened in only one way: by using the _connect() method as a context manager.
  • The SQLConnector uses a single engine throughout its lifetime. Connections are opened as needed and closed after each operation.

Specific code changes:

  • _connection is gone altogether. You cannot pass it to SQLConnector and you cannot access it off of the connector. We are not caching or passing around an open DB connection anywhere anymore.
  • create_sqlalchemy_engine and create_sqlalchemy_connection have been altered to use the single cached engine, and are also deprecated. Ideally, subclasses & other objects would stop using those methods.

Suggestions:

  • If you have any database-interacting logic that happens outside of the SQLConnector (eg if your SQLStream subclass has a get_records method that gets the SQLConnector.connection and does something with it), please move that logic onto a method on your subclass of SQLConnector and call that method from the Stream/other object.
  • Wherever you're accessing the connection on SQLConnector/a subclass, please do so by using the _connect() method as a context manager. It's directly equivalent to SQLAlchemy.Engine.connect(). Here's an example of its use.

bug: `Invalid character length`

This is logged as something to consider fixing at the SDK level meltano/sdk#1812 but we should fix it here also.

Invalid character length: 4,294,967,295. Must be between 1 and 16,777,216

Set a limit to the max length parameter for varchars to be 16,777,216.

Cleaning up the `to_sql_type()` implementation

Originally posted by @aaronsteers in #2 (comment)

Note to our future selves to see what can be improved upstream in the SDK to simplify and remove need to call the private _json_schema_type_check()

Probably SDK based changes, but we can discuss further here without blocking #2.

cc @BuzzCutNorman

Relevant code snippets from this target-snowflake implementation

to_sql_type()

@staticmethod
def to_sql_type(jsonschema_type: dict) -> sqlalchemy.types.TypeEngine:
"""Return a JSON Schema representation of the provided type.
Uses custom Snowflake types from [snowflake-sqlalchemy](https://github.com/snowflakedb/snowflake-sqlalchemy/blob/main/src/snowflake/sqlalchemy/custom_types.py)
Args:
jsonschema_type: The JSON Schema representation of the source type.
Returns:
The SQLAlchemy type representation of the data type.
"""
# start with default implementation
target_type = SQLConnector.to_sql_type(jsonschema_type)
# snowflake max and default varchar length
# https://docs.snowflake.com/en/sql-reference/intro-summary-data-types.html
maxlength = jsonschema_type.get("maxLength", 16777216)
# define type maps
string_submaps = [
TypeMap(eq, sct.TIMESTAMP_NTZ(), "date-time"),
TypeMap(contains, sqlalchemy.types.TIME(), "time"),
TypeMap(eq, sqlalchemy.types.DATE(), "date"),
TypeMap(eq, sqlalchemy.types.VARCHAR(maxlength), None),
]
type_maps = [
TypeMap(th._jsonschema_type_check, sct.NUMBER(), ("integer",)),
TypeMap(th._jsonschema_type_check, sct.VARIANT(), ("object",)),
TypeMap(th._jsonschema_type_check, sct.VARIANT(), ("array",)),
]
# apply type maps
if th._jsonschema_type_check(jsonschema_type, ("string",)):
datelike_type = th.get_datelike_property_type(jsonschema_type)
target_type = evaluate_typemaps(string_submaps, datelike_type, target_type)
else:
target_type = evaluate_typemaps(type_maps, jsonschema_type, target_type)
return cast(sqlalchemy.types.TypeEngine, target_type)

_adapt_column_type()

Custom implementation will be removed after this merges:

https://github.com/MeltanoLabs/target-snowflake/blob/a3caa38bc66ed223d3b52518e9cdfd5f38d27a17/target_snowflake/sinks.py#L95-L149?plain=true

Custom TypeMap logic:

class TypeMap:
def __init__(self, operator, map_value, match_value=None):
self.operator = operator
self.map_value = map_value
self.match_value = match_value
def match(self, compare_value):
try:
if self.match_value:
return self.operator(compare_value, self.match_value)
return self.operator(compare_value)
except TypeError:
return False
def evaluate_typemaps(type_maps, compare_value, unmatched_value):
for type_map in type_maps:
if type_map.match(compare_value):
return type_map.map_value
return unmatched_value

feat: implement helper init command for creating snowflake objects

This was mentioned in meltano/meltano#2814 (comment) and Office Hours.

The premise is that taps/targets are all CLIs and they already have all the credentials and logic needed to connect to the source or destination so we could extend it to do other operations.

In this case it would be nice to help users initialize needed snowflake objects so onboarding is seamless. I'm not 100% this is relevant for this target but in pipelinewise's the user needs to create a file format + role + optional stage + apply permissions appropriately. I see in

# create file format in new schema
that we auto create some stuff though.

It would be cool to allow 2 options:

  • print SQL (default) - I'm sure most users would prefer to see the SQL before its run and likely run it themselves manually. Also some might require elevated permissions that the connector doesnt have.
  • execute SQL - if the connector has enough permissions it should just execute the SQL.

TypeError: Object of type Decimal is not JSON serializable

External ref: Slack thread on Meltano Slack community

meltano.yaml

# ...
  extractors:
  - name: tap-oracle
    pip_url: git+https://github.com/christianelliott/pipelinewise-tap-oracle.git
# ...
  loaders:
  - name: target-snowflake
    variant: meltanolabs
    pip_url: meltanolabs-target-snowflake
# ...

The error

❯ meltano --environment=dev run tap-oracle target-snowflake
...
2023-07-09T15:36:51.661162Z [info     ] 2023-07-09 17:36:51,661 | INFO     | target-snowflake     | Target sink for 'BUSINESSDATA_PROD-FINANCIAL_TRANSACTION' is full. Draining... cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.491003Z [info     ] Traceback (most recent call last): cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.491140Z [info     ]   File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/bin/target-snowflake", line 8, in <module> cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.491266Z [info     ]     sys.exit(TargetSnowflake.cli()) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.491388Z [info     ]   File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 1157, in __call__ cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.492633Z [info     ]     return self.main(*args, **kwargs) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.492732Z [info     ]   File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 1078, in main cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.492888Z [info     ]     rv = self.invoke(ctx)      cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.492978Z [info     ]   File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 1434, in invoke cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.493094Z [info     ]     return ctx.invoke(self.callback, **ctx.params) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.493165Z [info     ]   File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 783, in invoke cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.493245Z [info     ]     return __callback(*args, **kwargs) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.493310Z [info     ]   File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 549, in invoke cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.493539Z [info     ]     target.listen(file_input)  cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.493606Z [info     ]   File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/io_base.py", line 34, in listen cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.494018Z [info     ]     self._process_lines(file_input) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.494089Z [info     ]   File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 291, in _process_lines cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.494297Z [info     ]     counter = super()._process_lines(file_input) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.494490Z [info     ]   File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/io_base.py", line 93, in _process_lines cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.494551Z [info     ]     self._process_record_message(line_dict) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.494619Z [info     ]   File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 353, in _process_record_message cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.494712Z [info     ]     self.drain_one(sink)       cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.494974Z [info     ]   File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 494, in drain_one cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495049Z [info     ]     sink.process_batch(draining_status) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495120Z [info     ]   File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/sinks/sql.py", line 267, in process_batch cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495330Z [info     ]     self.bulk_insert_records(  cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495389Z [info     ]   File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/target_snowflake/sinks.py", line 141, in bulk_insert_records cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495443Z [info     ]     for files in batches:      cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495495Z [info     ]   File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/batch.py", line 106, in get_batches cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495545Z [info     ]     gz.writelines(             cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495592Z [info     ]   File "/Users/janispuris/projects/meltano_poc/project/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/batch.py", line 107, in <genexpr> cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495641Z [info     ]     (json.dumps(record) + "\n").encode() for record in chunk cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495825Z [info     ]   File "/Users/janispuris/.pyenv/versions/3.10.11/lib/python3.10/json/__init__.py", line 231, in dumps cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.495973Z [info     ]     return _default_encoder.encode(obj) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.496033Z [info     ]   File "/Users/janispuris/.pyenv/versions/3.10.11/lib/python3.10/json/encoder.py", line 199, in encode cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.496118Z [info     ]     chunks = self.iterencode(o, _one_shot=True) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.496221Z [info     ]   File "/Users/janispuris/.pyenv/versions/3.10.11/lib/python3.10/json/encoder.py", line 257, in iterencode cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.496284Z [info     ]     return _iterencode(o, 0)   cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.496358Z [info     ]   File "/Users/janispuris/.pyenv/versions/3.10.11/lib/python3.10/json/encoder.py", line 179, in default cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.496420Z [info     ]     raise TypeError(f'Object of type {o.__class__.__name__} ' cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.496562Z [info     ] TypeError: Object of type Decimal is not JSON serializable cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.532959Z [info     ] 2023-07-09 17:36:52,532 | INFO     | snowflake.connector.connection | closed cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.533122Z [info     ] 2023-07-09 17:36:52,532 | INFO     | snowflake.connector.connection | No async queries seem to be running, deleting session cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-07-09T15:36:52.606484Z [error    ] [Errno 32] Broken pipe         
...

Might be related to #66

feat: Support writing full record to a single variant column

target-bigquery provides the option to either denormalize into columns or leave as a single json column. That could be a useful feature for this target also.

target-bigquery docs:

Denormalized variants indicate data is unpacked during load with a resultant schema in BigQuery based on the tap schema. Non-denormalized means we have a fixed schema which loads all data into an unstructured JSON column. They are both useful patterns. The latter allowing BigQuery to work with schemaless or rapidly changing sources such as MongoDB instantly, while the former is more performant and convenient to start modeling quickly.

bug: default test `TargetSchemaUpdates` fails `column 'A3' already exists`

The test data https://github.com/meltano/sdk/blob/main/singer_sdk/testing/target_test_streams/schema_updates.singer tries to alter the schema of the table multiple times, each time adding a few records. The test is failing because it tries to add a column that already exists. I suspect its an upper/lower case issue related to comparing what exists.

column 'A3' already exists", 'query': 'ALTER TABLE MELTANOLABS_RAW.TARGET_SNOWFLAKE_685220.TEST_SCHEMA_UPDATES ADD COLUMN a3 BOOLEAN', ...

bug: colons in property name not handled

In our Dynamodb stream in the Squared project there are some column names with colons like tenant_resource_key::project_id that are causing the merge sql logic to break because theyre unquoted. The query ends up failing with Unsupported data type 'SCHEDULE_NAME' because the double colons is the cast syntax also.

sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 002040 (42601): 01ace745-0503-27e0-0051-5c830365b15a: SQL compilation error:
Unsupported data type 'SCHEDULE_NAME'.
[SQL: merge into USERDEV_RAW.PNADOLNY_DYNAMODB_NEW.PROJECT_SCHEDULES_TABLE d using (select $1:enabled::BOOLEAN as enabled, $1:deployment_name::schedule_name::VARCHAR as "deployment_name::schedule_name", $1:interval::VARCHAR as interval, $1:tenant_resource_key::project_id::VARCHAR as "tenant_resource_key::project_id", $1:eventbridge_name::VARCHAR as eventbridge_name, $1:_sdc_deleted_at::TIMESTAMP_NTZ as _sdc_deleted_at, $1:_sdc_received_at::TIMESTAMP_NTZ as _sdc_received_at, $1:_sdc_batched_at::TIMESTAMP_NTZ as _sdc_batched_at, $1:_sdc_extracted_at::TIMESTAMP_NTZ as _sdc_extracted_at, $1:_sdc_table_version::DECIMAL as _sdc_table_version, $1:_sdc_sequence::DECIMAL as _sdc_sequence from '@~/target-snowflake/project_schedules_table-52921d9b-6c76-4e0f-beb1-daa92cfa3b98'(file_format => USERDEV_RAW.PNADOLNY_DYNAMODB_NEW."project_schedules_table-52921d9b-6c76-4e0f-beb1-daa92cfa3b98")) s on d.tenant_resource_keyproject_id = s.tenant_resource_keyproject_id and d.deployment_nameschedule_name = s.deployment_nameschedule_name when matched then update set d.enabled = s.enabled, d."deployment_name::schedule_name" = s."deployment_name::schedule_name", d.interval = s.interval, d."tenant_resource_key::project_id" = s."tenant_resource_key::project_id", d.eventbridge_name = s.eventbridge_name, d._sdc_deleted_at = s._sdc_deleted_at, d._sdc_received_at = s._sdc_received_at, d._sdc_batched_at = s._sdc_batched_at, d._sdc_extracted_at = s._sdc_extracted_at, d._sdc_table_version = s._sdc_table_version, d._sdc_sequence = s._sdc_sequence when not matched then insert (enabled, "deployment_name::schedule_name", interval, "tenant_resource_key::project_id", eventbridge_name, _sdc_deleted_at, _sdc_received_at, _sdc_batched_at, _sdc_extracted_at, _sdc_table_version, _sdc_sequence) values (s.enabled, s."deployment_name::schedule_name", s.interval, s."tenant_resource_key::project_id", s.eventbridge_name, s._sdc_deleted_at, s._sdc_received_at, s._sdc_batched_at, s._sdc_extracted_at, s._sdc_table_version, s._sdc_sequence)]

Sqlalchemy `IntegrityError` if key property is missing from record

The target passes this test by throwing an IntegrityError exception if the record doesnt contain the key property, which is good, but its relying on snowflake or maybe the snowflake sqlalchemy client specifically to assert integrity errors based on the PK that was defined in the table. Should we be catching this and handling it ourselves in the target or in the SDK instead? It seems risky to rely on the client/destination to raise this, especially because its not necessarily enforced by systems like snowflake.

sqlalchemy.exc.IntegrityError: (snowflake.connector.errors.IntegrityError) 100072 (22000): None: NULL result in a non-nullable column
E       [SQL: merge into MELTANOLABS_RAW.TARGET_SNOWFLAKE_17B5D0.TEST_RECORD_MISSING_KEY_PROPERTY d using (select $1:id::DECIMAL as ID, $1:metric::DECIMAL as METRIC, $1:_sdc_extracted_at::TIMESTAMP_NTZ as _SDC_EXTRACTED_AT, $1:_sdc_received_at::TIMESTAMP_NTZ as _SDC_RECEIVED_AT, $1:_sdc_batched_at::TIMESTAMP_NTZ as _SDC_BATCHED_AT, $1:_sdc_deleted_at::TIMESTAMP_NTZ as _SDC_DELETED_AT, $1:_sdc_table_version::DECIMAL as _SDC_TABLE_VERSION, $1:_sdc_sequence::DECIMAL as _SDC_SEQUENCE from '@~/target-snowflake/test_record_missing_key_property-68fbb0ac-0330-49d1-a859-ed77c52dd6a7'(file_format => MELTANOLABS_RAW.TARGET_SNOWFLAKE_17B5D0."test_record_missing_key_property-68fbb0ac-0330-49d1-a859-ed77c52dd6a7")) s on d."ID" = s."ID" when matched then update set d."ID" = s."ID", d."METRIC" = s."METRIC", d."_SDC_EXTRACTED_AT" = s."_SDC_EXTRACTED_AT", d."_SDC_RECEIVED_AT" = s."_SDC_RECEIVED_AT", d."_SDC_BATCHED_AT" = s."_SDC_BATCHED_AT", d."_SDC_DELETED_AT" = s."_SDC_DELETED_AT", d."_SDC_TABLE_VERSION" = s."_SDC_TABLE_VERSION", d."_SDC_SEQUENCE" = s."_SDC_SEQUENCE" when not matched then insert (ID, METRIC, _SDC_EXTRACTED_AT, _SDC_RECEIVED_AT, _SDC_BATCHED_AT, _SDC_DELETED_AT, _SDC_TABLE_VERSION, _SDC_SEQUENCE) values (s."ID", s."METRIC", s."_SDC_EXTRACTED_AT", s."_SDC_RECEIVED_AT", s."_SDC_BATCHED_AT", s."_SDC_DELETED_AT", s."_SDC_TABLE_VERSION", s."_SDC_SEQUENCE")]
E       (Background on this error at: https://sqlalche.me/e/14/gkpj)

Implementing a Before_run_SQL

Implement a Before_run_SQL for SQL commands before sending the data. This would make Drop_Table commands easier for projects that need it. Today doing this is not possible using Meltano.

bug: COPY logic fails if schema has updates

As described in #52 (comment). The COPY logic has a bug when schema updates are required. The sync fails with an error related to a mismatch in column counts in the copy statement/table definition/internal stage files.

Draft PR up with test case #52 but still needs logic changes to fix them.

Snowflake destination table created with incorrect NUMBER precision

External ref: Slack thread on Meltano Slack community

meltano.yml

# ...
  extractors:
  - name: tap-oracle
    pip_url: git+https://github.com/christianelliott/pipelinewise-tap-oracle.git
# ...
  loaders:
  - name: target-snowflake
    variant: meltanolabs
    pip_url: meltanolabs-target-snowflake
# ...

For example, --dump=catalog's one of the fields in stream are

            "AMT_EXCL_TAX": {
                "multipleOf": 0.0001,
                "type": [
                    "null",
                    "number"
                ]
            },

but in snowflake it is created as

AMT_EXCL_TAX NUMBER(38,0)

while the correct field would be

AMT_EXCL_TAX NUMBER(34, 4)

Please let me know, if there is anything in addition I can gather to make this easier to troubleshoot.
I can also try to produce a MRE, if absolutely necessary.

bug: default test `TargetDuplicateRecords` failing, records not deduped using key_properties

The test data https://github.com/meltano/sdk/blob/main/singer_sdk/testing/target_test_streams/duplicate_records.singer sends 5 records but only 2 distinct key property IDs so they should be updating instead of appending. I added an assertion to the test to make sure only 2 records were present in the final table but theres still 5.

I think in #15 there was an attempt to dedupe using temp tables, maybe that was to fix this issue.

Unexpected error: `AttributeError: 'NoneType' object has no attribute 'storage'` on `prefix = batch_config.storage.prefix or ""`

When running a simple ELT from Slack to Snowflake, I get the following error:

2023-06-01T14:52:08.118395Z [info     ] 2023-06-01 16:52:08,118 | INFO     | target-snowflake     | Target 'target-snowflake' completed reading 4299 lines of input (4297 records, (0 batch manifests, 1 state messages). cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.125457Z [info     ] Traceback (most recent call last): cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.125600Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/bin/target-snowflake", line 8, in <module> cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.125722Z [info     ]     sys.exit(TargetSnowflake.cli()) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.125799Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 1130, in __call__ cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.125861Z [info     ]     return self.main(*args, **kwargs) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.125917Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 1055, in main cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.126391Z [info     ]     rv = self.invoke(ctx)      cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.126451Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 1404, in invoke cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.126953Z [info     ]     return ctx.invoke(self.callback, **ctx.params) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.127037Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/click/core.py", line 760, in invoke cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.127106Z [info     ]     return __callback(*args, **kwargs) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.127707Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 572, in cli cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.127818Z [info     ]     target.listen(file_input)  cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.128521Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/io_base.py", line 35, in listen cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.128606Z [info     ]     self._process_endofpipe()  cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.128678Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 287, in _process_endofpipe cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.129194Z [info     ]     self.drain_all(is_endofpipe=True) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.129266Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 449, in drain_all cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.129802Z [info     ]     self._drain_all(list(self._sinks_active.values()), self.max_parallelism) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.129863Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 482, in _drain_all cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.129918Z [info     ]     Parallel()(delayed(_drain_sink)(sink=sink) for sink in sink_list) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.130603Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/joblib/parallel.py", line 1098, in __call__ cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.130743Z [info     ]     self.retrieve()            cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.131340Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/joblib/parallel.py", line 975, in retrieve cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.131401Z [info     ]     self._output.extend(job.get(timeout=self.timeout)) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.131464Z [info     ]   File "/Users/florian.ernst/.pyenv/versions/3.10.1/lib/python3.10/multiprocessing/pool.py", line 771, in get cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.132069Z [info     ]     raise self._value          cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.132124Z [info     ]   File "/Users/florian.ernst/.pyenv/versions/3.10.1/lib/python3.10/multiprocessing/pool.py", line 125, in worker cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.132721Z [info     ]     result = (True, func(*args, **kwds)) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.132775Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/joblib/_parallel_backends.py", line 620, in __call__ cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.132832Z [info     ]     return self.func(*args, **kwargs) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.133331Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/joblib/parallel.py", line 288, in __call__ cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.133386Z [info     ]     return [func(*args, **kwargs) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.133998Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/joblib/parallel.py", line 288, in <listcomp> cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.134056Z [info     ]     return [func(*args, **kwargs) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.134111Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 479, in _drain_sink cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.134608Z [info     ]     self.drain_one(sink)       cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.134668Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 469, in drain_one cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.135177Z [info     ]     sink.process_batch(draining_status) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.135233Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/singer_sdk/sinks/sql.py", line 253, in process_batch cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.135282Z [info     ]     self.bulk_insert_records(  cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.135847Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/target_snowflake/sinks.py", line 348, in bulk_insert_records cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.135899Z [info     ]     encoding, files = self.get_batches( cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.136618Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/target_snowflake/sinks.py", line 374, in get_batches cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.136709Z [info     ]     prefix = batch_config.storage.prefix or "" cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.136777Z [info     ] AttributeError: 'NoneType' object has no attribute 'storage' cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr

2023-06-01T14:52:08.165845Z [info     ] 2023-06-01 16:52:08,165 | INFO     | snowflake.connector.cursor | query: [ROLLBACK] cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.170986Z [info     ] 2023-06-01 16:52:08,169 | ERROR    | sqlalchemy.pool.impl.QueuePool | Exception during reset or similar cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr

2023-06-01T14:52:08.171085Z [info     ] Traceback (most recent call last): cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.171149Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 763, in _finalize_fairy cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.171208Z [info     ]     fairy._reset(pool, transaction_was_reset) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.171282Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 1038, in _reset cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.171334Z [info     ]     pool._dialect.do_rollback(self) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.171383Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 683, in do_rollback cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.171878Z [info     ]     dbapi_connection.rollback() cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.171929Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/connection.py", line 647, in rollback cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.172428Z [info     ]     self.cursor().execute("ROLLBACK") cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.172476Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/cursor.py", line 801, in execute cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.172969Z [info     ]     ret = self._execute_helper(query, **kwargs) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.173030Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/cursor.py", line 521, in _execute_helper cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.173079Z [info     ]     ret = self._connection.cmd_query( cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.173679Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/connection.py", line 1067, in cmd_query cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.173728Z [info     ]     ret = self.rest.request(   cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.174303Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/network.py", line 477, in request cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.174353Z [info     ]     return self._post_request( cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.174401Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/network.py", line 716, in _post_request cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.174889Z [info     ]     ret = self.fetch(          cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.174938Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/network.py", line 814, in fetch cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.175474Z [info     ]     ret = self._request_exec_wrapper( cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.175523Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/network.py", line 937, in _request_exec_wrapper cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.175578Z [info     ]     raise e                    cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.176179Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/network.py", line 857, in _request_exec_wrapper cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.176237Z [info     ]     return_object = self._request_exec( cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.176937Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/network.py", line 1131, in _request_exec cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.176991Z [info     ]     raise err                  cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.177041Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/network.py", line 1033, in _request_exec cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.177556Z [info     ]     raw_ret = session.request( cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.177604Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/vendored/requests/sessions.py", line 577, in request cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.178420Z [info     ]     settings = self.merge_environment_settings( cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.178498Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/vendored/requests/sessions.py", line 759, in merge_environment_settings cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.178994Z [info     ]     env_proxies = get_environ_proxies(url, no_proxy=no_proxy) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.179050Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/vendored/requests/utils.py", line 825, in get_environ_proxies cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.179794Z [info     ]     if should_bypass_proxies(url, no_proxy=no_proxy): cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.179853Z [info     ]   File "/Users/florian.ernst/Alan/meltano/alan_slack/.meltano/loaders/target-snowflake/venv/lib/python3.10/site-packages/snowflake/connector/vendored/requests/utils.py", line 809, in should_bypass_proxies cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.179906Z [info     ]     bypass = proxy_bypass(parsed.hostname) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.180531Z [info     ]   File "/Users/florian.ernst/.pyenv/versions/3.10.1/lib/python3.10/urllib/request.py", line 2649, in proxy_bypass cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.180596Z [info     ]     return proxy_bypass_macosx_sysconf(host) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.180647Z [info     ]   File "/Users/florian.ernst/.pyenv/versions/3.10.1/lib/python3.10/urllib/request.py", line 2626, in proxy_bypass_macosx_sysconf cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.181135Z [info     ]     return _proxy_bypass_macosx_sysconf(host, proxy_settings) cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.181203Z [info     ]   File "/Users/florian.ernst/.pyenv/versions/3.10.1/lib/python3.10/urllib/request.py", line 2568, in _proxy_bypass_macosx_sysconf cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.181721Z [info     ]     from fnmatch import fnmatch cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.181774Z [info     ] ImportError: sys.meta_path is None, Python is likely shutting down cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.181919Z [info     ] 2023-06-01 16:52:08,170 | INFO     | snowflake.connector.connection | closed cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.182252Z [info     ] 2023-06-01 16:52:08,171 | INFO     | snowflake.connector.connection | No async queries seem to be running, deleting session cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.224127Z [error    ] Loading failed                 code=1 message=2023-06-01 16:52:08,171 | INFO     | snowflake.connector.connection | No async queries seem to be running, deleting session name=meltano run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake

I added some blank lines to clearly show the 3 distinct parts of the logs. Here, I think the 2 interesting lines are:

2023-06-01T14:52:08.136709Z [info     ]     prefix = batch_config.storage.prefix or "" cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr
2023-06-01T14:52:08.136777Z [info     ] AttributeError: 'NoneType' object has no attribute 'storage' cmd_type=loader name=target-snowflake run_id=f0b21644-eafb-41d7-8924-31c7ba04feb4 state_id=2023-06-01T145159--tap-slack--target-snowflake stdio=stderr

Not sure what's wrong here.

Implement configurable insert vs update setting

Related to meltano/sdk#1783

We should allow users to configure whether they want every record inserted in append only mode or if we should update records that exist. If someone wants to update then they need to have key properties otherwise the target should raise and exception. It feels like the current default in the community is to update and require key properties, so sticking with the status quo seems like a fine path forward.

This target implements logic to use copy or merge statements depending on whether key properties are present so we can do both, although meltano/sdk#1819 is requiring key properties so it would never use our copy logic.

ImportError: cannot import name 'SQLConnector' from 'singer_sdk.sinks'

Traceback (most recent call last):
  File "/Users/pnadolny/Documents/Git/GitHub/pnadolny/hub-utils/hub_utils/test_meltano_project/.meltano/loaders/target-snowflake/venv/bin/target-snowflake", line 5, in <module>
    from target_snowflake.target import TargetSnowflake
  File "/Users/pnadolny/Documents/Git/GitHub/pnadolny/hub-utils/hub_utils/test_meltano_project/.meltano/loaders/target-snowflake/venv/lib/python3.8/site-packages/target_snowflake/target.py", line 8, in <module>
    from target_snowflake.sinks import SnowflakeSink
  File "/Users/pnadolny/Documents/Git/GitHub/pnadolny/hub-utils/hub_utils/test_meltano_project/.meltano/loaders/target-snowflake/venv/lib/python3.8/site-packages/target_snowflake/sinks.py", line 17, in <module>
    from singer_sdk.sinks import SQLConnector, SQLSink
ImportError: cannot import name 'SQLConnector' from 'singer_sdk.sinks' (/Users/pnadolny/Documents/Git/GitHub/pnadolny/hub-utils/hub_utils/test_meltano_project/.meltano/loaders/target-snowflake/venv/lib/python3.8/site-packages/singer_sdk/sinks/__init__.py)
Command '['poetry', 'run', 'meltano', 'invoke', 'target-snowflake', '--help']' returned non-zero exit status 1.
Configuration file exists at /Users/pnadolny/Library/Application Support/pypoetry, reusing this directory.

@kgpayne I was trying to add this to the hub and it threw an error before I could extract the --about metadata, in this case during a --help command.

Unexpected error: `Object '{schema_name}' already exists` during Snowflake `CREATE SCHEMA`

@Jan Soutusta in slack reported this error:

2023-03-06T08:03:44.213924Z [info     ] sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 002002 (42710): 01aac2a3-0001-008b-0001-aed20033302a: SQL compilation error: cmd_type=elb consumer=True name=target-snowflake-sfdc producer=False stdio=stderr string_id=target-snowflake-sfdc
2023-03-06T08:03:44.214106Z [info     ] Object 'SFDC_INPUT_STAGE' already exists. cmd_type=elb consumer=True name=target-snowflake-sfdc producer=False stdio=stderr string_id=target-snowflake-sfdc
2023-03-06T08:03:44.214274Z [info     ] [SQL: CREATE SCHEMA "SFDC_INPUT_STAGE"] cmd_type=elb consumer=True name=target-snowflake-sfdc producer=False stdio=stderr string_id=target-snowflake-sfdc
2023-03-06T08:03:44.214440Z [info     ] (Background on this error at: https://sqlalche.me/e/14/f405) cmd_type=elb consumer=True name=target-snowflake-sfdc producer=False stdio=stderr string_id=target-snowflake-sfdc

https://meltano.slack.com/archives/C01TCRBBJD7/p1678089929508169?thread_ts=1678088654.698399&cid=C01TCRBBJD7

If schema isn't specified weird stuff happens

While fixing #87 I ran into my schema not getting created automatically, really I think we should just make https://github.com/MeltanoLabs/target-snowflake/blob/main/target_snowflake/target.py#L46 required, and probably check that the value isn't null somewhere.

When rolling back my transactions it was trying to rollback from the DATABASE.None schema which didn't exist. I think just forcing a schema to exist would help this, but there could be other options here!

Stack trace

joblib.externals.loky.process_executor._RemoteTraceback:
"""
Traceback (most recent call last):
  File "E:\code\target-snowflake\target_snowflake\sinks.py", line 178, in insert_batch_files_via_internal_stage
    self.conform_name(
  File "E:\code\target-snowflake\target_snowflake\sinks.py", line 98, in conform_name
    return super().conform_name(name=name, object_type=object_type)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\singer_sdk\sinks\sql.py", line 159, in conform_name
    name = re.sub(r"[^a-zA-Z0-9_\-\.\s]", "", name)
  File "c:\users\derek\appdata\local\programs\python\python38\lib\re.py", line 210, in sub
    return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1910, in _execute_context
    self.dialect.do_execute(
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\engine\default.py", line 736, in do_execute
    cursor.execute(statement, parameters)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\snowflake\connector\cursor.py", line 910, in execute
    Error.errorhandler_wrapper(self.connection, self, error_class, errvalue)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\snowflake\connector\errors.py", line 290, in errorhandler_wrapper
    handed_over = Error.hand_to_other_handler(
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\snowflake\connector\errors.py", line 345, in hand_to_other_handler
    cursor.errorhandler(connection, cursor, error_class, error_value)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\snowflake\connector\errors.py", line 221, in default_errorhandler
    raise error_class(
snowflake.connector.errors.ProgrammingError: 002003 (02000): 01ad9a15-0001-32de-0000-0004428e32e9: SQL compilation error:
Schema 'MELTANO_DATABASE.NONE' does not exist or not authorized.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\joblib\_parallel_backends.py", line 273, in _wrap_func_call
    return func()
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\joblib\parallel.py", line 588, in __call__
    return [func(*args, **kwargs)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\joblib\parallel.py", line 588, in <listcomp>
    return [func(*args, **kwargs)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\singer_sdk\target_base.py", line 505, in _drain_sink
    self.drain_one(sink)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\singer_sdk\target_base.py", line 495, in drain_one
    sink.process_batch(draining_status)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\singer_sdk\sinks\sql.py", line 267, in process_batch
    self.bulk_insert_records(
  File "E:\code\target-snowflake\target_snowflake\sinks.py", line 142, in bulk_insert_records
    self.insert_batch_files_via_internal_stage(
  File "E:\code\target-snowflake\target_snowflake\sinks.py", line 205, in insert_batch_files_via_internal_stage
    self.connector.drop_file_format(file_format=file_format)
  File "E:\code\target-snowflake\target_snowflake\connector.py", line 451, in drop_file_format
    conn.execute(drop_statement, **kwargs)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1385, in execute
    return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\sql\elements.py", line 334, in _execute_on_connection
    return connection._execute_clauseelement(
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1577, in _execute_clauseelement
    ret = self._execute_context(
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1953, in _execute_context
    self._handle_dbapi_exception(
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\engine\base.py", line 2134, in _handle_dbapi_exception
    util.raise_(
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\util\compat.py", line 211, in raise_
    raise exception
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1910, in _execute_context
    self.dialect.do_execute(
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\sqlalchemy\engine\default.py", line 736, in do_execute
    cursor.execute(statement, parameters)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\snowflake\connector\cursor.py", line 910, in execute
    Error.errorhandler_wrapper(self.connection, self, error_class, errvalue)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\snowflake\connector\errors.py", line 290, in errorhandler_wrapper
    handed_over = Error.hand_to_other_handler(
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\snowflake\connector\errors.py", line 345, in hand_to_other_handler
    cursor.errorhandler(connection, cursor, error_class, error_value)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\snowflake\connector\errors.py", line 221, in default_errorhandler
    raise error_class(
sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 002003 (02000): 01ad9a15-0001-32de-0000-0004428e32e9: SQL compilation error:
Schema 'MELTANO_DATABASE.NONE' does not exist or not authorized.
[SQL: drop file format if exists MELTANO_DATABASE.None."animals-5dd839ea-8cd4-44da-b036-5a0be3b679b3"]
(Background on this error at: https://sqlalche.me/e/14/f405)
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "c:\users\derek\appdata\local\programs\python\python38\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\users\derek\appdata\local\programs\python\python38\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\Scripts\target-snowflake.exe\__main__.py", line 7, in <module>
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\click\core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\click\core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\click\core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\click\core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\singer_sdk\target_base.py", line 550, in invoke
    target.listen(file_input)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\singer_sdk\io_base.py", line 35, in listen
    self._process_endofpipe()
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\singer_sdk\target_base.py", line 310, in _process_endofpipe
    self.drain_all(is_endofpipe=True)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\singer_sdk\target_base.py", line 475, in drain_all
2023-07-13 11:49:09,424 | INFO     | snowflake.connector.cursor | query execution done
    self._drain_all(list(self._sinks_active.values()), self.max_parallelism)
2023-07-13 11:49:09,425 | INFO     | snowflake.connector.cursor | Number of results in first chunk: 1
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\singer_sdk\target_base.py", line 508, in _drain_all
    Parallel()(delayed(_drain_sink)(sink=sink) for sink in sink_list)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\joblib\parallel.py", line 1944, in __call__
    return output if self.return_generator else list(output)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\joblib\parallel.py", line 1587, in _get_outputs
    yield from self._retrieve()
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\joblib\parallel.py", line 1691, in _retrieve
    self._raise_error_fast()
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\joblib\parallel.py", line 1726, in _raise_error_fast
    error_job.get_result(self.timeout)
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\joblib\parallel.py", line 735, in get_result
    return self._return_or_raise()
  File "E:\code\target-snowflake\.meltano\loaders\target-snowflake\venv\lib\site-packages\joblib\parallel.py", line 753, in _return_or_raise
    raise self._result
sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 002003 (02000): 01ad9a15-0001-32de-0000-0004428e32e9: SQL compilation error:

Throw better error if database not found

If a database is specified that doesnt exist then the target returns this error when trying to create the empty table:

Cannot perform CREATE TABLE. This session does not have a current database. Call 'USE DATABASE', or use a qualified name.

Its not clear that the reason for this is because the database wasnt found. If the database is found then its uses it as the default database. We should put a check in place to raise a clear exception when the configured database isnt found. It doesnt look like we're conforming database names right now but its also possible at someone will implement that and some users will be confused when their pre-conformed database name exists but post conformed it does not. So having a clear error message with the database name is important.

jsonschema errors - No way to know which stream name failed

2023-09-29T18:56:53.270886Z [info     ] 2023-09-29 18:56:53,270 | INFO     | snowflake.connector.cursor | Number of results in first chunk: 1 cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.271316Z [info     ] Traceback (most recent call last): cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.271668Z [info     ]   File "/project/.meltano/loaders/target-snowflake/venv/bin/target-snowflake", line 8, in <module> cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.272072Z [info     ]     sys.exit(TargetSnowflake.cli()) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.272406Z [info     ]   File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/click/core.py", line 1157, in __call__ cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.272720Z [info     ]     return self.main(*args, **kwargs) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.273074Z [info     ]   File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/click/core.py", line 1078, in main cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.273485Z [info     ]     rv = self.invoke(ctx)      cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.273831Z [info     ]   File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/click/core.py", line 1434, in invoke cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.274157Z [info     ]     return ctx.invoke(self.callback, **ctx.params) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.274471Z [info     ]   File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/click/core.py", line 783, in invoke cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.274805Z [info     ]     return __callback(*args, **kwargs) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.275122Z [info     ]   File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/singer_sdk/target_base.py", line 550, in invoke cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.275445Z [info     ]     target.listen(file_input)  cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.275742Z [info     ]   File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/singer_sdk/io_base.py", line 34, in listen cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.276050Z [info     ]     self._process_lines(file_input) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.276343Z [info     ]   File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/singer_sdk/target_base.py", line 291, in _process_lines cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.276789Z [info     ]     counter = super()._process_lines(file_input) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.277120Z [info     ]   File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/singer_sdk/io_base.py", line 93, in _process_lines cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.277426Z [info     ]     self._process_record_message(line_dict) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.277724Z [info     ]   File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/singer_sdk/target_base.py", line 341, in _process_record_message cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.278017Z [info     ]     sink._validate_and_parse(transformed_record) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.278309Z [info     ]   File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/singer_sdk/sinks/core.py", line 317, in _validate_and_parse cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.278603Z [info     ]     self._validator.validate(record) cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.278914Z [info     ]   File "/project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/jsonschema/validators.py", line 435, in validate cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.279209Z [info     ]     raise error                cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.279504Z [info     ] jsonschema.exceptions.ValidationError: 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' is too long cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.279798Z [info     ]                                cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.280089Z [info     ] Failed validating 'maxLength' in schema['properties']['fieldname']: cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.280392Z [info     ]     {'maxLength': 20, 'type': ['string', 'null']} cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.280697Z [info     ]                                cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.281026Z [info     ] On instance['fieldname']:        cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake
2023-09-29T18:56:53.281332Z [info     ]     'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' cmd_type=elb consumer=True name=target-snowflake producer=False stdio=stderr string_id=target-snowflake

We could wrap /project/.meltano/loaders/target-snowflake/venv/lib/python3.9/site-packages/singer_sdk/sinks/core.py with a try catch and output the sink name. Ideally the error would include the stack trace, and jsonschema error, along the with the stream_name, and key properties of the row that fails (if one exists)

We could do this in the SDK as well 🤷‍♂️

feat: native Snowflake bulk load for `BATCH` messages

Related:

Spec details:

  1. Use table stages to avoid requiring S3 creds.
  2. Use default encoding of jsonl and gzip to match initial built-in SDK feature spec.
  3. Skip record validation to avoid needing to parse the raw files.
  4. One COPY command per BATCH message's manifest. (Each manifest could contain 1 or many files, depending on the tap implementation.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.