singer-io / tap-shopify Goto Github PK
View Code? Open in Web Editor NEWSinger.io tap for extracting Shopify data
License: GNU Affero General Public License v3.0
Singer.io tap for extracting Shopify data
License: GNU Affero General Public License v3.0
I probably misunderstand something about Singer, but many numeric fields are processed as text. For example: order_adjustments.tax_amount, order_adjustments.amount, customers.total_spent. A few of these have been annotated with:
"format": "singer.decimal"
But many have not. Specifically I have noticed that customer.total_spent comes through to my PostgreSQL endpoint as a text column, which is undesirable for attempting to segment customers.
Is there a reason for these fields to be text? If I create a PR that addresses it, is it more correct to update the type to "numeric" or to update the format with "singer.decimal"?
There are multiple schemas which contains {} causing validation to fail.
https://github.com/singer-io/getting-started/blob/master/docs/BEST_PRACTICES.md#schemas
For examples, orders.json and definitions.json
https://help.shopify.com/en/api/graphql-admin-api/reference/object/inventoryitem
"Represents the goods available to be shipped to a customer. It holds essential information about the goods, including SKU and whether it is tracked."
This endpoint can be used (likely in conjunction with inventorylocation
) to track various items in inventory, costs associated with them, etc.
I'd like to gauge the interest in accepting a PR for new functionality.
The metafield.value field currently has the following schema:
"value": {
"type": [
"null",
"integer",
"object",
"string"
],
"properties": {}
},
The object
type occurs when the type
field is what's called a json_string
in Shopify's API. It doesn't appear the SQL targets (or at least redshift/postgres) support object in this list of possible types, and they silently drop these values. I'd like a way to keep them as strings (what Shopify returns directly) so they'll still make it to my database.
Would you be amenable to a config flag (default off for backwards compatibility) that would disable current json_string
parsing logic, leaving those values as strings?
I'm going to add this for myself, and then I plan on adding a generated column to parse these into JSONB manually in the target db. I have to imagine this is a very common use case that others may be interested in. I'd also prefer not maintaining a fork.
I am trying to figure out where the configuration is to limit the rate within the tap, however, i couldnt find it. I see a separate python file in Shopify git (requests_rate_limit.py) which seems to throttle API requests, but i do not see that file in the tap configuration within my server when i installed the tap . Below is the error i s
ee
This tap currently uses the 2022-07 API as defined in tap_shopify/__init__.py
:
tap-shopify/tap_shopify/__init__.py
Line 28 in edbe7a3
This is throwing warnings in the Shopify admin and will need to be updated. I'm not sure what policy is for Singer, but the versions of the Shopify API active as of writing this are:
Hi contributors thanks so much for the current tap.
Are there any plans for adding support for draft orders?
The schema is very similar to the orders with a few changes. I can start looking into creating a PR but I am no python expert so I thought I would check here first.
Thanks
there is no relationship created between fulfillments and fulfillment line items (orders__fulfillments and orders__fulfillments__line_items). consequently it is impossible to know which line item belongs to which fulfillment.
https://help.shopify.com/en/api/reference/plus/giftcard
"A gift card is an alternative payment method. Each gift card has a unique code that is entered during checkout. GET /admin/api/2019-10/gift_cards.json
Retrieves a list of gift cards. "
This endpoint can be used to pull information about gift cards, including balances available and status. This information is not available through other shopify tap endpoints already implemented. Knowing how much value remains on gift cards is vital for calculating total liabilities.
Within Orders, there are several fields that are not filled out
https://github.com/singer-io/tap-shopify/blob/master/tap_shopify/schemas/orders.json#L9-L14
"subtotal_price_set": {},
"total_discounts_set": {},
"total_line_items_price_set": {},
"total_price_set": {},
"total_shipping_price_set": {},
"total_tax_set": {},
There may be more, but these are most obvious.
According to this shopify doc, the api version that this tap uses is out of date, and is now throwing errors.
Can this be updated?
Thanks!
Hi, I was using tap-shopify for some months, but I've had problems for some days. I have a large shop (it has more than 1000 orders per day), I saw a lot of times this error:
INFO Filtered paths list: ['admin_graphql_api_id', 'app_id', 'billing_address', 'browser_ip', 'buyer_accepts_marketing', 'cancel_reason', 'cancelled_at', 'cart_token', 'checkout_id', 'checkout_token', 'client_details', 'closed_at', 'confirmed', 'contact_email', 'currency', 'customer_locale', 'device_id', 'discount_applications', 'discount_codes', 'financial_status', 'fulfillment_status', 'fulfillments', 'gateway', 'landing_site', 'landing_site_ref', 'location_id', 'note', 'note_attributes', 'number', 'order_status_url', 'payment_gateway_names', 'phone', 'presentment_currency', 'processed_at', 'processing_method', 'reference', 'referring_site', 'refunds', 'shipping_lines', 'source_identifier', 'source_name', 'source_url', 'subtotal_price', 'subtotal_price_set', 'tags', 'tax_lines', 'taxes_included', 'test', 'token', 'total_discounts_set', 'total_line_items_price', 'total_line_items_price_set', 'total_price', 'total_price_set', 'total_shipping_price_set', 'total_tax', 'total_tax_set', 'total_tip_received', 'total_weight', 'user_id']
WARNING Removed 2 paths during transforms:
customer.accepts_marketing_updated_at
customer.marketing_opt_in_level
WARNING Removed paths list: ['customer.accepts_marketing_updated_at', 'customer.marketing_opt_in_level']
CRITICAL IncompleteRead(1896442 bytes read)
Traceback (most recent call last):
File "/usr/local/lib/python3.6/http/client.py", line 546, in _get_chunk_left
chunk_left = self._read_next_chunk_size()
File "/usr/local/lib/python3.6/http/client.py", line 513, in _read_next_chunk_size
return int(line, 16)
ValueError: invalid literal for int() with base 16: b''
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/http/client.py", line 563, in _readall_chunked
chunk_left = self._get_chunk_left()
File "/usr/local/lib/python3.6/http/client.py", line 548, in _get_chunk_left
raise IncompleteRead(b'')
http.client.IncompleteRead: IncompleteRead(0 bytes read)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/src/app/venv_etl/vtap_shopify/bin/tap-shopify", line 11, in <module>
load_entry_point('tap-shopify', 'console_scripts', 'tap-shopify')()
File "/usr/src/app/venv_etl/vtap_shopify/lib/python3.6/site-packages/singer/utils.py", line 192, in wrapped
return fnc(*args, **kwargs)
File "/usr/src/app/etl/tools/tap-shopify/tap_shopify/__init__.py", line 191, in main
sync()
File "/usr/src/app/etl/tools/tap-shopify/tap_shopify/__init__.py", line 153, in sync
for rec in stream.sync():
File "/usr/src/app/etl/tools/tap-shopify/tap_shopify/streams/base.py", line 191, in sync
for obj in self.get_objects():
File "/usr/src/app/etl/tools/tap-shopify/tap_shopify/streams/base.py", line 152, in get_objects
objects = self.call_api(query_params)
File "/usr/src/app/venv_etl/vtap_shopify/lib/python3.6/site-packages/backoff.py", line 286, in retry
ret = target(*args, **kwargs)
File "/usr/src/app/venv_etl/vtap_shopify/lib/python3.6/site-packages/backoff.py", line 286, in retry
ret = target(*args, **kwargs)
File "/usr/src/app/etl/tools/tap-shopify/tap_shopify/streams/base.py", line 67, in wrapper
return fnc(*args, **kwargs)
File "/usr/src/app/etl/tools/tap-shopify/tap_shopify/streams/base.py", line 119, in call_api
return self.replication_object.find(**query_params)
File "/usr/src/app/venv_etl/vtap_shopify/lib/python3.6/site-packages/pyactiveresource/activeresource.py", line 385, in find
return cls._find_every(from_=from_, **kwargs)
File "/usr/src/app/venv_etl/vtap_shopify/lib/python3.6/site-packages/pyactiveresource/activeresource.py", line 523, in _find_every
return cls._build_list(cls.connection.get(path, cls.headers),
File "/usr/src/app/venv_etl/vtap_shopify/lib/python3.6/site-packages/pyactiveresource/connection.py", line 329, in get
return self.format.decode(self._open('GET', path, headers=headers).body)
File "/usr/src/app/venv_etl/vtap_shopify/lib/python3.6/site-packages/shopify/base.py", line 23, in _open
self.response = super(ShopifyConnection, self)._open(*args, **kwargs)
File "/usr/src/app/venv_etl/vtap_shopify/lib/python3.6/site-packages/pyactiveresource/connection.py", line 291, in _open
response = Response.from_httpresponse(http_response)
File "/usr/src/app/venv_etl/vtap_shopify/lib/python3.6/site-packages/pyactiveresource/connection.py", line 184, in from_httpresponse
return cls(response.code, response.read(),
File "/usr/local/lib/python3.6/http/client.py", line 456, in read
return self._readall_chunked()
File "/usr/local/lib/python3.6/http/client.py", line 570, in _readall_chunked
raise IncompleteRead(b''.join(value))
http.client.IncompleteRead: IncompleteRead(1896442 bytes read)
Do you have any idea?. Thank you so much for reading me.
accepts_marketing_updated_at
is a field available in the shopify customer API, but is not included in the current customer schema. can this field be added to the schema?
https://shopify.dev/docs/admin-api/rest/reference/customers/customer?api[version]=2020-04
I got JsonDecodeError while trying the command "tap-shopify -c config.json --catalog catalog-file.json". THe catalog-flle.json was not populated automatically.
Tried to run the tests
nosetests
got
ERROR: Failure: ModuleNotFoundError (No module named 'tap_tester')
e.g. at tests/test_automatic_fields.py:5
I can't find tap_tester
on GitHub, PyPi or anything else.
We'd like to instrument every HTTP request to the Shopify API to emit metrics about timing and success/failure.
Wrap all API calls with a singer.metrics.http_request_timer
.
That should include at least the call at https://github.com/singer-io/tap-shopify/blob/master/tap_shopify/streams/base.py#L153
Verify behavior by ensuring that existing tests pass and METRICS messages are emitted in the expects places.
The device_type field is tracked within the analytics section of the shopify website, however this field does not appear to be being extracted via stitch to my data warehouse. Is there any plan to add this field to the stitch extract?
The current version (1.4.0
) doesn't support the events endpoint.
Therefore, it is not possible to retrieve a list of events
I'm unable to get tap-shopify to run a sync - it just outputs "Skipping stream" for everything.
tap-shopify -c config.json -d > catalog.json
tap-shopify -c config.json --catalog catalog.json
INFO Skipping stream: orders
INFO Skipping stream: customers
INFO Skipping stream: custom_collections
INFO Skipping stream: abandoned_checkouts
INFO Skipping stream: products
INFO Skipping stream: transactions
INFO Skipping stream: metafields
INFO Skipping stream: order_refunds
INFO Skipping stream: collects
There's no explanation in either tap-shopify or singer what this means or why it's happening. Followed the instructions in the README but it looks like something important is missing.
Even though in the configuration json its mentioned as incremental , every time it runs, it is taking from the config.json start-date and fetch data till date.
"replication_key": "updated_at",
"replication-method": "INCREMENTAL"
Do we have any option to mention both start date and end date. I can see only start date option as bellow.
If I want to run only for a particular date what should I do ?
{
"start_date": "2020-01-01T00:00:00Z",
"api_key": "shppa_92834792349834293786639234",
"shop": "https://mytestshop.myshopify.com/"
}
Some thing like this we can get directly data between two dates
myshopify.com/admin/api/2019-07/orders.json?created_at_min=2020-10-01&created_at_max=2020-11-01&limit=175
This tap doesn't currently retrieve any information on duties.
https://shopify.dev/changelog/duties-are-now-available-on-the-storefront-api
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.