Comments (4)
@joeschmid thanks for the kind words! We're always looking to make Target-Redshift better, so we really appreciate questions like this.
There is currently no supported way to do what you're asking. There have been conversations in the past about building up tooling to detect data widths so that we can leverage tighter constraints inside Redshift and avoid penalties for things like TEXT
columns everywhere, instead of VARCHAR(20)
, etc.
There is some work coming down the pipe which will make a number of these improvements simpler in the future, but what the "future" here means is pretty up in the air.
Given this, I don't think the most expedient way for you to resolve your is to wait out for this feature.
I'd be happy to help walk you through what changes I would expect you'd need to make to get things working if that's useful to you?
from target-redshift.
@AlexanderMann thanks very much for the update and explanation. That all makes sense. If you wouldn't mind walking through the changes to get this scenario working I'd appreciated it. (And maybe any others who come across similar issues would see the explanation here and it would help them out.)
from target-redshift.
@joeschmid no problem. So I will start by saying that the way to "get this working" is to fork this repo, and start trying to get what you're after working. I'm also not sure if it'll "work" or end up being a 🐰 🕳
Worth noting, Stitch also doesn't "support" this: https://www.stitchdata.com/docs/destinations/redshift/#data-limits
Integer range
9223372036854775808 to 9223372036854775807
Integer values outside of this range will be rejected and logged in the _sdc_rejected table.
Easiest Option
Make all integers
NUMERIC(0, 20)
Pros
Prolly be straightforward and simple.
Cons
Column widths will balloon for all integers. Redshift (last I checked) uses the full width for a column for all values in the column, whereas PostgreSQL uses the width of the data in the row to consume memory.
Changes
In these lines, you're just going to make a mapping for JSONSchema's integer type to Redshift's NUMERIC(0,20)
: https://github.com/datamill-co/target-redshift/blob/master/target_redshift/redshift.py#L97-L118
For more examples of what that'd look like, check in here: https://github.com/datamill-co/target-postgres/blob/master/target_postgres/postgres.py#L806-L870
from target-redshift.
@joeschmid I'm not sure if you resolved this, but a hack (and for anyone looking this issue) would be to create a view where that column is a text/string type then use a SQL transform to parse that into a custom numeric type after replication.
from target-redshift.
Related Issues (18)
- Automatic varchar widening HOT 1
- Support IAM Roles
- Minor quality of life improvement HOT 1
- Add support for Postgres array types HOT 5
- invalid_records_detect in config not being respected HOT 5
- Add support for patternProperties HOT 7
- Performance: Use `CREATE TABLE ... LIKE ...` HOT 1
- connection closing before write HOT 5
- Add "timestamp without time zone" as supported data type HOT 5
- Memory consumption increase HOT 2
- Integrate with RedShfit Spectrum HOT 6
- Table comments breaking target HOT 3
- singer target-redshift, not able to connect to Redshift due to issue with SSL HOT 1
- Disable COMPUPDATE and STATUPDATE during copy for performance?
- CRITICAL cursor already closed / connection already closed HOT 2
- Column Sizing
- Meltano Recharge Issue: Datamill
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from target-redshift.