Comments (4)
The temporary table is created/deleted from the same database. As the temp table has a random name, you must have Lake Formation permissions to create, describe and delete any table in that database
from aws-sdk-pandas.
@jaidisido Thanks! I finally got what the problem was thanks to your answer. Now I am facing a different problem, but looking at wrangler's source code I think I know what's happening.
Basically, it seems that the temporary table is missing a column. This column is in fact the index of the pandas dataframe I am trying to merge into the Iceberg table.
Check this.
Unlike the final merge operation that takes in consideration the index parameter given as input to the to_iceberg method, the linked invocation of to_parquet does not pass the value of the parameter. Since the default is False, the temporary table will not be aligned with the final one in this particular case. This leads to the error I am having.
Now, I guess I can reset the dataframe index in order to retrieve the index as a column, but I think it would be nice to fix this. Let me know if my assumption is correct, or if I missed some other line that covers this case!
from aws-sdk-pandas.
The to_iceberg
API is already significantly overloaded with parameters. We would prefer not to add yet another parameter especially since as you mentioned you can reset the index before calling the method
from aws-sdk-pandas.
I understand, but that's not the point I was trying to make. You do not need to add a new parameter, you just have to pass the index parameter from outer method to inner method. If you'd like, I can fork and add a pull request so that you can check what I mean and evaluate if it makes sense.
from aws-sdk-pandas.
Related Issues (20)
- Merging with null matching causes extreme performance degradation. HOT 4
- Athena read_sql_query fails for time columns
- Timeout when using add_parquet_partitions() HOT 2
- error writing complex type to DynamoDB HOT 1
- Ability to add metadata to parquet/orc schemas directly HOT 1
- Timezone issue HOT 1
- KeyError: 'EnforceWorkGroupConfiguration' in athena.read_sql_query() HOT 2
- 'Unable to find installation candidates for ray (2.32.0)' when using Python3.8 HOT 2
- Unexpected OOM on wr.s3.to_parquet() (2.20.1 / Py3.7) HOT 1
- Layer Version 12 Is not Available in China HOT 3
- An error occurred while attempting to fetch data through Ray. HOT 2
- Add support for polars (perhaps via narwhals) HOT 1
- upsert_conflict_columns in `wr.postgresql.to_sql` does not accept double quoted case sensitive column name HOT 1
- Significant Performance Degradation with Python 3.9+ Upgrade in AWS Lambda HOT 3
- Iceberg commit errors thrown when using overwrite partition somehow fails to clean up temp table HOT 2
- Timestamps not being saved correctly to arraw dataset HOT 2
- Make S3 Select Deprecated and remove in future version
- Athena read_sql_query error when workgroup encryption is turned ON using CSE_KMS HOT 1
- No module named 'jsonpath_ng' when importing awswrangler HOT 6
- Support Numpy >= 2.0.0
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aws-sdk-pandas.