Coder Social home page Coder Social logo

Comments (3)

jaidisido avatar jaidisido commented on May 28, 2024

The wr.athena.read_sql_query API has a pyarrow_additional_kwargs argument which is forwarded to the to_pandas method. If nothing is supplied, some sane defaults are applied.

If you wish to override these defaults, to remove types_mapper for example, you can do something along the lines of:

data = wr.athena.read_sql_query(
    sql="SELECT id, options FROM my_table",
    database="my-database",
    categories=["options"],
    pyarrow_additional_kwargs={'types_mapper': None},
)

from aws-sdk-pandas.

antbz avatar antbz commented on May 28, 2024

@jaidisido While that is a nice suggestion, it also does not work because of how _fetch_parquet_result works internally. When you specify pyarrow_additional_kwargs, the categories are never added to the kwargs actually passed onto pyarrow:

if not pyarrow_additional_kwargs:
pyarrow_additional_kwargs = {}
if categories:
pyarrow_additional_kwargs["categories"] = categories

For it to work correctly you need to pass categories as additional kwargs as well:

data = wr.athena.read_sql_query(
    sql="SELECT id, options FROM my_table",
    database="my-database",
    pyarrow_additional_kwargs={'types_mapper': None, 'categories': ['options']},
)

I'm not sure if the behaviour in _fetch_parquet_result is intentional or not, but as it stands, the categories parameter effectively does not do what it is supposed to. We should either document this better or find a way to make it compatible by default.

from aws-sdk-pandas.

jaidisido avatar jaidisido commented on May 28, 2024

I can't think of a reason why it's setup that way so I believe it's just badly indented. #2701 should fix that

from aws-sdk-pandas.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.