Coder Social home page Coder Social logo

paws Athena wrapper request about paws HOT 24 CLOSED

DyfanJones avatar DyfanJones commented on June 29, 2024
paws Athena wrapper request

from paws.

Comments (24)

davidkretch avatar davidkretch commented on June 29, 2024 1

We have a fix ready for this as well. We'll submit it to CRAN probably within the next day unless more comes up.

from paws.

davidkretch avatar davidkretch commented on June 29, 2024 1

@DyfanJones I'm going to close this issue if it's alright with you -- we've got another issue that covers setting the credentials as you discuss above and we're leaving that one open till we finish with it.

from paws.

DyfanJones avatar DyfanJones commented on June 29, 2024 1

@davidkretch no worries :) I am happy to change to the name to help prevent confusion. This is why I asked you guys in the first place :D

I will change the package name to noctua (latin for owl). Owl is the symbol of Athena, which seems fitting :)

from paws.

davidkretch avatar davidkretch commented on June 29, 2024 1

Awesome! Thanks for the shout out! Also, if you want to contact us directly, you can email us at [email protected] and [email protected].

from paws.

adambanker avatar adambanker commented on June 29, 2024

That sounds like a really cool tool and we'd be thrilled if you used Paws for it! Let us know if you have any questions about Paws along the way.

from paws.

DyfanJones avatar DyfanJones commented on June 29, 2024

Hi all,
From my understanding the paws package sets credentials through the environmental variables.
Even down to profiles that are set in the .aws directory. For example you use the environmental variable AWS_PROFILE to change profiles.

My question is: If I want to assume a role using profile "x", and then use that new role do I need to set those temporary credentials in the
environmental variables? Or is there a way to feed them into the application I am using similar to python's boto3 package i.e.

Method 1

# set aws profile
Sys.setenv("AWS_PROFILE" = "x")

sts <- paws::sts()
Role <- sts$assume_role(RoleArn = "arn:aws:sts::made_up_arn_role",
                                          RoleSessionName = "example_session")

Sys.setenv(AWS_ACCESS_KEY_ID = Role$Credentials$AccessKeyId)
Sys.setenv(AWS_SECRET_ACCESS_KEY = Role$Credentials$SecretAccessKey)
Sys.setenv(AWS_SESSION_TOKEN = Role$Credentials$SessionToken)

# simple example of using the athena application
athena <- paws::athena()
athena$list_named_queries()

OR

Method 2:

# set aws profile
Sys.setenv("AWS_PROFILE" = "x")

sts <- paws::sts()
Role <- sts$assume_role(RoleArn = "arn:aws:sts::made_up_arn_role",
                                          RoleSessionName = "example_session")

athena <- paws::athena(AWS_ACCESS_KEY_ID= Role$Credentials$AccessKeyId,
                       AWS_SECRET_ACCESS_KEY = Role$Credentials$SecretAccessKey,
                       AWS_SESSION_TOKEN = Role$Credentials$SessionToken)

# simple example of using the athena application
athena$list_named_queries()

from paws.

adambanker avatar adambanker commented on June 29, 2024

It is interesting that you bring this up as this is a feature that we have been actively working on. In its current state, you would have to use method 1. However, we are getting close to rolling out support for passing in credentials using method 2. We hope to add that ability in this week.

from paws.

DyfanJones avatar DyfanJones commented on June 29, 2024

Thanks for getting back in touch, this is great news. I will keep an eye on your package. Currently I have template for the initial dbGetQuery wrapper. I will keep the method basic for now and update and adapt it to align with your up coming changes :)

I will push to github soon, feel free to check out the progress of the wrapper. I am thinking of calling the package paws.athena to reflect the paws package being the driver.

from paws.

DyfanJones avatar DyfanJones commented on June 29, 2024

Hi all,

Is there any plans to implement upload_file for s3 (in paws.storage) similar boto3s upload_file method?

import boto3

session = boto3.Session()
s3 = session.client("s3")

s3.Bucket("somebucket").upload_file(Filename = "local/iris.csv", Key = "iris.csv")

So for paws something like:

s3 <- paws::s3()

s3$upload_file(Bucket ="some_bucket",
                         Filename = "local/iris.csv",
                         Key = "iris.csv")

The reason for this request is that I am trying to create a dbWriteTable method and Athena is unable to read raw file types (to my knowledge).

In my other wrapper, I out a csv, tsv or parquet file types and upload them into s3 and then register the ddl in athena.

from paws.

adambanker avatar adambanker commented on June 29, 2024

You can use the s3$put_object function to upload the csv files to S3. An example of this in action can be found here: S3 Example.

from paws.

DyfanJones avatar DyfanJones commented on June 29, 2024

Thanks @adambanker that example did help

from paws.

DyfanJones avatar DyfanJones commented on June 29, 2024

Hi all,

I am having difficulty uploading object to a partitioned file structure using the current method:

S3 <- paws::s3()

# create tempfile
t <- tempfile()
write.table(mtcars, t, sep = ",", row.names = FALSE, quote=FALSE)

# prepare data
t_con <- file(t, "rb")
obj <- readBin(t_con, "raw", n = file.size(x))

# upload data to s3
S3$put_object(Body = obj, Bucket = "mybucket", Key = "mtcars/timestamp=20090420/mtcars.csv")

# close file connection
close(t_con)

Returned error:

Error: SignatureDoesNotMatch: The request signature we calculated does not match the signature you provided. Check your key and signing method.

The reason to create a partitioned file structure is for partition tables in athena please see attached for more details:
https://docs.aws.amazon.com/athena/latest/ug/partitions.html

Is there another method for creating partitioned file structures in s3 using the paws package?

from paws.

adambanker avatar adambanker commented on June 29, 2024

On first glance, on the line:

obj <- readBin(t_con, "raw", n = file.size(x))

you set n = file.size(x) but x is undefined. Do you get the same error if you change it to:

obj <- readBin(t_con, "raw", n = file.size(t))

from paws.

DyfanJones avatar DyfanJones commented on June 29, 2024

hi @adambanker thanks for your reply. My apologises there was an error in my example code. After change it i still get the error when using S3$put_object with the above error. Further digging aws says the "=" might require special handling, and might need to be URL encoded or referenced as HEX.

https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html#object-metadata

from paws.

davidkretch avatar davidkretch commented on June 29, 2024

Ah, sorry about that. We'll fix that tonight and have the newest version on CRAN within the next couple days.

from paws.

DyfanJones avatar DyfanJones commented on June 29, 2024

@davidkretch perfect so far i am really impressed with this package and the level in response time. I should have a working DBI interface into athena really soon with all the improvements you guys have been making.

from paws.

DyfanJones avatar DyfanJones commented on June 29, 2024

Hi All,

Sorry to be a pain, but I am coming across this issue. I am trying to delete objects using delete_objects

Here is some example code:

S3 <- paws::s3()
S3$delete_objects(Bucket = "mybucket",
Delete = list(Objects = list(list(Key = "subfolder/file_want_to_delete")), 
                     Quiet = F))

The returning error:

$Deleted
list()

$RequestCharged
character(0)

$Errors
$Errors[[1]]
$Errors[[1]]$Key
[1] "subfolder/file_want_to_delete"

$Errors[[1]]$VersionId
character(0)

$Errors[[1]]$Code
[1] "NoSuchVersion"

$Errors[[1]]$Message
[1] "The specified version does not exist."

From my understanding if a versionId is not provided then it should delete all versions. However it looks like it is passing VersionId character(0) and returning the above error.

The reason why delete objects is interesting as it can be used in combination with list_objects to delete objects by prefix for example:

S3 <- paws::s3()

bucket <- "mybucket"
content <- S3$list_objects(Bucket = bucket, 
                                   Prefix = "subfolder/")
content_list <- lapply(content$Contents, function(x) list(Key= x$Key))

S3$delete_objects(Bucket = bucket,
                  Delete = list(Objects = content_list, Quiet = F))

from paws.

davidkretch avatar davidkretch commented on June 29, 2024

@DyfanJones The bug fixes for the S3 key names and deleting multiple objects are now in the version of paws.common on CRAN. paws.common is paws' low-level API interaction package. If you install the latest version of it from CRAN, let us know if you run into further issues.

from paws.

DyfanJones avatar DyfanJones commented on June 29, 2024

@davidkretch Thanks for the update :) I have ran my unit tests and the paws.athena wrapper is working with the update 👍 the good news is now the package can partition athena tables with dbWriteTable and create history tables. I have developed an initial way to assume_roles (by updating the system variables). When you guys have implemented a method to pass credentials to paws object I will update the package accordingly :)

from paws.

davidkretch avatar davidkretch commented on June 29, 2024

@DyfanJones Cool, glad to hear it. We'll keep you posted.

Also, about the name -- we think the name paws.athena might be a little confusing, since Paws is already made up of a bunch of different packages, all of which start with "paws." (paws, paws.common, paws.compute, etc.).

I think it would be nice to have a common name pattern for helpful interfaces or abstraction layers to the AWS services e.g. Athena, but I don't know what that would be yet. I am not one for coming up with good names.

from paws.

davidkretch avatar davidkretch commented on June 29, 2024

@DyfanJones Thanks -- that is a cool name.

from paws.

DyfanJones avatar DyfanJones commented on June 29, 2024

Hi all,
Just wanted to let you both know that the athena DBI wrapper noctua is now on the cran. Thanks for your quick responses. I will keep up to date with the developments of paws package and help to prompt the package through my personal blog and rbloggers.

from paws.

davidkretch avatar davidkretch commented on June 29, 2024

Nice! We'll also link to noctua in the paws readme.

from paws.

DyfanJones avatar DyfanJones commented on June 29, 2024

Just to let you know, here is the blog: https://www.r-bloggers.com/an-amazon-sdk-for-r/ hope you enjoy the read

from paws.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.