Coder Social home page Coder Social logo

rstudio / plumbertableau Goto Github PK

View Code? Open in Web Editor NEW
30.0 10.0 1.0 3.08 MB

R package for creating Plumber APIs that function as Tableau Analytics Extensions

Home Page: https://rstudio.github.io/plumbertableau/

License: Other

R 85.84% HTML 9.43% CSS 4.57% Rez 0.16%

plumbertableau's Introduction

plumbertableau

R-CMD-check Codecov test coverage CRAN status

plumbertableau lets you call external R code in real time from Tableau workbooks via Tableau Analytics Extensions. You achieve this by writing a plumbertableau extension, which is a Plumber API with some extra annotations — comments prefixed with #*.

library(plumber)
library(plumbertableau)

#* @apiTitle String utilities
#* @apiDescription Simple functions for mutating strings

#* Capitalize incoming text
#* @tableauArg str_value:[character] Strings to be capitalized
#* @tableauReturn [character] A capitalized string(s)
#* @post /capitalize
function(str_value) {
  toupper(str_value)
}

# The Plumber router modifier tableau_extension is required. This object is a
# function that acts as a plumber router modifier. For more details, see the
# Plumber documentation:
# https://www.rplumber.io/articles/annotations.html#plumber-router-modifier
#* @plumber
tableau_extension

plumbertableau extensions are used in Tableau’s calculated fields. Let’s imagine we’ve published our extension to RStudio Connect and have given it the custom URL stringutils. To use our capitalize extension, we’d type the following into a Tableau calculated field, or just copy and paste it from the automatically generated code samples. (In real usage, you’ll probably replace "Hello World" with references to Tableau data.)

SCRIPT_STR("/stringutils/capitalize", "Hello World")

Before you use the extension in Tableau, Tableau needs to be able to access it. plumbertableau integrates seamlessly with RStudio Connect, a commercial publishing platform that enables R developers to easily publish a variety of R content types. Connect lets you host multiple extensions by ensuring that requests from Tableau are passed to the correct extension. It’s also possible to host plumbertableau extensions on your own servers.

Installation

You can install plumbertableau from CRAN or install the latest development version from GitHub.

# From CRAN
install.packages("plumbertableau")

# From GitHub
remotes::install_github("rstudio/plumbertableau")

library(plumbertableau)

FAQ

I thought Tableau already supports R?

Tableau’s current support for R as an analytics extension is built on Rserve. This approach requires configuring Rserve in a separate environment and then passing R code as plain text from Tableau calculated fields to be executed by Rserve.

Why would I use this instead of RServe?

The approach suggested here allows specific endpoints to be called, rather than requiring the Tableau user to write and submit R code in a plain text field from Tableau. This allows Tableau users to be seperate from the extension developers. R developers can build extensions that are then used by Tableau developers who may have no working knowledge of R.

Is RStudio Connect required?

While this package has been designed specifically with RStudio Connect in mind, it will work independent of RStudio Connect.

What are the advantages of RStudio Connect?

RStudio Connect offers a number of advantages as a deployment platform for Tableau Analytics Extensions:

  • Tableau workbooks can only be configured to use a single extension endpoint, which typically limits a workbook to only using one type of extension. RStudio Connect can host both R and Python based extensions, which means that a single Tableau workbook can use both R and Python based extensions hosted on RStudio Connect.
  • R developers can develop extensions in their preferred development environment and then publish to RStudio Connect
  • Extensions published to RStudio Connect can be secured to only allow access from specific accounts
  • RStudio Connect natively handles R and Python packages, which allows extensions to seemlessly use different versions of underlying packages without creating conflicts.
  • RStudio Connect processes are sandboxed, which limits the scope of impact the extension can have on the underlying OS.

Why can’t I just write my own Plumber API to function as an analytics extension?

Tableau Analytics Extensions are configured to reach out to two specific endpoints:

  • /info: Information about the extension
  • /evaluate: Execution endpoint for the extension

plumbertableau automatically generates the /info endpoint and reroutes requests to /evaluate to the endpoint defined in the script value of the request body. This allows multiple endpoints to function as extensions, rather than relying on a single extension operating under /evaluate. These features are intended to allow the R developer to easily create Tableau Analytics Extensions as standard Plumber APIs without needing to worry about the lower level implementation.

Further Reading

You can read more about plumbertableau at https://rstudio.github.io/plumbertableau/. There, you’ll find more detail about writing plumbertableau extensions, publishing them to RStudio Connect, configuring Tableau, and using your extensions in Tableau.

plumbertableau's People

Contributors

blairj09 avatar jcheng5 avatar sagerb avatar toph-allen avatar yihui avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

isabella232

plumbertableau's Issues

Correct specification of tableauArg and tableauReturn

Say I have an R function that I wish to turn into an API for Tableau and I'd like to use this (awesome!) package. My function doesn't require any arguments and returns a data frame that contains a variety of types. A toy example is included below.

library(plumber)
library(plumbertableau)

#* @apiTitle An example
#* @apiDescription Simple example function
#* @tableauArg x:[logical] A dummy variable
#* @tableauReturn [integer] Returning some stuff
#* @post /mtcars
function(x) {
  mtcars[1:3, ]
}

# The Plumber router modifier tableau_extension is required
#* @plumber
tableau_extension

When I run this API (through Swagger), I get the following:

  {
    "mpg": 21,
    "cyl": 6,
    "disp": 160,
    "hp": 110,
    "drat": 3.9,
    "wt": 2.62,
    "qsec": 16.46,
    "vs": 0,
    "am": 1,
    "gear": 4,
    "carb": 4,
    "_row": "Mazda RX4"
  },
  {
    "mpg": 21,
    "cyl": 6,
    "disp": 160,
    "hp": 110,
    "drat": 3.9,
    "wt": 2.875,
    "qsec": 17.02,
    "vs": 0,
    "am": 1,
    "gear": 4,
    "carb": 4,
    "_row": "Mazda RX4 Wag"
  },
  {
    "mpg": 22.8,
    "cyl": 4,
    "disp": 108,
    "hp": 93,
    "drat": 3.85,
    "wt": 2.32,
    "qsec": 18.61,
    "vs": 1,
    "am": 1,
    "gear": 4,
    "carb": 1,
    "_row": "Datsun 710"
  }
]

This is exactly what I want. Fantastic!

Question 1

Note that I have included tableauArg despite my function not needing any arguments. If I omit this and the argument to my function, I get the following:

Error: Tableau argument and return data types must be specified.

Is there a correct way to omit arguments if they are not needed or do I always have to include a dummy argument like above?

Question 2

Similarly, I must include a data type for tableauReturn. I specify that the return type is an integer, but actually the data frame includes a mixture of integers, doubles, and strings. In Swagger, at least, the response above looks good. How should I specify mixed response data types, like shown in the above example?

Any help would be greatly appreciated and kudos on a stellar package!


Edit:

To follow up on Question 2: it looks like this specifies the SCRIPT_* function that Tableau uses (e.g., SCRIPT_INT in the example above since I specify @tableauReturn [integer]). Given that this is how Tableau and plumbertableau interact, presumably there's no way around this besides returning all data as strings (SCRIPT_STR) and doing the type conversion in Tableau?

Improve experience for mocking Tableau requests

Currently, R developers need to stop running the extension in their existing R session, then use mock_tableau_request() to generate a mock request, copy the output, restart the extension, then input the copied output from mock_tableau_request(). This process is clunky and unintuitive.

Alternatives:

  • An endpoint on the router that is automatically generated that generates a mock request and returns it inside the OpenAPI docs
  • A mechanism (probably an RStudio extension) for running these extensions as a background job (RStudio only) so the main R session remains free to generate the mock Tableau request.

RStudio Connect permissions error

Can we more gracefully report back authentication errors to Tableau from RSC? Right now, if a user tries to access an extension they can't access under the current API key, they see a message like the following in Tableau:

Unexpected Server Error
TableauException: An error occurred while communicating with the Analytics Extension.
2021-07-09 21:35:46.186
(YOjBMfthyrGrLc2DvwkGiwAAAFQ,1:1)

Parse Tableau arguments by name rather than position

There is concern that Tableau requests may rarely submit data objects that are out of order:

"data": {
  "_arg2": [],
  "_arg1": [],
  ...

Currently, arguments are parsed based on order rather than name. In order to prevent issues with unordered objects, data should be parsed by key values rather than order.

Update plumbertableau UI

fastapitableau has an updated UI (listening at /) that needs to be ported over to plumbertableau

Plumber-Tableou App Failure on Connect Marketplace Testing Instances

The plumber-tableau app, which runs successfully on the Dogfood server, fails to operate on the marketplace testing instance. Can this be related to missing dependencies, firewall configurations, or other environmental factors? The error in the logs is not intuitive.

Test app code
library(plumber)
library(plumbertableau)
library(outForest)
library(dplyr)

#* @apiTitle Outlier Detection for Tableau
#* @apiDescription Detect outliers in real-time on Tableau data using a Random Forest

#* Calculate outliers on input data
#* @tableauArg sales:numeric Numeric values representing sales for a given transaction
#* @tableauArg profit:numeric Numeric values representing profit for a given transaction
#* @tableauReturn logical A vector indicating the outlier status of each original observation
#* @post /detect-outliers
function(sales, profit) {
  dat <- tibble(sales, profit)
  out <- outForest(dat)
  outlier_rows <- outliers(out) %>%
    select(row) %>%
    distinct()

  dat %>%
    mutate(row = 1:n(),
           outlier = row %in% outlier_rows$row) %>%
    pull(outlier)
}

#* @plumber
tableau_extension

The same app failed to run on the Marketplace testing instance
Screenshot 2023-12-22 at 11 56 27 AM
Screenshot 2023-12-22 at 11 56 36 AM

Error in the log

2023/12/22 10:45:01 AM: Running plumber API at http://127.0.0.1:45949
2023/12/22 10:45:01 AM: Running swagger Docs at http://127.0.0.1:45949/__docs__/
2023/12/22 10:45:01 AM: Error in run_now(check_time, all = FALSE) : 
2023/12/22 10:45:01 AM:   Not compatible with requested type: [type=list; target=raw].
2023/12/22 10:45:01 AM: Calls:  -> invokeCppCallback
2023/12/22 10:47:00 AM: [rsc-session] Received signal: interrupt
2023/12/22 10:47:00 AM: [rsc-session] Terminating subprocess with interrupt ...
2023/12/22 10:47:01 AM:
2023/12/22 10:47:01 AM:
2023/12/22 10:47:01 AM: Execution halted
2023/12/22 10:47:01 AM: Plumber API exiting ...
2023/12/22 10:47:01 AM: [rsc-session] Terminated subprocess with signal: interrupt

Improve programmatic usage of `tableau_extension()`

Currently, tableau_extension() has been designed to work with standard Plumber decorators. However, it currently doesn't work well if a user wants to build the entire API programmatically:

library(plumber)
library(plumbertableau)

#* @plumber
function(pr) {
  pr %>%
    ... %>%
    tableau_extension()
}

The above doesn't work, and we need to think about how to support this type of development pattern.

The likely solution is something like the following:

library(plumber)
library(plumbertableau)

ext <- tableau_extension()

#* @plumber
function(pr) {
  pr %>%
    ... %>%
    ext()
}

However, this approach also doesn't currently work b/c there is no metadata provided about the Tableau endpoints, since the traditional annotation approach (#* tab.arg) isn't in use here.

Benchmarking

Benchmark performance when using RSC + Tableau and document findings.

Provide further details about using `tableau_extension`

User testing has revealed that users are caught off guard by the way tableau_extension is used:

@plumber
tableau_extension

This is certainly interesting syntax since there are no parentheses, which would indicate invoking a function. We should more clearly document how this modifier works and point to additional documentation.

Uniquely identify calling users in RSC access logs

In a customer conversation today, the question was raised about how to identify which Tableau user made a request to an analytics extension hosted on RSC. Since all analytics extension requests use the same shared credentials, we'll have to think about how we might do this. It may end up being something that needs to be solved on the RSC side of things.

Best practices for local development

While developing Tableau compliant APIs, it's beneficial to be able to inspect API requests in real-time to determine how to best handle them via Plumber. This can be accomplished using browser() statements in a locally running API. We should think if there are other approaches to the local development process that we could streamline / highlight.

FAQ: Calculated Field complains if you attempt to pass direct fields

If you attempt something like this:

SCRIPT_STR("/penguins/species", [Flipper Length Mm],[Body Mass G])

You may get an error like:

all fields must be aggregate or constant

To fix, you use a "fake" function, e.g.:

``
SCRIPT_STR("/penguins/species", ATTR([Flipper Length Mm]),ATTR([Body Mass G]))


This is likely going to be a very common FAQ that we may need to document.

RStudio Connect deployment

When deploying a Tableau API to RStudio Connect, it's often desired to set the Vanity URL and other attributes. Currently, there's no straightforward way to do this on content deployment. There are a few approaches here:

  1. Create a custom deployment function in this package
  2. Submit updates to the rsconnect package.

It seems that option 2 may be the preferred path.

RStudio IDE Template

Provide an IDE template to help users get started easily. The Loess example may be sufficient here.

This will be used in the RSC jumpstart as well.

Add logging

Use an environment variable to signal log verbosity and handle that request within the package.

Allow developers to specify expected data from Tableau

Since Tableau doesn't send named values in the data object of the request (values are named _arg1 - _argN), it would be helpful to enable developers to easily validate incoming data from Tableau and provide a helpful error message back to the Tableau user.

Warn of unsupported versions of RStudio Connect

The expectation is that plumbertableau will only work with RStudio Connect from a specific release forward. This should be transparent to users and, if possible, users should be warned/notified when trying to use plumbertableau with an unsupported version of RStudio Connect.

OpenAPI documentation `curl` command seems to mix Tableau-style and regular-style invocation

I'm running the stringutils example, and I see this CURL command under "Try it out":

Screen Shot 2021-07-16 at 3 59 08 PM

curl -X POST "http://127.0.0.1:8215/lowercase" -H  "accept: */*" -H  "Content-Type: application/json" -d "{\"script\":\"/lowercase\",\"data\":{\"_arg1\":[\"string\"]}}"

Notice that the URL's path component is /lowercase, but the JSON blob also contains the redundant "script": "lowercase". This doesn't match calls we'd expect from Tableau, which would, unless I'm having a brain fart, be more like:

curl -X POST "http://127.0.0.1:8215/evaluate" -H  "accept: */*" -H  "Content-Type: application/json" -d "{\"script\":\"/lowercase\",\"data\":{\"_arg1\":[\"string\"]}}"

Probably best to go one way or the other? Or am I overthinking this?

rstudioconnect a requirement?

Hi -

Is it possible to run this without rstudioconnect?

I tried to follow the steps in the introduction but getting the following error when starting up plumber:
Verbose logging is off. To enable it please set the environment variable DEBUGME to include plumbertableau.
Error in route$getFunc() : attempt to apply non-function

Any help would be appreciated.

Nested paths not currently supported on RStudio Connect

Due to how RStudio Connect routes requests from Tableau, nested paths are not currently supported. All endpoints must be at the root path of the router.

In order to properly accommodate this limitation, plumbertableau will be updated to only honor the final entry in the script value sent in a Tableau request. For example, if the Tableau request looks like the following:

{
  "script": "/foo/bar/capitalize",
  "data": {
    "_arg1": ["Hello World"]
  }
}

Then plumbertableau will route the request to /capitalize. This is due to the fact that the above request may reference an API hosted at RStudio Connect at /foo/bar, while the internal API route is /capitalize.

Initial release checklist

The following items should be complete ahead of the initial CRAN release:

Engineering

  • Unit testing
  • GH Actions
  • Comprehensive code review

Documentation

  • pkgdown
  • vignettes / getting started guides
  • RStudio website

Navigate from Tableau workbook back to extension

Today, it's difficult to know how to navigate from a Tableau workbook back to the extension it depends on. Since the calculated field in Tableau only contains the relative path to the extension, there's no good way to know how exactly to find the extension within RStudio Connect. We should think about how to make this easier so that when a Tableau developer inherits a workbook from someone else or revisits an old workbook, it's clear exactly what extension is being used.

Easier access to tableau request body

The request sent by Tableau contains a data object that contains values passed from Tableau. Currently, these values are parsed by Plumber and accessed as req$body$data. However, this requires that every endpoint make use of the req object.

#@ post /foo
function(req, res) {
  print(req$body$data$`_arg1`
}

Instead, we should parse out the data items and pass them directly to the endpoint function so endpoints can be defined as functions with traditional arguments:

#@ post /foo
function(foo) {
  print(foo)
}

The items in the data object are named _arg1, _arg2, ... _argn. It's imperfect, but the arguments could be parsed and passed into the function by order, rather than by name (since unique names cannot be specified when submitting a request from Tableau).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.