Coder Social home page Coder Social logo

pbipy's Introduction

pbipy

PyPI GitHub tag (latest SemVer) PyPI - Python Version GitHub Static Badge Static Badge

pbipy is a Python Library for interacting with the Power BI Rest API. It aims to simplyify working with the Power BI Rest API and support programatic administration of Power BI in Python.

pbipy supports operations for Apps, Dataflows, Datasets, Reports, and Workspaces (Groups), allowing users to perform actions on their PowerBI instance using Python.

Installation

pip install pbipy

Or to install the latest development code:

pip install git+https://github.com/andrewvillazon/pbipy

Getting Started: Authentication

To use pbipy you'll first need to acquire a bearer_token.

How do I get a bearer_token?

To acquire a bearer_token you'll need to authenticate against your Registered Azure Power BI App. Registering is the first step in turning on the Power BI Rest API, so from here on it's assumed your Power BI Rest API is up and running.

To authenticate against the Registered App, Microsoft provides the MSAL and azure-identity python libraries. These libraries support different ways of acquiring a bearer_token and which to use will depend on how your cloud/tenant is configured.

Because there are multiple ways to acquire the token, pbipy leaves it up to the user do this in the way that suits, rather than directly handling authentication (of course, this might change in future).

This README doesn't cover authentication in detail, however, these are some helpful resources that look at acquiring a bearer_token in the context of Power BI:

The example below uses the msal library to to get a bearer_token.

import msal


#  msal auth setup
def acquire_bearer_token(username, password, azure_tenant_id, client_id, scopes):
    app = msal.PublicClientApplication(client_id, authority=azure_tenant_id)
    result = app.acquire_token_by_username_password(username, password, scopes)
    return result["access_token"]


bearer_token = acquire_bearer_token(
    username="your-username",
    password="your-password",
    azure_tenant_id="https://login.microsoftonline.com/your-azure-tenant-id",
    client_id="your-pbi-client-id",
    scopes=["https://analysis.windows.net/powerbi/api/.default"],
)

The code that follows assumes you've authenticated and acquired your bearer_token.

Useage

Start by creating the PowerBI() client. Interactions with the Power BI Rest API go through this object.

from pbipy import PowerBI

pbi = PowerBI(bearer_token)

To interact with the API, simply call the relevant method from the client.

# Grab the datasets from a workspace

pbi.datasets(group="f089354e-8366-4e18-aea3-4cb4a3a50b48")

pbipy converts API responses into regular Python objects, with snake case included! ๐Ÿ๐Ÿ

sales = pbi.dataset("cfafbeb1-8037-4d0c-896e-a46fb27ff229")

print(type(sales))
print(hasattr(sales, "configured_by"))

# <class 'pbipy.resources.Dataset'>
# True

Most methods take in an object id...

dataset = pbi.dataset(
    id="cfafbeb1-8037-4d0c-896e-a46fb27ff229",
    group="a2f89923-421a-464e-bf4c-25eab39bb09f"
)

... or just pass in the object itself.

group = pbi.group("a2f89923-421a-464e-bf4c-25eab39bb09f")

dataset = pbi.dataset(
    "cfafbeb1-8037-4d0c-896e-a46fb27ff229"
    ,group=group
)

If you need to access the raw json representation, this is supported to.

sales = pbi.dataset("cfafbeb1-8037-4d0c-896e-a46fb27ff229")

print(sales.raw)

# {
#   "id": "cfafbeb1-8037-4d0c-896e-a46fb27ff229",
#   "name": "SalesMarketing",
#   "addRowsAPIEnabled": False,
#   "configuredBy": "[email protected]",
#   ...
# }

Example: Working with Datasets

Let's see how pbipy works by performing some operations on a Dataset.

First, we initialize our client.

from pbipy import PowerBI

pbi = PowerBI(bearer_token)

Now that we've got a client, we can load a Dataset from the API. To load a Dataset, we call the dataset() method with an id and group argument. In the Power BI Rest API, a Group and Workspace are synonymous and used interchangeably.

sales = pbi.dataset(
    id="cfafbeb1-8037-4d0c-896e-a46fb27ff229",
    group="f089354e-8366-4e18-aea3-4cb4a3a50b48",
)

print(sales)

# <Dataset id='cfafbeb1-8037-4d0c-896e-a46fb27ff229', name='SalesMarketing', ...>

Dataset not updating? Let's look at the Refresh History.

We call the refresh_history() method on our Dataset. Easy.

refresh_history = sales.refresh_history()

for entry in refresh_history:
    print(entry)

# {"refreshType":"ViaApi", "startTime":"2017-06-13T09:25:43.153Z", "status": "Completed" ...}

Need to kick off a refresh? That's easy too.

sales.refresh()

How about adding some user permissions to our Dataset? Just call the add_user() method with the User's details and permissions.

# Give John 'Read' access on the dataset
sales.add_user("[email protected]", "User", "Read")

Lastly, if we're feeling adventurous, we can execute DAX against a Dataset and use the results in Python.

dxq_result = sales.execute_queries("EVALUATE VALUES(MyTable)")
print(dxq_result)

# {
#   "results": [
#     {
#       "tables": [
#         {
#           "rows": [
#             {
#               "MyTable[Year]": 2010,
#               "MyTable[Quarter]": "Q1"
#             },
# ...
# }

Example: Working with the Admin object

pbypi also supports Administrator Operations, specialized operations available to users with Power BI Admin rights. Let's see how we can use these.

First, we need to initialize our client. Then we call the admin method and initialize an Admin object.

from pbipy import PowerBI

pbi = PowerBI(bearer_token)
admin = pbi.admin()

Need to review some access on some reports? We can call the report_users method.

users = admin.report_users("5b218778-e7a5-4d73-8187-f10824047715")
print(users[0])

# {"displayName": "John Nick", "emailAddress": "[email protected]", ...}

What about understanding User activity on your Power BI tenant?

from datetime import datetime

start_dtm = datetime(2019, 8, 31, 0, 0, 0)
end_dtm = datetime(2019, 8, 31, 23, 59, 59)

activity_events = admin.activity_events(start_dtm, end_dtm)

print(activity_events)

# [
#   {
#       "Id": "41ce06d1", 
#       "CreationTime": "2019-08-13T07:55:15", 
#       "Operation": "ViewReport", 
#       ...
#   },
#   {
#       "Id": "c632aa64", 
#       "CreationTime": "2019-08-13T07:55:10", 
#       "Operation": "GetSnapshots", 
#       ...
#   }
# ]

More examples

Datasets in a Workspace

datasets = pbi.datasets(group="f089354e-8366-4e18-aea3-4cb4a3a50b48")

for dataset in datasets:
    print(dataset)

# <Dataset id='cfafbeb1-8037-4d0c-896e-a46fb27ff229', ...>
# <Dataset id='f7fc6510-e151-42a3-850b-d0805a391db0', ...>

List Workspaces

groups = pbi.groups()

for group in groups:
    print(group)

# <Group id='a2f89923-421a-464e-bf4c-25eab39bb09f', name='contoso'>
# <Group id='3d9b93c6-7b6d-4801-a491-1738910904fd', name='marketing'>

Create a Workspace

group = pbi.create_group("contoso")
print(group)

# <Group id='a2f89923-421a-464e-bf4c-25eab39bb09f', name='contoso'>

Users and their access

group = pbi.group("a2f89923-421a-464e-bf4c-25eab39bb09f")
users = group.users()

for user in users:
    print(user)

# {"identifier": "[email protected]", "groupUserAccessRight": "Admin", ... }
# {"identifier": "[email protected]", "groupUserAccessRight": "Member", ... }

Power BI Rest API Operations

pbipy methods wrap around the Operations described in the Power BI Rest API Reference:

Power BI REST APIs for embedded analytics and automation - Power BI REST API

What's implemented?

Most of the core operations on Datasets, Workspaces (Groups), Reports, Apps, and Dataflows are implemented. Given the many available endpoints, not everything is covered by pbipy, so expect a few features to be missing.

If an operation is missing and you think it'd be useful, feel free to suggest it on the Issues tab.

PowerBI Component Progress Notes
Datasets Done
Groups (Workspaces) Done
Reports Done
Apps Done
Dataflows Done
Admin Operations Done Implements operations related to Datasets, Groups, Reports, Apps, and Dataflows only.
Dashboards Todo
Everything else Backlog

Contributing

pbipy is an open source project. Contributions such as bug reports, fixes, documentation or docstrings, enhancements, and ideas are welcome. pbipy uses github to host code, track issues, record feature requests, and accept pull requests.

View CONTRIBUTING.md to learn more about contributing.

Acknowledgements

The design of this library was heavily inspired by (basically copied) the pycontribs/jira library. It also borrows elements of cmberryay's pypowerbi wrapper.

Thank You to all the contributors to these libraries for the great examples of what an API Wrapper can be.

pbipy's People

Contributors

andrewvillazon avatar c-roensholt avatar jeromerg avatar lgrosjean avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pbipy's Issues

Add support for passing in a Group object to PowerBI.dataset()

Some methods should be able to support passing in an object as opposed to passing in an Id.

Current:

group = pbi.group("a2f89923-421a-464e-bf4c-25eab39bb09f")
dataset = pbi.dataset("cfafbeb1-8037-4d0c-896e-a46fb27ff229", group.id)

Proposed:

group = pbi.group("a2f89923-421a-464e-bf4c-25eab39bb09f")
dataset = pbi.dataset("cfafbeb1-8037-4d0c-896e-a46fb27ff229", group)

Feature Request: Get Embed Token for Reports

It would be useful if the Report class could support the https://api.powerbi.com/v1.0/myorg/groups/{GROUP ID}/reports/{REPORT ID}/GenerateToken API endpoint. So that we could generate an embed token for a given Report.

Enhancement Request: Easy way to override default BASE_URL

Thank you for your library. It is proving useful.

The initial difficult thing had to figure out was that the settings.py was setting BASE_URL to wrong value for our environment. Once I figured this out and overwrote the BASE_URL value in the file, the library worked for me. Would be nice to have parameter when creating the PowerBI client to override the BASE_URL instead.

Thanks.

Get report visuals data results

If there posible to get the data of visual objets of a report ?, i alreay creade de pbi.report() and is fine, but i need to create a script that get data for a visual object of that specific report

Move response handling from RequestsMixin

To minimize repetition of api response handling (response parsing, error handling, etc), the handling was provided through a mixin, utils.RequestsMixin. Classes requiring the response handling inherit from the Mixin. The unintended side effect is that subclasses of Resource can call methods belonging to the Mixin, e.g., delete(), patch(), etc. However, as noted in #12, using in this way will error. This creates confusion, as these were not intended to be available to be called.

There's a couple of options:

  • Put the response handling on powerbi.PowerBI (power bi client) and then make the client a dependency for classes that need to interact with the api. Remove the session as a dependency for Resource types and replace with the client
  • Move the utils.RequestsMixin methods into utils and reference where required

List of measures with expression

Is pbipy capable of returning the measure names along with the corresponding dax expression for the pbi dataset . If yes ,please guide with the steps.

Add convenience method that handles the export process

In the pbi api, the process to export a Report into another format involves:

  • Triggering an export job
  • Polling the export job status until complete
  • Downloading the file.

A method that wraps this process should be added to the Report object, e.g. report.export(format="pdf", save_to="/some_dir", filename="my_report")

Correctly define __repr__ and __str__

The current __repr__ methods on Resource types is being used like a __str__ method, i.e. being used to provide a readable string representation, rather than an unambiguous representation. Fix the __repr__ method and implement a __str__ method.

Admin Dataflows returns no ID

When calling the admin.dataflows() the id field returns = "None" but there is data in the other fields.

This is the code I ran:

x = admin.dataflows()
df  = pd.DataFrame(x)
print(df)

This is the result

0 <Dataflow id=None, name='Test', description=''...
1   <Dataflow id=None, name='Turbine Tracker', des...

Clearly there are ID's

  "objectId": "69cedb8b-9550-4ec9-956d-82f20204a830",
      "name": "Test",
      "description": "",

create_group KeyError

creating a new group (workspace) works, but the RestAPI response of Microsoft might have changed.
raw ist not a list but a dictionary/json

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
[/app/etl/mgmt.ipynb](https://vscode-remote+attached-002dcontainer-002b7b22636f6e7461696e65724e616d65223a222f64665f6e6f7465626f6f6b227d-0040ssh-002dremote-002bssh-002egeoflow-002ech.vscode-resource.vscode-cdn.net/app/etl/mgmt.ipynb) Cell 15 line 1
----> [1](vscode-notebook-cell://attached-container%2B7b22636f6e7461696e65724e616d65223a222f64665f6e6f7465626f6f6b227d@ssh-remote%2B/mgmt.ipynb#X51sdnNjb2RlLXJlbW90ZQ%3D%3D?line=0) pbi.client.create_group("testgroup")

File [/usr/local/lib/python3.11/site-packages/pbipy/powerbi.py:604](https://vscode-remote+attached-002dcontainer-002b7b22636f6e7461696e65724e616d65223a222f64665f6e6f7465626f6f6b227d-0040ssh-002dremote-002bssh-002egeoflow-002ech.vscode-resource.vscode-cdn.net/usr/local/lib/python3.11/site-packages/pbipy/powerbi.py:604), in PowerBI.create_group(self, name, workspace_v2)
    602 raw = self.post_raw(resource, self.session, payload)
    603 # Endpoint returns as list with one element.
--> 604 group_raw = raw[0]
    606 return Group(group_raw.get("id"), self.session, raw=group_raw)

KeyError: 0

Dataset Refresh > wait until finished or return RefreshId ?

Dear maintainers,

First of all, thank you a lot for your job, your REST Api wrapper help me and my team a lot!

I'm using your package to send a RefreshRequest for some of my company Premium datasets. Dont you think it would help by improving the Dataset.refresh method by returning the RequestId generated after the HTTP Post instead of `None ?

Regarding the REST API documentation, it seems that the returned header contains the given RequestId: https://learn.microsoft.com/en-us/power-bi/connect-data/asynchronous-refresh#response

This request ID could be useful then to wait until the refresh is done to monitor the refresh of the dataset! I can help on this feature if you want.

Thank you for your time and you work again!

Error handling fails when the response does not include a json component

Under certain conditions the api response won't have a json component, e.g., 403 Forbidden. Currently the error handling assumes that a json component and includes a nicely formatted version of the json in the error message. When there is no json component this results in a JSONDecodeError because there's nothing.

Improve the error handling to properly handle the different errors that could be encountered and provide better feedback as to what is going on.

Missing looping in admins.groups and others?

Hi Andrew et all,

I came across your library looking for an implementation on the admin getscanresult and looked through the other REST call implementations.

I see that you implemented a top=5000 approach, however with a 5000+ tenant environment a skip/top approach is needed. The API returns a "@odata.count": 5001, so it is possible to check the number of groups, then start looping until there is an empty response/the count is reached.

Is that something that you would prefer to be handled within admin.group or outside? I assume that it should have been handled within the function, as the wrapper is meant to make life with the Power BI REST API's more easy.

This applies to quite a few admin API's btw. Well... except the Activity Event, which uses a continuation token, which you did implement :)

Regards, Johannes

Unable to access pbi datasets and groups

I got the respective bearer token to connect to pbi app services using pbipy, but still, I am getting the below error while doing the same. Please guide urgently. Please find the below snippets for reference.

continuationToken API error

I'm following exactly the example to get the admin.activity_events provided, with current dates. My code:

pbi = PowerBI(access_token)
admin = pbi.admin()
from datetime import datetime

start_dtm = datetime(2023, 9, 7, 0, 0, 0)
end_dtm = datetime(2023, 9, 7, 23, 59, 59)

zEvents = admin.activity_events(start_dtm, end_dtm)

print(zEvents)

And I keep getting this error message: "Expected literal type token but found token"

Not sure what's going on. I can get output if I do not use this library and use requests.get()

Could this be a bug in your library?

Here is the full cell output:

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
Cell In [15], line 9
      6 end_dtm = datetime(2023, 9, 7, 23, 59, 59)
      8 #zEvents = admin.activity_events('2023-09-07T00:00:00.000Z', '2023-09-07T23:59:59.000Z')
----> 9 zEvents = admin.activity_events(start_dtm, end_dtm)
     11 print(zEvents)

File ~/cluster-env/clonedenv/lib/python3.10/site-packages/pbipy/admin.py:87, in Admin.activity_events(self, start_date_time, end_date_time, filter)
     84 continuation_token = init_raw["continuationToken"]
     86 while continuation_token is not None:
---> 87     raw = self.get_raw(
     88         url, self.session, params={"continuationToken": continuation_token}
     89     )
     91     activity_events.extend(raw["activityEventEntities"])
     93     continuation_token = raw["continuationToken"]

File ~/cluster-env/clonedenv/lib/python3.10/site-packages/pbipy/utils.py:463, in RequestsMixin.get_raw(self, resource, session, params, **kwargs)
    460     return self.parse_raw(raw)
    462 except Exception as e:
--> 463     raise e

File ~/cluster-env/clonedenv/lib/python3.10/site-packages/pbipy/utils.py:457, in RequestsMixin.get_raw(self, resource, session, params, **kwargs)
    433 """
    434 Request an api resource, parse the response json, and return the
    435 parsed json.
   (...)
    453     Error encountered during the request process.
    454 """
    456 try:
--> 457     response = self.get(resource, session, params, **kwargs)
    458     raw = response.json()
    460     return self.parse_raw(raw)

File ~/cluster-env/clonedenv/lib/python3.10/site-packages/pbipy/utils.py:418, in RequestsMixin.get(self, resource, session, params, success_codes)
    415 response = session.get(resource, params=params)
    417 if response.status_code not in success_codes:
--> 418     raise HTTPError(
    419         f"""Encountered api error. Response: 
    420         
    421         {json.dumps(response.json(), indent=True)})"""
    422     )
    424 return response

HTTPError: Encountered api error. Response: 
                
                {
 "error": {
  "code": "BadRequest",
  "message": "Bad Request",
  "details": [
   {
    "message": "Expected literal type token but found token 'eyJTdGFydERhdGVUaW1lIjoiMjAyMy0wOS0wN1QwMDowMDowMC4wMDAwMDAwXHUwMDJCMDA6MDAiLCJFbmREYXRlVGltZSI6IjIwMjMtMDktMDdUMjM6NTk6NTkuMDAwMDAwMFx1MDAyQjAwOjAwIiwiRmlsZU5hbWUiOiIyMDIzLTA5LTA3VDAxX3YxXzAwMS5jc3YiLCJGaWxlT2Zmc2V0IjowLCJBY3Rpdml0eSI6bnVsbCwiVXNlcklkIjpudWxsfSwyMDIzLTA5LTA3VDAwOjAwOjAwLjAwMDAwMDArMDA6MDAsMjAyMy0wOS0wN1QyMzo1OTo1OS4wMDAwMDAwKzAwOjAwLDAsLA'.",
    "target": "continuationToken"
   }
  ]
 }
})

admin.groups missing 1 required positional argument: 'top'

When we run the admin.groups() we get this error:

TypeError    Traceback (most recent call last) Cell In [57], line 1----> 1 zEvents=admin.groups()TypeError: Admin.groups() missing 1 required positional argument: 'top'

Is this an issue or is groups handled differently?

group.delete() fails

delete() method seems not to be implemented

TypeError                                 Traceback (most recent call last)
Cell In[33], [line 3](vscode-notebook-cell:?execution_count=33&line=3)
      [1](vscode-notebook-cell:?execution_count=33&line=1) groups=pbi.client.groups(filter="name eq 'customer3'")
      [2](vscode-notebook-cell:?execution_count=33&line=2) group=groups[0]
----> [3](vscode-notebook-cell:?execution_count=33&line=3) group.delete()

TypeError: RequestsMixin.delete() missing 2 required positional arguments: 'resource' and 'session'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.