Tommy leads the Chicago Power BI User Group (downtown), which meets every month. He is also one of the hosts on the powerbi.tips Explicit Measures Podcast
learn more here - Puglia BI Consulting
Solution to pull in the Power BI Scanner API Metadata via Power Automate, and Power BI Template file
Home Page: https://powerbi.tips/2021/10/using-the-power-bi-scanner-api-to-manage-tenants-entire-metadata/
Tommy leads the Chicago Power BI User Group (downtown), which meets every month. He is also one of the hosts on the powerbi.tips Explicit Measures Podcast
learn more here - Puglia BI Consulting
Describe the bug
Power Query error on dataset refresh:
To Reproduce
I pulled information out of all workspaces (excluding personal workspaces) of our tenant (~1100 Workspaces)
Expected behavior
Dataset refreshes successfully.
Screenshots
The key dataSourceInstances appears twice:
as well as the key misconfiguredDatasourceInstances":
Desktop (please complete the following information):
Windows 10
Hi @pugliathomas - I was trying this solution and seems great. However, I was unable to run the "silverGoldGetScanResults" dataflow even though I followed all the instructions. I was able to run the first dataflow properly. Attached you will find the .TXT of my dataflow, GitHub does not allow attaching as JSON, so I am using TXT.
Describe the bug
Initial Dataflow (bronzeGetScannerWorkspaceID) doesn't work with greater than 100 workspaces
To Reproduce
If your tenant has more than 100 workspaces (mine has 255) when you PostWSContent in getScanID, it throws an error, in the dataflow the error is: "Bad Request" when I do the same thing in Insomnia the more specific error is "
Post https://api.powerbi.com/v1.0/myorg/admin/workspaces/getInfo?lineage=True&datasourceDetails=True&datasetSchema=true&datasetExpressions=true&getArtifactUsers=true
{
"error": {
"code": "InvalidRequest",
"message": "ValidateInput does not support more then 100 workspace Ids in one call"
}
}
When I manually cut down the wsbody list of workspaces to less than 100 it works great.
Expected behavior
To iterate thru and send Post requests in batches of 100
Additional context
https://learn.microsoft.com/en-us/rest/api/power-bi/admin/workspace-info-post-workspace-info#request-body
The required workspace IDs to be scanned (supports 1 to 100 workspace IDs)
Hello,
Thank you for your work on the scanner API and for helping create a useable solution to navigate this data. When I imported the power automate zip for the API call, I saw that your email is in as a related resource. What are the implications of this? It prevented me from moving forward with this solution due to my uncertainty of the future implications of this embedded relationship to your email.
Thank you again for this great repo,
All the best,
Josh
Describe the bug
The PowerAutomate Flow is running fine and returns a number of files. Some of them are quite large (66 Mb).
In one of the final steps, the flow hangs on step 'Return Workspace Info'.
The error text:
ActionResponseTimedOut. The execution of template action 'Return_workspace_info' is failed: the client application timed out waiting for a response from service. This means that workflow took longer to respond than the alloted timeout value. The connection maintained between the client application and service will be closed and client application will get an HTTP status code 504 Gateway Timeout.
Screenshots
Desktop (please complete the following information):
Additional context
Although the files are produced, the provided PowerBI template does not want to load them, returning error 'We found extra characters at the end of JSON input.'
The two issues might be related.
For me, the main flow "PowerBI Scanner V2" runs for more than 2mns, even for a modifiedSince window of 24hrs. As a consequence, the calling flow "PBI_Scanner_Refresh_Daily" times out before getting a response from the main flow. From what I read, the Power Automate http request has a default timeout of 2mns in synchronous mode. When the scanning completes, the return action in "PowerBI Scanner V2" finds no listening flow to respond to.
Am I the only one experiencing this issue ? Any configuration I may have missed ?
Trying to connect to the Sharepoint Library, Power Query Returns this error: Duplicate Name datasourceintances. Dont know where the problem is. It seems that the JSON file has more than one entrance for dasourceinstances.
This is the error message:
I am looking at the Power Query Editor and it seems that there is a problem just in the meta Step, after the content.
Making some additional investigation, it seems that for some reasson the JSON has duplicated parts, that we show in this picture.
If we delete the duplicated datasourceintances and misconfiguredatasourceinstances the report template works fine.
Maybe the problem is at the Flow level building the JSON file.
Anyway to solve this other than manipulate the JSON and delete the duplicated items?
regards.
Describe the bug
I have done all required setup (imported the two flows, created the App + Service principle, adjusted the tenant settings). The flow is starting successfully, but in the action "Step 3_ Retrieve data" I'm getting the error message "The variable 'responseArray' has size of more than '113213046' bytes. This exceeded the maximum size '104857600' allowed." after 9 rounds. The first 8 rounds in that loop did run ok. This is a large tenant, with currently ~22k workspaces.
What can be done to get around this issue? (Seems like a flow restriction.)
To Reproduce
Not sure it can be reproduced.
Describe the bug
Hello Pugliathomas, I am trying to configure the V2 Solution, and everthing works well until the last step which is conigure the PBIT. As the docuemntation shows, I conifigre the Worksspace ID and the Dataflow ID, and everything upload well except the Workspace Query, which show the following message.
Expression.Error: The key didn't match any rows in the table.
The image is Attached here.
Describe the bug
Power Query error on dataset refresh:
DataFormat.Error: We found extra characters at the end of JSON input.
Details:
Value={
This error is happening due to an issue in the MetaData_CCYY-MM-DD file.
To Reproduce
This error appears when trying to refresh the dataset. Note, DAX & Mashup expressions and detailed metadata enhancements are enabled in the tenant settings. There are also >100 Workspaces modified, so batches of 100 is triggered.
Expected behavior
A clear and concise description of what you expected to happen.
Dataset refreshes succesfully.
Screenshots
Found the "}{" causing the error, but not sure why it is happening.
Desktop (please complete the following information):
Additional context
I played with the batch counts a bit to try resolve but haven't found a solution yet.
This is a Great tool. By the way your team hosts an excellent podcast.
i am hoping you can help me with this issue:
apiscannerbug.docx
In the power bi pbix file, I navigate to the Datasets tab
On the pivot table visual there are two fields on Rows - datasetname, reportname
However when i expand the datasetname, it shows me all reports in the tenant and does not filter by reportname
for eg as show in the screen shot - when i expand my dataset "Addedvalue by state" it shows me 70 reports
when i expand another daset e.g. ERP customer file it shows me the same 70 reports.
the relationship does not seem to be working.
pls let me know if i can provide any additional information.
thank you
I have followed all the steps mentioned in the video and when I ran the flow it is erroring out at step "Step 3: Retrieve Data"->"Append to String variable"
The variable 'responseArrayASString' has size of more than '105910233' bytes. This exceeded the maximum size '104857600' allowed.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.