Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is possible to extract data from Power BI dashboard using crrri package? #107

Open
covid19ec opened this issue Jul 30, 2021 · 0 comments
Open

Comments

@covid19ec
Copy link

Hi all. I hope you are fine. I am trying to extract some data from a Power BI dashboard. The issue is the dashboard has three pages and data is in the last page inside a plot. This is the dashboard I am trying to scrape:

https://app.powerbi.com/view?r=eyJrIjoiMTkwNTZjZmEtNDJkYi00MmI3LThlZmYtZjViMDVmYTk1NTJiIiwidCI6IjJmYzgyYWFkLWYyMjUtNDM0OS04YjliLTg0MTZhNGFmNGQ3ZiJ9&pageName=ReportSection5e050ac003d0b042a320

It looks like this, the main issue is that in order to get the final page, I need to click over Siguiente button (circled in red):

imagen

Then, in second page there is a similar Siguiente button that I need to click:

imagen

After clicking the button I finally arrive at final page. The data I need is placed on the TOTAL DOSIS SEGÚN CANTÓN plot:

imagen

In order to get the data, I need to right click on the plot to get the option Show as table:

imagen

After that I need to click on this pop-up and see this:

imagen

The data I need is placed on the final part after the plot (the three columns). I have had some issues trying to obtain the data because it is difficult to identify the Siguiente buttons and then click the plot and see as table. I was trying to sketch some code using RSelenium but I am not able to determine the click buttons. Here is the code I have used:

library(dplyr)
library(purrr)
library(readr)
library(wdman)
library(RSelenium)
library(xml2)
library(selectr)

# using wdman to start a selenium server
selServ <- selenium(
  port = 4444L,
  version = 'latest',
  chromever = '91.0.4472.101', 
)
# using RSelenium to start chrome on the selenium server
remDr <- remoteDriver(
  remoteServerAddr = 'localhost',
  port = 4444L,
  browserName = 'chrome'
)
# open a new Tab on Chrome
remDr$open()
# navigate to the site you wish to analyze
report_url <- "https://app.powerbi.com/view?r=eyJrIjoiMTkwNTZjZmEtNDJkYi00MmI3LThlZmYtZjViMDVmYTk1NTJiIiwidCI6IjJmYzgyYWFkLWYyMjUtNDM0OS04YjliLTg0MTZhNGFmNGQ3ZiJ9&pageName=ReportSection5e050ac003d0b042a320"
remDr$navigate(report_url)
# find and click the button leading to the Siguiente action
NexBtn <- remDr$findElement('.//button[descendant::span[text()="Siguiente"]]', using="xpath")
NexBtn$clickElement()


The last two lines of code did not work because I do not know how to place the Siguiente buttons.

Maybe is it possible to extract this data using crrri package? Any help is welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant