Coder Social home page Coder Social logo

processmapr's People

Contributors

bijst avatar fmannhardt avatar gertjanssenswillen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

processmapr's Issues

dotted_chart.grouped_eventlog does not work

When trying dotted_chart on a grouped event log, the grouping columns are missing in the plot data. For example:

sepsis %>% group_by(resource) %>% dotted_chart()
Error: At least one layer must contain all faceting variables: `resource`.
* Plot is missing `resource`
* Layer 1 is missing `resource`

pass render=TRUE

I am using the process_map within an R MARKDOWN... When there are too many cases the code wait untill user confirms he wants to plot regardless possible ununderstandable graph. Is it possible to pass programmatically the confirmation, e.g., using a parameter lot_of_traces_behavious = c("ask", "N", Y")

Thanks in advance for the attention

myevnlog %>% process_map()
You are about to draw a process map with a lot of traces.
        This might take a long time. Try to filter your event log. Are you sure you want to proceed?
Y/N: Y
Warning messages:
1: In bind_rows_(x, .id) :
  binding factor and character vector, coercing into character vector
2: In bind_rows_(x, .id) :
  binding character and factor vector, coercing into character vector

Sort Dotted Chart on all activities

dotted_chart(x = "absolute", y = "start") or plotly_dotted_chart() work well, if the start timestamps of the cases are all different. However, sometimes many cases start at the same time, if there's some batch behaviour. For these cases, the dotted_chart() function seems to arrange the cases according to their position in the log or even in reverse order. This can produce strange graphs, such as this one (see the area marked with the red ellipse):

newplot

If several cases start at the same time, the dotted chart with the "start"-argument should arrange them according to the timestamp of the second activities in the case, then the third activities, etc. The phenomenon is particularly relevant, if the timestamps are not very granular and only consist of dates.

Or do I somehow have to prearrange the cases in the log, before using the Dotted Chart?

process_map() issue

Hi Gert,

Thank you for all these fundamental improvements of the bupaR package.

I have a problem with the process_map () function: when I try to produce a process map, RStudio's Viewer reports this error "Error: syntax in row 37 near" ("." I found that this occurs when the size of my data increases (for example, by moving filter_activity_frequency from 0.1 to 0.2).

Do you have some advice?

Thank you

Process Map ends up backwards

Wanted to officially document what I saw here in case anyone else sees similar issue. The minute I execute devtools::install_github("gertjanssenswillen/processmapr", dependencies=TRUE), I end up with the map essentially upside down (start is still on top and end on bottom). Below was the code I attempted to execute:

data.map2 <- data %>%
  process_map(type_nodes = frequency(value = "absolute_case"), type_edges = performance(FUN = median, units = "mins"))

Along with the following error:
Error in data.frame(id = 1:n, from = from, to = to, rel = rel, stringsAsFactors = FALSE) : arguments imply differing number of rows: 2, 0 In addition: Warning messages: 1: In bind_rows_(x, .id) : binding factor and character vector, coercing into character vector 2: In bind_rows_(x, .id) : binding character and factor vector, coercing into character vector 3: In min(x) : no non-missing arguments to min; returning Inf 4: In max(x) : no non-missing arguments to max; returning -Inf

could not find function "set_global_graph_attrs"

Running the command patients %>% process_map() results in the following error:

Error in set_global_graph_attrs(., attr = "rankdir", value = "LR", attr_type = "graph") : could not find function "set_global_graph_attrs"

I'm using the latest packages of bupaR (0.3.2) and DiagrammeR (1.0.0).

According to DiagrammeR issue #277 it appears that this function has been replaced by add_global_graph_attrs.

EDIT: this issue also occurs with function resource_map()

plotly_dotted_chart for relative view

Thanks for your continuing development of bupaR!

Here's a suggestion for the plotly_dotted_chart()function.

Log %>% plotly_dotted_chart() seems to only work without arguments.

If I enter plotly_dotted_chart(x = "relative", sort = "duration") I get an error, saying there's an unused argument.

Suggestion 1: Make plotly_dotted_chart available for relative dotted charts as well. This would be very useful, since the user can then get the Case-ID from long-running cases directly from the graph and investigate specific cases further in the raw data.

Suggestion 2: Create a function plotly_trace_explorer. This would be useful too, since the trace_explorer graph is sometimes hard to read if there are many traces, due to the shortening of activity names. The tooltips from plotly would alleviate this.

I'm using the CRAN-versions (v 0.4.1 from bupaR and v 0.3.2 from processmapR).

process_map()

Hi Gert,

as you have already intended, I am using bupaR for a project. I noted some incongruences when I use the process_map() function and the precedence_matrix(type = "absolute") function: it seems that the arcs in process_map doesn't fit well the same information showed by precedence_matrix.

Is it possible?

Thanks

precedence_matrix

Hi, Gert. I'm using your useful package and I have an isseu on processmapR.

The precedence_matrix function return the following error:
Error in mutate_impl(.data, dots) :
Not compatible with requested type: [type=character; target=integer]

Thanks.

Please remove it from process_map

There is a prompt when one asks for generation of a processmap with more than 750 traces. The problem is that this prompt stops knittr from doing its work. Please remove it from the function.

The next code creates the problem:

if(n_traces(eventlog) > 750) {
message("You are about to draw a process map with a lot of traces.
This might take a long time. Try to filter your event log. Are you sure you want to proceed?")
answer <- readline("Y/N: ")

if(answer != "Y")
	break()

}

process_map output does not have nodes in rankdir order

Process_map output doesn't appear to layout the graph correctly, e.g.

event_log %>%
process_map(
rankdir = "RL"
)

produces a graph where the nodes seem to be randomly all over the place, and not in order from left-to-right for the most frequent path.

processmapoutput

Understand that this is probably caused by DiagrammeR, and maybe I could modify the DiagrammR Graph object myself to obtain a more intuitive rankdir. Have you seen this before?

Negative durations for overlapping/parallel activity periods

Hello, I have a couple questions in regards to how best to handle activities that occur at the same time as other activities or overlap with other activities.

I'm trying to convert a series of customer and staff activities that can occur as part of a case, but often they can be created and/or completed at the same time.

  • Case gets created when customer fills out details in an online form
  • Based on details customer provides in the form, 3 activities get created for the customer to provide bank statements, proof of address and recent payslips with the same creation timestamp.
  • Customer can provide these complete these separately or all at once, and on customer activity completion it triggers the creation of staff activities to verify the documents.
  • As part of this verification the staff can then create a new activity requesting further documents, or book an appointment with the customer
  • The staff can complete all the activities and grant the application which results in all the staff activities having same completion timestamp

All in all its quite a complex process with a variety of activities and users (Over 100 different activity types) but when filtering the frequency of the activities and then trying to visualise the performance process maps I seem to get a large amount of negative durations on the edges between activities on which I don't know how I should be handling.

My questions are:

  • Am I just stupidly doing something wrong when I generate the event log?
  • How do you recommend i approach a scenario like this?
  • Is there anything I could do to ensure I always have a positive duration between edges?

I've replicated a similar issue in one of repos using the loan application event log data set which is available here jessevent/loan-app-process. Funnily enough it also causes processanimateR tokens to traverse backwards and float off of edges to different activities which is actually how I first identified I was experiencing something odd.

This is essentially the format/code i'm using to transform my activity instances into the event log.

example_log_4 %>%
    mutate(activity_instance = 1:nrow(.)) %>%
    gather(status, timestamp, schedule, start, complete)  %>%
    filter(!is.na(timestamp)) %>%
    eventlog(
        case_id = "patient",
        activity_id = "activity",
        activity_instance_id = "activity_instance",
        lifecycle_id = "status",
        timestamp = "timestamp",
        resource_id = "resource"
    )

Thanks so much for any assistance, the whole bupaR framework is an outstanding piece of work and an amazing achievement. Personally i've spent a long time looking for a framework like this and am quite excited with the progress and future to come!

process_map fails on data frame containing a column named 'time'

When supplying an event log with a (possibly) unrelated column names 'time' process_map fails with an error:

> process_map(log)
Error in summarise_impl(.data, dots) : 
  Column `time` must have a unique name
> traceback()
13: stop(list(message = "Column `time` must have a unique name", 
        call = summarise_impl(.data, dots), cppstack = list(file = "", 
            line = -1L, stack = "C++ stack not available on this system")))
12: summarise_impl(.data, dots)
11: summarise.tbl_df(., start_time = min(time), end_time = max(time), 
        min_order = min(.order))
10: summarize(., start_time = min(time), end_time = max(time), min_order = min(.order))
9: function_list[[k]](value)
8: withVisible(function_list[[k]](value))
7: freduce(value, `_function_list`)
6: `_fseq`(`_lhs`)
5: eval(quote(`_fseq`(`_lhs`)), env, env)
4: eval(quote(`_fseq`(`_lhs`)), env, env)
3: withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
2: grouped_log %>% summarize(start_time = min(time), end_time = max(time), 
       min_order = min(.order)) at process_map.R#91
1: process_map(log)

Code to reproduce:

x <- data.frame(case = c(1), time = c("foobar"), timestamp = c(as.POSIXct(Sys.time())), activity = c("test"), activity_instance_id = c(1), resource = c("bar"), lifecyle = "complete")
log <- eventlog(x, case_id = "case", timestamp = "timestamp", activity_id = "activity", activity_instance_id = "activity_instance_id", resource_id = "resource", lifecycle_id = "lifecyle")
process_map(log)

Session info:

> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=Norwegian Bokmål_Norway.1252  LC_CTYPE=Norwegian Bokmål_Norway.1252    LC_MONETARY=Norwegian Bokmål_Norway.1252 LC_NUMERIC=C                            
[5] LC_TIME=Norwegian Bokmål_Norway.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] bindrcpp_0.2.2         petrinetR_0.2.0        processmonitR_0.1.0    xesreadR_0.2.2         processmapR_0.3.2.9000 eventdataR_0.2.0       edeaR_0.8.1           
 [8] bupaR_0.4.1            forcats_0.3.0          stringr_1.3.1          dplyr_0.7.6            purrr_0.2.5            readr_1.1.1            tidyr_0.8.1           
[15] tibble_1.4.2           ggplot2_3.0.0          tidyverse_1.2.1       

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.18       lubridate_1.7.4    lattice_0.20-35    visNetwork_2.0.4   utf8_1.1.4         assertthat_0.2.0   digest_0.6.16      mime_0.5          
 [9] R6_2.2.2           cellranger_1.1.0   plyr_1.8.4         backports_1.1.2    httr_1.3.1         pillar_1.3.0       rlang_0.2.2        lazyeval_0.2.1    
[17] readxl_1.1.0       shinyTime_0.2.1    rstudioapi_0.7     data.table_1.11.4  miniUI_0.1.1.1     DiagrammeR_1.0.0   downloader_0.4     htmlwidgets_1.2   
[25] igraph_1.2.2       munsell_0.5.0      shiny_1.1.0        broom_0.5.0        compiler_3.5.1     influenceR_0.1.0   rgexf_0.15.3       httpuv_1.4.5      
[33] modelr_0.1.2       pkgconfig_2.0.2    htmltools_0.3.6    tidyselect_0.2.4   gridExtra_2.3      XML_3.98-1.16      fansi_0.3.0        viridisLite_0.3.0 
[41] crayon_1.3.4       withr_2.1.2        later_0.7.3        grid_3.5.1         nlme_3.1-137       jsonlite_1.5       xtable_1.8-2       gtable_0.2.0      
[49] magrittr_1.5       scales_1.0.0       cli_1.0.0          stringi_1.2.4      viridis_0.5.1      promises_1.0.1     ggthemes_4.0.1     xml2_1.2.0        
[57] brew_1.0-6         RColorBrewer_1.1-2 tools_3.5.1        glue_1.3.0         hms_0.4.2          Rook_1.1-1         yaml_2.2.0         colorspace_1.3-2  
[65] rvest_0.3.2        plotly_4.8.0       bindr_0.1.1        haven_1.1.2  

add QUANTILE performance

Is it possible to integrate also the QUANTILE metric inside performance()?

As showed in the example below, I'm hoping to get from process_map or other graphs, the quantile performance evaluation.
The classic 3rd, 1st qualtile for example, or better, the prefered quantile.
For example the number 0.32 for 32nd occurrence.
Even better if avaiable for both, nodes and edges.

patients %>%
process_map(performance(quantile, 0,32 , "days"))

Hope you the best
@gertjanssenswillen

Edges are stacked when using fixed_node_pos

Hi,

I tried the recently added fixed_node_pos parameter, and it worked fine with the patient example in another issue but when I tried it with my data that have self edge or multiple edges going to multiple nodes, everything get stacked and the edge values get behind.
Any way to control the edges ? Maybe adding more curve to the edges would fix this.
image

image

Edit; this is how it is by default
image
But even when trying another kind of disposition, it still make straight edges:
image

precedence_matrix with performance views

The precedence_matrix() is sometimes the better alternative than a process_map()in case of a "spaghetti-process" with many variants. It would therefore be useful to have a performance view for the precedence diagrams too.

This would have the same syntax as the process_map function, e.g.

precedence_matrix(performance(median, "days")) and its other options.

process_map() new problem after installing processmapR from Github

Hi, after installing the new version of processmapR from Github I got the following error message (see below), I have just used this code to test:

library(bupaR)
patients %>%
    process_map(type = frequency("relative"))
Error in FUN(X[[i]], ...): object '.order' not found
Traceback:

1. patients %>% process_map(type = frequency("relative"))
2. withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
3. eval(quote(`_fseq`(`_lhs`)), env, env)
4. eval(quote(`_fseq`(`_lhs`)), env, env)
5. `_fseq`(`_lhs`)
6. freduce(value, `_function_list`)
7. withVisible(function_list[[k]](value))
8. function_list[[k]](value)
9. process_map(., type = frequency("relative"))
10. eventlog %>% as.data.frame() %>% droplevels %>% select(act = !(!activity_id_(eventlog)), 
  .     aid = !(!activity_instance_id_(eventlog)), case = !(!case_id_(eventlog)), 
  .     time = !(!timestamp_(eventlog)), .order) %>% group_by(act, 
  .     aid, case) %>% summarize(start_time = min(time), end_time = max(time), 
  .     min_order = min(.order))
11. withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
12. eval(quote(`_fseq`(`_lhs`)), env, env)
13. eval(quote(`_fseq`(`_lhs`)), env, env)
14. `_fseq`(`_lhs`)
15. freduce(value, `_function_list`)
16. function_list[[i]](value)
17. select(., act = !(!activity_id_(eventlog)), aid = !(!activity_instance_id_(eventlog)), 
  .     case = !(!case_id_(eventlog)), time = !(!timestamp_(eventlog)), 
  .     .order)
18. select.data.frame(., act = !(!activity_id_(eventlog)), aid = !(!activity_instance_id_(eventlog)), 
  .     case = !(!case_id_(eventlog)), time = !(!timestamp_(eventlog)), 
  .     .order)
19. select_vars(names(.data), !(!(!quos(...))))
20. map_if(ind_list, !is_helper, eval_tidy, data = names_list)
21. map(.x[matches], .f, ...)
22. lapply(.x, .f, ...)
23. FUN(X[[i]], ...)

Enlarging nodes in process map

Hi,

This is an awesome package! Thank you! I have a very large process map and cannot see the individual nodes. Is there a way to increase their size? I have tried to use visnetwork and end up with a cool ball of sphaghetti.. Any suggestions would be greatly appreciated!!! Thank you!!

Sincerely,

tom

The Start Node Wants to be on Top

The placement of the "Start" node isn't ideal when event logs have a fair amount of trace variants.
Is there a way to coerce the Start node to the top?

2018-07-16_17-52-53

Secondary metric for the process_map

It would be nice to have a secondary metric in the process_map function to show both frequencies and waiting times on the edges. The secondary metric could have a slightly smaller font size (see Disco).

process_map(performance(sum, "days")) is an alternative to show the severity of waiting times, but a secondary metric would provide even more context.

Extended RAW DATA parameter

Hy,

I'm guessing if is possible to add 2 more intresting features for the processmap and trace_explorer functions.

1. process_map
Could be usefull to extract the raw.data behind the plot, in a raw form, usable for furthers manipulations and so on. Some processmap (raw.data=T), that provide back a data.frame with all the data strucutre behind the plot (edges and nodes values), maybe inside a data.frame.

2. trace_explorer
Also there, a raw.data parameter has been allready provided. But instead of return just the first occurrence for every trace, I'm looking for a full raw data.frame return.
I'm meaning that now, if two patients A and B, following the user guide, make the same trace for example 1-2-3, only the trace regarding the patient A, is returned back. Is it possible to extend the raw.data to all the patients?

Hope you the best,
Antonio

precedence_matrix() types should be hyphen-separated

The edeaR resource metric levels are hyphen separated, for example "resource-activity".

precedence_matrix() types are underscore separated, for example, "relative_antecedent".

This should be made consistent.

process_map()

Hi Gert,

I'm using the process_map() function. I'm not able to understand why the activity is sometimes printed white and sometimes printed black when the node in process map is colored white/pink.

The problem is that it is impossible to read activity in some cases .

Thanks.

Prefixing `UQ()` with the rlang namespace is deprecated as of rlang 0.3.0

Hi!.
process_map give warning concerning new conception in rlang:

m2 %>%
+   processmapR::process_map(type = processmapR::frequency("relative_case"))

Warning message:
Prefixing `UQ()` with the rlang namespace is deprecated as of rlang 0.3.0.
Please use the non-prefixed form or `!!` instead.
  # Bad:
  rlang::expr(mean(rlang::UQ(var) * 100))
  # Ok:
  rlang::expr(mean(UQ(var) * 100))
  # Good:
  rlang::expr(mean(!!var * 100))
This warning is displayed once per session. 

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] rTRNG_4.20-1      anytime_0.3.4     data.table_1.12.2 tictoc_1.0        datapasta_3.0.0  
 [6] tidylog_0.1.0     forcats_0.4.0     stringr_1.4.0     dplyr_0.8.3       purrr_0.3.2      
[11] readr_1.3.1       tidyr_0.8.3       tibble_2.1.3      ggplot2_3.2.0     tidyverse_1.2.1  

loaded via a namespace (and not attached):
 [1] nlme_3.1-140       bitops_1.0-6       matrixStats_0.54.0 lubridate_1.7.4   
 [5] RColorBrewer_1.1-2 httr_1.4.0         ggsci_2.9          profvis_0.3.6     
 [9] tools_3.6.1        backports_1.1.4    R6_2.4.0           lazyeval_0.2.2    
[13] colorspace_1.4-1   withr_2.1.2        tidyselect_0.2.5   gridExtra_2.3     
[17] compiler_3.6.1     cli_1.1.0          rvest_0.3.4        xml2_1.2.0        
[21] influenceR_0.1.0   plotly_4.9.0       scales_1.0.0       checkmate_1.9.4   
[25] RApiDatetime_0.0.4 digest_0.6.20      pkgconfig_2.0.2    htmltools_0.3.6   
[29] htmlwidgets_1.3    rlang_0.4.0        ggthemes_4.2.0     readxl_1.3.1      
[33] rstudioapi_0.10    pryr_0.1.4         shiny_1.3.2        visNetwork_2.0.7  
[37] generics_0.0.2     zoo_1.8-6          jsonlite_1.6       rgexf_0.15.3      
[41] RCurl_1.95-4.12    magrittr_1.5       rapportools_1.0    Rcpp_1.0.1        
[45] munsell_0.5.0      viridis_0.5.1      eventdataR_0.2.0   yaml_2.2.0        
[49] stringi_1.4.3      plyr_1.8.4         shinyTime_1.0.0    grid_3.6.1        
[53] parallel_3.6.1     promises_1.0.1     edeaR_0.8.2        crayon_1.3.4      
[57] bupaR_0.4.2        miniUI_0.1.1.1     lattice_0.20-38    haven_2.1.1       
[61] pander_0.6.3       summarytools_0.9.3 hms_0.5.0          magick_2.0        
[65] zeallot_0.1.0      pillar_1.4.2       tcltk_3.6.1        igraph_1.2.4.1    
[69] processmapR_0.3.3  codetools_0.2-16   XML_3.98-1.20      glue_1.3.1        
[73] downloader_0.4     RcppParallel_4.4.3 modelr_0.1.4       vctrs_0.2.0       
[77] httpuv_1.5.1       cellranger_1.1.0   gtable_0.3.0       assertthat_0.2.1  
[81] mime_0.7           skimr_1.0.7        xtable_1.8-4       broom_0.5.2       
[85] later_0.8.0        viridisLite_0.3.0  Rook_1.1-1         DiagrammeR_1.0.1  
[89] brew_1.0-6

Problems by using visNetwork as output format

Hello everyone,

I have a problem by using visNetwork as output format.
All the structure and colors getting lost by switching to this kind ob export format.
Even numbers on the edges get lost.

Do I habe to rebuild the design by switching to visnetwork or is this a bug?
Thanks for your help!

Greetings,
Niklas

could not find function "set_global_graph_attrs"

Hey Guys,

after updating DiagrammeR to Version 1.0.0 I have a problem to create a process map with the processmapR package, wich is trying to use one of the contained functions:

Log2 %>%
process_map()
Error in set_global_graph_attrs(., attr = "rankdir", value = "LR", attr_type = "graph") :
could not find function "set_global_graph_attrs"

process_map

Hi, Gert. I found that process_map function doesn't work well when type = perfomance().

Nodes are numbers instead of "actions" and the starting node seems to be a final node.

cattura

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.