Coder Social home page Coder Social logo

felixleopoldo / benchpress Goto Github PK

View Code? Open in Web Editor NEW
58.0 5.0 16.0 124.76 MB

A Snakemake workflow to run and benchmark structure learning (a.k.a. causal discovery) algorithms for probabilistic graphical models.

Home Page: https://benchpressdocs.readthedocs.io

License: GNU General Public License v2.0

R 39.21% Python 45.20% Shell 3.97% Dockerfile 0.54% TeX 11.08%
graphical-models bayesian-networks markov-networks benchmarking reproducible-research machine-learning snakemake-workflow structure-learning causal-discovery causal-models

benchpress's People

Contributors

aditya003singh avatar alex-markham avatar dependabot[bot] avatar felixleopoldo avatar jackkuipers avatar jcussens avatar melmasri avatar rocafuerte avatar yasu-sh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

benchpress's Issues

Boxplot

Enable for using box plots in the roc curves.

Tetrad json graph structure may have the arrow-tail direction as <--

isdirected <- ((e$endpoint1 == "TAIL") && (e$endpoint2 == "ARROW")) | ((e$endpoint2 == "TAIL") && (e$endpoint1 == "ARROW"))

Currently the code has checked the existence of direction so far.
But tetrad with bootstrapping may have output the opposite arrow as edge: "<--".

The code as below may avoid potential problems.
For my experience, but for bootstrapping, I expect tetrad would make output as one direction as "-->"
If you have any intentions, feel free to let me know.

    # No CIRCLE check.
    isdirected1to2 <- (e$endpoint1 == "TAIL") && (e$endpoint2 == "ARROW")
    isdirected2to1 <- (e$endpoint1 == "ARROW") && (e$endpoint2 == "TAIL")

    if (isdirected1to2) {
      m[node1_ind, node2_ind] <- 1
    } else if (isdirected2to1) {
      m[node2_ind, node1_ind] <- 1
    } else {
      m[node1_ind, node2_ind] <- 1
      m[node2_ind, node1_ind] <- 1
    }

Support for arm64 architecture

The Docker images are currently built for amd64 architectures but arm64 should also be available so that Benchpress could be used (through Docker) on e.g. Mac M1-2 machines.

SHD

Plot the SHD metric.

Write and plot edge weights

Some algorithms, like no tears, estimates edge weights/parameters. It should be possible to access these. It can be done by adding another output field, edge_weights, to the rules corresponding to these algorithms. For mcmc algorithms there should be a general converter rule that creates an estimate of the edge probabilities.

Failed to find loop device: could not attach image file to loop device: no loop devices available

@felixleopoldo Thanks for your works, benchpress. It works fine on the machines.
I was using benchpress and issues on WSL2 at win11 with docker.
So now I moved to native Ubuntu machine with docker-image as the instruction manual at benchpress.

One symptom is worth telling you on some cases.

  • countermeasure-1 reduce cores - worked.
    • snakemake --cores 4 --use-singularity --configfile config/config.json
  • countermeasure-2 install as Linux(Ubuntu) on WSL - worked!
    • Straightforwardly used miniforge(mambaforge) instead of miniconda since both conda make corruption of base environment at conda.
    • Apptainer version 1.2.3 installed with non-setuid installation

I guess the root cause has not been solved yet. But it in some cases updates makes a solutions.
I hope my report be a help for benchpress users.
sylabs/singularity#67 <- the same symptoms.

When I run the code as below:

(snakemake) root@:/mnt# snakemake --cores all --use-singularity --configfile config/config.json

Then I faced the error as below. Both have the same. loop device was not enough.

  • Win11 on Docke Desktop with WSL2, volume mounted at Windows file system
    PS > docker run -it -w /mnt --privileged -v F:/benchpress:/mnt bpimages/snakemake:v7.32.3

  • Win11 on Docke Desktop with WSL2, volume mounted at WSL2 file system
    docker run -it -w /mnt --privileged --name bntab -v /home/path/benchpress:/mnt bpimages/snakemake:v7.32.3

  • Symptom:

(omit)
[Fri Sep 29 05:17:07 2023]
Finished job 112.
1 of 346 steps (0.3%) done
FATAL:   container creation failed: mount /proc/self/fd/3->/opt/conda/envs/snakemake/var/singularity/mnt/session/rootfs error: while mounting image /proc/self/fd/3: failed to find loop device: could not attach image file to loop device: no loop devices available

(snakemake) root@d6f240d00620:/mnt# lscpu 
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
Address sizes:       48 bits physical, 48 bits virtual
CPU(s):              12
On-line CPU(s) list: 0-11
Thread(s) per core:  2
Core(s) per socket:  6
Model name:          AMD Ryzen 5 5500
CPU MHz:             3593.164

Memory 64GB / GPU nvidia 8GB

The loop device increased from 8 to 255.

(snakemake) root@:/mnt# ls /dev/loop*
/dev/loop-control  /dev/loop119  /dev/loop140  /dev/loop162  /dev/loop184  /dev/loop205  /dev/loop227  /dev/loop249  /dev/loop40  /dev/loop62  /dev/loop84
/dev/loop0         /dev/loop12   /dev/loop141  /dev/loop163  /dev/loop185  /dev/loop206  /dev/loop228  /dev/loop25   /dev/loop41  /dev/loop63  /dev/loop85
(omit)
/dev/loop118       /dev/loop14   /dev/loop161  /dev/loop183  /dev/loop204  /dev/loop226  /dev/loop248  /dev/loop4    /dev/loop61  /dev/loop83

Variable naming convention.

When running the diffplot method to compare graphs, you have L127 in this here

compares the true graph to the estimated. This throws an error of like

Error in check.nodes(.nodes(custom[[i]]), graph = nodes, min.nodes = length(nodes),  : 
  invalid node(s) 'X0' 'X1' 'X2' 'X3' 'X4' 'X5' 'X6' 'X7' 'X8' 'X9' 'X10' 'X11' 'X12' 'X13' 'X14' 'X15' 'X16' 'X17' 'X18' 'X19' 'X20' 'X21' 'X22' 'X23' 'X24' 'X25' 'X26' 'X27' 'X28' 'X29' 'X30' 'X31' 'X32' 'X33' 'X34' 'X35' 'X36' 'X37' 'X38' 'X39' 'X40' 'X41' 'X42' 'X43' 'X44' 'X45' 'X46' 'X47' 'X48' 'X49' 'X50' 'X51' 'X52' 'X53' 'X54' 'X55' 'X56' 'X57' 'X58' 'X59' 'X60' 'X61' 'X62' 'X63' 'X64' 'X65' 'X66' 'X67' 'X68' 'X69' 'X70' 'X71' 'X72' 'X73' 'X74' 'X75' 'X76' 'X77' 'X78' 'X79' 'X80' 'X81' 'X82' 'X83' 'X84' 'X85' 'X86' 'X87' 'X88' 'X89' 'X90' 'X91' 'X92' 'X93' 'X94' 'X95' 'X96' 'X97' 'X98' 'X99'.
Calls: benchmarks ... graphviz.compare -> check.customlist -> check.nodes
Execution halted

The issue here is that names(pattern_true_bn$nodes) are named [1, 2,3, ....., n] while names(pattern_estimated_bn$nodes) are named [X1, X2, X3, ...] So the file throws and error since the names are not similar.

Here is print of those names

1] "names 1"
  [1] "0"  "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14"
 [16] "15" "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29"
 [31] "30" "31" "32" "33" "34" "35" "36" "37" "38" "39" "40" "41" "42" "43" "44"
 [46] "45" "46" "47" "48" "49" "50" "51" "52" "53" "54" "55" "56" "57" "58" "59"
 [61] "60" "61" "62" "63" "64" "65" "66" "67" "68" "69" "70" "71" "72" "73" "74"
 [76] "75" "76" "77" "78" "79" "80" "81" "82" "83" "84" "85" "86" "87" "88" "89"
 [91] "90" "91" "92" "93" "94" "95" "96" "97" "98" "99"
[1] "names 2"
  [1] "X0"  "X1"  "X2"  "X3"  "X4"  "X5"  "X6"  "X7"  "X8"  "X9"  "X10" "X11"
 [13] "X12" "X13" "X14" "X15" "X16" "X17" "X18" "X19" "X20" "X21" "X22" "X23"
 [25] "X24" "X25" "X26" "X27" "X28" "X29" "X30" "X31" "X32" "X33" "X34" "X35"
 [37] "X36" "X37" "X38" "X39" "X40" "X41" "X42" "X43" "X44" "X45" "X46" "X47"
 [49] "X48" "X49" "X50" "X51" "X52" "X53" "X54" "X55" "X56" "X57" "X58" "X59"
 [61] "X60" "X61" "X62" "X63" "X64" "X65" "X66" "X67" "X68" "X69" "X70" "X71"
 [73] "X72" "X73" "X74" "X75" "X76" "X77" "X78" "X79" "X80" "X81" "X82" "X83"
 [85] "X84" "X85" "X86" "X87" "X88" "X89" "X90" "X91" "X92" "X93" "X94" "X95"
 [97] "X96" "X97" "X98" "X99"

Set output directory in conf

The benchmark_setup section of the config file should have a key called output_dir specifying where the output of the evaluation modules should be saved. Thus everything currently saved in output should instead be saved in output/config["benchmark_setup"]["output_directory"]. When running a config file, the config file itself should also be saved here.
One would basically just have to change the output part of the rules.smk in the evaluation modules. For example this line and the ones below.

Compress MCMC trajectories

MCMC trajectories should be compressed into tarballs to save disk space.
This can be be implemented with an additional rule that compresses a trajectory. Another rule should extract it. The output from the MCMC algorithms should be protected (i.e. rule-temporary) CSV files.

Difference adjacency matrix plots

In the graph_plots module, we should in case the true graph is provided also plot a difference matrix plot, similar to the adjacency matrix plots, were correct, missing, and false edges are indicated in black, blue, and red, respectively.

True graph skeleton as optional algorithm input

In other words, create an undirected graph from the true DAG (or any graph), and pass it to an algorithm as input. Sampled data should be passed as input as well (as per usual). It's important to note that the passed skeleton is the true undirected graph, not an estimate.

The reason is to be able to test pairwise algorithms, such as these.

Usually, pairwise methods are tested only on (X, Y) datasets. But testing them on bigger graphs (nodes > 2) is arguably more interesting and challenging. This, however, requires to provide the algorithms with a starting point in the form of a graph's skeleton. The task then boils down to orient the edges. The final product is a fully oriented graph (can have cycles), so most, if not all, of the existing metrics can be used without issues.

For an example, see [1] section 5 (5.2 and 5.4 specifically).

[1] O. Goudet, D. Kalainathan, P. Caillou, D. Lopez-Paz, I. Guyon, and M. Sebag, ‘Learning Functional Causal Models with Generative Neural Networks’, Springer International Publishing, 2018. doi: 10.1007/978-3-319-98131-4.

Ground truth adjacent matrix's column order might need to be the same one as dataset.

This would be true since I noticed the large SHD number are obtained at no-bootstrapping even I get reasonalbe result with bootstrapping in tetrad from my eyes in plot.
If it is true, it is important for users.

Dataset: alarm(made from bnlearn by me)
left: ground truth / center: without bootstrapping / right: with bootstrapping = 5
image

Diffplot
image

Graph structure
image

Estimated graph as input

It would be nice with a reserved field name e.g. input_graph_id, that could be used to pass an estiamated graph to an algorithm using the algorithms object ID. This is done already in some algorithm modules, but it is not as easy as using a reserved field in the JSON config.
Looking at this implementation of the gobnilp module/rule:

startgraph_file would correspond to something like input_graph_file and wildcards["startalg"] here: would be wildcards["input_graph_id"] . The idea is to make this feature available seamlessly in any algorithm module by adding the input_graph_id field in the config file.
To do that, something similar to this part of the code should be evaluated for any algorithm having the input_graph_id in the config file.
items["startalg"] = idtopath(items["startalg"], json_string)

Timing

Proper timing for all algorithms. There is already support for timing but the times are not set for all algorithms.

Annotations to ROC curves

When the ROC plots contain many ids, the curves tend to be hard to distinguish by their colors. Annotations in the plots should make it easier.

Data preprocessing

There should be a data preprocessing field in the benchmark_setup section. This could e.g. handle modules that normalize, pollute or discretize data.

Using custom dataset

I'm interested in running benchpress with my own dataset without a solution graph and I would like to use specifically notears and golem algorithms for research purposes.

I've set already this data configuration:

            "graph_id": null,

            "parameters_id": null,

            "data_id": "insilico.csv",

            "seed_range": null

but I don't know how should I configure the evaluation section to not let the config validator show an error of "ROC evaluation requires graph_id.".

I started from the gcastle.json configuration.

It can happen, that I'm missing something or not fully understand the environment, sorry in advance!

Thank you for your help!

Score plots

Save scores for the score-based algorithms and plot them.
Scores are probably best saved in separate files as the timings.
In the plots, the result for different seeds should not be mixed up, so the score of one method should be used as benchmarks.

Readme file for the datasets

There should be a README.rst in the resources/data/mydatasets folder that contains one (or several) table describing the datasets having info as

  • Title
  • Filename
  • Dimension
  • Number of observation
  • Datatype
  • Underlying graph (if applicable)
  • Description

When running parallelDG autocorrelation doesn't plot property

You get a flat picture for size/score correlational when running
snakemake --configfile config/parallelDG.json --cores all --use-singularity

The issues seems to come from this line of code here. the ffill routine fills forward, but when the first index in you data is not 0, the re-indixing of this line

        df2 = df2.reindex(newindex).reset_index().reindex(
            columns=df2.columns).fillna(method="ffill")

Creates index 0 with NaN.

Here is an example

>>> df2
       size
index      
2         1
4         3
6         6
7        10
8        15
...     ...
99978   197
99980   196
99988   195
99996   194
99998   195

[21120 rows x 1 columns]
>>> df2.reindex(newindex).reset_index()
       index   size
0          0    NaN
1          1    NaN
2          2    1.0
3          3    NaN
4          4    3.0
...      ...    ...
99993  99993    NaN
99994  99994    NaN
99995  99995    NaN
99996  99996  194.0
99997  99997    NaN

[99998 rows x 2 columns]
>>> 99997  99997    NaN

[99998 rows x 2 columns]
>>>   C-c C-c

Custom Dataset format

Hello,

if we wan to use on our own dataset (tabular csv)
format.

what will be the format of the dataset ?

thx

Summarize Iterative search

Bug when summarising iterative search

Error in dag2essgraph(g) : Invalid graph passed to replaceUnprotected(). Calls: <Anonymous> -> dag2essgraph Execution halted Full Traceback (most recent call last): File "/users/staff/dmi-dmi/rios0000/anaconda3/envs/benchmark/lib/python3.8/site-packages/snakemake/executors.py", line 2141, in run_wrapper run( File "/users/staff/dmi-dmi/rios0000/git/benchpress/workflow/rules/algorithm_rules.smk", line 588, in __rule_summarise_itsearch File "/users/staff/dmi-dmi/rios0000/anaconda3/envs/benchmark/lib/python3.8/site-packages/snakemake/shell.py", line 176, in __new__ raise sp.CalledProcessError(retcode, cmd)

Allow for different parameters for json objects from same algorithm or module

The same algorithm can be parameterized in different ways depending on e.g. the score function. This is currently handled by setting some values to null. One way to solve this could be to generate pattern stings from each algorithm object and then generate the snakemake rules in a for loop using the pattern strings.

A bug in creating the adj mat

This line in buggy.

m = nx.to_numpy_matrix(g) - np.identity(g.order())

It should be

m = nx.to_numpy_matrix(g) 

You can see in the below code, you are removing the diagonal. Is this intended. In trilearn

g = dlib.gen_AR_graph(10, width=2)

m = nx.to_numpy_matrix(g) - np.identity(g.order())
m
matrix([[1., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
        [1., 1., 1., 1., 0., 0., 0., 0., 0., 0.],
        [1., 1., 1., 1., 1., 0., 0., 0., 0., 0.],
        [0., 1., 1., 1., 1., 1., 0., 0., 0., 0.],
        [0., 0., 1., 1., 1., 1., 1., 0., 0., 0.],
        [0., 0., 0., 1., 1., 1., 1., 1., 0., 0.],
        [0., 0., 0., 0., 1., 1., 1., 1., 1., 0.],
        [0., 0., 0., 0., 0., 1., 1., 1., 1., 1.],
        [0., 0., 0., 0., 0., 0., 1., 1., 1., 1.],
        [0., 0., 0., 0., 0., 0., 0., 1., 1., 1.]])
nx.to_numpy_matrix(g) - np.identity(g.order())
matrix([[0., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
        [1., 0., 1., 1., 0., 0., 0., 0., 0., 0.],
        [1., 1., 0., 1., 1., 0., 0., 0., 0., 0.],
        [0., 1., 1., 0., 1., 1., 0., 0., 0., 0.],
        [0., 0., 1., 1., 0., 1., 1., 0., 0., 0.],
        [0., 0., 0., 1., 1., 0., 1., 1., 0., 0.],
        [0., 0., 0., 0., 1., 1., 0., 1., 1., 0.],
        [0., 0., 0., 0., 0., 1., 1., 0., 1., 1.],
        [0., 0., 0., 0., 0., 0., 1., 1., 0., 1.],
        [0., 0., 0., 0., 0., 0., 0., 1., 1., 0.]])

Looking at the code downstream, this shouldn't make a difference, because you are using cov_matrix function, which takes in the graph of the adj. However, I found that in sims, this actually makes a difference.

If you generate the same sim, with the two difference adj matrices, you get a substantially different true graph. I'll look into it later. I am just noting it here.

Problem with time limit

It's basically working but I find that when I hit the timelimit then rather than getting "None" in my time file, the file is empty. Not sure what is going on there. This is causing problems in combine_ROC_data.R since we get an error:
Error in summarise():
ℹ In argument: time_median = median(time).
ℹ In group 1: id = "gobnilp-neat-bge", adjmat = "pcalg_randdag/max_parents=5/n=20/d=4/par1=None/par2=None/method=er/DAG=True", parameters = "sem_params/min=0.25/max=1", data = "iid/n=5000/standardized=True", alpha_mu = 0.01.

Separate plots instead of ggplot facet_wrap

It would be better to plot e.g. the roc curves in separate plots instead of using facet_wrap since it often happens that the title doesn't fit into the figure and there could be too many plots.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.