Coder Social home page Coder Social logo

trepan_reloaded's Introduction

Trepan Reloaded

An extended version of the Trepan algorithm (Trepan Reloaded) that takes into account ontologies to decide what feature to use when creating conditions in split nodes.

The decision is based on a modified version of the information gain measure (used in the inductive creation of the extracted decision tree from the neural network) that taked into account the degree of generality of features.

The Trepan Reloaded implementation consists of three parts:

  • Python scripts that are used to pre-process a given dataset, build and train a FFNN model, export the network's weights and biases, and create all the files needed by Trepan (quite a few of them!).

  • A Java library libs/ontology-wrapper.jar implemented using the OWL API 5.0 (plus other ontology utilities) that computes the degree of generality of features associated to concepts defined in an ontology.

  • An extended version of the Trepan algorithm taking into account weights associated to the features in a dataset.

Requirements

Python scripts

[Only needed if you want to use new datasets]

  • graphviz >= 0.10.1
  • numpy >= 1.15.4
  • pandas >= 0.24.0
  • scikit-learn >= 0.20.2
  • torch >= 0.4.1

Java library

[Only needed if you want to use new datasets and new ontologies]

The Java library is called from the C implementation using the jni.h library. To this end the following dependencies are needed to be made explicit in the makefile of the Trepan C distribution.

  • JDK >= 8

Trepan Reloaded

This version of Trepan has been tested only under Mac OS platform with the following gcc compiler

gcc --version
Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/c++/4.2.1
Apple LLVM version 10.0.1 (clang-1001.0.46.3)
Target: x86_64-apple-darwin18.5.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

Running

The current version of Trepan Reloaded can be run by compiling (after having set the java path correctly)

sh trepan-compile.sh 

and then running one of the two demos heart or loan by

Running the heart example

./trepan heart_dataset.cmd

Running the loan example

./trepan loan_dataset.cmd

The results can be found in examples/heart_dataset_example and examples/loan_dataset_example. They consist of:

  • name_dataset.dot: a graphviz representation of the decision tree extracted by Trepan.
  • name_dataset.fidelity.pruned: the accuracy and fidelity of the decision tree extracted by Trepan as well as some understandability measures computed based on the syntactic complexity of the tree (e.g., nr of leaves, nr of branches).

Reference

  • Confalonieri, R., et al. Trepan Reloaded: A Knowledge-driven Approach to Explaining Black-box models. In Proc. of ECAI 2020.

trepan_reloaded's People

Watchers

Roberto Confalonieri avatar James Cloos avatar  avatar

trepan_reloaded's Issues

Generating attribute values file is different from the example file uploaded

While trying to generate files that are needed for trepan using writeTrepanAttributesValuesFile in FFN from generate-trepan-network.py (or ‘demoutils.py’ which has the same generate_trepan_input_descriptors_and_nominals), I am generating an .attr.value file which is different from the files in the examples uploaded. In the examples uploaded there is some kind of mapping with the ontology file like age R hasAge while I am getting age R age. Is there any other part of the code used to generate these files?

Moreover the invoke_class that uses the java library 'libs/ontology-wrapper.jar' is not used anywhere, is the information content calculated some other way, or I am missing something?

Missing `TrepanDatasetType` package

Hello everyone,

First of all, thank you for making your paper's implementation open source. I have managed to run your examples you have provided and ran everything smoothly.

However, having installed the Python requirements and after trying to run your code on my own dataset I am encountering an error No module named TrepanDatasetType from here https://github.com/rconfalonieri/trepan_reloaded/blob/main/utils/pytorchmodels/threshold-xor.py#L8

Is there a way to get access to this package? I could not find anything about it online.

Thank you in advance.

Best regards,
Majlinda Llugiqi

Output changes when I run cmd without ontology using different IC

Hello,

In the cmd files there is set use_ontology which should indicate wether the ontology is used or not, but when I use set use_ontology 0 which mean that ontology is not used and when I change the information contents (.onto) file, the output is different. It means that it somehow uses ontology (information content from onto file), eventhough I use set use_ontology 0.

Is there something I am missing?

Thank you!

Information Content for Real Values inconsistent

Hello,

I am testing TREPAN Reloaded in order to explain different trained Neural Networks on a few datasets and I've gotten some inconsistent results when trying to use an ontology when there are Real Values present (this problem also occurred with the example present in the repository)

When launching Trepan Reloaded, the information content used for Real Values is a complete different number than the one present in the .onto file. This never isn't even present in the .onto file at all. So where do these numbers come from ? Does the C module calculate another Information Content than the one from the ontology-wrapper ?

Thanks in advance for the answers

Using personal ontology

Hello,

I was making some tests on your program and wanted to try this program with my own ontologies. The problem is, I don't quite understand how to do that.

I tried the command "java -jar ontology-wrapper.jar test.onto test.owl

but what I got in return was an error message :

Welcome to computeInformationContent
ontologyTrepanFile is: test.onto
ontologyPath is: test.owl
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
29

Do you know how to solve this issue ?

Thanks in advance.

Where does the IG change take place? (IG'=IG*IC)

Hello,

I was wondering if the IG'=IG*IC is done in tree.c file. I saw that there is the line of code split->gain = inf_content*split->gain; that does this, but when I change it, the output is the same. Even if I change printf("\nevaluate splits: inf_content %f",split->frequency); to 'printf("\nevaluateeeee splits: inf_content %f",split->frequency);' it still prints: evaluate splits: ... that indicates the printing isn't from here, and I couldn't find it, so I was wondering where you're updating the IG?

Best,
Majlinda

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.