Coder Social home page Coder Social logo

apted's People

Contributors

joaofelipe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

apted's Issues

Tree is dict and list ?

My tree is include dict and list, for example: data1 = {"a": [{"b": []}, {"c": [{"h":[]}]}]}, data2 = {"a": [{"b": [{"m":[]}]}, {"c": [{"h":[{"k":[]}]}]}]}, the edit distance always 1. The code:

from apted import APTED
from apted.helpers import Tree
data1 = {"a": [{"b": []}, {"c": [{"h":[]}]}]}
data2 = {"a": [{"b": [{"m":[]}]}, {"c": [{"h":[{"k":[]}]}]}]}
tree1 = Tree(data1)
tree2 = Tree(data2)
apted = APTED(tree1, tree2)
res = apted.compute_edit_distance()

Negative edit distances

I am trying to use this library to compare ASTs of python code (specifically, I'm comparing an incorrect solution to a problem with several correct solutions, to pick a closest one).

I created a custom Config to specify how to get children and node names out of an ast node(below). This seems to work, except that I sometimes get negative edit distances; and other times I get edit distances that are clearly too small.

Below is a complete example, including all my logic for dealing with python ASTs. It returns -8 for the edit distance. Please advise.

import ast
from apted import APTED, Config

# for the purpose of tree differencing, we consider both the ast.Node objects and various literals representing values, ids, etc. to be nodes in the AST.
# therefore, we assume that all nodes that are passed to functions in astConfig are either ast.Node objects, or literals of some kind
class AstConfig(Config):
    def get_node_name(node):
        if hasattr(node, '_fields'):
            # this is an ast.Node object - return its type
            return type(node).__name__
        else:
            # this must be a literal, e.g. variable name, const value, etc. 
            return str(node)


    def children(self, node):
        if hasattr(node, '_fields'):
            # this is a Node object. It has all its children listed in _fields.
            # a child could be:
            # a) a literal, e.g. the name of the variable
            # b) a list of nodes, e.g. list of expressions in the body of a function
            # c) a direct child node.
            # we combine these into one flat list. 
            children = []
            for childname in node._fields:
                child = getattr(node, childname)
                if isinstance(child, list):
                    children.extend(child)
                else:
                    children.append(child)
            return children
        else:
            # this node itself is a literal; it has no children.
            return []


    def rename(self, node1, node2):
        name1 = AstConfig.get_node_name(node1)
        name2 = AstConfig.get_node_name(node2)
        return 1 if name1 != name2 else 0

# test apted+ast:
p1 = ast.parse('''def helloWorld():
    print ('Hello World!')''')
p2 = ast.parse('''def helloWorld():
    return 'Hello World!'
''')
apted = APTED(p1, p2, AstConfig())
print(apted.compute_edit_distance())
print(apted.compute_edit_mapping())

The mappings do seem to be reasonable, and picking the smallest edit distance does seem to choose the closest correct solution; so whatever is going on, it at least seems to be consistent across different pairings.

How do you get the "Maximum" possible TED ?

How do you get the "Maximum" possible TED ?

The TED is integer.
I want to get in some way the "maximum" TED between two trees.

The reason is I need a score with value between 0 and 1 i.e. score = distance/max

Is this feasible ?

Question about how to normalize the tree edit distance

Hi there

Thanks for your excellent repo!

My question is that if I want to normalize the computed tree edit distance into a certain range (e.g., between 0 and 1), what is the best way to do that? Or more mathematically, what is the upper bound of tree edit distance between arbitrary two trees?

Thanks in advance:)

Documentation

Hello!

Thank you for making this! Do you happen to have a programmatic hello world example?
I want to try the algorithm, but I'm not sure how to go about creating a tree in python
to hand off to your edit distance calculation.

Thank you :)

The Tree is a list ?

In the docs you say : class CustomConfig(Config):

  class CustomConfig(Config): ...

but my tree is :

  class Tree(list): ....

how do I solve that ?

Example :


['root',
 [['comam', [['qecer', [['dsvek', [['t', ['t']], ['h', ['h']]]], ['c', ['c']]]], ['e', ['e']]]],
  ['oflgd', [['yvsro', [['dptbp', [['d', ['d']], ['d', ['d']]]], ['tjbpc', [['c', ['c']], ['g', ['g']]]]]], ['f', ['f']]]],
  ['fqtgd', [['yvsro', [['dptbp', [['d', ['d']], ['d', ['d']]]], ['tjbpc', [['c', ['c']], ['g', ['g']]]]]], ['t', ['t']]]],
  ['dablv',
   [['yvsro', [['dptbp', [['d', ['d']], ['d', ['d']]]], ['tjbpc', [['c', ['c']], ['g', ['g']]]]]], ['qecer', [['dsvek', [['t', ['t']], ['h', ['h']]]], ['c', ['c']]]]]]]]


Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.