Coder Social home page Coder Social logo

davidsbatista / musico Goto Github PK

View Code? Open in Web Editor NEW
5.0 3.0 2.0 38.27 MB

A Minwise Hashing Method for Addressing Relationship Extraction from Text

License: GNU Lesser General Public License v3.0

Java 82.84% Perl 16.97% Shell 0.19%
semantic-relationship-extraction minhash locality-sensitive-hashing on-line-learning scalable semeval

musico's Introduction

Minwise Hashing for Relationship Extraction from Text

The MinHash-based Semantic Relationship Classifier (MuSICo) is an on-line approach for extracting of semantic relationships, based on the idea of nearest neighbor classification.

Instead of learning a statistical model, it finds the most similar relationship instances in a database and uses these similarities to make the decision of whether the sentence holds a certain relationship type. The sentence is classified according to the relationship type of the most similar relationship instances in a database.

The computation is done by leveraging min-hash and locality sensitive hashing for efficiently measuring the similarity between instances.

Usage:

First just ant to compile the source code, which should generate MuSICo.jar based on build.xml    

ant

All the external libs needed by MuSICo are in the libs/ directory. Then you can call MuSICo.jar with the following parameters, e.g.:

java -cp libs/*:MuSICo.jar bin.Main semeval true 400 50 5

parameters

MuSICo.jar bin.Main dataset true|false #min-hash-sigs #bands #kNN [train_file] [test_file]

dataset           semeval wiki aimed wikipt
true|false        generate shingles ? if false need to pass train_file/test_file
#min-hash-sigs    number of hash signatures
#bands            size of the LSH bands
#kNN              number of closest neighbors to consider

References

David S. Batista, Rui Silva, Bruno Martins, Mário J. Silva, A Minwise Hashing Method for Addressing Relationship Extraction from Text in Web Information Systems Engineering (WISE), 2013

David Soares Batista, David Forte, Rui Silva, Bruno Martins, Mário Silva, Exploring DBpedia and Wikipedia for Portuguese Semantic Relationship Extraction in Linguamática, 5(1), 2013.

David S. Batista, Ph.D. Thesis, Large-Scale Semantic Relationship Extraction for Information Discovery (Chapter 4), Instituto Superior Técnico, University of Lisbon, 2016

musico's People

Contributors

davidsbatista avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

musico's Issues

Sample

This project looks interesting, do you have some samples I can try.. also is there a way to train the information.

Please help me on this.

Krishna

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.