SPECITELLER is a tool to predict sentence specificity.

The models in this package are obtained using co-training as described in Li and Nenkova, Fast and Accurate Prediction of Sentence Specificity, AAAI 2015.

Dependencies

Speciteller is implemented using Python 2.7. It depends on the following packages:

numpy
liblinear (in particular, liblinearutil.py; be sure you have a liblinear.so.<x> file in its python/ directory. If not, type make in python/)

Data and resources

Word lexicons for the models are available for download here. Please note that these resources come with license(s). Decompress the tar ball under this (i.e., the speciteller) directory.

Running Speciteller

Call:

$ python speciteller.py --inputfile inputfile --outputfile predfile

<inputfile> should consists of word-tokenized sentences, one sentence per line;
<predfile> will be the destination file which Speciteller will write the specificity scores to, one score per line in the same order as sentences in <inputfile>.
An optional argument is --write_all_preds. When flagged this will generate two addtional files: <predfile>.s (prediction from the shallow model) and <predfile>.w (prediction from the word representation model).

For example:

$ python speciteller.py --inputfile sents_test --outputfile test.probs

This will give you specificity scores for the two sentences in sents_test in test.probs.

The scores range from 0 to 1, with 0 being most general and 1 being most specific.

Practical notes

It is best that you word-tokenize your sentences. If you don't, you will still get a score, but less good (~4% less accurate if you translate them into labels with a cutoff at 0.5).
Note that the word embedding file is a compressed ~190mb .gz file. Each run of speciteller.py will load the file to generate features. Thus it is best to avoid loading it multiple times, or modify predict.py and tailor it for your data loading purpose.

Citation and contact

Please cite the following paper:

Junyi Jessy Li and Ani Nenkova. 2015. Fast and Accurate Prediction of Sentence Specificity. Twenty-Ninth Conference on Artificial Intelligence (AAAI). [bibtex]

Please send comments and feedback to Jessy Li.

cnap / speciteller Goto Github PK

speciteller's Introduction

SPECITELLER is a tool to predict sentence specificity.

Dependencies

Data and resources

Running Speciteller

Practical notes

Citation and contact

speciteller's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent