Coder Social home page Coder Social logo

multilingual_factuality's Introduction

Multilingual factuality

This repository is used to build a multi-lingual system for identifying the factuality of events. We are currently working on a basic implementation for Dutch and English.

Running the module

The scripts to run the module are located at feature_extractor/

  1. On a single file:

cat inputfile | python rule_based_factuality.py > outputfile

  1. For a directory:

./run_rule_based_on_dir.sh inputdir/ outputdir/

Version:

Version 0.01 is released on May 9th 2016. It is a basic implementation and first creation of resources based on intuition. We expect to have many minor revisions in the near future. You will therefore find some version 0.xx in the document.

As soon as the resources for at least one language and the rule application system have stabilized, we will release version 1.0

Content

This repository consists of:

docs/ for documentation data/ for development and training data resources/ for language specific resources (vocabularies) and models scripts/ for code extracting input features, applying rules/calling machine learning modules and producing output.

Evaluation/experimental setup:

please check out: https://github.com/cltl/factuality_experimental_environment

For gold data, input data and evaluation scripts for factuality.

Contact

Antske Fokkens: [email protected] Ruben Izquierdo: [email protected] Roser Morante: [email protected] Tommaso Caselli: [email protected]

multilingual_factuality's People

Contributors

antske avatar rubenizquierdo avatar vanatteveldt avatar tommasoc80 avatar

Stargazers

Will J. avatar  avatar  avatar

Watchers

James Cloos avatar Marieke van Erp avatar Emiel van Miltenburg avatar  avatar  avatar piek avatar Minh Le avatar  avatar Marten Postma avatar Filip Ilievski avatar R.H. Segers avatar  avatar  avatar  avatar  avatar

multilingual_factuality's Issues

KeyError: t_xxx

I seem to get this error occasionally when running the multilingual factuality.

Traceback (most recent call last):
  File "/data/wva/newsreader_pipe_nl/modules/multilingual_factuality/feature_extractor/rule_based_factuality.py", line 413, in <module>
    main()
  File "/data/wva/newsreader_pipe_nl/modules/multilingual_factuality/feature_extractor/rule_based_factuality.py", line 406, in main
    run_factuality_module(nafobj)
  File "/data/wva/newsreader_pipe_nl/modules/multilingual_factuality/feature_extractor/rule_based_factuality.py", line 392, in run_factuality_module
    events_features = extract_features(feature_extractor, target_events)
  File "/data/wva/newsreader_pipe_nl/modules/multilingual_factuality/feature_extractor/rule_based_factuality.py", line 367, in extract_features
    add_predicate_chain_features(feature_extractor, event, myFeatures)
  File "/data/wva/newsreader_pipe_nl/modules/multilingual_factuality/feature_extractor/rule_based_factuality.py", line 210, in add_predicate_chain_features
    pred_chain = feature_extractor.get_list_term_ids_to_root(tid)
  File "/data/wva/newsreader_pipe_nl/modules/multilingual_factuality/feature_extractor/my_feature_extractor.py", line 173, in get_list_term_ids_to_root
    root_for_sentence = this_graph.get_root()
  File "/data/wva/newsreader_pipe_nl/modules/multilingual_factuality/feature_extractor/my_feature_extractor.py", line 41, in get_root
    self.calculate_root()
  File "/data/wva/newsreader_pipe_nl/modules/multilingual_factuality/feature_extractor/my_feature_extractor.py", line 35, in calculate_root
    list_with_min_freq = [(term_id, len(self.G[term_id])) for term_id, freq in L if freq == min_freq]
KeyError: 't_840'

An example input file that causes the error can be found here: http://i.amcat.nl/keyerror.naf

TypeError: 'NoneType' object is not iterable

I get this error when running the factuality

Traceback (most recent call last):
  File "/data/wva/newsreader_pipe_nl/modules/multilingual_factuality/feature_extractor/rule_based_factuality.py", line 413, in <module>
    main()
  File "/data/wva/newsreader_pipe_nl/modules/multilingual_factuality/feature_extractor/rule_based_factuality.py", line 406, in main
    run_factuality_module(nafobj)
  File "/data/wva/newsreader_pipe_nl/modules/multilingual_factuality/feature_extractor/rule_based_factuality.py", line 392, in run_factuality_module
    events_features = extract_features(feature_extractor, target_events)
  File "/data/wva/newsreader_pipe_nl/modules/multilingual_factuality/feature_extractor/rule_based_factuality.py", line 367, in extract_features
    add_predicate_chain_features(feature_extractor, event, myFeatures)
  File "/data/wva/newsreader_pipe_nl/modules/multilingual_factuality/feature_extractor/rule_based_factuality.py", line 212, in add_predicate_chain_features
    eventObj.predicate_chain_lemmas = feature_extractor.get_lemmas_for_list_term_ids(pred_chain)
  File "/data/wva/newsreader_pipe_nl/modules/multilingual_factuality/feature_extractor/my_feature_extractor.py", line 182, in get_lemmas_for_list_term_ids
    for term_id in list_term_ids:
TypeError: 'NoneType' object is not iterable
WARNING:elasticsearch:HEAD /nlpipe/annotate__0_0 [status:404 request:0.001s]

As far as I can see, get_list_term_ids_to_root returns None in some cases, which is not expected by the get_lemmas_for_list_term_ids. Adding or None to the return value of the former function prevents the error, but I have no clue what I'm doing so that's probably not a solution.

Let me know if you need the raw input that causes the problem.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.