Coder Social home page Coder Social logo

krantikariqa's Introduction

KrantikariQA

An Information Gain based Question Answering system over knowledge graph systems.

  1. chmod +x parallel_data_creation.sh
  2. download glove42B and save it in resource folder
  3. mkdir logs
  4. ./parallel_data_creation.sh
  5. python data_creation_step1.py
  6. python reduce_data_creation_step2.py
  7. CUDA_VISIBLE_DEVICES=3 python corechain.py -model slotptr -device cuda -dataset lcquad -pointwise False

Download glove

wget http://nlp.stanford.edu/data/glove.42B.300d.zip save it to resource folder unzip it

Use Anaconda installation (still need to test it)

conda env create -f environment.yml

Setup redis server (this setup is not necessary. Its used for caching)

For installation https://redis.io/topics/quickstart

Setup dbpedia and add the url in utils/dbpedia_interface.py

Setup SPARQL parsing server

@TODO: add code here 
Install nodejs (node, nodejs)
> nodejs app.js

Setup embedding server

 python ei_server.py (Keep this always on)
 This will need bottle installed (pip install bottle)

Check for running verison of DBPedia, Redis (if caching), SPARQL Parsing server, Embedding interface

Setup Qelos-utils

https://github.com/lukovnikov/qelos-util.git change into qelos-util dir and python setup.py build/develop/ cp qelos ../

Install few more things

A potential bug is that he glove file datatype would be <U32

A rdftype_lookup.json can be created using the keys of relation.pickle (data/data/common)

import numpy as np
mat = np.load('resources/vectors_gl.npy')
mat = mat.astype(np.float64)
np.save('resources/vectors_gl.npy',mat)

#### TODO
change embedding in configs to 300d

Once the dataset is prepared

To check if all the files are in correct palce run the following command

python file_location_check.py

Once the data is at appropriate place run the following command.

CUDA_VISIBLE_DEVICES=3 python corechain.py -model slotptr -device cuda -dataset lcquad -pointwise False

krantikariqa's People

Contributors

geraltofrivia avatar nilesh-c avatar saist1993 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

krantikariqa's Issues

the maximum i_batch is 87

Sorry, trouble you, again.
in corechain.py: every epoch, I find the maximum i_batch is 87.
Are there eror on my local ?

QQ截图20191015105318

thank you.

Sorry, I meet a error when run the code.

Hi, Sorry to trouble you. I find a error:
AttributeError: module 'datasetPreparation.entity_subgraph' has no attribute 'CreateSubgraph'

Would you like to help me?

yawei

When running, the error: CUDA error out of memory

Hi, sorry to trouble you again.
When I run : CUDA_VISIBLE_DEVICES=1 python corechain.py -model slotptr -device cuda -dataset lcquad -pointwise True
1

The error line: loss.backward()
My GPU memory : 10G.

Thank you for your help.

Getting started issue. (Missing attribute 'CreateSubgraph')

Hello,

I am encountering the same problem mentioned in issue #16 but it is not clear what the solution is.
This is happening at step 5 in the Readme.md, python data_creation_step1.py.

Note that the Python environment krantikari is built according to the provided environment.yml but there is no file data_creation_step1.py. Therefore, running (with sysargs)

(krantikari) user@host:/KrantikariQA$python data_creator_step1.py 0 -1 lcquad

results in the following output

Traceback (most recent call last):
  File "data_creator_step1.py", line 160, in <module>
    _predicate_blacklist=pb, _relation_file={}, return_data=False, _qald=False)
  File "data_creator_step1.py", line 71, in run
    cd_node = cd.CreateDataNode(_predicate_blacklist=_predicate_blacklist, _relation_file=_relation_file, _qald=_qald)
  File "/app/KrantikariQA/datasetPreparation/create_dataset.py", line 27, in __init__
    self.create_subgraph = es.CreateSubgraph(self.dbp, self.predicate_blacklist, self.relation_file, qald=_qald)
AttributeError: module 'datasetPreparation.entity_subgraph' has no attribute 'CreateSubgraph'

Hopefully this description provides all the details requested in this comment

I would also point out that there is no CreateSubgraph class in entity_subgraph.py, although it is called in both datasetPreparation/create_dataset.py and server.py which both do

from datasetPreparation import entity_subgraph as es

and each will attempt the same pattern:

self.create_subgraph = es.CreateSubgraph(self.dbp, self.predicate_blacklist, self.relation_file, qald=_qald)

and

subgraph_maker = es.CreateSubgraph(dbp, predicate_blacklist, {}, qald=False)

, respectively.

Any advice?

SPARQL parsing server

Hello,
Thanks for sharing your work. I am trying to set up the system for benchmarking purposes, however, I cannot run the SPARQL parsing server. I cannot find the file/lib used in your system. Any guides will be appreciated.
Thanks

Lemmatization

Lemmatize everything while converting it to IDs (nlutils).

Bugs

  • Only 2904 questions parsed (check why)
  • False paths include weird/metadata predicates (debate whether to remove them or not)
  • Y labels are 0/1 as of now. (find partially correct paths and rate them as such)
  • Pick up False labels with partially correct paths too (as opposed to purely random)

Handle ask query with variables

We don't handle ask queries whose answers depends upon whether this query will fetch any answer; only handle the ones which enquire whether THIS triple exists.

Eg. of queries we don't handle yet:

ASK WHERE {
        res:James_Bond dbo:spouse ?uri . 
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.