- Tokenization of the input facts.
- Break the generated tokens in the form of Subject and Object by identifying the Predicate.
- Assigning true/false value to the fact by searching the predicate on wiki pages of the generated token.
- It fetches the result wiki pages of generated tokens.
- It also stores the previously fetched data in the formal on text files in Local Store folder.
Input:
List of facts in natural language with their corresponding fact ids (train.tsv and test.tsv)
Output:
Generates a triple file that maps the facts to its truth value (trainresult.ttl and testresult.ttl)
- Tokenization - Generate tokens for the fact, based on words starting with a capital letter including the prepositions following a noun.
- Data fetch from Wiki - For each generated token check if the data exists in Local Store, if not fetch data from Wikipedia and store it in the Local Store.
- Get predicate for the fact - For each fact generate a generic predicate based on the predefined list of predicates.
- Search Data - Search the data of each token for pattern containing the predicate and the other token.
- Assign Truth Value - If the pattern matches, assign true to the truth value.
Positive Example
Fact: 3820514 Alfonso XIII of Spain's birth place is Madrid.
- Tokenization:
{"Alfonso XIII of Spain" ,"Madrid "} - Data fetched from Wiki:
token1: {"Alfonso XIII of Spain"}
token2: {"Madrid"} - Get predicate for the fact:
(B|b)orn.{0,150}(?i) - Search Data:
String "Born (1886-05-17)17 May 1886 Royal Palace of Madrid" found in the wiki page of token1 ("Alfonso XIII of Spain") - return truthValue =true for this fact.
Negative Example
Fact: 3885766 Lucille Ball's death place is Santa Monica, California.
- Tokenization:
{"Lucille Ball" ,"Santa Monica, California "} - Data fetched from Wiki:
token1: {"Lucille Ball"}
token2: {"Santa Monica, California "} - Get predicate for the fact:
(Died).{0,150}(?i) - Search Data:
token2 not found in the wiki page of token1 with predicate
token1 not found in the wiki page of token2 with predicate - return truthValue =false for this fact.
Shivam Bahedia, Sourabh Poddar, Yamini Punetha