hotpotqa / hotpot Goto Github PK

License: Apache License 2.0

Shell 0.76% Python 99.24%

hotpot's Introduction

HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering

This repository contains the baseline model code, as well as the entire pipeline of running experiments on the HotpotQA dataset, including data download, data preprocessing, training, and evaluation.

Requirements

Python 3, pytorch 0.3.0, spacy

To install pytorch 0.3.0, follow the instructions at https://pytorch.org/get-started/previous-versions/ . For example, with CUDA8 and conda you can do

conda install pytorch=0.3.0 cuda80 -c pytorch

To install spacy, run

conda install spacy

Data Download and Preprocessing

Run the script to download the data, including HotpotQA data and GloVe embeddings, as well as spacy packages.

./download.sh

There are three HotpotQA files:

Training set http://curtis.ml.cmu.edu/datasets/hotpot/hotpot_train_v1.1.json
Dev set in the distractor setting http://curtis.ml.cmu.edu/datasets/hotpot/hotpot_dev_distractor_v1.json
Dev set in the fullwiki setting http://curtis.ml.cmu.edu/datasets/hotpot/hotpot_dev_fullwiki_v1.json This is just hotpot_dev_distractor_v1.json without the gold paragraphs, but instead with the top 10 paragraphs obtained using our retrieval system. If you want to use your own IR system (which is encouraged!), you can replace the paragraphs in this json with your own retrieval results. Please note that the gold paragraphs might or might not be in this json because our IR system is pretty basic.
Test set in the fullwiki setting http://curtis.ml.cmu.edu/datasets/hotpot/hotpot_test_fullwiki_v1.json Because in the fullwiki setting, you only need to submit your prediction to our evaluation server without the code, we publish the test set without the answers and supporting facts. The context in the file is paragraphs obtained using our retrieval system, which might or might not contain the gold paragraphs. Again you are encouraged to use your own IR system in this setting --- simply replace the paragraphs in this json with your own retrieval results.

JSON Format

The top level structure of each JSON file is a list, where each entry represents a question-answer data point. Each data point is a dict with the following keys:

_id: a unique id for this question-answer data point. This is useful for evaluation.
question: a string.
answer: a string. The test set does not have this key.
supporting_facts: a list. Each entry in the list is a list with two elements [title, sent_id], where title denotes the title of the paragraph, and sent_id denotes the supporting fact's id (0-based) in this paragraph. The test set does not have this key.
context: a list. Each entry is a paragraph, which is represented as a list with two elements [title, sentences] and sentences is a list of strings.

There are other keys that are not used in our code, but might be used for other purposes (note that these keys are not present in the test sets, and your model should not rely on these two keys for making preditions on the test sets):

type: either comparison or bridge, indicating the question type. (See our paper for more details).
level: one of easy, medium, and hard. (See our paper for more details).

Preprocessing

Preprocess the training and dev sets in the distractor setting:

python main.py --mode prepro --data_file hotpot_train_v1.1.json --para_limit 2250 --data_split train
python main.py --mode prepro --data_file hotpot_dev_distractor_v1.json --para_limit 2250 --data_split dev

Preprocess the dev set in the full wiki setting:

python main.py --mode prepro --data_file hotpot_dev_fullwiki_v1.json --data_split dev --fullwiki --para_limit 2250

Note that the training set has to be preprocessed before the dev sets because some vocabulary and embedding files are produced when the training set is processed.

Training

Train a model

CUDA_VISIBLE_DEVICES=0 python main.py --mode train --para_limit 2250 --batch_size 24 --init_lr 0.1 --keep_prob 1.0 \ 
--sp_lambda 1.0

Our implementation supports running on multiple GPUs. Remove the CUDA_VISIBLE_DEVICES variable to run on all GPUs you have

python main.py --mode train --para_limit 2250 --batch_size 24 --init_lr 0.1 --keep_prob 1.0 --sp_lambda 1.0

You will be able to see the perf reach over 58 F1 on the dev set. Record the file name (something like HOTPOT-20180924-160521) which will be used during evaluation.

Local Evaluation

First, make predictions and save the predictions into a file (replace --save with your own file name).

CUDA_VISIBLE_DEVICES=0 python main.py --mode test --data_split dev --para_limit 2250 --batch_size 24 --init_lr 0.1 \ 
--keep_prob 1.0 --sp_lambda 1.0 --save HOTPOT-20180924-160521 --prediction_file dev_distractor_pred.json

Then, call the evaluation script:

python hotpot_evaluate_v1.py dev_distractor_pred.json hotpot_dev_distractor_v1.json

The same procedure can be repeated to evaluate the dev set in the fullwiki setting.

CUDA_VISIBLE_DEVICES=0 python main.py --mode test --data_split dev --para_limit 2250 --batch_size 24 --init_lr 0.1 \ 
--keep_prob 1.0 --sp_lambda 1.0 --save HOTPOT-20180924-160521 --prediction_file dev_fullwiki_pred.json --fullwiki
python hotpot_evaluate_v1.py dev_fullwiki_pred.json hotpot_dev_fullwiki_v1.json

Prediction File Format

The prediction files dev_distractor_pred.json and dev_fullwiki_pred.json should be JSON files with the following keys:

answer: a dict. Each key of the dict is a QA pair id, corresponding to the field _id in data JSON files. Each value of the dict is a string representing the predicted answer.
sp: a dict. Each key of the dict is a QA pair id, corresponding to the field _id in data JSON files. Each value of the dict is a list representing the predicted supporting facts. Each entry of the list is a list with two elements [title, sent_id], where title denotes the title of the paragraph, and sent_id denotes the supporting fact's id (0-based) in this paragraph.

Model Submission and Test Set Evaluation

We use Codalab for test set evaluation. In the distractor setting, you must submit your code and provide a Docker environment. Your code will run on the test set. In the fullwiki setting, you only need to submit your prediction file. See https://worksheets.codalab.org/worksheets/0xa8718c1a5e9e470e84a7d5fb3ab1dde2/ for detailed instructions.

License

The HotpotQA dataset is distribued under the CC BY-SA 4.0 license. The code is distribued under the Apache 2.0 license.

References

The preprocessing part and the data loader are adapted from https://github.com/HKUST-KnowComp/R-Net . The evaluation script is adapted from https://rajpurkar.github.io/SQuAD-explorer/ .

hotpot's People

Contributors

Stargazers

Watchers

Forkers

yucoian ahashisyuu songxiliang nottombrown robingong luckysheep861 merajat zcc616 maxbartolo coderoverflow timdettmers irisqy 16madhuk rvaughan stjordanis stanxii loneknightz ofersabo ftarlaci mwdgdx growingluffy cshekhard alphadl ninkle dirkgr wunaidev xwang365 zhengwu34 a1324yn strategist922 pwforks jimliu0327 yuxingh fishewyz woutster jianbotang longborn nicolasag yana-xuyan db12138 skamanur moerqiang jmad-1995 chenyangjun45 hamigua2019 arinaruck tenghaohuang waldenn xzk-seu brookzhcn 181847 baronrustamov anddoit tinginde gongchuanyang mihir3009 andreinosov dkhanh ericdoug-qi monk1337 jinensetpal myngsooo gundo0109 pandeyamrit rakin061 wilfoderek jlreyes l16by yml-blog matthewcwise jxzhangjhu dearborn-open-ai mycurious

hotpot's Issues

Possible error in dataset

Here is a row from the dataset.

The Jimmy Butler (basketball) document has a corresponding sent_id of 902. I believe that is a mistake as there are only 5 documents for jimmy butler.

{
        "id": "5ae61bfd5542992663a4f261",
        "question": "Which teams did Jimmy Butler play and what role did he play on these teams?",
        "answer": "swingman",
        "type": "bridge",
        "level": "hard",
        "supporting_facts": {
            "title": [
                "Shooting guard",
                "Shooting guard",
                "Jimmy Butler (basketball)",
                "Jimmy Butler (basketball)"
            ],
            "sent_id": [
                4,
                5,
                0,
                902
            ]
        },
        "context": {
            "title": [
                "Jimmy Butler (actor)",
                "Sports in Philadelphia",
                "National School Scrabble Championship",
                "Shooting guard",
                "When a Man's a Man",
                "American football positions",
                "Lauri Markkanen",
                "Jimmy Butler (basketball)",
                "2017\u201318 Chicago Bulls season",
                "John Butler (running back)"
            ],
            "sentences": [
                [
                    "Jimmy Butler (February 20, 1921 in Akron, Ohio \u2013 February 18, 1945 in France) was an American, juvenile, motion-pictures actor, active in the 1930s and early 1940s."
                ],
                [
                    "Philadelphia, Pennsylvania, has been home to many teams and events in professional, semi-professional, amateur, college, and high-school sports.",
                    " Philadelphia is one of twelve cities that hosts teams in all four major sports leagues in North America, and Philadelphia is one of just three cities in which one team from every league plays within city limits.",
                    " These major sports teams are the Philadelphia Phillies of Major League Baseball, the Philadelphia Eagles of the National Football League, the Philadelphia 76ers of the National Basketball Association and the Philadelphia Flyers of the National Hockey League.",
                    " Each team has played in Philadelphia since at least the 1960s, and each team has won at least one championship.",
                    " Since 2010, Philadelphia has been the home of the Philadelphia Union of Major League Soccer which plays in suburban Chester, Pennsylvania, making the Philadelphia market one of nine cities that hosts a team in the four major sports leagues and the MLS.",
                    " Philadelphia hosts several college sports teams, including the Philadelphia Big 5 schools and Temple's Division I FBS football team.",
                    " Many of these teams have fan bases in both Philadelphia and the surrounding Delaware Valley.",
                    " In addition to the major professional and college sports, numerous semi-pro, amateur, community, and high school teams play in Philadelphia.",
                    " The city hosts numerous sporting events, such as the Penn Relays and the Collegiate Rugby Championship, and Philadelphia has been the most frequent host of the annual Army-Navy football game.",
                    " Philadelphia has also been the home of several renowned athletes and sports figures.",
                    " Philly furthermore has played a historically significant role in the development of cricket and extreme wrestling in the United States."
                ],
                [
                    "The National School Scrabble Championship is a Scrabble tournament for 4th-8th graders held annually in North America since 2003.",
                    " In 2012, 4th graders were allowed to compete for the first time ever.",
                    " The School Scrabble Championship uses the SSWL dictionary which has offensive words such as \"lez\" or \"jew\" omitted.",
                    " The competition is tournament Scrabble play, in which teams of two play for 25 minutes with digital timers similar to those used in the board game of chess.",
                    " The time limit was originally 22 minutes for each side until 2012 when the switch was made to coincide with the traditional times of the Adult Nationals.",
                    " The team with the most wins is determined the winner.",
                    " If there are multiple teams with the same number of wins, spread is used to break the tie.",
                    " Matthew Silver of Connecticut became the first competitor to win two consecutive National School Scrabble Championship titles in 2007 and 2008.",
                    " He accumulated a 14-0 record in those two years.",
                    " In 2009, for the first time ever, the event was won by a team of 5th graders, Andy Hoang & Erik Salgado of Salem Elementary in North Carolina.",
                    " They were the last team to finish the tournament with an undefeated record (7-0).",
                    " Since then, the champion has finished either 6-1 (2010) or 7-1 (2011, 2012, 2013).",
                    " The winners have often been invited to be on Good Morning America and Jimmy Kimmel Live!",
                    ".",
                    " The event has also received recognition from president Barack Obama and NBA superstar Shaquille O'Neal, who are advocates for the game themselves.",
                    " In 2012, Andy Hoang & Erik Salgado of North Carolina became the first team to win two NSSC titles, their first as 5th graders in 2009, and their second as 8th graders in 2012.",
                    " The 2013 NSSC was held in Washington D.C. 2013 marked the first time since 2009 that a previous champion will not be competing.",
                    " In 2010, 2011, and 2012, Andy Hoang, Erik Salgado, Bradley Robbins, and Evan McCarthy were champions that returned.",
                    " Only Andy Hoang and Erik Salgado were the only ones to repeat during the streak.",
                    " With Kevin Bowerman and Raymond Gao's win in 2013, North Carolina became the first state to hold 3 National titles (Winning 3 of the last 5 tournaments: 2009, 2012, & 2013), the most of all the states or districts in North America."
                ],
                [
                    "The shooting guard (SG), also known as the two or off guard, is one of the five positions in a regulation basketball game.",
                    " A shooting guard's main objective is to score points for his team.",
                    " Some teams ask their shooting guards to bring up the ball as well; these players are known colloquially as combo guards.",
                    " Kobe Bryant, for example, as a shooting guard was as good a playmaker as he was a scorer; other examples of combo guards are Dwyane Wade, Allen Iverson, James Harden, Manu Gin\u00f3bili, Jamal Crawford, Randy Foye and Jason Terry.",
                    " A player who can switch between playing shooting guard and small forward is known as a swingman.",
                    " Notable swing men (also known as wing players) include Jimmy Butler, Tracy McGrady, Vince Carter, Joe Johnson, Andre Iguodala, Andrew Wiggins, Evan Turner and Tyreke Evans.",
                    " In the NBA, shooting guards usually range from 6' 4\" (1.93 m) to 6' 7\" (2.01 m) and 5' 9\" (1.75 m) to 6' 0\" (1.83 m) in the WNBA."
                ],
                [
                    "When a Man's a Man is a 1935 American Western film directed by Edward F. Cline and written by Frank Mitchell Dazey and Agnes Christine Johnston.",
                    " The film stars George O'Brien, Dorothy Wilson, Paul Kelly, Harry Woods, Jimmy Butler and Richard Carlyle.",
                    " The film was released on February 15, 1935, by Fox Film Corporation."
                ],
                [
                    "In American football, each team has 11 players on the field at one time.",
                    " The specific role that a player takes on the field is called his position.",
                    " Under the modern rules of American football, teams are allowed unlimited substitutions; that is, teams may change any number of players after any play.",
                    " This has resulted in the development of three \"platoons\" of players: the offense (the team with the ball, which is trying to score), the defense (the team trying to prevent the other team from scoring, and to take the ball from them), and the special teams (who play in kicking situations).",
                    " Within those platoons, various specific positions exist depending on what each player's main job is."
                ],
                [
                    "Lauri Markkanen (born May 22, 1997) is a Finnish basketball player for the Chicago Bulls of the National Basketball Association (NBA).",
                    " In the 2017 NBA draft, he was taken by the Minnesota Timberwolves with the 7th overall pick before being included in a trade to the Chicago Bulls for Jimmy Butler.",
                    " He is the son of Finnish basketball players Pekka and Riikka Markkanen and brothers with the football player Eero Markkanen who plays in the German second-tier side Dynamo Dresden."
                ],
                [
                    "Jimmy Butler III (born September 14, 1989) is an American professional basketball player for the Minnesota Timberwolves of the National Basketball Association (NBA).",
                    " Born in Houston, Butler grew up in Tomball, Texas, and played college basketball for Tyler Junior College and Marquette University.",
                    " He was drafted with the 30th overall pick in the 2011 NBA draft by the Chicago Bulls.",
                    " He is a three-time NBA All-Star and a three-time NBA All-Defensive Team honoree, and was named to his first All-NBA Team in 2017.",
                    " In 2015, he was named the NBA Most Improved Player."
                ],
                [
                    "The 2017\u201318 Chicago Bulls season will be the 52nd season of the franchise in the National Basketball Association (NBA).",
                    " For the first time since 2011, All-Star Jimmy Butler will not be on the roster as he was traded to the Minnesota Timberwolves in the off-season."
                ],
                [
                    "John William Butler (September 14, 1918 \u2013 April 1963) was a professional football player in the National Football League drafted by the Pittsburgh Steelers in 1942.",
                    " He would go on to play for both Steelers merged teams (\"Steagles\" in 1943; \"Card-Pitt\" in 1944).",
                    " In 1943 Butler was drafted into the military due to World War II, however he was physically disqualified for duty.",
                    " He then made his first start with the \"Steagles\" one day after being ruled 4-F by his draft board for poor eyesight and bad knees.",
                    " During the 1944 season, Butler was charged, and fined $200, by co-coaches Walt Kiesling and Phil Handler for \"indifferent play\".",
                    " He was then put on waivers and was soon claimed by the Brooklyn Tigers.",
                    " In 1945, he played his final season with the Philadelphia Eagles."
                ]
            ]
        }
    }

Wiki Data Link Seems Break Down

Hi, I'm so interested about your work and hope to follow it.

When we try to create our own IR system based on wiki 2016, we find that the link wiki seems break down. Can you provide us a new link or google driver about the wiki?

A question about evaluating

I found you only provide the evaluating method for answer in your code. Could you provide the evaluating code for support facts? or joint evaluating method between ans and sup in order to make experiment better.

Kind regards
Tang

How to make predictions with a trained model?

After training a model, how to make predictions for the dataset I have (say I have a query and list of paragraphs which might contain the answer).

As far as I read through the readme file, the prediction file format was given. But prediction means that the model predicts the answer for the query with the supporting facts or the data it has. But as far as I understood, it was like I should have a file which has the ans, query and the supporting facts to evaluate the model.

Also it would be helpful if you can upload the JSON, pickle files (generated while preprocessing) and trained model for the dataset, as all of them would not have sufficient hardware to generate them or train the model.

not enough memory: you tried to allocate 0GB.

I tried to python main.py --mode prepro --data_file hotpot_train_v1.json --para_limit 2250 --data_split train which worked fine, till i got

RuntimeError: $ Torch: not enough memory: you tried to allocate 0GB. Buy new RAM! at ..\aten\src\TH\THGeneral.cpp:204.

I have 32GB of RAM and a 1070 GPU (8GB), is is not enough? And why is it saying 0GB?

PS: Using Pytorch 0.4, might that be an issue? Don't see how it connects to RAM?

why save ori_model rather than model

In run.py line 144, ori_model is saved instead of model, is this a bug?

Can you share the retrieval code for the fullwiki setting?

Hi,
Thanks for releasing the code, it's very helpful for me.

In paper's appendix, you introduce your retrieval strategy for fullwiki setting. I want to retrieve more than 10 articles, so I use DrQA's retriever(unigram+bigram hash table). But its performance is so far from the your results in the top 10 paragraphs.

Can you share the retrieval code for the fullwiki setting?

Thanks!

Best Regard,
Deming Ye

Can I see a small sample of questions (~100 questions) without having to download the entire data?

MemoryError during preprocessing

The error message is:

Traceback (most recent call last):
  File "main.py", line 86, in <module>
    prepro(config)
  File "/extradisk/jitianbo/workspace/HotpotQA/prepro.py", line 349, in prepro
    build_features(config, examples, config.data_split, record_file, word2idx_dict, char2idx_dict)
  File "/extradisk/jitianbo/workspace/HotpotQA/prepro.py", line 306, in build_features
    pickle.dump(datapoints, open(out_file, 'wb'), protocol=-1)
MemoryError

My mem size is 65876000kB and my system info is Linux lly-GPU 4.2.0-30-generic #36~14.04.1-Ubuntu SMP Fri Feb 26 18:49:23 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Error incurred during the preprocessing of the training set - {SIGKILL(-9)}

Hi, I am having a problem with the preprocess. I have already decrease the n_jobs = 3 (using 1 it killed the process after 9 iterations). This is the error that I received after 66893 tasks done. Can you help me?
Thank you in advance

joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.

The exit codes of the workers are {SIGKILL(-9)}

Size mismatch from rnn

Hi, after finally getting it to train I got the following error when calling python main.py --mode test --data_split dev --para_limit 2250 --batch_size 24 --init_lr 0.1 --keep_prob 1.0 --sp_lambda 1.0 --save HOTPOT-20190113-103231 --prediction_file dev_distractor_pred.json:

RuntimeError: Error(s) in loading state_dict for SPModel:
        size mismatch for rnn_start.rnns.0.weight_ih_l0: copying a param with shape torch.Size([240, 81]) from checkpoint, the shape in current model is torch.Size([240, 240]).
        size mismatch for rnn_start.rnns.0.weight_ih_l0_reverse: copying a param with shape torch.Size([240, 81]) from checkpoint, the shape in current model is torch.Size([240, 240]).
        size mismatch for rnn_end.rnns.0.weight_ih_l0: copying a param with shape torch.Size([240, 241]) from checkpoint, the shape in current model is torch.Size([240, 240]).
        size mismatch for rnn_end.rnns.0.weight_ih_l0_reverse: copying a param with shape torch.Size([240, 241]) from checkpoint, the shape in current model is torch.Size([240, 240]).
        size mismatch for rnn_type.rnns.0.weight_ih_l0: copying a param with shape torch.Size([240, 241]) from checkpoint, the shape in current model is torch.Size([240, 240]).
        size mismatch for rnn_type.rnns.0.weight_ih_l0_reverse: copying a param with shape torch.Size([240, 241]) from checkpoint, the shape in current model is torch.Size([240, 240]).

Edit: Alright, training finished, but still says episode 0. F1 is at 46.

Any idea why the shapes are different? All help is appreciated, thank you!

what is the sent_limit?

Hi
I would like to know what is the sent_limit? It may be hard to understand this parameters due to that there is no annotation for parameters.
Kind regards
Tang

Cuda error during training step.

Hello I was running the training step as mentioned python main.py --mode train --para_limit 2250 --batch_size 24 --init_lr 0.1 --keep_prob 1.0 --sp_lambda 1.0

I encountered this error. I am running this on GPU , is it possible that this is a memory allocation issue? Any idea how much memory is required to perform the training step?

Any pointers towards resolving this would be much appreciated.

DocQA or improved BiDAF?

Hello,

this may be a dumb question but from the code I have the feeling that the model looks more like BiDAF than DocQA. DocQA is an improved BiDAF with a retrieval system to select a single paragraph on which to apply BiDAF. However if I'm not mistaken the preprocessing script joins all the Hotpot paragraph instead of selecting one. The only retrieval is in the creation of the fullwiki setup where 10 paragraphs are selected.

Sure the model contains improvements compared to BiDAF (self-attention, 3-way classifier, no highway layer) but not the main particularity of DocQA which is the retriever. Yet in the paper states that the authors implemented DocQA.

Can you confirm that the code indeed merges all the paragraphs?

Thanks.

Wrong answer in Dev Set

{"_id":"5a8de02b554299068b959e13",
"answer":"Republic of Ireland national team",
"question":"What team did Robbie Keane play for after Inter Milan?"
..}

Thre is a mistake in the question above. The answer should be Leeds United according to Wikipedia

You change your training data directory so you should change your download.sh

The new training json file url : http://curtis.ml.cmu.edu/datasets/hotpot/hotpot_train_v1.1.json
The one in the download.sh : http://curtis.ml.cmu.edu/datasets/hotpot/hotpot_train_v1.json

Preprocessing error: joblib.externals.loky.process_executor.BrokenProcessPool: A process in the executor was terminated abruptly, the pool is not usable anymore

When I run the preprocessing step - the job always fail at this step
(after processing 64,000 questions)

Has anybody else faced this issue?

[Parallel(n_jobs=8)]: Batch computation too slow (2.0377s.) Setting batch_size=2.
exception calling callback for <Future at 0x7f3526427e48 state=finished raised BrokenProcessPool>
Traceback (most recent call last):
  File "/root/anaconda3/envs/local_nmt/lib/python3.5/site-packages/joblib/externals/loky/_base.py", line 322, in _invoke_callbacks
    callback(self)
  File "/root/anaconda3/envs/local_nmt/lib/python3.5/site-packages/joblib/parallel.py", line 375, in __call__
    self.parallel.dispatch_next()
  File "/root/anaconda3/envs/local_nmt/lib/python3.5/site-packages/joblib/parallel.py", line 797, in dispatch_next
    if not self.dispatch_one_batch(self._original_iterator):
  File "/root/anaconda3/envs/local_nmt/lib/python3.5/site-packages/joblib/parallel.py", line 825, in dispatch_one_batch
    self._dispatch(tasks)
  File "/root/anaconda3/envs/local_nmt/lib/python3.5/site-packages/joblib/parallel.py", line 782, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/root/anaconda3/envs/local_nmt/lib/python3.5/site-packages/joblib/_parallel_backends.py", line 506, in apply_async
    future = self._workers.submit(SafeFunction(func))
  File "/root/anaconda3/envs/local_nmt/lib/python3.5/site-packages/joblib/externals/loky/reusable_executor.py", line 151, in submit
    fn, *args, **kwargs)
  File "/root/anaconda3/envs/local_nmt/lib/python3.5/site-packages/joblib/externals/loky/process_executor.py", line 990, in submit
    raise BrokenProcessPool(self._flags.broken)
joblib.externals.loky.process_executor.BrokenProcessPool: A process in the executor was terminated abruptly, the pool is not usable anymore.

Very slow training?

Hello,

I am trying to run your code on a Tesla P100 and it takes more than an hour to compute 1000 steps of the first epoch. I noticed that the 16Go of GPU memory are completely used but the "GPU-Utilization" from nvidia-smi is at only 20%, meaning that there is a serious problem of optimization. Is that normal or am I missing something?

Thanks.

Preprocessing step - memory consumption

Hello,

I am trying to get the intial repo to work by following the steps provided. In the preprocessing step, upon running:

python main.py --mode prepro --data_file hotpot_train_v1.1.json --para_limit 2250 --data_split train
The pre-processing begins. However, the memory consumption on my machine becomes enormous (using up to 10gb), so I decided to terminate it. How many 'tasks' (as it displays while running) does the preprocessing have to go through?

Is this amount of memory consumption normal or is something wrong with my setup/enviornment?

Thanks

Unable to download the dataset from the website.

wget http://curtis.ml.cmu.edu/datasets/hotpot/hotpot_dev_distractor_v1.json shows "Network is Unreachable".
Even downloading the data set from https://hotpotqa.github.io/ shows unable to connect.

Processing Error: list index out of range

I get an out-of-range error while processing the training dataset.
It happens at around 24263 questions.
Thanks!

Traceback (most recent call last):
  File "/opt/miniconda3/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py", line 418, in _process_worker
    r = call_item()
  File "/opt/miniconda3/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py", line 272, in __call__
    return self.fn(*self.args, **self.kwargs)
  File "/opt/miniconda3/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 567, in __call__
    return self.func(*args, **kwargs)
  File "/opt/miniconda3/lib/python3.7/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/opt/miniconda3/lib/python3.7/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/home/develop/ailab/zjiehang/HotpotQA/prepro.py", line 171, in _process_article
    best_indices = (answer_span[0], answer_span[-1])
IndexError: list index out of range
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/develop/ailab/zjiehang/HotpotQA/main.py", line 86, in <module>
    prepro(config)
  File "/home/develop/ailab/zjiehang/HotpotQA/prepro.py", line 332, in prepro
    examples, eval_examples = process_file(config.data_file, config, word_counter, char_counter)
  File "/home/develop/ailab/zjiehang/HotpotQA/prepro.py", line 191, in process_file
    outputs = Parallel(n_jobs=12, verbose=10)(delayed(_process_article)(article, config) for article in data)
  File "/opt/miniconda3/lib/python3.7/site-packages/joblib/parallel.py", line 934, in __call__
    self.retrieve()
  File "/opt/miniconda3/lib/python3.7/site-packages/joblib/parallel.py", line 833, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/opt/miniconda3/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 521, in wrap_future_result
    return future.result(timeout=timeout)
  File "/opt/miniconda3/lib/python3.7/concurrent/futures/_base.py", line 425, in result
    return self.__get_result()
  File "/opt/miniconda3/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
IndexError: list index out of range_**

why I can't reach your performance of baseline?

what is start_mapping?

hi, i have some questions about support prediction:

what is the meaning of start_mapping, its shape is (batch, max_para_limit, max_sent_num), why not (batch, max_sent_num)?
on the model of support prediction, i cannot understand these code:
start_output = torch.matmul(start_mapping.permute(0, 2, 1).contiguous(), sp_output[:,:,self.hidden:])
end_output = torch.matmul(end_mapping.permute(0, 2, 1).contiguous(), sp_output[:,:,:self.hidden])
sp_output = torch.cat([start_output, end_output], dim=-1)
sp_output_t = self.linear_sp(sp_output)
sp_output_aux = Variable(sp_output_t.data.new(sp_output_t.size(0), sp_output_t.size(1), 1).zero_())
predict_support = torch.cat([sp_output_aux, sp_output_t], dim=-1).contiguous()

MemoryError: during preprocessing

python main.py --mode prepro --data_file hotpot_train_v1.1.json --para_limit 2250 --data_split train

Getting memory error after 11048 tasks are done.
I have 8GB of RAM and I believe that it runs out of memory while doing pre processing. Is there any way so that I could perhaps perform pre processing in chunks or some other way around ?

MemoryError still presists with n_jobs=3

UnboundLocalError: local variable 'word_counter' referenced before assignment

python main.py --mode prepro --data_file hotpot_train_v1.1.json --para_limit 2250 --data_split train
python main.py --mode prepro --data_file hotpot_dev_distractor_v1.json --para_limit 2250 --data_split dev

[Parallel(n_jobs=12)]: Using backend LokyBackend with 12 concurrent workers.
[Parallel(n_jobs=12)]: Done 1 tasks | elapsed: 5.0s
[Parallel(n_jobs=12)]: Done 8 tasks | elapsed: 6.4s
[Parallel(n_jobs=12)]: Done 17 tasks | elapsed: 8.0s
......
[Parallel(n_jobs=12)]: Done 7057 tasks | elapsed: 6.0min
[Parallel(n_jobs=12)]: Done 7176 tasks | elapsed: 6.2min
[Parallel(n_jobs=12)]: Done 7297 tasks | elapsed: 6.3min
[Parallel(n_jobs=12)]: Done 7405 out of 7405 | elapsed: 6.4min finished
7405 questions in total
Traceback (most recent call last):
File "main.py", line 86, in
prepro(config)
File "/content/hotpot/prepro.py", line 324, in prepro
word_emb_mat, word2idx_dict, idx2word_dict = get_embedding(word_counter, "word", emb_file=config.glove_word_file,
UnboundLocalError: local variable 'word_counter' referenced before assignment

NameError: during preprocessing

when i want to give below command as per your manual at here..
Preprocess the training and dev sets in the distractor setting:

python main.py --mode prepro --data_file hotpot_train_v1.1.json --para_limit 2250 --data_split train
python main.py --mode prepro --data_file hotpot_dev_distractor_v1.json --para_limit 2250 --data_split dev

Preprocess the dev set in the full wiki setting:

python main.py --mode prepro --data_file hotpot_dev_fullwiki_v1.json --data_split dev --fullwiki --para_limit
.......
i got error in prepro.py at
96
97 def _process(sent, is_sup_fact, is_title=False):
---> 98 nonlocal [text_context, context_tokens, context_chars, offsets, start_end_facts, flat_offsets]
text_context = undefined
99 N_chars = len(text_context)

NameError: global name 'nonlocal' is not defined

how may i solve it?

RNN dimension order mismatch?

Hi, Kimi

Thanks for your opening source the code.

I found the input of torch.nn.GRU (https://pytorch.org/docs/0.3.0/nn.html#torch.nn.GRU):

-- input (seq_len, batch, input_size): tensor containing the features of the input sequence. The input can also be a packed variable length sequence. See torch.nn.utils.rnn.pack_padded_sequence() for details.
-- h_0 (num_layers * num_directions, batch, hidden_size): tensor containing the initial hidden state for each element in the batch.

But in EncoderRNN (https://github.com/hotpotqa/hotpot/blob/master/model.py)

bsz, slen = input.size(0), input.size(1)
output = input
...
output, hidden = self.rnns[i](output, hidden)

The input size is (batch_size,seq_len,input_size), it doesn't seem to match the document for pytorch?

Deming Ye
Best Wish

Unauthorized (403) when downloading json files

when I launch download.sh I get 403 for all of the three files:

wget http://curtis.ml.cmu.edu/datasets/hotpot/hotpot_dev_distractor_v1.json
wget http://curtis.ml.cmu.edu/datasets/hotpot/hotpot_dev_fullwiki_v1.json
wget http://curtis.ml.cmu.edu/datasets/hotpot/hotpot_train_v1.1.json

Problematic Questions

Hi,

I have been looking through your datasets, and found something odd - in the training set, there are questions that seem broken / missing.
For example, sample id 5a775ea9554299373536024d holds the question 'w', and sample id 5a81265c5542995ce29dcbca holds the question 'DRM'. There are several more.

The easiest way to find these examples is by sorting the questions in the training set by length, and then looking at the shortest ones.
A simple workaround could be to discard all questions with no question mark, but this eliminates 2322 samples, some of them perfectly good questions.

Are you aware of this?
Thanks!

Tensor split for start_output and end_output?

I noticed that start_output uses the later half of sp_output

        start_output = torch.matmul(start_mapping.permute(0, 2, 1).contiguous(), sp_output[:,:,self.hidden:])
        end_output = torch.matmul(end_mapping.permute(0, 2, 1).contiguous(), sp_output[:,:,:self.hidden])

Is it OK to use the first half as start_output?

Is it possible to convert the input data [context_idxs] and [ques_idxs] of the model into the form of original text?

Is it possible to convert the input data [context_idxs] and [ques_idxs] of the model into the form of original text?
For example, the size of [context_idxs] is [24, 1700], which is converted into 24 lines of 1700 words.thanks for your help

no softmax for nll_loss

both answer prediction and sp prediction

A question about evaluate.

Hi
I run my results in the eval() in hotpot_evaluate_v1.py, however, the result may be not the same with your scores in leaderboard. Could you tell me the correct function to evaluate?

When i split the training set for dev, something strange happened！

I got an unbelievable high-performance when i split a part of training set for dev, certainly this part of data not be used in training.
by the way, could you share the test set?

Not all supporting_facts titles in context titles?

Hi, I'm trying to understand how to access the "gold" paragraphs in the dataset and having difficulties.

Its to my understanding that the unique values of supporting_facts['titles'] represent the titles of the gold paragraphs, e.g. the first entry of the fullwiki validation split is:
{ "title": [ "Scott Derrickson", "Ed Wood" ], "sent_id": [ 0, 0 ] }

But the titles of the paragraphs in the context column are:
[ "Adam Collis", "Ed Wood (film)", "Tyler Bates", "Doctor Strange (2016 film)", "Hellraiser: Inferno", "Sinister (film)", "Deliver Us from Evil (2014 film)", "Woodson, Arkansas", "Conrad Brooks", "The Exorcism of Emily Rose" ]

"Ed Wood (film)" is a paragraph title, but "Ed Wood" is not, so how are we meant to map between the two?

In other cases nothing even resembling the supporting fact title is present in the paragraph titles, and only about 60% the supporting paragraphs are able to be accessed using the title.

multiple GPUs error

Traceback (most recent call last):
File "main.py", line 86, in
train(config)
File "/home/caoxing/project/hotpot-master/run.py", line 110, in train
logit1, logit2, predict_type, predict_support = model(context_idxs, ques_idxs, context_char_idxs, ques_char_idxs, context_lens, start_mapping, end_mapping, all_mapping, return_yp=False)
File "/home/caoxing/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/caoxing/miniconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 155, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/caoxing/miniconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 165, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/caoxing/miniconda3/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "/home/caoxing/miniconda3/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise
raise self.exc_type(msg)
IndexError: Caught IndexError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/caoxing/miniconda3/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, **kwargs)
File "/home/caoxing/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/caoxing/project/hotpot-master/sp_model.py", line 85, in forward
context_output = self.rnn(context_output, context_lens)
File "/home/caoxing/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/caoxing/project/hotpot-master/sp_model.py", line 187, in forward
hidden = self.get_init(bsz, i)
File "/home/caoxing/project/hotpot-master/sp_model.py", line 178, in get_init
return self.init_hidden[i].expand(-1, bsz, -1).contiguous()
File "/home/caoxing/miniconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 402, in getitem
idx = self._get_abs_string_index(idx)
File "/home/caoxing/miniconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 393, in _get_abs_string_index
raise IndexError('index {} is out of range'.format(idx))
IndexError: index 0 is out of range

Using own retrieval results

Hello. I am attempting to experiments with using my own retrieval results and assessing the impact on the performance of the model (for the full-wiki setting)

If I understood correctly, the process would be as follows.

Generate my own version of hotpot_dev_fullwiki_v1.json in which the context entry of the JSON has been modified with my own paragraphs. All the other keys should be left identical. I shall refer to this json file as custom_dev_fullwiki.json.

Likewise for the test set, to create custom_test_fullwiki.json

Then, I would run the preprocessing , training and evaluation commands as specified in the README.

To make meaningful comparison of metrics, once I change my retrieval method and generate a 'v2.json' set of files, I should start again from scratch on preprocessing, then train etc.

Is this correct?

Training error

While training the model, I am getting the following error. I am using torch 1.3.0, CUDA 10 version on a linux machine. Is there an issue with the pytorch version ?

python main.py --mode train --para_limit 2250 --batch_size 24 --init_lr 0.1 --keep_prob 1.0 --sp_lambda 1.0

Error Log:
Traceback (most recent call last):
File "main.py", line 84, in
train(config)
File "hotpotqa/hotpot/run.py", line 121, in train
total_loss += loss.data[0]
IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

Any help would be appreciated.

have sufficient permissions on worksheet

When I evaluated the prediction by the command cl macro hotpotqa-utils//dev-eval-distractor-v1.0 predict -n evaluate, there is an error:
User Dorothy(0x1dde82bc468e4c96b270d944a387037b) does not have sufficient permissions on worksheet 0xcb3dcf53ef7241a586b6e2e1dac8169c (have read, need all).

In addition, the predict buddle output the prediction but failed in the last step: Error while uploading: Unable to update bundle contents in bundle service: Bad Request - Error while parsing chunked transfer body.

Is there anything wrong with my operation? Would you please help me solve it?

Selecting answer span during preprocessing

Hello,

I have trouble understanding how exactly the preprocessing script selects the answer spans from the answer text. From what I understand, the function fix_span loops over all matches of the answer text and tries to select the one that better "sticks" to individual tokens.
In the case where the answer perfectly matches some tokens, the first occurrence is returned.

Is that right?

Thanks!

Got 403 when launching download.sh

Self-explanatory title ...

root@1cf6c8a4d6e1:/app/hotpot# ./download.sh
--2023-09-20 09:37:50-- http://curtis.ml.cmu.edu/datasets/hotpot/hotpot_dev_distractor_v1.json
Resolving curtis.ml.cmu.edu (curtis.ml.cmu.edu)... 128.2.204.193
Connecting to curtis.ml.cmu.edu (curtis.ml.cmu.edu)|128.2.204.193|:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
2023-09-20 09:37:51 ERROR 403: Forbidden.

--2023-09-20 09:37:51-- http://curtis.ml.cmu.edu/datasets/hotpot/hotpot_dev_fullwiki_v1.json
Resolving curtis.ml.cmu.edu (curtis.ml.cmu.edu)... 128.2.204.193
Connecting to curtis.ml.cmu.edu (curtis.ml.cmu.edu)|128.2.204.193|:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
2023-09-20 09:37:51 ERROR 403: Forbidden.

--2023-09-20 09:37:51-- http://curtis.ml.cmu.edu/datasets/hotpot/hotpot_train_v1.1.json
Resolving curtis.ml.cmu.edu (curtis.ml.cmu.edu)... 128.2.204.193
Connecting to curtis.ml.cmu.edu (curtis.ml.cmu.edu)|128.2.204.193|:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
2023-09-20 09:37:51 ERROR 403: Forbidden.

--2023-09-20 09:37:51-- http://nlp.stanford.edu/data/glove.840B.300d.zip
Resolving nlp.stanford.edu (nlp.stanford.edu)... 171.64.67.140
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://nlp.stanford.edu/data/glove.840B.300d.zip [following]
--2023-09-20 09:37:52-- https://nlp.stanford.edu/data/glove.840B.300d.zip
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://downloads.cs.stanford.edu/nlp/data/glove.840B.300d.zip [following]
--2023-09-20 09:37:53-- https://downloads.cs.stanford.edu/nlp/data/glove.840B.300d.zip
Resolving downloads.cs.stanford.edu (downloads.cs.stanford.edu)... failed: No address associated with hostname.

where is hotpot leaderboard?

i can not find your leaderboard

CUDA error during training

Hi, I received this CUDA error after training the model for 3000 steps. I used the exact same command as suggested in README. I suppose it's caused by an indexing out of range issue. Any suggestion where the issue might be? Thank you.

UnboundLocalError: local variable 'word_counter' referenced before assignment：during preprocessing

when i want to give below command as per your manual at here..
Three commands have different errors
python main.py --mode prepro --data_file hotpot_train_v1.1.json --para_limit 2250 --data_split train
（
[Parallel(n_jobs=12)]: Done 35960 tasks | elapsed: 2.4min
[Parallel(n_jobs=12)]: Done 36784 tasks | elapsed: 2.5min
[Parallel(n_jobs=12)]: Done 37624 tasks | elapsed: 2.5min
[Parallel(n_jobs=12)]: Done 38464 tasks | elapsed: 2.5min
[Parallel(n_jobs=12)]: Done 39320 tasks | elapsed: 2.6min
[Parallel(n_jobs=12)]: Done 40176 tasks | elapsed: 2.6min
[Parallel(n_jobs=12)]: Done 41048 tasks | elapsed: 2.7min
[Parallel(n_jobs=12)]: Done 41920 tasks | elapsed: 2.7min
[Parallel(n_jobs=12)]: Done 42808 tasks | elapsed: 2.8min
已杀死
）

python main.py --mode prepro --data_file hotpot_dev_distractor_v1.json --para_limit 2250 --data_split dev
（
[Parallel(n_jobs=12)]: Done 7405 out of 7405 | elapsed: 30.2s finished
7405 questions in total
Traceback (most recent call last):
File "main.py", line 86, in
prepro(config)
File "/home/caoxing/project/hotpot-master/prepro.py", line 324, in prepro
word_emb_mat, word2idx_dict, idx2word_dict = get_embedding(word_counter, "word", emb_file=config.glove_word_file,
UnboundLocalError: local variable 'word_counter' referenced before assignment
）

python main.py --mode prepro --data_file hotpot_dev_fullwiki_v1.json --data_split dev --fullwiki --para_limit 2250

（[Parallel(n_jobs=12)]: Done 7405 out of 7405 | elapsed: 30.3s finished
7405 questions in total
Traceback (most recent call last):
File "main.py", line 86, in
prepro(config)
File "/home/caoxing/project/hotpot-master/prepro.py", line 324, in prepro
word_emb_mat, word2idx_dict, idx2word_dict = get_embedding(word_counter, "word", emb_file=config.glove_word_file,
UnboundLocalError: local variable 'word_counter' referenced before assignment
）
I would be very appreciated if you can help me:)

hotpotqa / hotpot Goto Github PK

hotpot's Introduction

HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering

Requirements

Data Download and Preprocessing

JSON Format

Preprocessing

Training

Local Evaluation

Prediction File Format

Model Submission and Test Set Evaluation

License

References

hotpot's People

Contributors

Stargazers

Watchers

Forkers

hotpot's Issues

Recommend Projects

Recommend Topics

Recommend Org