avisingh599 / visual-qa Goto Github PK

View Code? Open in Web Editor NEW

481.0 481.0 186.0 34.34 MB

[Reimplementation Antol et al 2015] Keras-based LSTM/CNN models for Visual Question Answering

Home Page: https://avisingh599.github.io/deeplearning/visual-qa/

License: MIT License

Shell 2.45% Python 97.55%

visual-qa's People

Contributors

Stargazers

Watchers

Forkers

zencoding fireae yanweifu lu839684437 oztc eriche2016 samim23 sjtufs akansal1 jizhihang mouhidine xypan1232 alyxb domarps poyuwu opengelo ml-lab vishaalmohan codeaudit gtostock arushk1 lngvietthang wolfhu rickyall sachavakili tammyyang fancyerii silasxue mingyuanxie snownus nagyistge ye-lun lokeshpancharia liyi193328 zhimingz zhengkaifu sophie-germain iamaaditya chagge omar-florez mikelew88 realentertain cloudreamer kalyanp hitluobin mydaisy2 benjamesbabala morusu amoliu owajawa passiweinberger wavelets lvapeab richagarg123 hdzhao zhiqiangwan deshraj guduxingzou shivamagrawal2014 lihoujunuestc rhythm92 zachlungu nigelliyang wsnpyo boozyguo caomw dreadlord1984 nanqiangyipo tonytongzhao vyraun wlads olivernina yongduek srinivasreddymandala laisun techstone waoudi walkoncross erkhemee yzabc007 kevinwenya mdmustafizurrahman wait1988 bytearchive iij0 snehil rafat-islam1186 jaciyu sp2014 alirezarahimpour predictlytechlabs gau2112 daiwk rain-y bryant1410 akshayjh wuyanlun sawon1234 michaelfeng87 cjx3721

visual-qa's Issues

Error with Progressbar

when running trainMLP.py, the training crashes in Keras' Progbar. The problem appears to be the following line 108:
progbar.add(args.batch_size, values=[("train loss", loss)])

loss should probably be a float value of some sort, but when I look in the debugger, it is a list:
[array(7.346083164215088, dtype=float32)]

The following change on line 108 appears to fix the issue:
progbar.add(args.batch_size, values=[("train loss", loss[0].item(0))])

We can't downloads the training and validation sets from visualqa.org

TypeError: init() takes exactly 2 arguments (1 given)

rzai@rzai00:/prj/visual-qa/scripts$ CUDA_VISIBLE_DEVICES=1 python trainLSTM_1.py
Using Theano backend.
Using gpu device 0: GeForce GTX 1080 (CNMeM is disabled, cuDNN 5105)
Traceback (most recent call last):
File "trainLSTM_1.py", line 122, in
main()
File "trainLSTM_1.py", line 59, in main
image_model.add(Reshape(input_shape = (img_dim,), dims=(img_dim,)))
TypeError: init() takes exactly 2 arguments (1 given)
rzai@rzai00:/prj/visual-qa/scripts$

expected lstm_input_1 to have shape (None, 30, 300) but got array with shape (128, 5, 300)

rzai@rzai00:/prj/visual-qa/scripts$ CUDA_VISIBLE_DEVICES=1 python trainLSTM_language.py
Using Theano backend.
Using gpu device 0: GeForce GTX 1080 (CNMeM is disabled, cuDNN 5105)
Loaded questions, sorting by length...
Compiling model...
Compilation done...
loaded word2vec features...
Training started...
Traceback (most recent call last):
File "trainLSTM_language.py", line 91, in
main()
File "trainLSTM_language.py", line 81, in main
loss = model.train_on_batch(X_q_batch, Y_batch)
File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 712, in train_on_batch
class_weight=class_weight)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1215, in train_on_batch
check_batch_dim=True)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 961, in _standardize_user_data
exception_prefix='model input')
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 108, in standardize_input_data
str(array.shape))
Exception: Error when checking model input: expected lstm_input_1 to have shape (None, 30, 300) but got array with shape (128, 5, 300)
rzai@rzai00:/prj/visual-qa/scripts$

FYI it doesn't work with the latest version of Keras (v. 0.3)

Error log:

python trainMLP.py
Using Theano backend.
loaded vgg features
loaded word2vec features...
Compiling model...
Compilation done...
Training started...
   128/215375 [..............................]Traceback (most recent call last):
  File "trainMLP.py", line 116, in <module>
    main()
  File "trainMLP.py", line 108, in main
    progbar.add(args.batch_size, values=[("train loss", loss)])
  File "/root/anaconda2/lib/python2.7/site-packages/keras/utils/generic_utils.py", line 143, in add
    self.update(self.seen_so_far+n, values)
  File "/root/anaconda2/lib/python2.7/site-packages/keras/utils/generic_utils.py", line 112, in update
    avg = self.sum_values[k][0] / max(1, self.sum_values[k][1])
TypeError: unsupported operand type(s) for /: 'list' and 'int'

IOError: [Errno 2] No such file or directory: '../models/labelencoder.pkl'

Hello! I' trying to run own_images.py , but I receive this error:
Traceback (most recent call last): File "own_image.py", line 65, in <module> main() File "own_image.py", line 31, in main labelencoder = joblib.load('../models/labelencoder.pkl') File "/home/ivan/PycharmProjects/visual-qa/env/local/lib/python2.7/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 409, in load with open(filename, 'rb') as file_handle: IOError: [Errno 2] No such file or directory: '../models/labelencoder.pkl'
Did you move this file? Can you help me to fix that?

Images Validation filename is mismatched

I think the dumpText.py script creates a file with an _all prefix at the end. The eval scripts should be updated to match. For example:

- images_val = open('../data/preprocessed/images_val2014.txt',
+ images_val = open('../data/preprocessed/images_val2014_all.txt',

How do I test with my own images?

So I have a VGG_feats.mat file obtained by running my own images through a VGGNet. I also have txt file of question(s) about that image.

Do I need anymore data to use your net to get answers? I don't need a seperate word2vec net to calculate the features from my questions right?
How do I use your net to get answers to my questions about the image?

dumpText.py does not dump text

loading VQA annotations and questions into memory...
0:00:16.625574
creating index...
index created!
Dumping questions,answers, imageIDs, and questions lenghts to text files...
Traceback (most recent call last): |
File "dumpText.py", line 82, in
main()
File "dumpText.py", line 74, in main
answers_file.write(getModalAnswer(qa[i]['answers']).encode('utf8'))
KeyError: 1

Need Instruction for Implementaion

Need to know about the order in which we should run the program files to implement the system.

Getting MemoryError

root@vps:~# python mnist_mlp.py Using Theano backend. Traceback (most recent call last): File "mnist_mlp.py", line 24, in <module> (X_train, y_train), (X_test, y_test) = mnist.load_data() File "/usr/local/lib/python2.7/dist-packages/Keras-1.0.5-py2.7.egg/keras/datasets/mnist.py", line 17, in load_data data = cPickle.load(f) File "/usr/lib/python2.7/gzip.py", line 455, in readline c = self.read(readsize) File "/usr/lib/python2.7/gzip.py", line 261, in read self._read(readsize) File "/usr/lib/python2.7/gzip.py", line 313, in _read self._add_read_data( uncompress ) File "/usr/lib/python2.7/gzip.py", line 331, in _add_read_data self.extrabuf = self.extrabuf[offset:] + data MemoryError

Please help

faied on python trainMLP.py in the get_started.sh

Dumping questions, answers, questionIDs, imageIDs, and questions lengths to text files...
100% |######################################################################################################################################################################################################|
completed dumping training data
Dumping questions, answers, questionIDs, imageIDs, and questions lengths to text files...
100% |######################################################################################################################################################################################################|
completed dumping validation data
Using Theano backend.
loaded vgg features
loaded word2vec features...
Compiling model...
Compilation done...
Training started...
128/215375 [..............................]Traceback (most recent call last):
File "trainMLP.py", line 116, in
main()
File "trainMLP.py", line 108, in main
progbar.add(args.batch_size, values=[("train loss", loss)])
File "/usr/local/lib/python2.7/dist-packages/keras/utils/generic_utils.py", line 123, in add
self.update(self.seen_so_far+n, values)
File "/usr/local/lib/python2.7/dist-packages/keras/utils/generic_utils.py", line 92, in update
avg = self.sum_values[k][0] / max(1, self.sum_values[k][1])
TypeError: unsupported operand type(s) for /: 'list' and 'int'
Using Theano backend.
Traceback (most recent call last):
File "evaluateMLP.py", line 102, in
main()
File "evaluateMLP.py", line 22, in main
model.load_weights(args.weights)
File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 781, in load_weights
for k in range(f.attrs['nb_layers']):
File "/usr/lib/python2.7/dist-packages/h5py/_hl/attrs.py", line 48, in getitem
attr = h5a.open(self._id, self._e(name))
File "h5a.pyx", line 74, in h5py.h5a.open (h5py/h5a.c:2107)
KeyError: "can't open attribute (Attribute: Can't open object)"
envy@ub1404envy:~/os_prj/github/_QA/visual-qa/scripts$

Add pre-trained model and demo script

Make an installation guide for the dependencies

vgg_feats.mat empty

When i tried to execute trainMLP.py is got the following error
raise MatReadError("Mat file appears to be empty") scipy.io.matlab.miobase.MatReadError: Mat file appears to be empty

and when i checked the file "vgg_feats.mat" it was empty.
so could you please give a link of that file?

run evaluation failed !

envy@ub1404envy:/os_prj/github/_QA/visual-qa/scripts$ python evaluateMLP.py -model ../models/mlp_num_hidden_units_1024_num_hidden_layers_3.json -weights ../models/mlp_num_hidden_units_1024_num_hidden_layers_3_epoch_00_loss_5.10.hdf5 -results ../results/mlp_1024_3_ep0.txt
Using Theano backend.
Traceback (most recent call last):
File "evaluateMLP.py", line 102, in
main()
File "evaluateMLP.py", line 22, in main
model.load_weights(args.weights)
File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 781, in load_weights
for k in range(f.attrs['nb_layers']):
File "/usr/lib/python2.7/dist-packages/h5py/_hl/attrs.py", line 48, in getitem
attr = h5a.open(self._id, self._e(name))
File "h5a.pyx", line 74, in h5py.h5a.open (h5py/h5a.c:2107)
KeyError: "can't open attribute (Attribute: Can't open object)"
envy@ub1404envy:/os_prj/github/_QA/visual-qa/scripts$

envy@ub1404envy:/os_prj/github/_QA/visual-qa/scripts$ ll ../models/
total 365152
drwxrwxr-x 2 envy envy 4096 Feb 12 05:21 ./
drwxrwxr-x 9 envy envy 4096 Feb 11 09:54 ../
-rw-rw-r-- 1 envy envy 184 Feb 11 16:27 labelencoder.pkl
-rw-rw-r-- 1 envy envy 72080 Feb 11 16:27 labelencoder.pkl_01.npy
-rw-rw-r-- 1 envy envy 38055400 Feb 11 09:54 lstm_1_num_hidden_units_lstm_512_num_hidden_units_mlp_1024_num_hidden_layers_mlp_3_epoch_070.hdf5
-rw-rw-r-- 1 envy envy 1803 Feb 11 09:54 lstm_1_num_hidden_units_lstm_512_num_hidden_units_mlp_1024_num_hidden_layers_mlp_3.json
-rw-rw-r-- 1 envy envy 30520784 Feb 11 16:35 mlp_num_hidden_units_1024_num_hidden_layers_3_epoch_00.hdf5
-rw-rw-r-- 1 envy envy 800 Feb 11 15:49 mlp_num_hidden_units_1024_num_hidden_layers_3_epoch_00_loss_5.10.hdf5
-rw-rw-r-- 1 envy envy 30520784 Feb 11 17:52 mlp_num_hidden_units_1024_num_hidden_layers_3_epoch_10.hdf5
-rw-rw-r-- 1 envy envy 30520784 Feb 11 19:11 mlp_num_hidden_units_1024_num_hidden_layers_3_epoch_20.hdf5
-rw-rw-r-- 1 envy envy 30520784 Feb 11 20:27 mlp_num_hidden_units_1024_num_hidden_layers_3_epoch_30.hdf5
-rw-rw-r-- 1 envy envy 30520784 Feb 11 21:44 mlp_num_hidden_units_1024_num_hidden_layers_3_epoch_40.hdf5
-rw-rw-r-- 1 envy envy 30520784 Feb 11 23:01 mlp_num_hidden_units_1024_num_hidden_layers_3_epoch_50.hdf5
-rw-rw-r-- 1 envy envy 30520784 Feb 12 00:19 mlp_num_hidden_units_1024_num_hidden_layers_3_epoch_60.hdf5
-rw-rw-r-- 1 envy envy 30520784 Feb 12 01:39 mlp_num_hidden_units_1024_num_hidden_layers_3_epoch_70.hdf5
-rw-rw-r-- 1 envy envy 30520784 Feb 12 02:56 mlp_num_hidden_units_1024_num_hidden_layers_3_epoch_80.hdf5
-rw-rw-r-- 1 envy envy 30520784 Feb 12 04:12 mlp_num_hidden_units_1024_num_hidden_layers_3_epoch_90.hdf5
-rw-rw-r-- 1 envy envy 30520784 Feb 12 05:21 mlp_num_hidden_units_1024_num_hidden_layers_3_epoch_99.hdf5
-rw-rw-r-- 1 envy envy 1474 Feb 11 16:27 mlp_num_hidden_units_1024_num_hidden_layers_3.json
-rw-rw-r-- 1 envy envy 123 Feb 11 09:54 README.md
envy@ub1404envy:/os_prj/github/_QA/visual-qa/scripts$ ll ../models/mlp_num_hidden_units_1024_num_hidden_layers_3_epoch_00_loss_5.10.hdf5
-rw-rw-r-- 1 envy envy 800 Feb 11 15:49 ../models/mlp_num_hidden_units_1024_num_hidden_layers_3_epoch_00_loss_5.10.hdf5
envy@ub1404envy:~/os_prj/github/_QA/visual-qa/scripts$

Complete the arguments part for all scripts. Also write new .sh scripts for running experiments.

Training Loss increases

Is the loss being output on trainLSTM_1.py , the real train loss or just the loss of some random epoch ?
Cause My training loss seems to increase after 2 epochs ...
FYI : I have used the glove vectors as the word vectors.

We'll fix the spaCy vectors (make GloVe the default)

Hey,

Impressive system. I really regret not switching to GloVe vectors a long time ago. Thanks for putting up with the awkwardness of having to install extra vectors etc. We'll get this fixed.

The reason spaCy still ships with the Wikipedia vectors is sort of random. My plan since around May last year was to train POS specific vectors, but I never got around to this, until Trask et al published their sense2vec paper. We finally published a demo on this recently ( https://sense2vec.spacy.io ). I might have a go at using these vectors in your system :).

Soon we'll have a command shipped to install the GloVe vectors. We'll then make these the default, and offer the previous ones as a backwards compatibility pack.

KeyError: 'class_name'

rzai@rzai00:/prj/visual-qa/scripts$ python evaluateMLP.py -model ../models/lstm_1_num_hidden_units_lstm_512_num_hidden_units_mlp_1024_num_hidden_layers_mlp_3.json -weights ../models/mlp_num_hidden_units_1024_num_hidden_layers_3_epoch_00_loss_5.10.hdf5 -results ../results/mlp_1024_3_ep0.txt
Using Theano backend.
Using gpu device 0: GeForce GTX 1080 (CNMeM is disabled)
Traceback (most recent call last):
File "evaluateMLP.py", line 102, in
main()
File "evaluateMLP.py", line 21, in main
model = model_from_json(open(args.model).read())
File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 197, in model_from_json
return layer_from_config(config, custom_objects=custom_objects)
File "/usr/local/lib/python2.7/dist-packages/keras/utils/layer_utils.py", line 25, in layer_from_config
class_name = config['class_name']
KeyError: 'class_name'
rzai@rzai00:/prj/visual-qa/scripts$

3rdParty folder is missing

Add option to resume training

evaluate.py seems to fail on the example (get_started.sh) script.

Running the evaluate.py script as given in get_started.sh is leading to a Key Error. Any reasons/explanations as to what might be going wrong?

`$> python evaluateMLP.py -model ../models/mlp_num_hidden_units_1024_num_hidden_layers_3.json -weights ../models/mlp_num_hidden_units_1024_num_hidden_layers_3_epoch_00_loss_5.10.hdf5 -results ../results/mlp_1024_3_ep0.txt

Using Theano backend.
Couldn't import dot_parser, loading of dot files will not be possible.
/usr/local/lib/python2.7/dist-packages/theano/tensor/signal/downsample.py:5: UserWarning: downsample module has been moved to the pool module.
warnings.warn("downsample module has been moved to the pool module.")
Traceback (most recent call last):
File "evaluateMLP.py", line 102, in
main()
File "evaluateMLP.py", line 22, in main
model.load_weights(args.weights)
File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 849, in load_weights
for k in range(f.attrs['nb_layers']):
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-build-fkfoP6/h5py/h5py/_objects.c:2453)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-build-fkfoP6/h5py/h5py/_objects.c:2410)
File "/usr/local/lib/python2.7/dist-packages/h5py/_hl/attrs.py", line 52, in getitem
attr = h5a.open(self._id, self._e(name))
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-build-fkfoP6/h5py/h5py/_objects.c:2453)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-build-fkfoP6/h5py/h5py/_objects.c:2410)
File "h5py/h5a.pyx", line 77, in h5py.h5a.open (/tmp/pip-build-fkfoP6/h5py/h5py/h5a.c:2057)
KeyError: "Can't open attribute (Can't locate attribute: 'nb_layers')"`

dumpText.py returns with 'Killed'

I installed Visual-qa in a Docker container run on a Mac. While running get_started.sh, I ran into trouble with

python dumpText.py -split train -answers modal

Running that line separately, it was exiting prematurely with the simple message "Killed". I worked line by line through the code and
qa = json.load(open(annFile, 'r'))
appears to be the source of the problems. My best guess is that JSON is attempting to load the entire training annotations file into memory fails because of its size.

trainLSTM_1.py unreshapeable error

The environment has been tested by running get_started.sh, however, when run python trainLSTM_1.py something goes wrong, saying :

Training started...
Traceback (most recent call last):
File "trainLSTM_1.py", line 126, in
main()
File "trainLSTM_1.py", line 116, in main
loss = model.train_on_batch([X_q_batch, X_i_batch], Y_batch)
File "/home/nate/miniconda2/lib/python2.7/site-packages/keras/models.py", line 804, in train_on_batch
return self._train(ins)
File "/home/nate/miniconda2/lib/python2.7/site-packages/keras/backend/theano_backend.py", line 448, in call
return self.function(*inputs)
File "/home/nate/miniconda2/lib/python2.7/site-packages/theano/compile/function_module.py", line 871, in call
storage_map=getattr(self.fn, 'storage_map', None))
File "/home/nate/miniconda2/lib/python2.7/site-packages/theano/gof/link.py", line 314, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File "/home/nate/miniconda2/lib/python2.7/site-packages/theano/compile/function_module.py", line 859, in call
outputs = self.fn()
ValueError: GpuReshape: cannot reshape input of shape (640, 512) to shape (21, 30, 512).
Apply node that caused the error: GpuReshape{3}(GpuElemwise{Add}[(0, 0)].0, TensorConstant{[ -1 30 512]})
Toposort index: 119
Inputs types: [CudaNdarrayType(float32, matrix), TensorType(int64, vector)]
Inputs shapes: [(640, 512), (3,)]
Inputs strides: [(512, 1), (8,)]
Inputs values: ['not shown', array([ -1, 30, 512])]
Outputs clients: [[GpuJoin(TensorConstant{2}, GpuReshape{3}.0, GpuReshape{3}.0, GpuReshape{3}.0, GpuReshape{3}.0)]]

Keyword argument not understood: truncate_gradient

I performed installation as it said in tutorial, succesfully run trainMLP.py
But when i try to run demo_batch.py I get error
File "demo_batch.py", line 71, in <module> main() File "demo_batch.py", line 34, in main model = model_from_json(open(args.model).read()) File "/home/deep/temp/vqa_web_demo/vqa/local/lib/python2.7/site-packages/keras/models.py", line 166, in model_from_json return model_from_config(config, custom_objects=custom_objects) File "/home/deep/temp/vqa_web_demo/vqa/local/lib/python2.7/site-packages/keras/models.py", line 177, in model_from_config model = container_from_config(config, custom_objects=custom_objects) File "/home/deep/temp/vqa_web_demo/vqa/local/lib/python2.7/site-packages/keras/utils/layer_utils.py", line 44, in container_from_config init_layer = container_from_config(layer) File "/home/deep/temp/vqa_web_demo/vqa/local/lib/python2.7/site-packages/keras/utils/layer_utils.py", line 35, in container_from_config init_layer = container_from_config(layer) File "/home/deep/temp/vqa_web_demo/vqa/local/lib/python2.7/site-packages/keras/utils/layer_utils.py", line 44, in container_from_config init_layer = container_from_config(layer) File "/home/deep/temp/vqa_web_demo/vqa/local/lib/python2.7/site-packages/keras/utils/layer_utils.py", line 102, in container_from_config base_layer = get_layer(name, layer_dict) File "/home/deep/temp/vqa_web_demo/vqa/local/lib/python2.7/site-packages/keras/utils/layer_utils.py", line 168, in get_layer instantiate=True, kwargs=kwargs) File "/home/deep/temp/vqa_web_demo/vqa/local/lib/python2.7/site-packages/keras/utils/generic_utils.py", line 18, in get_from_module return res(**kwargs) File "/home/deep/temp/vqa_web_demo/vqa/local/lib/python2.7/site-packages/keras/layers/recurrent.py", line 559, in __init__ super(LSTM, self).__init__(**kwargs) File "/home/deep/temp/vqa_web_demo/vqa/local/lib/python2.7/site-packages/keras/layers/recurrent.py", line 132, in __init__ super(Recurrent, self).__init__(**kwargs) File "/home/deep/temp/vqa_web_demo/vqa/local/lib/python2.7/site-packages/keras/layers/core.py", line 47, in __init__ assert kwarg in allowed_kwargs, 'Keyword argument not understood: ' + kwarg AssertionError: Keyword argument not understood: truncate_gradient

Why this error happen and how can i fix it? Thanks

Improper config format for : lstm_1_num_hidden_units_lstm_512_num_hidden_units_mlp_1024_num_hidden_layers_mlp_3.json

running own_image.py I got that error.

File "/usr/local/lib/python2.7/dist-packages/keras/utils/generic_utils.py", line 122, in deserialize_keras_object
raise ValueError('Improper config format: ' + str(config))
ValueError: Improper config format: {u'layers': [{u'layers': [{u'layers': [{u'truncate_gradient': -1, u'name': u'LSTM', u'inner_activation': u'hard_sigmoid', u'activation': u'tanh', u'input_shape': [30, 300], u'init': u'glorot_uniform', u'inner_init': u'orthogonal', u'input_dim': None, u'return_sequences': False, u'output_dim': 512, u'forget_bias_init': u'one', u'input_length': None}], u'name': u'Sequential'}, {u'layers': [{u'dims': [4096], u'name': u'Reshape', u'input_shape': [4096]}], u'name': u'Sequential'}], u'concat_axis': 1, u'mode': u'concat', u'name': u'Merge'}, {u'b_constraint': None, u'name': u'Dense', u'activity_regularizer': None, u'W_constraint': None, u'init': u'uniform', u'input_dim': None, u'b_regularizer': None, u'W_regularizer': None, u'activation': u'linear', u'output_dim': 1024}, {u'beta': 0.1, u'target': 0, u'activation': u'tanh', u'name': u'Activation'}, {u'p': 0.5, u'name': u'Dropout'}, {u'b_constraint': None, u'name': u'Dense', u'activity_regularizer': None, u'W_constraint': None, u'init': u'uniform', u'input_dim': None, u'b_regularizer': None, u'W_regularizer': None, u'activation': u'linear', u'output_dim': 1024}, {u'beta': 0.1, u'target': 0, u'activation': u'tanh', u'name': u'Activation'}, {u'p': 0.5, u'name': u'Dropout'}, {u'b_constraint': None, u'name': u'Dense', u'activity_regularizer': None, u'W_constraint': None, u'init': u'uniform', u'input_dim': None, u'b_regularizer': None, u'W_regularizer': None, u'activation': u'linear', u'output_dim': 1024}, {u'beta': 0.1, u'target': 0, u'activation': u'tanh', u'name': u'Activation'}, {u'p': 0.5, u'name': u'Dropout'}, {u'b_constraint': None, u'name': u'Dense', u'activity_regularizer': None, u'W_constraint': None, u'init': u'glorot_uniform', u'input_dim': None, u'b_regularizer': None, u'W_regularizer': None, u'activation': u'linear', u'output_dim': 1000}, {u'beta': 0.1, u'target': 0, u'activation': u'softmax', u'name': u'Activation'}], u'name': u'Sequential'}

Possible Bug: not handling question mark

It's possible I'm mistaken, but it seems there's a bug in the way word embeddings are being computed in own_image.py

       question = unicode(raw_input("Ask a question: "))
       X_q = get_questions_tensor_timeseries([question], nlp, timesteps)

If question = "what color is the cat?"

the word "cat?" will considered out of vocabulary (due to lack of space before question mark) and word embeddings will be an all zero vector.

Issue with evaluation code

There's an issue with the evaluation code.

My old definition of accuracy:
If the Neural Net-generated answer matches at least three human answers, then the accuracy of that answer is 1, else 0.

Actual definition of accuracy in the VQA challenge:
Let n be the number of human answers that exactly match the neural net answer. Then acc = min(n/3, 1). This gives a score of 0.33 if there is exactly one match between human and neural net, and 0.66 if there are exactly two matches.

I will be fixing this and updating the results soon. Should give a bump to the validation set performance numbers that I reported earlier.

improper json format

lstm_1_num_hidden_units_lstm_512_num_hidden_units_mlp_1024_num_hidden_layers_mlp_3.json , this seems to be improper, any help on this ?

not working on tensorflow

this uses keras, but doesn't seem to support tensorflow. can we get tensorflow support?

input_dim has multiple values?

$ python trainMLP.py
loaded vgg features
loaded word2vec features...
Traceback (most recent call last):
File "trainMLP.py", line 114, in
main()
File "trainMLP.py", line 62, in main
model.add(Dense(args.num_hidden_units, input_dim=img_dim+word_vec_dim, init='uniform'))
TypeError: init() got multiple values for keyword argument 'input_dim'

Not able to understand this. Seems fine to me. This is the way I'd do it. Insights?