I'm interested in your approach, and I'd like to run your model and reproduce your results so I can use it in my own research. However, I am unable to get the system up and running.
I've tried to run the model in a Python 3.7 Docker container, to ensure a correct environment and reproducability of the error.
docker run -it --volume=${PWD}:/app --name webke python:3.7.9 /bin/bash
I've downloaded the weights, models and dataset to the correct locations, and then tried executing the following commands in the Docker container:
# python --version
Python 3.7.13
# pip --version
pip 22.0.4 from /usr/local/lib/python3.7/site-packages/pip (python 3.7)
# pip install bert4keras==0.10.0 tensorflow==2.2.0 beautifulsoup4 tqdm
...
# cd app/webke
# mkdir results
# python html_extract_with_pos.py
2022-04-07 15:07:44.716868: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-04-07 15:07:44.716896: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2022-04-07 15:07:44.716918: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (8d3b7a0a3317): /proc/driver/nvidia/version does not exist
2022-04-07 15:07:44.717074: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2022-04-07 15:07:44.738792: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2599990000 Hz
2022-04-07 15:07:44.739629: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f8f78000b20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-04-07 15:07:44.739650: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
weight/tiny_object_nbaplayer_with_pos.weights
load data: 19941it [00:01, 14690.62it/s]
0it [00:00, ?it/s]354
59
WARNING:tensorflow:5 out of the last 7 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7f8edee53830> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.
...
WARNING:tensorflow:6 out of the last 11 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7f8eb9a23b90> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.
Traceback (most recent call last):
File "html_extract_with_pos.py", line 180, in <module>
evaluate(data)
File "html_extract_with_pos.py", line 145, in evaluate
R = set([pred for pred in html_extract(d)])
File "html_extract_with_pos.py", line 124, in html_extract
preds_list = extract_preds(string, pos)
File "/app/webke/predicate_extraction_with_pos.py", line 221, in extract_preds
pred_preds = pred_model.predict([token_ids, segment_ids, x0, y0, x1, y1])
File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 88, in _method_wrapper
return method(self, *args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 1268, in predict
tmp_batch_outputs = predict_function(iterator)
File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 580, in __call__
result = self._call(*args, **kwds)
File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 618, in _call
results = self._stateful_fn(*args, **kwds)
File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 2420, in __call__
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1665, in _filtered_call
self.captured_inputs)
File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1746, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 598, in call
ctx=ctx)
File "/usr/local/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0,512] = 512 is not in [0, 512)
[[node model_3/Embedding-Position/Gather (defined at /app/webke/layers.py:625) ]] [Op:__inference_predict_function_63294]
Errors may have originated from an input operation.
Input Source operations connected to node model_3/Embedding-Position/Gather:
model_3/Embedding-Position/ReadVariableOp/resource (defined at /app/webke/layers.py:618)
model_3/Embedding-Position/strided_slice_2 (defined at /app/webke/layers.py:597)
Function call stack:
predict_function
f1: 0.82353, precision: 1.00000, recall: 0.70000: : 1it [00:29, 29.27s/it]
When running it multiple times, it sometimes crashes in different stages. For instance, sometimes it appears to crash during segment extraction, and other times it crashes with predicate extraction. It seems to be an issue with the embedding size being passed into the model, but I cannot figure out what I am doing wrong.
I don't know if it is relevant, but I am currently not using GPU acceleration.
Hopefully you have an idea about what is going wrong and you can point me in the right direction to fix it.