relevanceai / vectorhub Goto Github PK

Vector Hub - Library for easy discovery, and consumption of State-of-the-art models to turn data into vectors. (text2vec, image2vec, video2vec, graph2vec, bert, inception, etc)

Home Page: https://relevanceai.com

License: Apache License 2.0

Makefile 0.28% Python 99.72%

python vector embeddings encodings vector-similarity transformers tfhub machine-learning deeplearning artificial-intelligence

vectorhub's Issues

NameError: name 'sf' is not defined

is the soundfile correctly imported just curious I was trying the wav2vec, got this ?

Tried importing/uninstall and install but not fruitful!

Any help will be appreciated

WARNING: tensorflow: 11 out of the last 11 calls

Hello community,

I get this warning and going to ask you how to deal with it and what is the reason for it

WARNING:tensorflow:11 out of the last 11 calls to <function recreate_function.<locals>.restored_function_body at 0x7fea085935e0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.
W

thank a lot!

Thanks for your very useful and handy project!
https://github.com/Synerise/cleora is a very fast and responsible project written in Rust? Can you leverage the power of this project in your library (just add it to your approaches for graph embeddings)?
I used Cleora on a large amount of data and it is very fast and memory efficient.

Include examples in Vectorhub on utilisation of GPUs and encoding in parallel.

Include examples in vectorhub of how to utilise GPUs and how to run encode in parallel.

Tensorflow 2.4 Support

I'm running into an issue with tensorflow 2.4

import tensorflow as tf
tf.__version__
>>> 2.4.0

from vectorhub.encoders.text.tfhub import Bert2Vec
model = Bert2Vec()
>>> 
    ValueError: Could not find matching function to call loaded from the SavedModel. Got:
      Positional arguments (3 total):
        * {'input_word_ids': <tf.Tensor 'inputs_2:0' shape=(None, 512) dtype=int32>, 'input_mask': <tf.Tensor 'inputs:0' shape=(None, 512) dtype=int32>, 'input_type_ids': <tf.Tensor 'inputs_1:0' shape=(None, 512) dtype=int32>}
        * False
        * None
      Keyword arguments: {}

    Expected these arguments to match one of the following 4 option(s):

    Option 1:
      Positional arguments (3 total):
        * [TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/0'), TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/1'), TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/2')]
        * False
        * None
      Keyword arguments: {}

    Option 2:
      Positional arguments (3 total):
        * [TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/0'), TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/1'), TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/2')]
        * True
        * None
      Keyword arguments: {}

    Option 3:
      Positional arguments (3 total):
        * [TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids'), TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask'), TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids')]
        * True
        * None
      Keyword arguments: {}

    Option 4:
      Positional arguments (3 total):
        * [TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids'), TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask'), TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids')]
        * False
        * None
      Keyword arguments: {}

Any suggestion on how to fix that?

CLIP2vec encode_image error

When I try the listed example for get image vectors from CLIP:

from vectorhub.bi_encoders.text_image.torch import Clip2Vec
model = Clip2Vec()
model.encode_image('https://getvectorai.com/assets/hub-logo-with-text.png')

I get the following trace:

/home/is2961/anaconda3/lib/python3.9/site-packages/vectorhub/base.py:62: UserWarning: Unable to encode. Filling in with dummy vector.
  warnings.warn("Unable to encode. Filling in with dummy vector.")
Traceback (most recent call last):
  File "/home/is2961/anaconda3/lib/python3.9/site-packages/vectorhub/base.py", line 42, in catch_vector
    return func(*args, **kwargs)
  File "/home/is2961/anaconda3/lib/python3.9/site-packages/vectorhub/bi_encoders/text_image/torch/clip.py", line 101, in encode_image
    return self.model.encode_image(image).detach().numpy().tolist()[0]
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/multimodal/model/multimodal_transformer/___torch_mangle_9591.py", line 19, in encode_image
    _0 = self.visual
    input = torch.to(image, torch.device("cuda:0"), 5, False, False, None)
    return (_0).forward(input, )
            ~~~~~~~~~~~ <--- HERE
  def encode_text(self: __torch__.multimodal.model.multimodal_transformer.___torch_mangle_9591.Multimodal,
    input: Tensor) -> Tensor:
  File "code/__torch__/multimodal/model/multimodal_transformer.py", line 20, in forward
    _4 = self.positional_embedding
    _5 = self.class_embedding
    _6 = (self.conv1).forward(input, )
          ~~~~~~~~~~~~~~~~~~~ <--- HERE
    _7 = ops.prim.NumToTensor(torch.size(_6, 0))
    _8 = int(_7)
  File "code/__torch__/torch/nn/modules/conv/___torch_mangle_9366.py", line 8, in forward
  def forward(self: __torch__.torch.nn.modules.conv.___torch_mangle_9366.Conv2d,
    input: Tensor) -> Tensor:
    x = torch._convolution(input, self.weight, None, [32, 32], [0, 0], [1, 1], False, [0, 0], 1, False, False, True, True)
        ~~~~~~~~~~~~~~~~~~ <--- HERE
    return x
  def forward1(self: __torch__.torch.nn.modules.conv.___torch_mangle_9366.Conv2d,

Traceback of TorchScript, original code (most recent call last):
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py(420): _conv_forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py(423): forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(709): _slow_forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(725): _call_impl
/root/workspace/multimodal-pytorch/multimodal/model/multimodal_transformer.py(85): forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(709): _slow_forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(725): _call_impl
/root/workspace/multimodal-pytorch/multimodal/model/multimodal_transformer.py(221): visual_forward
/opt/conda/lib/python3.7/site-packages/torch/jit/_trace.py(940): trace_module
<ipython-input-1-40b054242c5d>(36): export_torchscript_models
<ipython-input-2-808c11c4d1cf>(3): <module>
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(3418): run_code
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(3338): run_ast_nodes
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(3147): run_cell_async
/opt/conda/lib/python3.7/site-packages/IPython/core/async_helpers.py(68): _pseudo_sync_runner
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(2923): _run_cell
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(2878): run_cell
/opt/conda/lib/python3.7/site-packages/IPython/terminal/interactiveshell.py(555): interact
/opt/conda/lib/python3.7/site-packages/IPython/terminal/interactiveshell.py(564): mainloop
/opt/conda/lib/python3.7/site-packages/IPython/terminal/ipapp.py(356): start
/opt/conda/lib/python3.7/site-packages/traitlets/config/application.py(845): launch_instance
/opt/conda/lib/python3.7/site-packages/IPython/__init__.py(126): start_ipython
/opt/conda/bin/ipython(8): <module>
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [768, 3, 32, 32], but got 5-dimensional input of size [1, 1, 3, 224, 224] instead

Is there an easy fix for this?

Typo in https://hub.getvectorai.com/

The code example for Dense Passage Retrieval doesn't work as there an s is missing:
from vectorhub.bi_encoder.text_text.torch_transformers import DPR2Vec should be changed as from vectorhub.bi_encoders.text_text.torch_transformers import DPR2Vec

Blox fruuuuuut

How to ues code2vec to generate vectors?

Other __2vecs

Congratulations and thank you for this amazing initiative!
How about adding cat2vec, for categorical, tabular data, and node2vec, for network graphs?
Best wishes,
Milcent

Add VL-BERT to provide Image and Text shared vector space

Include a new VLBERT model, combining image and text into the same vector space.

https://github.com/jackroos/VL-BERT

Update summarisation model

Test and assess the summarisation model for vectorhub cards using https://github.com/allenai/scitldr.

'hub' is not definded

hello community,

I got this error

>>> from vectorhub.encoders.image.tfhub import BitSmall2Vec
>>> model=BitSmall2Vec()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ivaadmin/anaconda3/envs/x2vec/lib/python3.9/site-packages/vectorhub/encoders/image/tfhub/bit.py", line 27, in __init__
    self.init(model_url)
  File "/home/ivaadmin/anaconda3/envs/x2vec/lib/python3.9/site-packages/vectorhub/encoders/image/tfhub/bit.py", line 34, in init
    self.model = hub.load(self.model_url)
NameError: name 'hub' is not defined

what is missing?

SentenceTransformer2Vec example from docs raising a ModelError

Hello,
I've tried following the docs for SentenceTransformer2Vec.
I followed the steps listed on the page, installed the package along with the model, then when trying to import and instantiate it as the docs suggest, I'm seeing the following error:

from vectorhub.encoders.text.sentence_transformers import SentenceTransformer2Vec 
model = SentenceTransformer2Vec('bert-base-uncased') 
model.encode("I enjoy taking long walks along the beach with my dog.") `

---------------------------------------------------------------------------
ModelError                                Traceback (most recent call last)
<ipython-input-360-23fbb894168d> in <module>
      1 from vectorhub.encoders.text.sentence_transformers import SentenceTransformer2Vec
----> 2 model = SentenceTransformer2Vec('bert-base-uncased')
      3 model.encode("I enjoy taking long walks along the beach with my dog.")

~/opt/anaconda3/envs/kaggle/lib/python3.7/site-packages/vectorhub/encoders/text/sentence_transformers/sentence_auto_transformers.py in __init__(self, model_name)
     45     def __init__(self, model_name: str):
     46         self.list_of_urls = LIST_OF_URLS
---> 47         self.validate_model_url(model_name, LIST_OF_URLS)
     48         self.vector_length = LIST_OF_URLS[model_name]["vector_length"]
     49         self.model = SentenceTransformer(model_name)

~/opt/anaconda3/envs/kaggle/lib/python3.7/site-packages/vectorhub/base.py in validate_model_url(cls, model_url, list_of_urls)
     71             return True
     72         raise ModelError(
---> 73             message="We currently not support this url. If issue persist then contact us.")
     74 
     75     @classmethod

ModelError:

It seems like bert-base-uncased is not present in the LIST_OF_URLS.

relevanceai / vectorhub Goto Github PK

vectorhub's Issues

Recommend Projects

Recommend Topics

Recommend Org