Coder Social home page Coder Social logo

relevanceai / vectorhub Goto Github PK

View Code? Open in Web Editor NEW
550.0 18.0 58.0 11.89 MB

Vector Hub - Library for easy discovery, and consumption of State-of-the-art models to turn data into vectors. (text2vec, image2vec, video2vec, graph2vec, bert, inception, etc)

Home Page: https://relevanceai.com

License: Apache License 2.0

Makefile 0.28% Python 99.72%
python vector embeddings encodings vector-similarity transformers tfhub machine-learning deeplearning artificial-intelligence

vectorhub's Issues

NameError: name 'sf' is not defined

is the soundfile correctly imported just curious I was trying the wav2vec, got this ?

Tried importing/uninstall and install but not fruitful!

Any help will be appreciated

WARNING: tensorflow: 11 out of the last 11 calls

Hello community,

I get this warning and going to ask you how to deal with it and what is the reason for it

WARNING:tensorflow:11 out of the last 11 calls to <function recreate_function.<locals>.restored_function_body at 0x7fea085935e0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details.
W

thank a lot!

embeddings of 1e-07 values

Hello community,

I have only the 1e-07 values as result by all img2vec-models (I took the same picture from the given example)

from vectorhub.encoders.image.tfhub import BitSmall2Vec
image_encoder = BitSmall2Vec()
# 'https://www.google.com/images/branding/googlelogo/2x/googlelogo_color_92x30dp.png is locally saved as pic
sample = image_encoder.read('./pic.png')
emb = image_encoder.encode(sample)
[1e-07, ... 1e-07]

what do I wrong?

Use cleora

Thanks for your very useful and handy project!
https://github.com/Synerise/cleora is a very fast and responsible project written in Rust? Can you leverage the power of this project in your library (just add it to your approaches for graph embeddings)?
I used Cleora on a large amount of data and it is very fast and memory efficient.

Tensorflow 2.4 Support

I'm running into an issue with tensorflow 2.4

import tensorflow as tf
tf.__version__
>>> 2.4.0

from vectorhub.encoders.text.tfhub import Bert2Vec
model = Bert2Vec()
>>> 
    ValueError: Could not find matching function to call loaded from the SavedModel. Got:
      Positional arguments (3 total):
        * {'input_word_ids': <tf.Tensor 'inputs_2:0' shape=(None, 512) dtype=int32>, 'input_mask': <tf.Tensor 'inputs:0' shape=(None, 512) dtype=int32>, 'input_type_ids': <tf.Tensor 'inputs_1:0' shape=(None, 512) dtype=int32>}
        * False
        * None
      Keyword arguments: {}

    Expected these arguments to match one of the following 4 option(s):

    Option 1:
      Positional arguments (3 total):
        * [TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/0'), TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/1'), TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/2')]
        * False
        * None
      Keyword arguments: {}

    Option 2:
      Positional arguments (3 total):
        * [TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/0'), TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/1'), TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/2')]
        * True
        * None
      Keyword arguments: {}

    Option 3:
      Positional arguments (3 total):
        * [TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids'), TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask'), TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids')]
        * True
        * None
      Keyword arguments: {}

    Option 4:
      Positional arguments (3 total):
        * [TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids'), TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask'), TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids')]
        * False
        * None
      Keyword arguments: {}

Any suggestion on how to fix that?

CLIP2vec encode_image error

When I try the listed example for get image vectors from CLIP:

from vectorhub.bi_encoders.text_image.torch import Clip2Vec
model = Clip2Vec()
model.encode_image('https://getvectorai.com/assets/hub-logo-with-text.png')

I get the following trace:

/home/is2961/anaconda3/lib/python3.9/site-packages/vectorhub/base.py:62: UserWarning: Unable to encode. Filling in with dummy vector.
  warnings.warn("Unable to encode. Filling in with dummy vector.")
Traceback (most recent call last):
  File "/home/is2961/anaconda3/lib/python3.9/site-packages/vectorhub/base.py", line 42, in catch_vector
    return func(*args, **kwargs)
  File "/home/is2961/anaconda3/lib/python3.9/site-packages/vectorhub/bi_encoders/text_image/torch/clip.py", line 101, in encode_image
    return self.model.encode_image(image).detach().numpy().tolist()[0]
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/multimodal/model/multimodal_transformer/___torch_mangle_9591.py", line 19, in encode_image
    _0 = self.visual
    input = torch.to(image, torch.device("cuda:0"), 5, False, False, None)
    return (_0).forward(input, )
            ~~~~~~~~~~~ <--- HERE
  def encode_text(self: __torch__.multimodal.model.multimodal_transformer.___torch_mangle_9591.Multimodal,
    input: Tensor) -> Tensor:
  File "code/__torch__/multimodal/model/multimodal_transformer.py", line 20, in forward
    _4 = self.positional_embedding
    _5 = self.class_embedding
    _6 = (self.conv1).forward(input, )
          ~~~~~~~~~~~~~~~~~~~ <--- HERE
    _7 = ops.prim.NumToTensor(torch.size(_6, 0))
    _8 = int(_7)
  File "code/__torch__/torch/nn/modules/conv/___torch_mangle_9366.py", line 8, in forward
  def forward(self: __torch__.torch.nn.modules.conv.___torch_mangle_9366.Conv2d,
    input: Tensor) -> Tensor:
    x = torch._convolution(input, self.weight, None, [32, 32], [0, 0], [1, 1], False, [0, 0], 1, False, False, True, True)
        ~~~~~~~~~~~~~~~~~~ <--- HERE
    return x
  def forward1(self: __torch__.torch.nn.modules.conv.___torch_mangle_9366.Conv2d,

Traceback of TorchScript, original code (most recent call last):
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py(420): _conv_forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py(423): forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(709): _slow_forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(725): _call_impl
/root/workspace/multimodal-pytorch/multimodal/model/multimodal_transformer.py(85): forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(709): _slow_forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(725): _call_impl
/root/workspace/multimodal-pytorch/multimodal/model/multimodal_transformer.py(221): visual_forward
/opt/conda/lib/python3.7/site-packages/torch/jit/_trace.py(940): trace_module
<ipython-input-1-40b054242c5d>(36): export_torchscript_models
<ipython-input-2-808c11c4d1cf>(3): <module>
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(3418): run_code
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(3338): run_ast_nodes
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(3147): run_cell_async
/opt/conda/lib/python3.7/site-packages/IPython/core/async_helpers.py(68): _pseudo_sync_runner
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(2923): _run_cell
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(2878): run_cell
/opt/conda/lib/python3.7/site-packages/IPython/terminal/interactiveshell.py(555): interact
/opt/conda/lib/python3.7/site-packages/IPython/terminal/interactiveshell.py(564): mainloop
/opt/conda/lib/python3.7/site-packages/IPython/terminal/ipapp.py(356): start
/opt/conda/lib/python3.7/site-packages/traitlets/config/application.py(845): launch_instance
/opt/conda/lib/python3.7/site-packages/IPython/__init__.py(126): start_ipython
/opt/conda/bin/ipython(8): <module>
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [768, 3, 32, 32], but got 5-dimensional input of size [1, 1, 3, 224, 224] instead

Is there an easy fix for this?

Typo in https://hub.getvectorai.com/

The code example for Dense Passage Retrieval doesn't work as there an s is missing:
from vectorhub.bi_encoder.text_text.torch_transformers import DPR2Vec should be changed as from vectorhub.bi_encoders.text_text.torch_transformers import DPR2Vec

Other __2vecs

Congratulations and thank you for this amazing initiative!
How about adding cat2vec, for categorical, tabular data, and node2vec, for network graphs?
Best wishes,
Milcent

'hub' is not definded

hello community,

I got this error

>>> from vectorhub.encoders.image.tfhub import BitSmall2Vec
>>> model=BitSmall2Vec()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ivaadmin/anaconda3/envs/x2vec/lib/python3.9/site-packages/vectorhub/encoders/image/tfhub/bit.py", line 27, in __init__
    self.init(model_url)
  File "/home/ivaadmin/anaconda3/envs/x2vec/lib/python3.9/site-packages/vectorhub/encoders/image/tfhub/bit.py", line 34, in init
    self.model = hub.load(self.model_url)
NameError: name 'hub' is not defined

what is missing?

SentenceTransformer2Vec example from docs raising a ModelError

Hello,
I've tried following the docs for SentenceTransformer2Vec.
I followed the steps listed on the page, installed the package along with the model, then when trying to import and instantiate it as the docs suggest, I'm seeing the following error:

from vectorhub.encoders.text.sentence_transformers import SentenceTransformer2Vec 
model = SentenceTransformer2Vec('bert-base-uncased') 
model.encode("I enjoy taking long walks along the beach with my dog.") `

---------------------------------------------------------------------------
ModelError                                Traceback (most recent call last)
<ipython-input-360-23fbb894168d> in <module>
      1 from vectorhub.encoders.text.sentence_transformers import SentenceTransformer2Vec
----> 2 model = SentenceTransformer2Vec('bert-base-uncased')
      3 model.encode("I enjoy taking long walks along the beach with my dog.")

~/opt/anaconda3/envs/kaggle/lib/python3.7/site-packages/vectorhub/encoders/text/sentence_transformers/sentence_auto_transformers.py in __init__(self, model_name)
     45     def __init__(self, model_name: str):
     46         self.list_of_urls = LIST_OF_URLS
---> 47         self.validate_model_url(model_name, LIST_OF_URLS)
     48         self.vector_length = LIST_OF_URLS[model_name]["vector_length"]
     49         self.model = SentenceTransformer(model_name)

~/opt/anaconda3/envs/kaggle/lib/python3.7/site-packages/vectorhub/base.py in validate_model_url(cls, model_url, list_of_urls)
     71             return True
     72         raise ModelError(
---> 73             message="We currently not support this url. If issue persist then contact us.")
     74 
     75     @classmethod

ModelError: 

It seems like bert-base-uncased is not present in the LIST_OF_URLS.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.