Coder Social home page Coder Social logo

Comments (10)

lawliet19189 avatar lawliet19189 commented on July 22, 2024

could you provide a bit more information on what is present inside the train_dataset that you are using?

Also can you share the variant of fiass that you have installed and its version number?

from dspy.

jyx-su avatar jyx-su commented on July 22, 2024

Sure, the train_dataset is a list of around 80,000 dsp.Example instances, and here's one example:

{'id': '57097051ed30961900e84136',
'title': 'Sky_(United_Kingdom)',
'context': "BSkyB's digital service was officially launched on 1 October 1998 under the name Sky Digital, although small-scale tests were carried out before then. At this time the use of the Sky Digital brand made an important distinction between the new service and Sky's analogue services. Key selling points were the improvement in picture and sound quality, increased number of channels and an interactive service branded Open.... now called Sky Active, BSkyB competed with the ONdigital (later ITV Digital) terrestrial offering and cable services. Within 30 days, over 100,000 digiboxes had been sold, which help bolstered BSkyB's decision to give away free digiboxes and minidishes from May 1999.",
'question': 'Within the 30 days how many digiboxes had been sold?',
'answer': ['100,000', 'over 100,000', '100,000']}

It's faiss-cpu 1.7.3 from pypi channel.

from dspy.

lawliet19189 avatar lawliet19189 commented on July 22, 2024

can I know if you are using dsp from pip install dsp-ml or directly using from github source? If you are trying the former, could you try with the latter?
pip install git+https://github.com/stanfordnlp/dsp

from dspy.

tsunrise avatar tsunrise commented on July 22, 2024

I encountered the same issue! I'm using dsp directly from the github source, and here is my stack trace:

----> 1 knn_engine = dsp.knn(squad_train[:20000])

File [c:\Users\tom10\anaconda3\envs\nlu\lib\site-packages\dsp\primitives\demonstrate.py:173](file:///C:/Users/tom10/anaconda3/envs/nlu/lib/site-packages/dsp/primitives/demonstrate.py:173), in knn(train, cast, **knn_args)
    170 vectorizer: "BaseSentenceVectorizer" = dsp.settings.vectorizer
    171 all_vectors = vectorizer(train_casted_to_vectorize).astype(np.float32)
--> 173 index = create_faiss_index(
    174     emb_dim=all_vectors.shape[1], n_objects=len(train), **knn_args
    175 )
    176 index.train(all_vectors)
    177 index.add(all_vectors)

File [c:\Users\tom10\anaconda3\envs\nlu\lib\site-packages\dsp\utils\ann_utils.py:116](file:///C:/Users/tom10/anaconda3/envs/nlu/lib/site-packages/dsp/utils/ann_utils.py:116), in create_faiss_index(emb_dim, n_objects, n_probe, max_gpu_devices, encode_residuals, in_list_dist_type, centroid_dist_type)
    114     index = _get_brute_index(emb_dim=emb_dim, dist_type=in_list_dist_type)
    115 else:
--> 116     index = _get_ivf_index(
    117         emb_dim=emb_dim,
    118         n_objects=n_objects,
    119         in_list_dist_type=in_list_dist_type,
    120         centroid_dist_type=centroid_dist_type,
    121         encode_residuals=encode_residuals
    122     )
    124 index.nprobe = n_probe
    126 num_devices, is_gpu = determine_devices(max_gpu_devices)

File [c:\Users\tom10\anaconda3\envs\nlu\lib\site-packages\dsp\utils\ann_utils.py:70](file:///C:/Users/tom10/anaconda3/envs/nlu/lib/site-packages/dsp/utils/ann_utils.py:70), in _get_ivf_index(emb_dim, n_objects, in_list_dist_type, centroid_dist_type, encode_residuals)
     67 else:
     68     raise ValueError(f'Wrong distance type for FAISS index: {centroid_dist_type}')
---> 70 index = faiss.IndexIVFScalarQuantizer(
     71     quannizer,
     72     emb_dim,
     73     n_list,
     74     faiss.ScalarQuantizer.QT_fp16,  # TODO: should be optional?
     75     centroid_metric,
     76     encode_residuals=encode_residuals
     77 )
     78 return index

TypeError: replacement_init() got an unexpected keyword argument 'encode_residuals'

I'm also using the latest faiss-cpu from the pypi channel. Here is the format of the training set:

{'id': '5733be284776f41900661182',
  'title': 'University_of_Notre_Dame',
  'context': 'Architecturally, the school has a Catholic character. Atop the Main Building\'s gold dome is a golden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend "Venite Ad Me Omnes". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary.',
  'question': 'To whom did the Virgin Mary allegedly appear in 1858 in Lourdes France?',
  'answer': ['Saint Bernadette Soubirous']}

from dspy.

okhat avatar okhat commented on July 22, 2024

@stalkermustang are you familiar with this error?

from dspy.

stalkermustang avatar stalkermustang commented on July 22, 2024

@stalkermustang are you familiar with this error?

nope, but I tried this with Faiss 1.7.1 / 1.7.2. Idk what changed with 1.7.3. What I see here https://github.com/facebookresearch/faiss/releases is that smth has changed recently regarding Residuals.

from dspy.

lawliet19189 avatar lawliet19189 commented on July 22, 2024

@jyx-su @tsunrise could you try with #44 branch?
you can install with
pip install git+https://github.com/stanfordnlp/dsp.git@bug/IVF_index_creation_error

from dspy.

tsunrise avatar tsunrise commented on July 22, 2024

@jyx-su @tsunrise could you try with #44 branch? you can install with pip install git+https://github.com/stanfordnlp/dsp.git@bug/IVF_index_creation_error

That fix works on my end. Thanks!

from dspy.

lawliet19189 avatar lawliet19189 commented on July 22, 2024

Good to hear. Thanks for reporting the bug @tsunrise @jyx-su

from dspy.

okhat avatar okhat commented on July 22, 2024

Thanks a lot @lawliet19189 @stalkermustang @tsunrise @jyx-su !

from dspy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.