Comments (6)
I think it's better if users do that on their own so that they are aware of what's happening.
Also, if you do not that upfront you will end up slowing down the optimisation process, as you will search also for queries for which you do not have qrels.
from retriv.
However, how to integrate this filter during autotune?
Even during autotune, since the retrieve results depend on the queries and the training collection, it is unlikely that qrels contains all possible keys from run (although ideally, it will have an intersection with the run).
from retriv.
You don't need the true relevance value for each query-doc tuple.
As the error says, the query ids do not match. Meaning that the provided qrels has more/less/different query ids than the queries for which the run was computed.
from retriv.
I see. Since it is costly to obtain relevance feedback, I only have it for a general case. So when I try to autotune the retriever for different slices of data, there is no way to guarantee that the keys in qrels and run are identical, only that they intersect.
Anyway, thank you.
from retriv.
Well, can't you extract the intersection before tuning?
from retriv.
Closing for inactivity.
from retriv.
Related Issues (20)
- BM25 time complexity HOT 1
- Doc strings HOT 4
- I would like to see `retriv` part of the Search Benchmark, the Game
- [Feature Request] Allow GPU for query embedding HOT 1
- Minimal example for Hybrid Search fails HOT 3
- Input file format HOT 2
- [Feature Request] Add documents to index after initializing? HOT 2
- Multiprocess error triggers while trying example code HOT 3
- [BUG] Corrupted log when using SearchEngine HOT 9
- Does Advanced Retrieve support semantic searching? HOT 1
- [Feature Request] build index on a sequence of json/jsonl files HOT 3
- Compare retriv's permance to rank_bm25 and pyserini HOT 4
- HybridRetriever raise KeyError: -1 if the len of doc less than 1_000 HOT 1
- autotune Function Usage Example HOT 1
- ANN_Searcher not dealing with -1 returned by faiss_index.search()
- fsspec==2023.12.2 does not allow '**' in path
- HybridRetriever does not respect cutoff when calling sub-retrievers and the merger
- [BUG] Segmentation fault (core dumped) HOT 1
- Getting Out of Memory Error HOT 1
- using another ANN
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from retriv.