Coder Social home page Coder Social logo

terrier-prf's Introduction

Terrier-PRF

Terrier-PRF provides additional pseudo-relevance feedback (query expansion) models for the Terrier platform. In particular, it contains three models:

  • RM1 relevance model [1]
  • RM3 relevance model [2]
  • Axiomatic Query Expansion [3,4]

Installation

After cloning this github repo, you can

mvn install

to install to your local Maven repo.

Usage

From your Terrier directory, you should:

  1. edit the terrier.properties file to specify the query expansion models in the querying.processes list
querying.processes=terrierql:TerrierQLParser,parsecontrols:TerrierQLToControls,parseql:TerrierQLToMatchingQueryTerms,matchopql:MatchingOpQLParser,applypipeline:ApplyTermPipeline,localmatching:LocalManager$ApplyLocalMatching,rm1:RM1,rm3:RM3,ax:AxiomaticQE,qe:QueryExpansion,labels:org.terrier.learning.LabelDecorator,filters:LocalManager$PostFilterProcess'

  1. Invoke batchretrieval or interactive command while specifying the relevant controls, and the terrier-prf package
bin/terrier br -w BM25 -c rm1:on -o ./bm25.rm1.res -P org.terrier:terrier-prf:0.1-SNAPSHOT

bin/terrier br -w BM25 -c rm3:on -o ./bm25.rm3.res -P org.terrier:terrier-prf:0.1-SNAPSHOT

bin/terrier br -w BM25 -c axqe:on -o ./bm25.axqe.res -P org.terrier:terrier-prf:0.1-SNAPSHOT

(0.1-SNAPSHOT) is optional for Terrier versions after 5.2

Credits

  • Craig Macdonald, University of Glasgow
  • Nicola Tonellotto, University of Pisa

Thanks to Jeff Dalton and Jimmy Lin for useful discussions.

References

[1] Victor Lavrenko and W. Bruce Croft. Relevance based language models. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR โ€™01). https://dl.acm.org/doi/10.1145/383952.383972

[2] Nasreen Abdul-Jaleel, James Allan, W. Bruce Croft, Fernando Diaz, Leah Larkey, Xiaoyan Li, Mark D. Smucker, Courtney Wade. UMass at TREC 2004: Novelty and HARD. In Proceedings of TREC 2004. https://trec.nist.gov/pubs/trec13/papers/umass.novelty.hard.pdf

[3] Hui Fang, Chang Zhai.: Semantic term matching in axiomatic approaches to information retrieval. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 115โ€“122. SIGIR 2006. ACM, New York (2006).

[4] Peilin Yang and Jimmy Lin, Reproducing and Generalizing Semantic Term Matching in Axiomatic Information Retrieval. In Proceedings of ECIR 2019.

terrier-prf's People

Contributors

cmacdonald avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

terrier-prf's Issues

Set the number of documents and expansion terms

Hi,

Thanks for supporting additional pseudo-relevance feedback in Terrier!

I have a question about setting the number of documents and expansion terms. I tried to set "qe_fb_docs" and "qe_fb_terms" (see below) but they do not work, as Terrier still analyzes 3 documents (the default one).

bin/terrier batchretrieve -w BM25 -c rm3:on qe_fb_terms=5 qe_fb_docs=10 -P org.terrier:terrier-prf -t ${query_file}

What's the right way of setting these two parameters? Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.