CBAR is a Python package for content-based audio retrieval with text queries.
It contains two retrieval methods. The Passive-Aggressive Model for Image Retrieval (PAMIR) was initially developed in the context of an image retrieval application1 but has been proven to work equally well for audio retrieval applications2.
The second approach combines on a Low-Rank Retraction Algorithm (LORETA)3 and the Weighted Approximate-Rank Pairwise loss (WARP loss)4 to efficiently infer the model parameters. A similar algorithm, constrained to the context of finding similar items of the same kind (similarity search), has been shown to work well on image and audio datasets5.
Jump straight to the CAL500 quickstart <notebooks/quickstart>
guide if you are impatient.
The latest release of CBAR can be installed from PyPI using pip
.
pip install cbar
CBAR is tested on Python 2.7 and depends on NumPy, SciPy, Pandas, NLTK, and scikit-learn. See setup.py
for version information.
https://dschwertfeger.github.io/cbar
https://github.com/dschwertfeger/cbar
Grangier, D. and Bengio, S., 2008. A discriminative kernel-based approach to rank images from text queries. IEEE transactions on pattern analysis and machine intelligence, 30(8), pp.1371-1384.↩
Chechik, G., Ie, E., Rehn, M., Bengio, S. and Lyon, D., 2008, October. Large-scale content-based audio retrieval from text queries. In Proceedings of the 1st ACM international conference on Multimedia information retrieval (pp. 105-112). ACM.↩
Shalit, U., Weinshall, D. and Chechik, G., 2012. Online learning in the embedded manifold of low-rank matrices. Journal of Machine Learning Research, 13(Feb), pp.429-458.↩
Weston, J., Bengio, S. and Usunier, N., 2010. Large scale image annotation: learning to rank with joint word-image embeddings. Machine learning, 81(1), pp.21-35.↩
Lim, D. and Lanckriet, G., 2014. Efficient Learning of Mahalanobis Metrics for Ranking. In Proceedings of The 31st International Conference on Machine Learning (pp. 1980-1988).↩