Comments (4)
I am on the same page as you Miguel, readability over speed, and not (re)implementing our own distance metrics. That's interesting finding and somewhat surprising.
from computervision-recipes.
very nice results at the end: https://github.com/Microsoft/ComputerVision/blob/e011b08cca5eb3c35483cc1b3df8863eb51a5efe/image_similarity/notebooks/image_similarity_introduction.ipynb
there is an interesting mix of several things, first using resnet50 vs resnet18, the small one didn't converge. The key for the results I think it was to use a small feature size (512) instead of the initial ones that I had which was 2048. Not sure if also having batch normalization helped (could be, haven't tested). The small feature size probably also helps with the L2 distance. I can imagine that using KL could help if we use a larger feature size. Using finetunning vs freezing also improved the last computation.
Alexandra (what is her github user?) and I are planning to improve this, then she will take over
from computervision-recipes.
very nice results at the end: https://github.com/Microsoft/ComputerVision/blob/e011b08cca5eb3c35483cc1b3df8863eb51a5efe/image_similarity/notebooks/image_similarity_introduction.ipynb
there is an interesting mix of several things, first using resnet50 vs resnet18, the small one didn't converge. The key for the results I think it was to use a small feature size (512) instead of the initial ones that I had which was 2048. Not sure if also having batch normalization helped (could be, haven't tested). The small feature size probably also helps with the L2 distance. I can imagine that using KL could help if we use a larger feature size. Using finetunning vs freezing also improved the last computation.
Alexandra (what is her github user?) and I are planning to improve this, then she will take over
Very cool, indeed! Nice work, Miguel! My username is ateste.
from computervision-recipes.
question here @PatrickBue @loomlike @jainr @maxkazmsft
Patrick and I discussed about reformatting the computation metrics to use sklearn pairwise distances.
Recently I've been doing a lot of profiling for the reco proejct, so I did it here as well. It turns out that sklearn is much slower (I haven't tried all the functions though):
from sklearn.metrics import pairwise_distances
def compute_vector_distance2(vec1, vec2,method="l2"):
dist = pairwise_distances(vec1.reshape(1, -1), vec2.reshape(1, -1), method)
return dist[0][0]
print(feat1.shape) #(2048,)
%timeit compute_vector_distance(feat1, feat2, "l2")
#7.33 µs ± 43.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit compute_vector_distance2(feat1, feat2, "l2")
#109 µs ± 692 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
this happens because compute_vector_distance
is using np.linalg.norm
instead of the sklearn equivalent.
it's up to you guys, I'm a fan of priorizing redability over speed in python, if you think the original code is not readable I can refactor to sklearn, if you think it is readable, the original code is faster.
from computervision-recipes.
Related Issues (20)
- [BUG] deployment on Container Instances fails (classification model)
- Exception: It's not possible to apply those transforms to your dataset: expected scalar type Float but found Double HOT 1
- [ASK] action recognition milk bottle example HOT 2
- [ASK] SSL/TLS Handshake error while calling Azure Vision API HOT 1
- [FEATURE_REQUEST] Adding the 2020 SOTA paper for image retrieval (similarity scenario) HOT 1
- [ASK] about the job of document cleanup HOT 9
- [BUG]During the process of the train, it occurs the problem of OOM HOT 7
- [ASK] Batch Sampler and extreme classification HOT 2
- [ASK] the print that appears when I train R (2 + 1) D: [mpeg4 @ 0x5653a9a08900] Video uses a non-standard and wasteful way to store B-frames ('packed B-frames'). Consider using the mpeg4_unpack_bframes bitstream filter without encoding but stream copy to fix it
- how to generate gt data
- [FEATURE_REQUEST] Action Recognition via IP Camera Stream
- Release the trained model for document clean up HOT 1
- Missing `f` prefix on f-strings
- [ASK] Question about document cleanup HOT 1
- This repo is missing important files HOT 1
- With one exception, your dataset cannot be subjected to those transforms: anticipate scalar type but discovered to float Double #Exception: It's not possible to apply those transforms to your dataset: expected scalar type Float but found Double
- [FEATURE_REQUEST] Add vision transformers model to image classification HOT 1
- This repo is missing important files
- love
- ## Feature
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from computervision-recipes.