Coder Social home page Coder Social logo

mnseong / google-landmark-recognition-2021 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nickkaparinos/google-landmark-recognition-2021

0.0 0.0 0.0 272 KB

Google Landmark Recognition 2021 Kaggle image classification competition. Solution using transfer learning and ArcFace loss function.

Python 100.00%

google-landmark-recognition-2021's Introduction

Google-Landmark-Recognition-2021

Have you ever gone through your vacation photos and asked yourself: What is the name of this temple I visited in China? Who created this monument I saw in France? Landmark recognition can help! This technology can predict landmark labels directly from image pixels, to help people better understand and organize their photo collections. This competition challenges Kagglers to build models that recognize the correct landmark (if any) in a dataset of challenging test images.

Dataset

The dataset consists of 1580470 images of 81313 unique landmarks.

Neural Network Architecture

drawing

Additive Angular Margin Loss (ArcFace)

Overview

Additive Angular Margin Loss (ArcFace) is a state of art loss function used for image classification and face recognition. ArcFace has a clear geometric interpretation due to the exact correspondence to the geodesic distance on the hypersphere. https://arxiv.org/abs/1801.07698

ArcFace inference process

During inference, the features of the two images are normalised and the similarity is computed to determine if both pictures belong to the same class. The similarity between images is calculated using cosine similarity, which is a method used by search engines and can be calculated by the inner product of two normalised vectors.

ArcFace versus Cross Entropy Loss

In a standard classification network, SoftMax and Categorical Cross-Entropy loss are usually used at the end of the network. SoftMax transforms numbers into probabilities. For each object, it gives a probability for each class that sums to 1. Once training is complete, the class with the highest probability is chosen. The Categorical Cross-Entropy loss calculates the difference between two distributions of probabilities and is minimized in the process of back-propagation during the training.

The drawback with SoftMax is that it does not produce a safety margin, which means that the borders are a bit blurry. We want the vectors of two images of the same person to be as similar as possible, and the vectors of two images of two different people to be as different as possible. That means we want to produce a margin, as SVM does.

drawing

Results

Using Stochastic Gradient Descend with only 3 epochs of training, a validation accuracy of 25.5% and a micro F1 score of 0.25 was achieved. Due to computing limitations, no further optimisation could be done. Future work could include further network architecture optimisation, larger image dimensions and more training epochs.

google-landmark-recognition-2021's People

Contributors

nickkaparinos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.