Few Shot Learning
The process of learning good features for machine learning applications can be very computationally expensive and may prove difficult in cases where little data is available. A prototypical example of this is the one-shot learning setting, in which we must correctly make predictions given only a single example of each new class.
Here, I explored the power of One-Shot Learning with a popular model called "Siamese Neural Network".
Table of Contents
Setup
Clone or Download
Fire up your favorite command line utility (e.g. Terminal, iTerm or Command Prompt), and type the following commands to clone the project.
$ git clone https://github.com/victor-iyiola/few-shot-learning.git
$ cd few-shot-learning && ls
LICENSE README.md datasets images omniglot one-shot.ipynb utils.py
Or simply download this repository, and change your working directory to the downloaded project.
$ cd path/to/few-shot-learning
$ ls
LICENSE README.md datasets images omniglot one-shot.ipynb utils.py
Note:
ls
command will be changed todir
for Windows users.
Install Third-Party Dependency Requirements
This project was developed with Python v3.6.5. However any higher version of Python works fine.
- Jupyter >= v4.4.0
- NumPy >= v1.14.3
- Sci-kit learn >= v0.19.1
- Keras >= v2.2.0
- TensorFlow >= v1.9.0
$ pip3 install --upgrade -r requirements.txt
$ jupyter notebook
[I 08:36:52.271 LabApp] The Jupyter Notebook is running at:
[I 08:36:52.271 LabApp] http://localhost:8888/?token=cb246f438ca40a1a319d12c877d2e825c923fc0525c9d136
[I 08:36:52.271 LabApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 08:36:52.273 LabApp]
Copy/paste this URL into your browser when you connect for the first time,
to login with a token:
http://localhost:8888/?token=cb246f438ca40a1a319d12c877d2e825c923fc0525c9d136
Architecture
Model
A standard Siamese Convolutional Neural Network with
The model consists of a sequence of convolutional layers, each of which uses a single channel with filters of varying size and a fixed stride of 1. The number of convolutional filters is specified as a multiple of 16 to optimize performance. The network applies a ReLU activation function to the output feature maps, optionally followed by max-pooling with a filter size and stride of 2. Thus the
where
Learning
Loss function. Let
Credits
Contribution
You are very welcome to modify and use them in your own projects.
Please keep a link to the original repository. If you have made a fork with substantial modifications that you feel may be useful, then please open a new issue on GitHub with a link and short description.
License (MIT)
This project is opened under the MIT 2.0 License which allows very broad use for both academic and commercial purposes.
A few of the images used for demonstration purposes may be under copyright. These images are included under the "fair usage" laws.