arcface

A TensorFlow implementation of ArcFace for face recognition.

Video Demo: TensorFlow Face Recognition Demo (Bilibili)

Features

Build with TensorFlow 2.4 Keras API
Advanced model architecture: HRNet v2
Build in softmax pre-training
Fully customizable training loop from scratch
Automatically restore from previous checkpoint, even epoch not accomplished.
Handy utilities to convert and shuffle the training datasets.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

For training

Additional packages for evaluation

Additional packages for inference with video

Installing

Get the source code for training

# From your favorite development directory
git clone --recursive https://github.com/yinguobing/arcface

Download the training data

You can use any dataset as long as they can be converted to TensorFlow Record files. If you do not have any dataset, please download one from the ArcFace official dataset list.

Generate training dataset

This project provides a demo file showing how to convert the downloaded MXNet dataset into TFRecord files. You can run it like this:

# From the project root directory
python3 -m utils.mx_2_tf

Fully shuffle the dataset

Most face recognition datasets contains millions of training samples. It is better to fully shuffle the data in the record file.

cd utils
python3 shard_n_shuffle.py

Do not forget setting a correct dataset file path.

Training

Deep neural network training can be complicated as you have to make sure everything is ready like datasets, checkpoints, logs, etc. But do not worry. Following these steps you should be fine.

Setup the model

In the module train.py, setup your model's name.

# What is the model's name?
name = "hrnetv2"

Setup the training dataset

These files do not change frequently so set them in the source code.

# Where is the training file?
train_files = "/path/to/train.record"

# How many identities do you have in the training dataset?
num_ids = 85742

# How many examples do you have in the training dataset?
num_examples = 5822653

Setup the model parameters

These values varies with your dataset. Better double check before training.

# What is the shape of the input image?
input_shape = (112, 112, 3)

# What is the size of the embeddings that represent the faces?
embedding_size = 512

Start training

Set the hyper parameters in the command line. For softmax pre-training set --softmax=True. Otherwise the ArcLoss will be used.

# Softmax pre-training
python3 train.py --epochs=2 --batch_size=192 --softmax=True

# Train with ArcLoss
python3 train.py --epochs=4 --batch_size=192

Training checkpoints can be found in directory checkpoints. There is also another directory model_scout containing the best(max accuracy) model checkpoint. You get this feature for free.

Resume training

Once the training was interrupted, you can resume it with the exact same command used for staring. The build in TrainingSupervisor will handle this situation automatically, and load the previous training status from the latest checkpoint.

# The same command used for starting training.
python3 train.py --epochs=4 --batch_size=192

However, if you want more customization, you can also manually override the training schedule with --override=True to set the global step, the epoch and the monitor value.

python3 train.py --epochs=4 --batch_size=192 --override=True

Monitor the training process

Use TensorBoard. The log and profiling files are in directory logs

tensorboard --logdir /path/to/arcface/logs

Export

Even though the model wights are saved in the checkpoint, it is better to save the entire model so you won't need the source code to restore it. This is useful for inference and model optimization later.

For cloud/PC applications

Exported model will be saved in saved_model format in directory exported. You can restore the model with Keras directly.

python3 train.py --export_only=True

Evaluation

Once the model is exported, you can run an evaluation of the model with test datasets like LFW, etc.

python3 evaluate.py

After training for 84k steps(ignore the softmax pre-training steps) the prediction accuracy on LFW dataset is 0.9845 ± 0.0022. Following is the ROC curve figure.

Check the module code before running. It should not be difficult.

Inference

Once the model is exported, you can use predict.py to recognize faces. Please prepare some sample face images and set the paths in the python file. Then run

python3 predict.py --video /path/to/video.mp4

The most similar faces will be marked in the video frames. You can also use threshold to filter the results.

Authors

Yin Guobing (尹国冰) - yinguobing

License

Acknowledgments

ArcFace: the official implementation in MXNet.
InsightFace-tensorflow: Tensoflow implementation of InsightFace.
Keras_insightface: Insightface Keras implementation

Asian-celeb dataset download link

[Asian-celeb dataset]

Training data(Asian-celeb)

The dataset consists of the crawled images of celebrities on the he web.The ima images are covered under a Creative Commons Attribution-NonCommercial 4.0 International license (Please read the license terms here. e. http://creativecommons.org/licenses/by-nc/4.0/).

[train_msra.tar.gz]

MD5:c5b668f2204c400099b14f367069aef5

Content: Train dataset called MS-Celeb-1M-v1c with 86,876 ids/3,923,399 aligned images cleaned from MS-Celeb-1M dataset.

This dataset has been excluded from both LFW and Asian-Celeb.

Format: *.jpg

Google: https://drive.google.com/file/d/1aaPdI0PkmQzRbWErazOgYtbLA1mwJIfK/view?usp=sharing

[msra_lmk.tar.gz]

MD5:7c053dd0462b4af243bb95b7b31da6e6

Content: A list of five-point landmarks for the 3,923,399 images in MS-Celeb-1M-v1c.

Format: .....

while is the path of images in tar file train_msceleb.tar.gz.

Label is an integer ranging from 0 to 86,875.

(x,y) is the coordinate of a key point on the aligned images.

left eye
right eye
nose tip
mouth left
mouth right

Google: https://drive.google.com/file/d/1FQ7P4ItyKCneNEvYfJhW2Kff7cOAFpgk/view?usp=sharing

[train_celebrity.tar.gz]

MD5:9f2e9858afb6c1032c4f9d7332a92064

Content: Train dataset called Asian-Celeb with 93,979 ids/2,830,146 aligned images.

This dataset has been excluded from both LFW and MS-Celeb-1M-v1c.

Format: *.jpg

Google: https://drive.google.com/file/d/1-p2UKlcX06MhRDJxJukSZKTz986Brk8N/view?usp=sharing

[celebrity_lmk.tar.gz]

MD5:9c0260c77c13fbb32692fc06a5dbfaf0

Content: A list of five-point landmarks for the 2,830,146 images in Asian-Celeb.

Format: .....

while is the path of images in tar file train_celebrity.tar.gz.

Label is an integer ranging from 86,876 to 196,319.

(x,y) is the coordinate of a key point on the aligned images.

left eye
right eye
nose tip
mouth left
mouth right

Google: https://drive.google.com/file/d/1sQVV9epoF_8jS3ge6DqbilpWk3UNE8U7/view?usp=sharing

[testdata.tar.gz]

MD5:f17c4712f7562ea6d45f0a158e59b792

Content: Test dataset with 1,862,120 aligned images.

Format: *.jpg

Google: https://drive.google.com/file/d/1ghzuEQqmUFN3nVujfrZfBx_CeGUpWzuw/view?usp=sharing

[testdata_lmk.tar]

MD5:7e4995eb9976a2cfd2b23db05d76572c

Content: A list of five-point landmarks for the 1,862,120 images in testdata.tar.gz.

Features should be extracted in the same sequence and with the same amount with this list.

Format: .....

while is the path of images in tar file testdata.tar.gz.

(x,y) is the coordinate of a key point on the aligned images.

left eye
right eye
nose tip
mouth left
mouth right

Google: https://drive.google.com/file/d/1lYzqnPyHXRVgXJYbEVh6zTXn3Wq4JO-I/view?usp=sharing

[feature_tools.tar.gz]

MD5:227b069d7a83aa43b0cb738c2252dbc4

Content: Feature format transform tool and a sample feature file.

Format: We use the same format as Megaface(http://megaface.cs.washington.edu/) except that we merge all files into a single binary file.

yinguobing / arcface Goto Github PK

arcface's Introduction

arcface

Features

Getting Started

Prerequisites

Installing

Get the source code for training

Download the training data

Generate training dataset

Fully shuffle the dataset

Training

Setup the model

Setup the training dataset

Setup the model parameters

Start training

Resume training

Monitor the training process

Export

For cloud/PC applications

Evaluation

Inference

Authors

License

Acknowledgments

arcface's People

Contributors

Stargazers

Watchers

Forkers

arcface's Issues

Google: https://drive.google.com/file/d/1aaPdI0PkmQzRbWErazOgYtbLA1mwJIfK/view?usp=sharing

Google: https://drive.google.com/file/d/1FQ7P4ItyKCneNEvYfJhW2Kff7cOAFpgk/view?usp=sharing

Google: https://drive.google.com/file/d/1-p2UKlcX06MhRDJxJukSZKTz986Brk8N/view?usp=sharing

Google: https://drive.google.com/file/d/1sQVV9epoF_8jS3ge6DqbilpWk3UNE8U7/view?usp=sharing

Google: https://drive.google.com/file/d/1ghzuEQqmUFN3nVujfrZfBx_CeGUpWzuw/view?usp=sharing

Google: https://drive.google.com/file/d/1lYzqnPyHXRVgXJYbEVh6zTXn3Wq4JO-I/view?usp=sharing

Google: https://drive.google.com/file/d/1bjZwOonyZ9KnxecuuTPVdY95mTIXMeuP/view?usp=sharing

Recommend Projects

Recommend Topics

Recommend Org