luckycallor / insightface-tensorflow Goto Github PK
View Code? Open in Web Editor NEWTensoflow implementation of InsightFace (ArcFace: Additive Angular Margin Loss for Deep Face Recognition).
Tensoflow implementation of InsightFace (ArcFace: Additive Angular Margin Loss for Deep Face Recognition).
Hi! I was wondering if I could validate the model on my own validation set, because I'm testing my face detector. I have cropped the faces on LFW data set and saved them as images. Can I feed those into the ArcFace model to do inference? Thanks!
Hi there,
I am using the insightface tensor flow version, BUT there is a big problem, the similarity between embedding is too high for images that are completely different, the cosine similarity is 1 in some cases.
WHY is that???
训练模型的时候准确率一直为0 ?
Hi, I apply insightface for face recognition with my own dataset by simply add 2 fc layers and output the index of the class.
Firstly I trained it on mxnet, fixed the feature network(author provided pretrained model: r100), only finetune that 2 additional fc layers. Then I re-implemented it with tensorflow, the parameters and optimizer and initialization is the same. However, I cannot reproduce the same result as on mxnet.
Any suggestion, please? Is there any detail could I ignored?
Thank you.
我在lfw官网上下载了数据集
之后我运行generateTFRecord.py生产tfrecord后使用1006kSteps的预训练模型运行finetune_softmax.py,模型起始的acc很低基本为0但是后面会上升。
但是预训练模型在lfw.bin上已经有很高的准确率了,请问为什么在我生成的tfrecord上准确率这么低?
Hello,
Thanks for making this repo!
Just one question: Is cropping the image with MTCNN sufficient when using the ms1m dataset? Were the images aligned with a margin of 5?
Thanks a lot in advance!
I test embeddings from get_embd.py with 2 different faces, but distance between embeddings is small?
link image:https://drive.google.com/drive/folders/1YGB7-gxwUUt-UB9pSfdvoNeOwE82zCF8?usp=sharing
tks
I noticed that you only restore trainable variables in get_embd.py
, I guess that is an typo and is not intended?
with tf.Session(config=tf_config) as sess:
tf.global_variables_initializer().run()
print('loading...')
saver = tf.train.Saver(var_list=tf.trainable_variables())
saver.restore(sess, args.model_path)
print('done!')
Restoring only trainable variables will results in very bad performance because some BN weights and maybe some pretrained (frozen) weights are not restored.
Hello @luckycallor and thanks for your implementation, i am trying to apply arc loss in my own dataset but loss drops from around 57 to around 20 and then it stops converging. Furthermore the training accuracy stays at 0.0. I am using a batch size of 448, any tips for this?
And the other question is: for validation, if i measure the accuracy by the traditional way with
tf.reduce_mean(tf.cast(tf.equal(pred, self.train_labels), tf.float32)), without comparing 2 embeddings at a time, it would be a false prediction? if yes, can you explain me the reason?
What is the output node of the second pre-model(from baidu Drive, as if it were the first pre-trained model)? I can't find it in the file, I need to save the model, so I want to ask.
the following requirements are missing in your docu:
sklearn
pyyaml
The size of MS1M-ArcFace with the mode of bin is 16G ,when converted into tfrecord ,it's very large.Have you ever been in this situation?
Looking forward to your reply
在config_ms1m_100.yaml中。预训练模型下载的是你给的那个,pretrained_model: './premodel/config_ms1m_200_200k/best-m-200000 ',然后会出现错误ValueError: The passed save_path is not a valid checkpoint: ./premodel/config_ms1m_200_200k/best-m-200000 。
Hi Lucky,
Which IDE are using?
Please tell me in this program is using python 2.7 or 3.6?
Best regards,
PeterPham
Hi, luckycallor,
ArcFace论文中对图像的归一化是这样描述的:
The faces are cropped and resized to 112 × 112, and each pixel (ranged between [0, 255]) in RGB images is normalised by subtracting 127.5 then divided by 128.
在您的实现中是这样的:
InsightFace-tensorflow/get_embd.py
Line 41 in 3eca399
这两种做法的区别是归一化后的极值范围。不知道您这样做是否是因为有更好的表现?
I saw that in your configure files, you have batch size of 100, 200, 128, 256. Does this affect how I use this model to do inference? Do I have to pad my image data, e.g., 1 image to 100 images in order to do inference?
Besides, why do you choose batch size of 100, 200, not multiples of 2? Just because of conveniences?
It is a little strange that u dont add the code that add summary to suammary writer, especially when using the tensorboard
使用get_emb.py得到两张照片中的人脸embding,但是经过测试,两张照片同一个人和两张照片不同的人距离差别不大,请问如何来设置阈值呢?我使用的是欧氏距离计算
同一个人对应的距离:
dist1 : [[0.23265116]]
dist2 : 0.23265092
dist3 : 0.07422299692117233
simlarity : 0.9729368
不同人对应的距离::
dist1 : [[0.33493966]]
dist2 : 0.33493972
dist3 : 0.1071193271412563
simlarity : 0.94390774
其中dist1是我自己写的计算欧氏距离的函数,dist2使用的是作者提供的计算欧氏距离的函数,dist3是使用作者提供的计算cos距离的函数得到的,simlarity计算的是相似度,计算方式是:
def distance(embeddings1, embeddings2):
dot = np.dot(embeddings1, embeddings2)
norm = np.linalg.norm(embeddings1, ord=2) * np.linalg.norm(embeddings2, ord=2)
similarity = dot / norm
if similarity > 1:
similarity = 1.0
return similarity
大家都有这种问题吗?
Hi,
Thank you for your interesting work. I have tried to extract embedding with Pretrained Model, you mention that the face images should be well cropped here. Could you pls explain about it ? How well should I crop the faces ? Do I need to align the face ?
Thanks.
can the author tell us what is your training dataset?
Can you please explain this normalization method? First, you normalize the original image and the flipped image. Then you sum both embeddings into one.
embds_arr = embds_arr/np.linalg.norm(embds_arr, axis=1, keepdims=True)+embds_f_arr/np.linalg.norm(embds_f_arr, axis=1, keepdims=True)
Why would this work at inference time? Can you explain why this is better than just running the embedding and then normalizing it?
Thanks
您好,非常感谢您的代码,很好用~
有一个小小的疑惑,很想让您解答一下,就是我在faces_ms1m_112x112.tfrecord数据集上训练好后,想要在这个数据集上再验证一下分类准确率,但是我发现其准确率一直是0,但在lfw、agedb_30那种.bin文件上验证准确率很正常,请问这是为何呢?谢谢~
Is there any possibility to access the pre-trained models without installing Baidu software?
@luckycallor, thanks for your code!
However, I tried to evaluate your model config_ms1m_100_334k using your evaluate.py, and the result is not as reported. Can you tell if there is something wrong with my setup?
python evaluate.py --config_path configs/config_test.yaml --model_path config_ms1m_100_334k/best-m-334000
749/750
done!
749/750
done!
done!
eval on agedb_30: acc--0.60483+-0.02297, tar--0.07433+-0.01248@far=0.00067
reading /home/thanhnn/dataset/faces_ms1m_112x112/lfw.bin
done!
forward running...
749/750
done!
749/750
done!
done!
eval on lfw: acc--0.93733+-0.02137, tar--0.76000+-0.04442@far=0.00133
reading /home/thanhnn/dataset/faces_ms1m_112x112/cfp_ff.bin
done!
forward running...
874/875
done!
874/875
done!
done!
eval on cfp_ff: acc--0.81229+-0.01246, tar--0.32200+-0.03002@far=0.00114
reading /home/thanhnn/dataset/faces_ms1m_112x112/cfp_fp.bin
done!
forward running...
874/875
done!
874/875
done!
done!
eval on cfp_fp: acc--0.68686+-0.01423, tar--0.14657+-0.02339@far=0.00086
done!
# model params
backbone_type: resnet_v2_m_50
loss_type: arcface
out_type: E
image_size: 112
embd_size: 512
class_num: 85742
# hyper params
bn_decay: 0.9
keep_prob: 0.4
weight_decay: !!float 5e-4
logits_scale: 64.0
logits_margin: 0.5
momentum: 0.9
# run params
val_bn_train: False
augment_flag: True
augment_margin: 16
gpu_num: 1
batch_size: 16
epoch_num: 20
step_per_epoch: 100000
val_freq: 2000
lr_steps: [40000, 60000, 80000]
lr_values: [0.004, 0.002, 0.0012, 0.0004]
# paths
# pretrained_model: '/data/hhd/InsightFace-tensorflow/output/20190120-133421/checkpoints/ckpt-m-140000'
train_data: ['/data/hhd/dataset/FaceData/InsightFace/faces_ms1m_arcface.tfrecord']
val_data: {'agedb_30': '/home/thanhnn/dataset/faces_ms1m_112x112/agedb_30.bin', 'lfw': '/home/thanhnn/dataset/faces_ms1m_112x112/lfw.bin', 'cfp_ff': '/home/thanhnn/dataset/faces_ms1m_112x112/cfp_ff.bin', 'cfp_fp': '/home/thanhnn/dataset/faces_ms1m_112x112/cfp_fp.bin'}
output_dir: './output'
如
bn_decay
keep_prob
weight_decay
logits_scale
logits_margin
momentum
val_bn_train
augment_flag
augment_margin
gpu_num
batch_size
epoch_num
step_per_epoch
val_freq
lr_steps
lr_values
Thanks for your contribution. When I use get_embd.py to extract embedding with pretrained model, I put face images in read_path, I find when I put three images in read_path or four images in read_path, for same image, I get different embedding. For example, when there are three images in read_path, the embedding of '002.png' is [0.222, 0.1999,0.4445...], but when there are four images in read_path, the embedding of '002.png' become [0.111, 0.4888,0.7779...], they are different.
非常感谢代码分享,关于预训练模型里百度盘里的1006k和334k下载后的结果好像都是334k,是不是这样呢?
Please,Where can I get the evaluation data to get the link?
使用casia数据集训练,无预训模型。
使用config_ms1m_res50.yaml配置项acc会一直为0吗?
backbone_type: resnet_v2_50
loss_type: arcface
out_type: E
image_size: 112
embd_size: 512
class_num: 10572
bn_decay: 0.9
keep_prob: 0.4
weight_decay: !!float 5e-4
logits_scale: 64.0
logits_margin: 0.5
momentum: 0.9
val_bn_train: False
augment_flag: True
augment_margin: 16
gpu_num: 1
batch_size: 256
epoch_num: 20
step_per_epoch: 10000
val_freq: 2000
lr_steps: [40000, 60000, 80000]
lr_values: [0.004, 0.002, 0.0012, 0.0004]
pretrained_model: ''
train_data: ['/opt/gpu/z/InsightFace-tensorflow-master/data/casia.tfrecord']
i am doing the task about face verification. and i want to know the threshold so that i can separate whether these two images are the same person or not? @luckycallor hope u can answer me! really thanks....
Line 54: typo thresholdes(I assume this is threshold only)
Function load_image():
line 28:
line 37: doesn't have folder name before image name hence not reading
how I corrected it:-
def load_image(main_path, image_size):
print('reading %s' % main_path)
if os.path.isdir(main_path):
paths = list(os.listdir(main_path))
#print(paths)
else:
paths = [main_path]
images = []
images_f = []
for path in paths:
img = misc.imread(os.path.join(main_path, path))
img = misc.imresize(img, [image_size, image_size])
# img = img[s:s+image_size, s:s+image_size, :]
img_f = np.fliplr(img)
img = img/127.5-1.0
img_f = img_f/127.5-1.0
images.append(img)
images_f.append(img_f)
fns = [os.path.basename(p) for p in paths]
print('done!')
return (np.array(images), np.array(images_f), fns)
I finetune the model with my datasets, but the loss is always big
Epoch: [ 0/ 1] [ 15030/500000] time: 0.25, loss: 29.442 (inference: 27.447, wd: 1.994), acc: 0.000
Epoch: [ 0/ 1] [ 15031/500000] time: 0.25, loss: 12.441 (inference: 10.447, wd: 1.994), acc: 0.167
Epoch: [ 0/ 1] [ 15032/500000] time: 0.26, loss: 20.296 (inference: 18.302, wd: 1.994), acc: 0.167
and this is my confit_finetune.yaml:
backbone_type: resnet_v2_m_50
loss_type: arcface
out_type: E
image_size: 112
embd_size: 512
class_num: 93979
bn_decay: 0.9
keep_prob: 0.4
weight_decay: !!float 5e-4
logits_scale: 64.0
logits_margin: 0.5
momentum: 0.9
fixed_epoch_num: 1
val_bn_train: False
augment_flag: True
augment_margin: 16
gpu_num: 1
batch_size: 6
epoch_num: 1
step_per_epoch: 500000
val_freq: 100000
lr_steps: [40000, 60000, 80000]
lr_values: [0.004, 0.002, 0.0012, 0.0004]
pretrained_model: 'E:/weight/insight_face/best-m-334000'
train_data: ['F:/data_sets/tfrecord/faces_glintasia_cls93979.tfrecord']
val_data: {'lfw': 'E:/data_sets/val_data/lfw.bin'}
output_dir: 'output/'
what can I do in this situation?
please
刚开始训练的时候loss非常大 30多 到后面 loss会逐渐降到1.6左右 想问一下 大家训练时loss最后都会收敛到多少左右啊
请问有人遇到过这样的问题吗?
Hi, luckycallor
I find the checkpoint file so large, about 643M, are there any idea to reduce the size? Otherwise, when deploy in mobile devices, it consume a lot of memory.
FileNotFoundError: [Errno 2] No such file or directory: '/data/hhd/dataset/FaceData/InsightFace/faces_ms1m_arcface/agedb_30.bin'
Changing line102 in get_embd.py from:saver = tf.train.Saver(var_list=tf.trainable_variables()) --> saver = tf.train.Saver() solved this.
I use Resnet_50 to train ,batch_size 256, gpu 2*1080ti,however,I found that it spent nealy 12 seconds to train only one time, how could it so slow?did anyone know that?
hi, luckycallor, thanks for your code! sorry, i dont read the paper.
evaluate() in evaluate.py
thresholds = np.arange(0, 4, 0.01)
if distance_metric == 1:
thresholdes = np.arange(0, 1, 0.0025)
why 'distance_metric' ==1?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.