This is an unofficial implementation of FewShot3DKP.
Since thes code is not released, I tried to build an unofficial one.
I copied most of the parts from Autolink, the authors' another project, as their architecture looks super similar.
conda create -n fewshot3dkp python=3.8
conda activate fewshot3dkp
pip install -r requirements.txt
The WFLW, can be found on their websites.
I provide the pre-processing code for WFLW make them h5
files. The code is based on 3FabRec.
I also provide the processed h5 file in Github Release.
The pre-trained models can be downloaded in Github Release.
-
In the 3D loss, the similarity transformation needs to be detached. Otherwise the model will break. (confirmed by the authors)
-
The edge map needs to be multiplied by a small number before being concatenated with the masked image. Otherwise the model may generate weird edges. (confirmed by the authors)
-
In the 2D loss, if the keypoints are outside the image after affine transformation, they should be ignored. (confirmed by the authors)
-
The depth needs to be devided by a large number so that it is not too crazy in the beginning. (confirmed by the authors)
-
The few-shot examples can significantly affects the model. Better choose different poses or shapes.
-
I implemented the 3D loss on the whole object instead of parts.
-
I do not minimize the difference between minimal pairs but random pairs.
-
I am not sure how the detector and decoder looks exactly, so I just copied from Autolink, which might be too small.
-
I do not use ViT perceptual loss but VGG perceptual loss as in Autolink.
-
I do not use linearly increased augmentation range, but fix them.
-
Instead of using KMeans to choose few-shot examples, I manually picked them.
-
I use fixed edge thickness instead of learnable ones.
NME on WFLW (10 shots): 11.7 vs 9.19 (original paper)