xinntao / facexlib Goto Github PK

View Code? Open in Web Editor NEW

745.0 14.0 135.0 1.08 MB

FaceXlib aims at providing ready-to-use face-related functions based on current STOA open-source methods.

License: MIT License

Python 100.00%

pytorch deep-learning face detection alignment recognition parsing matting headpose tracking

facexlib's Introduction

FaceXLib

English | 简体中文

facexlib aims at providing ready-to-use face-related functions based on current SOTA open-source methods.
Only PyTorch reference codes are available. For training or fine-tuning, please refer to their original repositories listed below.
Note that we just provide a collection of these algorithms. You need to refer to their original LICENCEs for your intended use.

If facexlib is helpful in your projects, please help to ⭐ this repo. Thanks😊
Other recommended projects: ▶️ Real-ESRGAN ▶️ GFPGAN ▶️ BasicSR

✨ Functions

Function	Sources	Original LICENSE
Detection	Pytorch_Retinaface	MIT
Alignment	AdaptiveWingLoss	Apache 2.0
Recognition	InsightFace_Pytorch	MIT
Parsing	face-parsing.PyTorch	MIT
Matting	MODNet	CC 4.0
Headpose	deep-head-pose	Apache 2.0
Tracking	SORT	GPL 3.0
Assessment	hyperIQA	-
Utils	Face Restoration Helper	-

👀 Demo and Tutorials

🔧 Dependencies and Installation

Python >= 3.7 (Recommend to use Anaconda or Miniconda)
PyTorch >= 1.7
Option: NVIDIA GPU + CUDA

Installation

pip install facexlib

Pre-trained models

It will automatically download pre-trained models at the first inference.
If your network is not stable, you can download in advance (may with other download tools), and put them in the folder: PACKAGE_ROOT_PATH/facexlib/weights.

📜 License and Acknowledgement

This project is released under the MIT license.

📧 Contact

If you have any question, open an issue or email [email protected].

facexlib's People

Contributors

Stargazers

Watchers

Forkers

10183308 benjamesbabala robjsp i-amgeek nihirv bertwang tothebeginning xiaomajiaxx edwinkestler laughing-q simonslamka mfkiwl flyingdog-huang nirvanalan d-bohn stevewithington mobangs haile-vnu spy14414 mornydew xijunjun meijiangyuan cospel rhwfy daobook leecloudvictor a-raafat bossjones juanluisrosaramos hanhaiwang stjordanis zhuyeye longredzhong aredden anylee2021 caock taowangzj mindsetlib animebing midsommar-2019 iceclear liviatheodora rklasen dyz-zju elijahahianyo nimaone marcus-arcadius angelusdesign1 ludicityrock erfaneshrati kickback-space safore-com zdyshine duongna21 cybersys mbrukman beyondchenlin ernestchu jackzhousz ahmedalbanna suacker detkov bfreskura hehe1233211234567 jangocheng chaofeibu gaoyunlucky jaccas sanster yaoqi xvdp johnjarr sicxu techthiyanes sweetndata centaurioun humayun jiamumu dieeer iq-scm leepoooo stability-ai leonelhs jakeh-gc songfang pzx-star vltmedia ningchengzeng opentalker yousseb lanyan520 mjavadpur mrk1992 iwillcodeu zhangjinyangnwpu a198103 reynold97 sahilverma0696 laomaotf makcedward

facexlib's Issues

parsing model half inference not work

The half argument does not work in facexlib.parsing.init_parsing_model

def init_parsing_model(model_name='bisenet', half=False, device='cuda', model_rootpath=None):
    if model_name == 'bisenet':
        model = BiSeNet(num_class=19)
        model_url = 'https://github.com/xinntao/facexlib/releases/download/v0.2.0/parsing_bisenet.pth'
    elif model_name == 'parsenet':
        model = ParseNet(in_size=512, out_size=512, parsing_ch=19)
        model_url = 'https://github.com/xinntao/facexlib/releases/download/v0.2.2/parsing_parsenet.pth'
    else:
        raise NotImplementedError(f'{model_name} is not implemented.')

    model_path = load_file_from_url(
        url=model_url, model_dir='facexlib/weights', progress=True, file_name=None, save_dir=model_rootpath)
    load_net = torch.load(model_path, map_location=lambda storage, loc: storage)
    model.load_state_dict(load_net, strict=True)
    model.eval()
    model = model.to(device)
    return model

onnx

Hello, detection_Resnet50_Final.pth and parsing_parsenet.pth make my nvidia pc slow..just wanna ask if they got onnx version

FaceRestoreHelper seems not thread-safe

Using FaceRestoreHelper needs with threading.Semaphore() to work within threads. I experienced this first in GFPGAN and later in our custom implementations.

This limits performance of GFPGAN a lot!

can facexlib it support mac m1 architecture

hi,
xinntal,Thank you for such an excellent project，i have a request,hope you can solve。

https://github.com/xinntao/facexlib/blob/master/facexlib/detection/retinaface.py
in line 14
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
Can it support mac m1 architecture，use this code：
device = torch.device('cuda' if torch.cuda.is_available() else 'mps' if torch.has_mps else 'cpu')

TypeError: '>' not supported between instances of 'NoneType' and 'int'

xxx\facexlib\utils\face_restoration_helper.py", line 114, in read_image
if np.max(img) > 256: # 16-bit image
TypeError: '>' not supported between instances of 'NoneType' and 'int'

It seems like if you point it at an empty directory it will load up None and this kind of comparison isn't supported in Python 3. There needs to be some kind of wrapper or try/catch that will account for if the variable is None type before doing comparisons

face_resotration_helper.py read_image() cv2 method clobbers 16 bit to 8 bit.

To get 16 bit pngs in cv2 to read properly requires IMREAD_UNCHCHANGED - in opencv-python 4.7
img = cv2.imread(img, cv2.IMREAD_UNCHANGED)

changed in https://github.com/xvdp/facexlib commit 3e90175

Loading bunch of models

Hello, in face_restoration_helper.py i see that you are using different models: face detection, landmark detection and a segmentation model. This causes cuda out of memory errors in even GPUs with high memory. Why did you use different models and not just use one or two that can give you the same outputs?

face recognition model ResNetArcFace import error

Could not import ResNetArcFace, cosin_metric, load_image in inference/inference_recognition.py

increase area to paste.

Hi, I am using this library/tool for my project to. To be specific I am using paste_faces_to_input_image() function to paste my restored face into original image. I wanted to increase the area to be pasted rather than only facial landmarks. Is there any way to do this. Kindly let me know. Thanks.

Feature Request: offload memory usage on storage

I used Real-ESRGAN on Google Colab and wanted to enlarge the image 16x.
I tried the tile option, but it inevitably crashed in the process.
I examined the code and found that the paste_faces_to_input_image in facexlib's face_restoration_helper.py was allocating a huge amount of memory.
So I thought of offloading memory usage on storage and implemented it using numpy's memmap and numexpr.
This greatly reduced the memory consumption and made it possible to zoom in 16x on the image with Google Colab.

Would you like to incorporate this feature if you like?

--- face_restoration_helper_a.py
+++ face_restoration_helper_b.py
@@ -279,6 +279,7 @@
         self.restored_faces.append(face)
 
     def paste_faces_to_input_image(self, save_path=None, upsample_img=None):
+        import numexpr as ne
         h, w, _ = self.input_img.shape
         h_up, w_up = int(h * self.upscale_factor), int(w * self.upscale_factor)
 
@@ -288,6 +289,12 @@
         else:
             upsample_img = cv2.resize(upsample_img, (w_up, h_up), interpolation=cv2.INTER_LANCZOS4)
 
+        upsample_img_orig = upsample_img
+        upsample_img_memmap_fn = "/content/upsample_img.npy"
+        upsample_img = np.memmap(upsample_img_memmap_fn, shape=upsample_img.shape, dtype="float64", mode="w+")
+        upsample_img[:] = upsample_img_orig
+        del upsample_img_orig
+
         assert len(self.restored_faces) == len(
             self.inverse_affine_matrices), ('length of restored_faces and affine_matrices are different.')
         for restored_face, inverse_affine in zip(self.restored_faces, self.inverse_affine_matrices):
@@ -352,12 +359,14 @@
                 upsample_img = inv_soft_mask * pasted_face + (1 - inv_soft_mask) * upsample_img[:, :, 0:3]
                 upsample_img = np.concatenate((upsample_img, alpha), axis=2)
             else:
-                upsample_img = inv_soft_mask * pasted_face + (1 - inv_soft_mask) * upsample_img
+                ne.evaluate("inv_soft_mask * pasted_face + (1 - inv_soft_mask) * upsample_img", out=upsample_img)
 
         if np.max(upsample_img) > 256:  # 16-bit image
             upsample_img = upsample_img.astype(np.uint16)
         else:
             upsample_img = upsample_img.astype(np.uint8)
+        import os
+        os.unlink(upsample_img_memmap_fn)
         if save_path is not None:
             path = os.path.splitext(save_path)[0]
             save_path = f'{path}.{self.save_ext}'

Error: can't convert cuda:0 device type

I can run this package in different environments correctly but when i used the mentioned environment below I got following error:

self.face_helper.get_face_landmarks_5(only_center_face=only_center_face, eye_dist_threshold=5)
  File "/usr/local/lib/python3.8/dist-packages/facexlib/utils/face_restoration_helper.py", line 139, in get_face_landmarks_5
    bboxes = self.face_det.detect_faces(input_img, 0.97) * scale
  File "/usr/local/lib/python3.8/dist-packages/facexlib/detection/retinaface.py", line 228, in detect_faces
    bounding_boxes, landmarks = bounding_boxes[keep, :], landmarks[keep]
  File "/usr/local/lib/python3.8/dist-packages/torch/_tensor.py", line 970, in __array__
    return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

Environment info:

Docker
Based on nvcr.io/nvidia/tensorrt:21.12-py3 image.
Python 3.8.10
Pip env: env.txt

is the model retinaface trained on RGB/BGR way?

Hi,
i'd like to know is the model retinaface trained on RGB or BGR? if i applied direct on RGB, will the precision changes a lot?
Thanks for reply!

Low Quality Faces (blurry) after detecting and cv2.warpalign faces using Facexlib

I am using Facexlib library to detect, crop (warpalign) and resize (512x512) the faces from photographic images (high resolutions 4K or above). In some cases, the output images are low quality even though faces is bigger in size more than 1Kx1K resolutions. Here is code in Facexlib library for detecting and cv2.warpalign the faces:

self.face_helper.read_image(img)
# get face landmarks for each face
self.face_helper.get_face_landmarks_5(only_center_face=only_center_face, eye_dist_threshold=5)
# eye_dist_threshold=5: skip faces whose eye distance is smaller than 5 pixels
# align and warp each face
self.face_helper.align_warp_face()

here is link for these function: Face Detection using Facexlib

Below is Detected Image. (https://i.stack.imgur.com/VNNky.png)

Original Image is here (can not upload here as size is bigger)

How can I detect and crop (warp and align) faces from high resolution images ? I tried different interpolation method, but there is no difference in image quality. I tried following interpolation methods: cv2.INTER_NEAREST cv2.INTER_LINEAR cv2.INTER_AREA cv2.INTER_CUBIC cv2.INTER_LANCZOS4

I tried multiple interpolation techniques in cv2.warpalign method as flags, but no difference in image quality.

bounding_boxes < 0

bboxes = [[ -2.7954 213.42 660.27 979.73 0.99998 172.87 443.75 473.29 436.65 324.23 632.18 196.31 775.76 462.73 767.62]]

i have error result,
image

Can this handle animal or cartoon animal faces?

Would like to detect animal faces and other cartoon beings.

face_template values

How do I calculate these numbers?

class FaceRestoreHelper(object): ...
            self.face_template = np.array([[192.98138, 239.94708], [318.90277, 240.1936], [256.63416, 314.01935],
                                           [201.26117, 371.41043], [313.08905, 371.15118]])

I want to calculate them for LFW and other datasets. I'm very new so I would appreciate if you can show how to calculate. 🍵

Thank you so much for your amazing library!

recognition/init.py from 'cuda' to device parameter

Can you change this line only so we can use CPU or GPU.

FROM:
model = Backbone(num_layers=50, drop_ratio=0.6, mode='ir_se').to('cuda').eval()

TO:
model = Backbone(num_layers=50, drop_ratio=0.6, mode='ir_se').to(device).eval()

facexlib without CUDA

Hey,

I'm using this because I'm interested in testing out the GFPGAN repo. However I'd like to run this on a CPU version of torch only. Since we're performing inference and not training, would you be able to support this?

Thanks for the fantastic work though btw :)

NMS implementation

I see that you use a custom Python implementation for NMS.

facexlib/facexlib/detection/retinaface_utils.py

Lines 38 to 66 in 206e0cc

    
           def py_cpu_nms(dets, thresh): 
        
               """Pure Python NMS baseline.""" 
        
               x1 = dets[:, 0] 
        
               y1 = dets[:, 1] 
        
               x2 = dets[:, 2] 
        
               y2 = dets[:, 3] 
        
               scores = dets[:, 4] 
        
               areas = (x2 - x1 + 1) * (y2 - y1 + 1) 
        
               order = scores.argsort()[::-1] 
        
               keep = [] 
        
               while order.size > 0: 
        
                   i = order[0] 
        
                   keep.append(i) 
        
                   xx1 = np.maximum(x1[i], x1[order[1:]]) 
        
                   yy1 = np.maximum(y1[i], y1[order[1:]]) 
        
                   xx2 = np.minimum(x2[i], x2[order[1:]]) 
        
                   yy2 = np.minimum(y2[i], y2[order[1:]]) 
        
                   w = np.maximum(0.0, xx2 - xx1 + 1) 
        
                   h = np.maximum(0.0, yy2 - yy1 + 1) 
        
                   inter = w * h 
        
                   ovr = inter / (areas[i] + areas[order[1:]] - inter) 
        
                   inds = np.where(ovr <= thresh)[0] 
        
                   order = order[inds + 1] 
        
               return keep

Moreover, the requirements of this repository include torchvision:

facexlib/requirements.txt

Line 9 in 206e0cc

torchvision

I wonder whether it would be beneficial to use torchvision's implementation of NMS instead. cf. torchvision.ops.nms
I had this idea after a quick discussion in another repository: ternaus/retinaface#23 (comment)

A general question

Do you think I can use these functions for building live beauty filters like those seen in Snapchat or tencent (eye largening, face retouch, makeup...)? Or for those I have to use specialized GANs?
I would appreciate some general guidance.

Hi! profile faces!!

Hi!
I am using your algorithm very usefully.

But i don't want to detect profile faces.
How should I handle this?

Thanks!!

Can you take ‘device‘ as a parameter？

Thank you for perfect work.
But like this: "device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')",
Sometimes, I will use 'cuda:1'.
Please fix it.

4k HD realistic

wrong kernel in guassian blur ?

Hi,

you use twice gaussian blur with the same (101,101) kernel
should it be split into two 101x1 and 1x101 kernels?

or is it just a duplicate?

facexlib/facexlib/utils/face_restoration_helper.py

Line 317 in 9557a45

mask = cv2.GaussianBlur(mask, (101, 101), 11)

Function get_face_landmarks_5-like detect_faces Second individual number Noh self-determination? Unnecessary rehabilitation source code

Now！

Function get_face_landmarks_5 Detect_faces Second individual number 0.97, Nyoka demand repair source code, Incapable individual number direct change This individual, Unnecessary repair utils / face_resoration_helper.py

Please!

Possible bug in filtering small / angled faces

The code in question is located in face_restoration_helper.py, line 142.

...
for bbox in bboxes:
            # remove faces with too small eye distance: side faces or too small faces
            eye_dist = np.linalg.norm([bbox[6] - bbox[8], bbox[7] - bbox[9]])
            if eye_dist_threshold is not None and (eye_dist < eye_dist_threshold):
                continue
...

If I'm reading the rest of the code right, bbox array will have the following contents:

[bbox_x1, bbox_y1, bbox_x2, bbox_y2, confidence_score, eye_1_x, eye_1_y, eye_2_x, eye_2_y, nose_x, nose_y, lip_1_x, lip_1_y, lip_2_x, lip_2_y]

To calculate the distance between eyes, we should instead use np.linalg.norm([bbox[5] - bbox[7], bbox[6] - bbox[8]]).

Am I missing something?

Temporal instability with faces on continuous frames

--2411

--2412

--2413

I am having an issue when I use GFPGAN for videos. Even though three images are continuous and very identical, the output from GFPGAN for 2412 is quite weird around lip. I assume the issue is related to facexlib since angle of face output from GFPGAN under cmp is different on 2412 compare to 2411 and 2413. This issue leads to flickering-ish experience when its made to video.
I have created same issue in GFPGAN.
Does anyone know how to fix this issue?

Thanks in advance for help :)

Problem:

At the main README, there is a table with a few internal links:

Link Status	Link
⚠️ Non-existing link	Detection
✅ Working	Alignment
⚠️ Non-existing link	Recognition
⚠️ Non-existing link	Parsing
⚠️ Non-existing link	Matting
⚠️ Non-existing link	Headpose
❓ Just a sketch	Tracking
⚠️ Non-existing link	Assessment
⚠️ Non-existing link	Utils

Most of those links will point to some non-existing README file.

Fix

For each link, we should either:

Create the file
Fix the link to point to an existing file
Remove the link

	def py_cpu_nms(dets, thresh):
	"""Pure Python NMS baseline."""
	x1 = dets[:, 0]
	y1 = dets[:, 1]
	x2 = dets[:, 2]
	y2 = dets[:, 3]
	scores = dets[:, 4]

	areas = (x2 - x1 + 1) * (y2 - y1 + 1)
	order = scores.argsort()[::-1]

	keep = []
	while order.size > 0:
	i = order[0]
	keep.append(i)
	xx1 = np.maximum(x1[i], x1[order[1:]])
	yy1 = np.maximum(y1[i], y1[order[1:]])
	xx2 = np.minimum(x2[i], x2[order[1:]])
	yy2 = np.minimum(y2[i], y2[order[1:]])

	w = np.maximum(0.0, xx2 - xx1 + 1)
	h = np.maximum(0.0, yy2 - yy1 + 1)
	inter = w * h
	ovr = inter / (areas[i] + areas[order[1:]] - inter)

	inds = np.where(ovr <= thresh)[0]
	order = order[inds + 1]

	return keep