I am trying to use the u2netp_480x640_float32.pb mode

Thanks for the links <a class="user-mention notranslate" data-hovercard-type="user" da

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

I've committed some sample code here that I've tested. <a href="https://gi

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thanks a lot <a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

How to use U^2-net in Tensorflow? about pinto_model_zoo HOT 7 CLOSED

pinto0309 commented on July 21, 2024

How to use U^2-net in Tensorflow?

from pinto_model_zoo.

Comments (7)

edwinRNDR commented on July 21, 2024 1

Thanks for the links @PINTO0309

The test script seems to just load the model using tensorflow lite, then loads an Openvino version and uses Openvino for the inference. I have adjusted the script to use tflite for the inference, which gives me similar output to my previous attempts with Tensorflow/JVM.

The modified script:

import os
import cv2
import numpy as np
import time
try:
    from tflite_runtime.interpreter import Interpreter
except:
    from tensorflow.lite.python.interpreter import Interpreter

fps = ""
detectfps = ""
framecount = 0
detectframecount = 0
time1 = 0
time2 = 0

def resize_and_pad(img, size, pad_color=0):
    h, w = img.shape[:2]
    sh, sw = size
    # interpolation method
    if h > sh or w > sw: # shrinking image
        interp = cv2.INTER_AREA
    else: # stretching image
        interp = cv2.INTER_CUBIC
    # aspect ratio of image
    aspect = w/h  # if on Python 2, you might need to cast as a float: float(w)/h
    # compute scaling and pad sizing
    if aspect > 1: # horizontal image
        new_w = sw
        new_h = np.round(new_w/aspect).astype(int)
        pad_vert = (sh-new_h)/2
        pad_top, pad_bot = np.floor(pad_vert).astype(int), np.ceil(pad_vert).astype(int)
        pad_left, pad_right = 0, 0
    elif aspect < 1: # vertical image
        new_h = sh
        new_w = np.round(new_h*aspect).astype(int)
        pad_horz = (sw-new_w)/2
        pad_left, pad_right = np.floor(pad_horz).astype(int), np.ceil(pad_horz).astype(int)
        pad_top, pad_bot = 0, 0
    else: # square image
        new_h, new_w = sh, sw
        pad_left, pad_right, pad_top, pad_bot = 0, 0, 0, 0
    # set pad color
    if len(img.shape) is 3 and not isinstance(pad_color, (list, tuple, np.ndarray)): # color image but only one color provided
        pad_color = [pad_color]*3
    # scale and pad
    scaled_img = cv2.resize(img, (new_w, new_h), interpolation=interp)
    scaled_img = cv2.copyMakeBorder(scaled_img, pad_top, pad_bot, pad_left, pad_right, borderType=cv2.BORDER_CONSTANT, value=pad_color)
    return scaled_img


if __name__ == '__main__':
    camera_index  = 0
    camera_width  = 640
    camera_height = 480

    model_scale = 320

    # Tensorflow Lite
    interpreter = Interpreter(model_path='u2netp_{}x{}_float32.tflite'.format(model_scale, model_scale), num_threads=4)
    interpreter.allocate_tensors()
    input_details = interpreter.get_input_details()[0]['index']
    output_details = interpreter.get_output_details()[0]['index']

    # Init Camera
    cam = cv2.VideoCapture(camera_index)
    cam.set(cv2.CAP_PROP_FPS, 30)
    cam.set(cv2.CAP_PROP_FRAME_WIDTH, camera_width)
    cam.set(cv2.CAP_PROP_FRAME_HEIGHT, camera_height)
    window_name = "USB Camera"
    cv2.namedWindow(window_name, cv2.WINDOW_AUTOSIZE)

    while True:
        start_time = time.perf_counter()

        ret, raw_image = cam.read()
        if not ret:
            continue

        image = resize_and_pad(raw_image, (model_scale, model_scale))
        image = image.astype(np.float32)
        image = np.expand_dims(image, axis=0)
        image = image / 127.5 - 1.0

        interpreter.set_tensor(input_details, image)

        interpreter.invoke()
        output =  interpreter.get_tensor(output_details)

        output = (output + 1) * 127.5
        output = output[0]
        output = output.astype(np.uint8)
        output = resize_and_pad(output, (camera_width, camera_width))
        cv2.putText(output, detectfps, (camera_width - 175, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (38, 0, 255), 1, cv2.LINE_AA)
        cv2.imshow('USB Camera', output)

        if cv2.waitKey(1)&0xFF == ord('q'):
            break

        # FPS calculation
        detectframecount += 1
        framecount += 1
        if framecount >= 10:
            fps = "(Playback) {:.1f} FPS".format(time1 / 10)
            detectfps = "(Detection) {:.1f} FPS".format(detectframecount / time2)
            framecount = 0
            detectframecount = 0
            time1 = 0
            time2 = 0
        end_time = time.perf_counter()
        elapsedTime = end_time - start_time
        time1 += 1 / elapsedTime
        time2 += elapsedTime

Some md5sums:

a3506ec64c3856cfaa1124e64d33d373  u2netp_320x320_float32.h5
5a0be1148fc7f1b8dfffafc3d16f4339  u2netp_320x320_float32.pb
b3f9726a04dbd91b232c745671fd6444  u2netp_320x320_float32.tflite

from pinto_model_zoo.

PINTO0309 commented on July 21, 2024 1

@edwinRNDR
Thank you for the validation.
There seems to be a mistake in the conversion of the model, so I will review it. Please give me some time.

from pinto_model_zoo.

PINTO0309 commented on July 21, 2024 1

@edwinRNDR
It depends on the properties of the model and the properties of the framework before and after the conversion. My goal in working on the model transformation is to do integer quantization in Tensorflow Lite, but TFLite does not currently allow for variable input resolution. Without that constraint, I would prefer to generate the model with variable size inputs. I have been forced to keep the input resolution fixed in order to centralize the model conversion workflow.

from pinto_model_zoo.

PINTO0309 commented on July 21, 2024

I've committed some sample code here that I've tested.
https://github.com/PINTO0309/PINTO_model_zoo/blob/master/061_U-2-Net/01_float32/80_test_tflite_openvino.py

Below is a video of the results of the test.
https://twitter.com/PINTO03091/status/1312553042693111808?s=20

import os
import cv2
import numpy as np
import time
try:
    from tflite_runtime.interpreter import Interpreter
except:
    from tensorflow.lite.python.interpreter import Interpreter

try:
    from armv7l.openvino.inference_engine import IENetwork, IEPlugin
except:
    from openvino.inference_engine import IECore

fps = ""
detectfps = ""
framecount = 0
detectframecount = 0
time1 = 0
time2 = 0

def resize_and_pad(img, size, pad_color=0):
    h, w = img.shape[:2]
    sh, sw = size
    # interpolation method
    if h > sh or w > sw: # shrinking image
        interp = cv2.INTER_AREA
    else: # stretching image
        interp = cv2.INTER_CUBIC
    # aspect ratio of image
    aspect = w/h  # if on Python 2, you might need to cast as a float: float(w)/h
    # compute scaling and pad sizing
    if aspect > 1: # horizontal image
        new_w = sw
        new_h = np.round(new_w/aspect).astype(int)
        pad_vert = (sh-new_h)/2
        pad_top, pad_bot = np.floor(pad_vert).astype(int), np.ceil(pad_vert).astype(int)
        pad_left, pad_right = 0, 0
    elif aspect < 1: # vertical image
        new_h = sh
        new_w = np.round(new_h*aspect).astype(int)
        pad_horz = (sw-new_w)/2
        pad_left, pad_right = np.floor(pad_horz).astype(int), np.ceil(pad_horz).astype(int)
        pad_top, pad_bot = 0, 0
    else: # square image
        new_h, new_w = sh, sw
        pad_left, pad_right, pad_top, pad_bot = 0, 0, 0, 0
    # set pad color
    if len(img.shape) is 3 and not isinstance(pad_color, (list, tuple, np.ndarray)): # color image but only one color provided
        pad_color = [pad_color]*3
    # scale and pad
    scaled_img = cv2.resize(img, (new_w, new_h), interpolation=interp)
    scaled_img = cv2.copyMakeBorder(scaled_img, pad_top, pad_bot, pad_left, pad_right, borderType=cv2.BORDER_CONSTANT, value=pad_color)
    return scaled_img


if __name__ == '__main__':

    camera_index  = 0
    camera_width  = 640
    camera_height = 480

    model_scale = 320

    # Tensorflow Lite
    interpreter = Interpreter(model_path='u2netp_{}x{}_float32.tflite'.format(model_scale, model_scale), num_threads=4)
    interpreter.allocate_tensors()
    input_details = interpreter.get_input_details()[0]['index']
    output_details = interpreter.get_output_details()[0]['index']

    # OpenVINO
    model_xml = 'openvino/{}x{}/FP32/u2netp_{}x{}.xml'.format(model_scale, model_scale, model_scale, model_scale)
    model_bin = os.path.splitext(model_xml)[0] + ".bin"
    ie = IECore()
    net = ie.read_network(model=model_xml, weights=model_bin)
    input_blob = next(iter(net.inputs))
    exec_net = ie.load_network(network=net, device_name='CPU')

    # Init Camera
    cam = cv2.VideoCapture(camera_index)
    cam.set(cv2.CAP_PROP_FPS, 30)
    cam.set(cv2.CAP_PROP_FRAME_WIDTH, camera_width)
    cam.set(cv2.CAP_PROP_FRAME_HEIGHT, camera_height)
    window_name = "USB Camera"
    cv2.namedWindow(window_name, cv2.WINDOW_AUTOSIZE)

    while True:
        start_time = time.perf_counter()

        ret, raw_image = cam.read()
        if not ret:
            continue

        image = resize_and_pad(raw_image, (model_scale, model_scale))
        image = image.astype(np.float32)
        image = np.expand_dims(image, axis=0)
        image = image.transpose((0, 3, 1, 2))
        image = image / 127.5 - 1.0
        output = exec_net.infer(inputs={input_blob: image})
        output = (output['sigmd0'] + 1) * 127.5
        output = output.transpose((0, 2, 3, 1))[0]
        output = output.astype(np.uint8)
        output = resize_and_pad(output, (camera_width, camera_width))
        cv2.putText(output, detectfps, (camera_width - 175, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (38, 0, 255), 1, cv2.LINE_AA)
        cv2.imshow('USB Camera', output)

        if cv2.waitKey(1)&0xFF == ord('q'):
            break

        # FPS calculation
        detectframecount += 1
        framecount += 1
        if framecount >= 10:
            fps = "(Playback) {:.1f} FPS".format(time1 / 10)
            detectfps = "(Detection) {:.1f} FPS".format(detectframecount / time2)
            framecount = 0
            detectframecount = 0
            time1 = 0
            time2 = 0
        end_time = time.perf_counter()
        elapsedTime = end_time - start_time
        time1 += 1 / elapsedTime
        time2 += elapsedTime

from pinto_model_zoo.

PINTO0309 commented on July 21, 2024

@edwinRNDR
I used my automatic conversion script to re-convert it. I will formally update this repository at a later date.

Script (WIP) for automatic conversion from OpenVINO to Tensorflow
https://github.com/PINTO0309/openvino2tensorflow.git
u2netp_320x320_float32.tflite
https://drive.google.com/file/d/1uwgXPgXFfmgRQDqpN-1g-sg09QiHbJsW/view?usp=sharing
u2netp_320x320_float32.pb
https://drive.google.com/file/d/1fxTTa563dvSMUvzwyKIgOP5VGJYRvTGI/view?usp=sharing

from pinto_model_zoo.

edwinRNDR commented on July 21, 2024

Thanks a lot @PINTO0309, this works a charm!

Just out of curiosity: I see you generate the models for a range of input dimensions, but is it possible to give the model placeholder inputs of undetermined/varying size?

from pinto_model_zoo.

PINTO0309 commented on July 21, 2024

@edwinRNDR
All models except for the Full Integer Quantization and EdgeTPU models have been replaced and recommitted. e6189e5
https://github.com/PINTO0309/PINTO_model_zoo/tree/master/061_U-2-Net

I am closing this issue. If you have any other problems, please post the issue again.

from pinto_model_zoo.

How to use U^2-net in Tensorflow? about pinto_model_zoo HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent