Coder Social home page Coder Social logo

Comments (6)

snosov1 avatar snosov1 commented on June 17, 2024

@Ilya-Krylov can you comment?

from open_model_zoo.

Ilya-Krylov avatar Ilya-Krylov commented on June 17, 2024

@banderlog Thank you for your finding. Actually this is a mistake. There should be

[1x2x192x320] - logits related to text/no-text classification for each pixel.
[1x16x192x320] - logits related to linkage between pixels and their neighbors.

If so it corresponds to description in the paper. Output blob format is [BATCH_SIZE, CHANNELS_NUMBER, H, W].

from open_model_zoo.

banderlog avatar banderlog commented on June 17, 2024

Thank you for fast answer, I'm glad that I was able to help and that the second output's shape now the same in the all 3 sources. But I still have a problem: text-detection-0001 output is only one tensor.

Please, look on my example below:

import cv2

img = cv2.imread('test.jpg')

td = cv2.dnn.readNet('./text-detection-0001.xml','./text-detection-0001.bin')
blob = cv2.dnn.blobFromImage(img, 1, (768, 1280))
td.setInput(blob)
a, b = td.forward()
>>> ValueError: not enough values to unpack (expected 2, got 1)

And if I'll check an output's shape:

a = td.forward()
a.shape
>>> (1, 16, 192, 320)

As far as I understand, I still need logits related to text/no-text classification for each pixel with shape [1x2x192x320]. Could you comment it please, maybe you need some additional info, like cv2.getBuildInformation() output?

Currently I am using custom build of OpenCV 4.0.1 with Inference Engine built from dldt git.

from open_model_zoo.

dkurt avatar dkurt commented on June 17, 2024

@banderlog,

This method returns a single output:

 |  forward(...)
 |      forward([, outputName]) -> retval
 |      .   @brief Runs forward pass to compute output of layer with name @p outputName.
 |      .   *  @param outputName name for layer which output is needed to get
 |      .   *  @return blob for first output of specified layer.
 |      .   *  @details By default runs forward pass for the whole network.

source: https://docs.opencv.org/master/db/d30/classcv_1_1dnn_1_1Net.html#a98ed94cb6ef7063d3697259566da310b

You need to use td.forward(td.getUnconnectedOutLayersNames())

from open_model_zoo.

banderlog avatar banderlog commented on June 17, 2024

Thank you, @dkurt , it worked ๐Ÿ˜„

a, b = td.forward(td.getUnconnectedOutLayersNames())

a.shape
>>>  (1, 2, 192, 320)
b.shape
>>> (1, 16, 192, 320)
td.getUnconnectedOutLayersNames()
>>> ['pixel_cls/add_2', 'pixel_link/add_2']

May I suggest that td.forward(td.getUnconnectedOutLayersNames()) should be added into text-detection-0001.md?

from open_model_zoo.

banderlog avatar banderlog commented on June 17, 2024

Blob creation should be done in this way: blob = cv2.dnn.blobFromImage(img3, 1, (1280,768)), if you do as above, you will get no errors, but also no comprehensible output.

from open_model_zoo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.