Comments (11)
Hi Michael!
Have a look at the model metadata in the release notes, particularly the outputTensorRepresentation
. You can also look at the code for DetectionEngine
, which parses this kind of tensor.
from thermal-face.
Thank you for the quick response!
I'm not too sure how to use the code in DetectionEngine. outputTensorRepresentation
in the release notes looks like probably what I need, could you help explain a little further what it means? I see "maxDetections"
: 500 but I'm not quite sure what that means. The output tensor simply contains float numbers, so I'm not sure how that relates to "bounding_boxes"
, "class_labels"
, "class_confidences"
, and "num_of_boxes"
, which are the labels under outputTensorRepresentation
.
I hope I am not missing something simple--I am a student coder. I really appreciate the help!
from thermal-face.
Sure! I was suggesting looking at DetectionEngine
not necessarily to use that code but to see how they get the bounding boxes from the tensor. The model is one that supports detection as well as classification, so you can ignore the class label since it will always be the same. A quick reading of the code suggests that the coordinates of the bounding boxes are encoded as successive 4-tuples of floats where 1 means the full image width or height.
from thermal-face.
Thanks for the reply! Could you explain what you mean by "where 1 means the full image width or height"?
from thermal-face.
It looked to me like the bounding box coordinates are relative to the image size, with [0, 0] being the top left and [1, 1] being the bottom right, so you'd have to translate them back into pixels by multiplying the x and y values by width and height, respectively.
from thermal-face.
I see, thank you. Do you know why there's 500 sets of 4-tuples? I tried creating a bounding box using the first 4-tuple in the way you described, and it doesn't really seem correct. The bounding box doesn't bound my face most of the time--it floats around in space.
from thermal-face.
Could you post the raw tensor output you're seeing?
from thermal-face.
Yes, thanks for the reply. I made a real-time version of the code that attempts to draw a bounding box on a live video feed. As I played with the code a bit more and it occurred to me that the bounding box being drawn seemed to have some relationship with the position of my face, just not the correct one.
I tried a few more combinations of (X, Y) coordinates from the raw tensor output data, and realized that the bounding box is correct if I use this combination:
h, w, ch = image.shape y1 = int(results[0][0] * w) x1 = int(results[0][1] * h) y2 = int(results[0][2] * w) x2 = int(results[0][3] * h) cv2.rectangle(image, (x1, y1), (x2, y2), (255, 0, 0), 0)
I believe this means each row of the output array represents a detection (the 1st row is the highest confidence detection), and the 4 columns represent upper left corner Y, upper left corner X, lower right corner Y, lower right corner X, in that order.
That ended up being quite simple, I thought I had tried this already. Thank you for this help!
from thermal-face.
I do have one more question. The multi-person detection works really well, however, all 500 detections in the raw output tensor contain actual coordinates, but they don't contain any confidence levels. If, for example, there are 2 people in frame, the first 2 detections in the array very accurately bound the 2 people's faces, but the other 498 detections capture random background details. Is there any way to distinguish between detections of faces, and "filler" detections of background details?
from thermal-face.
This suggests that part of the output tensor contains the number of detections. I assume that only that many bounding boxes are valid and the rest are noise.
from thermal-face.
Hi, thank you for the response! I see how it works with DetectionEngine. However, I'm using the TensorflowLite API; do you know how I can achieve the same thing with TensorflowLite API?
For reference, this is my current code to run an image though the model and retrieve the output:
`interpreter = tflite.Interpreter(model_path=args.model_file)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
input_shape = input_details[0]['shape']
output_shape = output_details[0]['shape']
height = input_shape[1]
width = input_shape[2]
img = Image.open(args.image).convert('RGB').resize((width, height))
input_data = np.expand_dims(img, axis=0)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])`
output_data
is an array with shape (1, 500, 4), so I'm not sure where to find the number of "candidates". Do you know how I could achieve the equivalent of num_candidates = raw_result[self._tensor_start_index[3]]
using the TFLite API?
Thanks.
from thermal-face.
Related Issues (13)
- Consider using the FLIR Thermal Dataset HOT 3
- Is there a way to have a .pb version of the model? HOT 3
- Can I do it without Edge TPU ? HOT 3
- How is the performance HOT 1
- TPU-compiled model outputs are broken
- Try retraining an existing TPU face detector model HOT 9
- Pb version HOT 3
- tensorflowjs model HOT 1
- Need help with the outputs of the uncompiled model HOT 8
- Why did u use FLIR ADAS dataset since there is only person images and not face ones? HOT 1
- Try to perform transfer learning on the tflite models HOT 1
- Can i run this without google coral? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from thermal-face.