ashwhall / dsnt Goto Github PK

An unofficial Tensorflow implementation of the DSNT layer, as taken from the paper "Numerical Coordinate Regression with Convolutional Neural Networks".

Home Page: https://arxiv.org/abs/1801.07372

License: Apache License 2.0

Python 100.00%

deep-learning neural-network python tensorflow

dsnt's Introduction

Ash Hall's GitHub

Machine Learning Research Engineer / Software Developer

More info and contact details on my website 🔗

Now playing on Spotify:

dsnt's People

Contributors

Stargazers

Watchers

dsnt's Issues

Does this implementation support multiple activation maps?

Suppose I want to localize 5 objects with 5 x, y pairs can I do it?
I looked at the source and it seems that this use case is not supported.
If it is not supported, how can I make this support multiple activation maps?

dsnt in testing phase

Thanks for the code!

I am doing pose estimation and face a problem that the heatmap for predicting the left wrist also fires a little response on the right wrist, which means, the heatmap has two peaks, a strong peak on the left wrist and a weak peak on the right wrist.

The two peaks problem makes the dsnt predicted result uncorrect.
I understand the argument in paper that two peaks heatmap should be punished by the loss, while in the testing phase, ambiguity does happen sometimes

Do you have any suggestions?
Thanks!

Question about softmax

I think the _softmax2d should implent as ：
tf.exp(logits) / tf.reduce_sum(tf.exp(logits),[1,2], keepdims=True).

Why subtract the input by max value first as in dsnt.py _softmax2d as below?
max_axis = tf.reduce_max(target, axes, keepdims=True)
target_exp = tf.exp(target - max_axis)

How to get confidence of the predicted point

I am using DSNT to predict the coordinate, is there any way that I can get the confidence of the predicted point.

Expected ground truth labels range unclear

Hi,
Thanks for this TF implementation. By looking at the code and functions' docstrings, it is unclear to me what the expected range for the ground truth coordinates is: in the guassian generation function, the coordinates are clearly expected to be between 0 and 1, whereas in the computation of dsnt_x and dsnt_y, the coordinate values are beteween -1 and 1. This will produce outputs coordinates between -1 and 1. So, what range should I pick for my labels ?
Best regards,
Pierre

Defining loss function

How do I define my loss function when it's my understanding that custom loss functions are only supposed to take 2 inputs, y_true and y_pred? For example here those are targets and coords:

def loss_function(targets, coords):
loss_1 = tf.losses.mean_squared_error(targets, coords)
loss_2 = dsnt.js_reg_loss(norm_heatmaps, targets)
loss = loss_1 + loss_2
return loss

But how do I get norm_heatmaps passed to the loss function?

more than one points

if there is more than one points, the heatmaps size is [batch, width, height, P_num].
then how to build DSNT Layer?

I try it, but fail!

Add license

Hi!
Your repository currently doesn't have any license attached. Is this intentional? By not including a license, it means nobody has permission to use, modify or share this code - check Choose a Licence.

Where is the Pytorch based implementation?

Hi @ashwhall !
Thanks for your great work!
I have read your paper "Numerical Coordinate Regression with Convolutional Neural Networks", and I remember it said in your paper that your code is "written in PyTorch, and is available online".
I am more familiar with PyTorch, so can you help me with a PyTorch version of your code? That will also help me a lot with my own work .

Details in implementing the Gaussian kernel

Hi ashwhall,

Thanks for sharing the code!
May I ask when you implement the gaussian kernel why you times the '((x-x0)**2 + (y-y0)2) / fwhm2' with 4*tf.log(2.) ? I understand the term: '((x-x0)**2 + (y-y0)2) / fwhm2' , but why choose 4 and log2? (Line 125, _make_gaussian function in the dsnt.py file)
It probably would not affect the results greatly, just being curious.

Many thanks,
C