Comments (12)
bbox_real
means the real size of the target objects. For example, I assume humans have about 2000 milimeters x 2000 milimeters. If you want to train RootNet on FreiHAND, you might want to set bbox_real
to (300,300) as hands have about 300 milimeters x 300 milimeters. Please be careful to the unit. You should check the unit of bbox_real
is the same with GT root depth.
from 3dmppe_rootnet_release.
Thanks for your rapid reply. I would like to use your pre-trained model on the FreiHAND dataset, and I don't know if I should set the bbox_real=0.3
or bbox_real=300
?
from 3dmppe_rootnet_release.
The model that you want to use is pre-trained on human datasets? Then, you can't use it for the hand. You should train it again for the hand. Please look at FreiHAND dataset and decide whether you should set 0.3 or 300. That is based on the unit of GT root depth of FreiHAND dataset.
from 3dmppe_rootnet_release.
I want to use the model download from here, and I am not sure if this one is the pre-trained model on the FreiHAND dataset.
from 3dmppe_rootnet_release.
I see. You can use that as that one is pre-trained on FreiHAND.
from 3dmppe_rootnet_release.
OK, got it. Thanks for your patient reply again.
from 3dmppe_rootnet_release.
If you set bbox_real to 0.3, then the output root depth is in meter. If you set it to 300, then the output root depth is in milimeter.
from 3dmppe_rootnet_release.
OK, got it. Thanks.
from 3dmppe_rootnet_release.
Hi,
When I try to load the above-mentioned pre-trained model weights, it seems that the one you released is inconsistent with the codes in this repo, because the keys and weights are missing and some other things are stored in the model dict. I have tried to modify the code of the RootNet to make the model weights be loaded normally, however, I got other errors. Could you please provide the corresponding codes of the pre-trained model?
from 3dmppe_rootnet_release.
Sorry I don't have the codes now :( Why don't you just use predicted outputs of RootNet on FreiHAND? I made them publicly available. https://drive.google.com/file/d/1l1imjCHugUOoTHdL7so9ySXyNw26a0AK/view?usp=sharing
from 3dmppe_rootnet_release.
OK. That's because I want to evaluate the pre-trained model on the images captured in the wild. Anyway, I will try to handle this problem, and thanks for your patient reply again.
from 3dmppe_rootnet_release.
I changed the code of the Rootnet to the following:
class RootNet(nn.Module):
def __init__(self):
self.inplanes = 2048
self.outplanes = 256
super(RootNet, self).__init__()
self.xy_deconv = self._make_deconv_layer(3)
self.xy_conv = nn.Sequential(nn.Conv2d(in_channels=self.outplanes, out_channels=1, kernel_size=1, stride=1, padding=0))
self.gamma_layer = nn.Sequential(nn.Linear(self.inplanes, 512), nn.ReLU(inplace=True), nn.Linear(512, 1))
def _make_deconv_layer(self, num_layers):
layers = []
inplanes = self.inplanes
outplanes = self.outplanes
for i in range(num_layers):
layers.append(
nn.ConvTranspose2d(in_channels=inplanes,
out_channels=outplanes,
kernel_size=4,
stride=2,
padding=1,
output_padding=0,
bias=False))
layers.append(nn.BatchNorm2d(outplanes))
layers.append(nn.ReLU(inplace=True))
inplanes = outplanes
return nn.Sequential(*layers)
def forward(self, x, k_value):
# x,y
xy = self.xy_deconv(x)
xy = self.xy_conv(xy)
xy = xy.view(-1, 1, cfg.output_shape[0] * cfg.output_shape[1])
xy = F.softmax(xy, 2)
xy = xy.view(-1, 1, cfg.output_shape[0], cfg.output_shape[1])
hm_x = xy.sum(dim=(2))
hm_y = xy.sum(dim=(3))
coord_x = hm_x * torch.arange(cfg.output_shape[1]).float().cuda()
coord_y = hm_y * torch.arange(cfg.output_shape[0]).float().cuda()
coord_x = coord_x.sum(dim=2)
coord_y = coord_y.sum(dim=2)
# z
img_feat = torch.mean(x.view(x.size(0), x.size(1), x.size(2) * x.size(3)), dim=2) # global average pooling
# img_feat = torch.unsqueeze(img_feat, 2)
# img_feat = torch.unsqueeze(img_feat, 3)
gamma = self.gamma_layer(img_feat)
gamma = gamma.view(-1, 1)
depth = gamma * k_value.view(-1, 1)
coord = torch.cat((coord_x, coord_y, depth), dim=1)
return coord
def init_weights(self):
for name, m in self.deconv_layers.named_modules():
if isinstance(m, nn.ConvTranspose2d):
nn.init.normal_(m.weight, std=0.001)
elif isinstance(m, nn.BatchNorm2d):
nn.init.constant_(m.weight, 1)
nn.init.constant_(m.bias, 0)
for m in self.xy_layer.modules():
if isinstance(m, nn.Conv2d):
nn.init.normal_(m.weight, std=0.001)
nn.init.constant_(m.bias, 0)
for m in self.depth_layer.modules():
if isinstance(m, nn.Conv2d):
nn.init.normal_(m.weight, std=0.001)
nn.init.constant_(m.bias, 0)
class ResPoseNet(nn.Module):
def __init__(self, backbone, root):
super(ResPoseNet, self).__init__()
self.backbone = backbone
self.root_net = root
def forward(self, input_img, k_value, target=None):
_, fm = self.backbone(input_img)
coord = self.root_net(fm, k_value)
if target is None:
return coord
else:
target_coord = target["coord"]
target_vis = target["vis"]
target_have_depth = target["have_depth"]
## coordrinate loss
loss_coord = torch.abs(coord - target_coord) * target_vis
loss_coord = (loss_coord[:, 0] + loss_coord[:, 1] + loss_coord[:, 2] * target_have_depth.view(-1)) / 3.
return loss_coord
Then, the pre-trained model weights for the FreiHAND dataset can be loaded successfully.
from 3dmppe_rootnet_release.
Related Issues (20)
- DetectNet으로 resnet 50 + FPN 사용 HOT 3
- test on wild images HOT 7
- Measurement of bbox_real HOT 9
- Issue with 3d visualisation
- Does the RootNet work if i just predict the gamma?
- How do I set root_depth_list HOT 2
- pre-trained RootNet is broken HOT 1
- Where did I find k ?Data sets should be downloaded and merged HOT 4
- human-images HOT 1
- test dataset
- when i demo on my own img,the resout is very bad,why?
- the coordinates of x and y HOT 2
- Is there any follow-up study? HOT 2
- About 3DPW dataset HOT 2
- About MuCo dataset
- demo output
- Converting to onnx HOT 2
- Dataset download HOT 6
- The issue of inaccurate deep prediction HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from 3dmppe_rootnet_release.