Coder Social home page Coder Social logo

multiview-human-pose-estimation-pytorch's Introduction

This repo implements our ICCV paper "Cross View Fusion for 3D Human Pose Estimation" https://chunyuwang.netlify.com/img/ICCV_Cross_view_camera_ready.pdf

Quick start

Installation

  1. Clone this repo, and we'll call the directory that you cloned multiview-pose as ${POSE_ROOT}

  2. Install dependencies.

  3. Download pytorch imagenet pretrained models. Please download them under ${POSE_ROOT}/models, and make them look like this:

    ${POSE_ROOT}/models
    └── pytorch
        └── imagenet
            ├── resnet152-b121ed2d.pth
            ├── resnet50-19c8e357.pth
            └── mobilenet_v2.pth.tar
    

    They can be downloaded from the following link: https://onedrive.live.com/?authkey=%21AF9rKCBVlJ3Qzo8&id=93774C670BD4F835%21930&cid=93774C670BD4F835

  4. Init output(training model output directory) and log(tensorboard log directory) directory.

    mkdir ouput 
    mkdir log
    

    and your directory tree should like this

    ${POSE_ROOT}
    ├── data
    ├── experiments-local
    ├── experiments-philly
    ├── lib
    ├── log
    ├── models
    ├── output
    ├── pose_estimation
    ├── README.md
    ├── requirements.txt
    

Data preparation

For MPII data, please download from MPII Human Pose Dataset, the original annotation files are matlab's format. We have converted to json format, you also need download them from OneDrive. Extract them under {POSE_ROOT}/data, and make them look like this:

${POSE_ROOT}
|-- data
|-- |-- MPII
    |-- |-- annot
        |   |-- gt_valid.mat
        |   |-- test.json
        |   |-- train.json
        |   |-- trainval.json
        |   |-- valid.json
        |-- images
            |-- 000001163.jpg
            |-- 000003072.jpg

If you zip the image files into a single zip file, you should organize the data like this:

${POSE_ROOT}
|-- data
`-- |-- MPII
    `-- |-- annot
        |   |-- gt_valid.mat
        |   |-- test.json
        |   |-- train.json
        |   |-- trainval.json
        |   `-- valid.json
        `-- images.zip
            `-- images
                |-- 000001163.jpg
                |-- 000003072.jpg

For Human36M data, please follow https://github.com/CHUNYUWANG/H36M-Toolbox to prepare images and annotations, and make them look like this:

${POSE_ROOT}
|-- data
|-- |-- h36m
    |-- |-- annot
        |   |-- h36m_train.pkl
        |   |-- h36m_validation.pkl
        |-- images
            |-- s_01_act_02_subact_01_ca_01 
            |-- s_01_act_02_subact_01_ca_02

If you zip the image files into a single zip file, you should organize the data like this:

${POSE_ROOT}
|-- data
`-- |-- h36m
    `-- |-- annot
        |   |-- h36m_train.pkl
        |   |-- h36m_validation.pkl
        `-- images.zip
            `-- images
                |-- s_01_act_02_subact_01_ca_01
                |-- s_01_act_02_subact_01_ca_02

Limb length prior for 3D Pose Estimation, please download the limb length prior data from https://1drv.ms/u/s!AjX41AtnTHeTiQs7hDJ2sYoGJDEB?e=YyJcI4

put it in data/pict/pairwise.pkl

2D Training and Testing

Multiview Training on Mixed Dataset (MPII+H36M) and testing on H36M

python run/pose2d/train.py --cfg experiments-local/mixed/resnet50/256_fusion.yaml
python run/pose2d/valid.py --cfg experiments-local/mixed/resnet50/256_fusion.yaml

3D Testing

Multiview testing on H36M (based on CPU or GPU)

python run/pose3d/estimate.py --cfg experiments-local/mixed/resnet50/256_fusion.yaml (CPU Version)
python run/pose3d/estimate_cuda.py --cfg experiments-local/mixed/resnet50/256_fusion.yaml (GPU Version)

Citation

If you use our code or models in your research, please cite with:

@inproceedings{multiviewpose,
    author={Qiu, Haibo and Wang, Chunyu and Wang, Jingdong and Wang, Naiyan and Zeng, Wenjun},
    title={Cross View Fusion for 3D Human Pose Estimation},
    booktitle = {International Conference on Computer Vision (ICCV)},
    year = {2019}
}

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

The video demo is available here

PWC

multiview-human-pose-estimation-pytorch's People

Contributors

chunyuwang avatar microsoftopensource avatar msftgits avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

multiview-human-pose-estimation-pytorch's Issues

Size of saved model too big (~ 2 gb)

the final model saved on a mobile net backbone is of 2 GB. Can someone. please help me where I might be wrong. I have not made any significant changes in the code.

run error

python run/pose3d/estimate_cuda.py --cfg experiments-local/mixed/resnet50/256_fusion.yaml
Traceback (most recent call last):
File "run/pose3d/estimate_cuda.py", line 20, in
from core.config import config
ImportError: No module named core.config
Hope to get your help, thank you very much

Error when evaluating

Hello. I am training 3d pose model with
python run/pose3d/estimate_cuda.py --cfg experiments-local/mixed/resnet50/256_fusion.yaml (GPU Version)
, but I obtained an error:

Traceback (most recent call last):
File "/media/hkuit155/Windows/research/multiview-human-pose-estimation-pytorch/run/pose3d/estimate_cuda.py", line 120, in
main()
File "/media/hkuit155/Windows/research/multiview-human-pose-estimation-pytorch/run/pose3d/estimate_cuda.py", line 64, in main
all_heatmaps = h5py.File(prediction_path)['heatmaps']
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/home/hkuit155/.local/lib/python3.6/site-packages/h5py/_hl/group.py", line 264, in getitem
oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5o.pyx", line 190, in h5py.h5o.open
KeyError: "Unable to open object (object 'heatmaps' doesn't exist)"

I wonder how to fix it?

Project dependencies may have API risk issues

Hi, In multiview-human-pose-estimation-pytorch, inappropriate dependency versioning constraints can cause risks.

Below are the dependencies and version constraints that the project is using

EasyDict==1.7
opencv-python==3.4.2.17
shapely==1.6.4
Cython
scipy
pandas
pyyaml
json_tricks
scikit-image
tensorboardX
pymvg

The version constraint == will introduce the risk of dependency conflicts because the scope of dependencies is too strict.
The version constraint No Upper Bound and * will introduce the risk of the missing API Error because the latest version of the dependencies may remove some APIs.

After further analysis, in this project,
The version constraint of dependency scipy can be changed to >=0.14.0,<=1.7.3.

The above modification suggestions can reduce the dependency conflicts as much as possible,
and introduce the latest version as much as possible without calling Error in the projects.

The invocation of the current project includes all the following methods.

The calling methods from the scipy
scipy.interpolate.RegularGridInterpolator
The calling methods from the all methods
grid.mul.clamp
PoseMobileNetV2
min
j.output.detach
matplotlib.pyplot.subplot
scipy.interpolate.RegularGridInterpolator
conv_3x3_bn.append
torch.nn.ConvTranspose2d
self._make_branches
B.mean
max_i.detach.cpu
logging.getLogger.addHandler
optim.zero_grad
batch_image.clone.clone
p.p.np.array.reshape
math.ceil
sys.path.insert
self.mobilenetv2
points_2d_set.append
dataset.multiview_h36m.MultiViewH36M
m.weight.data.fill_
numpy.abs
self.layer3
matplotlib.pyplot.figure
self.bn2
numpy.float32
coords.copy
i.self.transition1
numpy.matmul
get_dir
torch.mul
torch.einsum
torchvision.utils.make_grid
numpy.array.append
utils.transforms.transform_preds
numpy.all
max_i.detach.cpu.numpy
torch.save
pathlib.Path.exists
modules.append
path.index
gridz.contiguous.view.contiguous
torch.max
transition_layers.append
numpy.argsort
self._get_db
core.inference.get_max_preds
self.maxpool
random.random
batch_image.clone.size
gridx.contiguous.view
cv2.imread
weight.append
matplotlib.pyplot.show
_xml_zfile.open
torch.optim.lr_scheduler.MultiStepLR
torch.nn.Upsample
core.config.update_dir
torch.nn.Sequential
yaml.load
i.batch_heatmaps.mul.clamp.byte.cpu.numpy
name_value.values
torch.load
pprint.pformat
self.features
numpy.amax
parse_args
batch_index._meta.clone.cpu
numpy.random.randn
PoseMobileNetV2.init_weights
self.deconv_layers
_xml_zfile.append
PoseUtils.estimate_camera
batch_index._output.clone.cpu
numpy.linspace
torch.cuda.memory_allocated
xrange
i._xml_zfile.open
torchvision.transforms.ToTensor
j.output.detach.cpu.numpy
self.sort_views
os.path.join
model.parameters
torch.nn.functional.grid_sample
cv2.waitKey
gen_config
infer.sort
self.union_joints.keys
self.stage4
add_path
m.x.torch.cat.view
R.dot
numpy.greater
save_batch_image_with_joints
numpy.dot
numpy.frombuffer
utils.transforms.affine_transform_pts_cuda
o.clone
grid.mul.clamp.byte.permute
h5py.File
node.unary.squeeze
PoseResNet
self.mpii_grouping
self.conv
imgs.append
numpy.meshgrid
pair.output_flipped.copy
self.layer1
torchvision.transforms.Compose
time.time
max_i.detach
idx.heatmaps_pred.squeeze.mul
self.transform
conv3x3s.append
model
m.weight.size
torch.nn.ModuleList
numpy.dot.mean
affine_transform
torch.cat
self.fuse_with_weights.reshape
os.path.dirname
Aggregation
i.self.transition3
batch_index._img.clone.min
root_idx.states_of_all_joints.detach
self.generate_heatmap
ChannelWiseFC
math.sqrt
v.todense.astype
torch.device
gridy.contiguous.view.contiguous
range
rgi
reset_config
cv2.getAffineTransform
self._initialize_weights
numpy.mean
batch_heatmaps.detach.cpu.numpy
torch.meshgrid
routing
cv2.resize
_update_dict
self.layer2
torch.Tensor
torch.from_numpy
main
compute_pairwise
numpy.sqrt
i.self.branches
min.img.add_.div_
unfold_camera_param
argparse.ArgumentParser.parse_args
idx.heatmaps_gt.squeeze
torch.sum.view
core.evaluate.accuracy
eval
i.batch_image.mul.clamp
root_idx.states_of_all_joints.detach.cpu
grid.mul.clamp.byte
outputs.append
poses3d.append
o.clone.cpu
cv2.imdecode
fc
self.branches
self.reset
project_point_radial
hmap.transpose
get_affine_transform
meta.append
i._im_zfile.read
copy.deepcopy
maxvals.reshape.reshape
tolerance.expect_length.distance.torch.abs.float
i.batch_image.mul.clamp.byte
node.unary.clone
self.fuse_with_weights
core.config.get_model_name
easydict.EasyDict
batch_index.finals.clone
multiviews.pictorial_cuda.rpsm
numpy.sign
self.generate_target
torch.zeros_like.clone
torch.t
self._make_deconv_layer
print
argparse.ArgumentParser.add_argument
m.bias.data.zero_
batch_index._img.clone.add_
self.downsample
img.mul.clamp.byte.permute.cpu
list.index
self.bn1
w.cuda.cuda
self.ChannelWiseFC.super.__init__
batch_index._hmap.clone.mul.clamp.byte.cpu.numpy
root_idx.states_of_all_joints.detach.cpu.numpy
get_loc_from_cube_idx
dataset.mpii.MPIIDataset
pymvg_cameras.append
batch_index._hmap.clone.mul
new_map.reshape.reshape
core.config.config.DATASET.TEST_DATASET.eval
aggre_layer.aggre.weight.data.cpu.numpy
o.clone.cpu.numpy
self.criterion
core.config.config.BACKBONE_MODEL.eval
multiviews.cameras.unfold_camera_param
pose_3d.reshape.reshape
self._get_deconv_cfg
conv_3x3_bn
PoseUtils.align_3d_to_2d
torch.as_tensor.append
grouping.append
os.path.exists
db_rec.np.array.copy
gpus.model.torch.nn.DataParallel.cuda.load_state_dict
batch_image.clone.min
self.conv2
numpy.mean.sum
ce.expand_as.expand_as
torch.tensor
fuse_layer.append
type
super.do_mapping
idx.heatmaps_gt.squeeze.mul
pdist2
batch_index._output.clone
prediction.cpu.numpy
torch.nn.MSELoss
core.config.config.LOSS.USE_TARGET_WEIGHT.JointsMSELoss.cuda
torch.optim.Adam
pair.joints.copy
infer.copy
mpii_grouping.append
root_idx.states_of_all_joints.detach.cpu.numpy.numel
len
poses.append
self.actual_joints.items
torch.nn.DataParallel
torch.mm
batch_index.finals.clone.mul.clamp.byte
round
m.numpy
torch.nn.Conv2d
multiviews.cameras_cuda.project_pose
img.mul.clamp.byte.permute.cpu.numpy
_xml_path_zip.append
batch_heatmaps.detach
batch_image.clone.max
multiviews.cameras.project_pose
core.config.config.GPUS.split
self.load_state_dict
numpy.cos
img.mul.clamp.byte
layers.append
self.get_group
u2a.items
numpy.not_equal
output.reshape
self.sort_skeleton_by_level
collections.OrderedDict
skeleton.append
torch.pairwise_distance
batch_index.finals.clone.mul
batch_heatmaps.size
_im_zfile.append
utils.utils.get_optimizer
torch.load.items
numpy.add
max
self.stage3
self.layer4
img.mul.clamp
name_value.keys
gridz.contiguous.view
logging.getLogger.setLevel
isinstance
numpy.clip
self.final_layer
j.target.detach
gpus.model.torch.nn.DataParallel.cuda
numpy.linalg.det
core.inference.get_final_preds
model.module.state_dict
sklearn.preprocessing.normalize
numpy.argmax
MobileNetV2
branches.append
num_joints.batch_size.target.reshape.split
idx.reshape.reshape
numpy.array.dot
numpy.outer
numpy.concatenate.sum
input.append
idx.heatmaps_pred.squeeze
modules.get_num_inchannels
numpy.linalg.svd
numpy.reshape
build_multi_camera_system
pred_mask.astype.astype
HumanBody
hmaps.append
triangulate_one_point
numpy.random.choice
i.batch_heatmaps.mul.clamp.byte
float
time.strftime
batch_index._img.clone.mul
batch_index._meta.clone.cpu.numpy
dict
batch_index._hmap.clone.mul.clamp
pymvg.multi_camera_system.MultiCameraSystem
logging.basicConfig
get_3rd_point
pose.append
self.aggre_layer
input.reshape
j.i.self.fuse_layers
self.conv3
core.config.config.DATASET.TRAIN_DATASET.eval
self.stage2
batch_index.finals.clone.mul.clamp.byte.cpu
os.makedirs
multiviews.cameras.camera_to_world_frame
matplotlib.pyplot.imshow
self.PoseHighResolutionNet.super.__init__
numpy.sin
i.batch_image.mul.clamp.byte.permute.cpu
numpy.linalg.inv
i.batch_heatmaps.mul.clamp.byte.cpu
i.batch_image.mul
dataset.evaluate
children_state.np.array.T.tolist
torch.no_grad
torch.utils.data.DataLoader
tensorboardX.SummaryWriter.close
self._make_fuse_layers
batch_heatmaps.reshape
self.deconv_layers.named_modules
i.batch_heatmaps.mul.clamp
j.all_unary.view
cameras.append
queue.append
i.batch_heatmaps.mul.clamp.byte.cpu.numpy.append
i.self.fuse_layers
recursive_infer
pair.joints_vis.copy
warped.append
self._make_transition_layer
utils.transforms.get_affine_transform
format
final_output_dir.mkdir
x_fuse.append
pathlib.Path.mkdir
cfg_name.os.path.basename.split
self.load_db
logging.getLogger.error
torchvision.transforms.Normalize
tensorboardX.SummaryWriter
self.final_layer.modules
tuple
numpy.less
torch.as_tensor.repeat
super.__getitem__
enumerate
images.append
save_batch_heatmaps
dist_acc
i.batch_image.mul.clamp.byte.permute.cpu.numpy
logging.StreamHandler
list.size
cv2.circle
pred.copy.copy
pose3d_as_cube_idx.copy.append
utils.zipreader.imread
self.get_key_str
numpy.sum
torch.nn.init.constant_
argparse.ArgumentParser
loss.item
multiviews.body.HumanBody
dict.items
batch_index._hmap.clone.mul.clamp.byte
self.Bottleneck.super.__init__
gt_db.append
core.loss.JointsMSELoss
utils.utils.load_checkpoint
compute_pairwise_constrain
torch.nn.Parameter
os.path.isfile
logging.getLogger.info
grid.mul.clamp.byte.permute.cpu
numpy.max
multiviews.pictorial.rpsm
_im_zfile.read
pathlib.Path
numpy.linalg.pinv
v.items
zipfile.ZipFile
batch_index._hmap.clone.mul.clamp.byte.cpu
i.batch_image.mul.clamp.byte.permute
batch_index.finals.clone.mul.clamp
cv2.warpAffine
numpy.exp
batch_index._img.clone.max
children_state.np.array.T.tolist.append
gridy.contiguous.view
block
target_cuda.append
torch.optim.Adam.load_state_dict
batch_index._output.clone.cpu.numpy
yaml.dump
self.relu
all_unary_list.append
os.path.basename
optim.step
routing.append
torch.nn.ReLU
target.append
mini_group.append
criterion
pose_2d.reshape.reshape
int
model.train
self.HighResolutionModule.super.__init__
self._check_branches
core.function.validate
PoseUtils
shutil.copy2
self.get_skeleton
model.eval
compute_grid
min.batch_image.add_.div_
argparse.ArgumentParser.parse_known_args
compute_limb_length
m.weight.data.normal_
torch.nn.init.normal_
join
model.module.load_state_dict
torch.pairwise_distance.view
self.aggre.append
target.astype.astype
torch.matmul
sample_grid.view.view
numpy.linalg.norm
logger.info
list
p.p.torch.cat.view
ndarr.copy.copy
batch_index._input.clone
self.weight.data.uniform_
thr.dist_cal.dists.np.less.sum
x_list.append
single_views.append
grid.mul.clamp.byte.permute.cpu.numpy
u2a.items.values
torch.ones
item.clone
sorted_skeleton.append
get_max_preds
self.MobileNetV2.super.__init__
utils.utils.get_optimizer.state_dict
numpy.tile
tan.radial.repeat
xml.etree.ElementTree.fromstring
cv2.applyColorMap
writer.add_scalar
cv2.imwrite
calc_dists
self.union_joints.values
zip
torch.ger
grouping.items
torch.sum
list.sort
numpy.diag
torch.optim.lr_scheduler.MultiStepLR.step
infer
ValueError
child.states_of_all_joints.squeeze
super.__init__
torch.zeros
pickle.load.items
self.modules
boxes.append
batch_index._hmap.clone
pymvg.camera_model.CameraModel.load_camera_from_M
batch_index._meta.clone
numpy.einsum
self.InvertedResidual.super.__init__
self.PoseResNet.super.__init__
numpy.concatenate
output.size
gridx.contiguous.view.contiguous
target.reshape
i.self.transition2
PoseResNet.init_weights
math.floor
super
json_tricks.load
utils.utils.create_logger
core.config.update_config
prediction.cpu.numpy.cpu
torch.nn.BatchNorm2d
m.named_parameters
torch.nn.ReLU6
logging.getLogger
open
utils.transforms.affine_transform_pts
utils.utils.save_checkpoint
loss.backward
numpy.not_equal.sum
self._make_stage
unary_of_all_joints.append
filtered_grouping.append
utils.vis.save_debug_images
aggre_layer.aggre.weight.data.cpu
j.output.detach.cpu
build_multi_camera_system.find3d
j.target.detach.cpu.numpy
fuse_layers.append
R.T.dot
PoseHighResolutionNet
numpy.vstack
batch_heatmaps.detach.cpu
numpy.zeros
batch_index._img.clone
_xml_zfile.open.read
map
A0.sum
AverageMeter
target.clone.append
core.function.train
self.MultiViewPose.super.__init__
torch.zeros_like
self._make_one_branch
numpy.cross
super.get_mapping
pairwise_mat.toarray.toarray
numpy.arange
numpy.float
copy.deepcopy.copy
numpy.floor
self.PoseMobileNetV2.super.__init__
self.Aggregation.super.__init__
preds.astype.astype
torchvision.utils.make_grid.mul
torch.linspace
conv3x3
self.conv1
AverageMeter.update
input.size
numpy.array
v.todense
self.BasicBlock.super.__init__
torch.as_tensor
compute_unary_term
idx.np.tile.astype
t.cuda.cuda
img.mul.clamp.byte.permute
pickle.load
numpy.ones
utils.transforms.affine_transform
cv2.imshow
numpy.min
batch_image.clone.add_
mpjpes.append
batch_index.finals.clone.mul.clamp.byte.cpu.numpy
pose3d_as_cube_idx.copy.pop
MultiViewPose
self.JointsMSELoss.super.__init__
easydict.EasyDict.items
HighResolutionModule
core.config.config.MODEL.eval
self.bn3
h5py.File.close
torch.nn.MaxPool2d
name.split
str
num_joints.batch_size.output.reshape.split
self._make_layer
numpy.multiply
torch.abs
PoseHighResolutionNet.init_weights
i.batch_heatmaps.mul
infer.append
torch.optim.SGD
u2a.items.keys
j.target.detach.cpu
grids.append
self.resnet

@developer
Could please help me check this issue?
May I pull a request to fix it?
Thank you very much.

the training time seems not to be shown in your paper

Hi, the training time of this method is not descripted clearly in this paper. Can you tell me how long you train 2D or 3D pose estimation in human3.6 and MPII without fusion( just 3D estimation from single image). Thank you very much.

Invalid download links

Hello, I have noticed some issues with the download links and downloads in general:

  1. The mobilenet_v2 model is missing from the link provided for imagenet pretrained models (found in issue #24)
  2. The limb-length prior model download link is broken. Even if i check the links under #6
  3. The trained models are nowhere to be found for direct inference.
    Found the trained model for 320_fusion under #14, but would it be possible to also share others?

Thank you

Missing pkl files

Did anybody find the cdf files or pkl files? I download a great number of annotation files, but all of them are coded by h5 or json. What a tragedy!

The link of onedrive broken.

Hello, I found that onedrive links broken when I try to download the limb length prior data (as well as that for mpii), could you please update them?
Thank you!

Questions for processing the total capture dataset

Hi Chunyu,

Thanks for releasing the H36M toolbox!
Acutally, I note that you train your models on the Total Capture dataset and few works have done experiments on it. For a fair comparison for future works, can you share how to process the total capture dataset?

Thanks.

operands could not be broadcast together with shapes (0,) (0,17,2)

Hi, first thanks for your work. When I run the 2d train code, it reminds me the following errors. I don't know how to solve this. Could you help me find the problems?

Epoch: [0][0/696] Time 5.216s (5.216s) Speed 6.1 samples/s Data 2.286s (2.286s) Loss 0.36521 (0.36521) Accuracy 0.026 (0.026) Memory 1429799424.0
Epoch: [0][100/696] Time 0.424s (0.543s) Speed 75.4 samples/s Data 0.000s (0.041s) Loss 0.34272 (0.39108) Accuracy 0.000 (0.002) Memory 1429799424.0
Epoch: [0][200/696] Time 0.453s (0.518s) Speed 70.7 samples/s Data 0.000s (0.028s) Loss 0.37073 (0.37585) Accuracy 0.015 (0.002) Memory 1428750848.0
Epoch: [0][300/696] Time 0.495s (0.506s) Speed 64.7 samples/s Data 0.000s (0.023s) Loss 0.34146 (0.36804) Accuracy 0.133 (0.016) Memory 1429799424.0
Epoch: [0][400/696] Time 0.503s (0.500s) Speed 63.6 samples/s Data 0.000s (0.020s) Loss 0.26815 (0.35295) Accuracy 0.424 (0.076) Memory 1428750848.0
Epoch: [0][500/696] Time 0.488s (0.495s) Speed 65.5 samples/s Data 0.000s (0.018s) Loss 0.25259 (0.33488) Accuracy 0.460 (0.149) Memory 1429799424.0
Epoch: [0][600/696] Time 0.423s (0.492s) Speed 75.7 samples/s Data 0.000s (0.017s) Loss 0.23425 (0.31868) Accuracy 0.551 (0.212) Memory 1428750848.0
Traceback (most recent call last):
File "run/pose2d/train.py", line 189, in
main()
File "run/pose2d/train.py", line 164, in main
criterion, final_output_dir, writer_dict)
File "/lustre/alice3/scratch/3dpoint/fz64/project/multiview/run/pose2d/../../lib/core/function.py", line 233, in validate
name_value, perf_indicator = dataset.evaluate(all_preds)
File "/lustre/alice3/scratch/3dpoint/fz64/project/multiview/run/pose2d/../../lib/dataset/multiview_h36m.py", line 145, in evaluate
distance = np.sqrt(np.sum((gt - pred)**2, axis=2))
ValueError: operands could not be broadcast together with shapes (0,) (0,17,2)

can't run valid.py

after python run/pose2d/valid.py --cfg experiments-local/mixed/resnet50/256_fusion.yaml
size mismatch for aggre_layer.aggre.0.weight: copying a param with shape torch.Size([6400, 6400]) from checkpoint, the shape in current model is torch.Size([4096, 4096]). size mismatch for aggre_layer.aggre.1.weight: copying a param with shape torch.Size([6400, 6400]) from checkpoint, the shape in current model is torch.Size([4096, 4096]). size mismatch for aggre_layer.aggre.2.weight: copying a param with shape torch.Size([6400, 6400]) from checkpoint, the shape in current model is torch.Size([4096, 4096]). size mismatch for aggre_layer.aggre.3.weight: copying a param with shape torch.Size([6400, 6400]) from checkpoint, the shape in current model is torch.Size([4096, 4096]). size mismatch for aggre_layer.aggre.4.weight: copying a param with shape torch.Size([6400, 6400]) from checkpoint, the shape in current model is torch.Size([4096, 4096]). size mismatch for aggre_layer.aggre.5.weight: copying a param with shape torch.Size([6400, 6400]) from checkpoint, the shape in current model is torch.Size([4096, 4096]). size mismatch for aggre_layer.aggre.6.weight: copying a param with shape torch.Size([6400, 6400]) from checkpoint, the shape in current model is torch.Size([4096, 4096]). size mismatch for aggre_layer.aggre.7.weight: copying a param with shape torch.Size([6400, 6400]) from checkpoint, the shape in current model is torch.Size([4096, 4096]). size mismatch for aggre_layer.aggre.8.weight: copying a param with shape torch.Size([6400, 6400]) from checkpoint, the shape in current model is torch.Size([4096, 4096]). size mismatch for aggre_layer.aggre.9.weight: copying a param with shape torch.Size([6400, 6400]) from checkpoint, the shape in current model is torch.Size([4096, 4096]). size mismatch for aggre_layer.aggre.10.weight: copying a param with shape torch.Size([6400, 6400]) from checkpoint, the shape in current model is torch.Size([4096, 4096]). size mismatch for aggre_layer.aggre.11.weight: copying a param with shape torch.Size([6400, 6400]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).

how can i solve it ? thank you.

How does RPSM model work?

Thank you very much for your sharing !
Even though I tried to read your paper, I was still confused about the Recursive Pictorial Structure Model.
Do you have any recommended materials that can help me understand it?
Your help was very much appreciated.

Top Down or Bottom Up

Do this use top down pose estimation or a bottom up solution.
Also is the HRNET provided here same as Higher HRNET?

Process killed during 2D validation

Hello,

I am using the pretrained 320_fused model from #14 to validate S9 and S11 data, generated using the H36M-Toolbox.

During validation, the program runs well, untit the last iteration where it gets abruptly killed.
Here is the output:

python run/pose2d/valid.py --cfg experiments-local/mixed/resnet50/320_fusion.yaml
/home/rjajodia/crossview/run/pose2d/../../lib/core/config.py:197: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
exp_config = edict(yaml.load(f))
=> creating output/mixed/multiview_pose_resnet_50/320_fusion
=> creating log/mixed/multiview_pose_resnet_50/320_fusion2020-10-04-22-57
Namespace(cfg='experiments-local/mixed/resnet50/320_fusion.yaml', dataDir='', data_format='', flip_test=False, frequent=100, gpus=None, logDir='', modelDir='', model_file=None, post_process=False, shift_heatmap=False, state='best', workers=None)
{'BACKBONE_MODEL': 'pose_resnet',
'CUDNN': {'BENCHMARK': True, 'DETERMINISTIC': False, 'ENABLED': True},
'DATASET': {'BBOX': 2000,
'CROP': True,
'DATA_FORMAT': 'zip',
'ROOT': 'data/',
'ROOTIDX': 0,
'ROT_FACTOR': 0,
'SCALE_FACTOR': 0,
'TEST_DATASET': 'multiview_h36m',
'TEST_SUBSET': 'validation',
'TRAIN_DATASET': 'mixed',
'TRAIN_SUBSET': 'train'},
'DATA_DIR': '',
'DEBUG': {'DEBUG': True,
'SAVE_BATCH_IMAGES_GT': True,
'SAVE_BATCH_IMAGES_PRED': True,
'SAVE_HEATMAPS_GT': True,
'SAVE_HEATMAPS_PRED': True},
'GPUS': '0',
'LOG_DIR': 'log',
'LOSS': {'USE_TARGET_WEIGHT': True},
'MODEL': 'multiview_pose_resnet',
'MODEL_EXTRA': {'FINAL_CONV_KERNEL': 1,
'PRETRAINED_LAYERS': ['conv1',
'bn1',
'conv2',
'bn2',
'layer1',
'transition1',
'stage2',
'transition2',
'stage3',
'transition3',
'stage4'],
'STAGE2': {'BLOCK': 'BASIC',
'FUSE_METHOD': 'SUM',
'NUM_BLOCKS': [4, 4],
'NUM_BRANCHES': 2,
'NUM_CHANNELS': [48, 96],
'NUM_MODULES': 1},
'STAGE3': {'BLOCK': 'BASIC',
'FUSE_METHOD': 'SUM',
'NUM_BLOCKS': [4, 4, 4],
'NUM_BRANCHES': 3,
'NUM_CHANNELS': [48, 96, 192],
'NUM_MODULES': 4},
'STAGE4': {'BLOCK': 'BASIC',
'FUSE_METHOD': 'SUM',
'NUM_BLOCKS': [4, 4, 4, 4],
'NUM_BRANCHES': 4,
'NUM_CHANNELS': [48, 96, 192, 384],
'NUM_MODULES': 3}},
'NETWORK': {'AGGRE': True,
'HEATMAP_SIZE': array([80, 80]),
'IMAGE_SIZE': array([320, 320]),
'NUM_JOINTS': 20,
'PRETRAINED': 'models/pytorch/imagenet/resnet50-19c8e357.pth',
'SIGMA': 3,
'TARGET_TYPE': 'gaussian'},
'OUTPUT_DIR': 'output',
'PICT_STRUCT': {'DEBUG': False,
'FIRST_NBINS': 16,
'GRID_SIZE': 2000,
'LIMB_LENGTH_TOLERANCE': 150,
'PAIRWISE_FILE': 'data/pict/pairwise.pkl',
'RECUR_DEPTH': 10,
'RECUR_NBINS': 2,
'SHOW_CROPIMG': False,
'SHOW_HEATIMG': False,
'SHOW_ORIIMG': False,
'TEST_PAIRWISE': False},
'POSE_RESNET': {'DECONV_WITH_BIAS': False,
'FINAL_CONV_KERNEL': 1,
'NUM_DECONV_FILTERS': [256, 256, 256],
'NUM_DECONV_KERNELS': [4, 4, 4],
'NUM_DECONV_LAYERS': 3,
'NUM_LAYERS': 50},
'PRINT_FREQ': 100,
'TEST': {'BATCH_SIZE': 2,
'BBOX_FILE': '',
'BBOX_THRE': 1.0,
'DETECTOR': 'fpn_dcn',
'DETECTOR_DIR': '',
'HEATMAP_LOCATION_FILE': 'predicted_heatmaps.h5',
'IMAGE_THRE': 0.1,
'IN_VIS_THRE': 0.0,
'MATCH_IOU_THRE': 0.3,
'MODEL_FILE': '',
'NMS_THRE': 0.6,
'OKS_THRE': 0.5,
'POST_PROCESS': False,
'SHIFT_HEATMAP': False,
'STATE': 'best',
'USE_GT_BBOX': True},
'TRAIN': {'BATCH_SIZE': 2,
'BEGIN_EPOCH': 0,
'END_EPOCH': 30,
'GAMMA1': 0.99,
'GAMMA2': 0.0,
'LR': 0.001,
'LR_FACTOR': 0.1,
'LR_STEP': [20, 25],
'MOMENTUM': 0.9,
'NESTEROV': False,
'OPTIMIZER': 'adam',
'RESUME': True,
'SHUFFLE': True,
'WD': 0.0001},
'WORKERS': 4}
=> loading model from output/mixed/multiview_pose_resnet_50/320_fusion/model_best.pth.tar
/home/rjajodia/.conda/envs/crossview/lib/python3.8/site-packages/torch/nn/_reduction.py:44: UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead.
warnings.warn(warning.format(ret))
Test: [0/1076] Time 7.371 (7.371) Loss 0.0431 (0.0431) Accuracy 1.000 (1.000)
Test: [100/1076] Time 0.090 (0.164) Loss 0.0753 (0.0546) Accuracy 1.000 (0.994)
Test: [200/1076] Time 0.091 (0.128) Loss 0.7771 (0.0897) Accuracy 0.007 (0.941)
Test: [300/1076] Time 0.092 (0.117) Loss 0.0597 (0.0773) Accuracy 0.993 (0.958)
Test: [400/1076] Time 0.093 (0.111) Loss 0.0414 (0.0977) Accuracy 1.000 (0.935)
Test: [500/1076] Time 0.092 (0.108) Loss 0.7098 (0.1056) Accuracy 0.000 (0.924)
Test: [600/1076] Time 0.092 (0.105) Loss 0.0442 (0.0975) Accuracy 1.000 (0.934)
Test: [700/1076] Time 0.094 (0.104) Loss 0.0347 (0.0883) Accuracy 1.000 (0.943)
Test: [800/1076] Time 0.093 (0.103) Loss 0.0726 (0.0822) Accuracy 0.993 (0.950)
Test: [900/1076] Time 0.093 (0.102) Loss 0.0845 (0.0785) Accuracy 0.978 (0.955)
Test: [1000/1076] Time 0.094 (0.102) Loss 0.0362 (0.0752) Accuracy 1.000 (0.959)
Killed

I tried to look at the logs but because the process gets killed no logs are written.
Any ideas?

Thank you.

About the Joint Detection Rate (JDR) metric of 2D pose estimation accuracy

In your paper, you used Joint Detection Rate (JDR) to measure the 2D pose estimation accuracy. But as I observe, the JDR metric is very similar to the PCK metric using headsize (a.k.a. PCKh). So I am wondering what is the difference between JDR and PCK. BTW, I didn't see JDR was used earlier than your paper, so it was proposed in your paper for the first time, right? :)

The annotation files of human36m

Hey, Thanks for u sharing!
Is it possible to share the download link of h36m_train.pkl and h36m_validation.pkl file?
i cloned and read the code from https://github.com/CHUNYUWANG/H36M-Toolbox, but i am unable to use this tool, i have got the human36m images and with the annotation file in json format as Human36M_subject1_camera.json Human36M_subject1_data.json Human36M_subject1_joint_3d.json
I just need the *.pkl annotation files.
Thx for your help!

Cuda out of memory

Hello, i get this error while trying to train:

python run/pose2d/train.py --cfg experiments-local/mixed/resnet50/320_fusion.yaml
/home/rjajodia/crossview/run/pose2d/../../lib/core/config.py:197: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
exp_config = edict(yaml.load(f))
=> creating output/mixed/multiview_pose_resnet_50/320_fusion
=> creating log/mixed/multiview_pose_resnet_50/320_fusion2020-10-04-20-04
Namespace(cfg='experiments-local/mixed/resnet50/320_fusion.yaml', dataDir='', data_format='', frequent=100, gpus=None, logDir='', modelDir='', workers=None)
{'BACKBONE_MODEL': 'pose_resnet',
'CUDNN': {'BENCHMARK': True, 'DETERMINISTIC': False, 'ENABLED': True},
'DATASET': {'BBOX': 2000,
'CROP': True,
'DATA_FORMAT': 'zip',
'ROOT': 'data/',
'ROOTIDX': 0,
'ROT_FACTOR': 0,
'SCALE_FACTOR': 0,
'TEST_DATASET': 'multiview_h36m',
'TEST_SUBSET': 'validation',
'TRAIN_DATASET': 'mixed',
'TRAIN_SUBSET': 'train'},
'DATA_DIR': '',
'DEBUG': {'DEBUG': True,
'SAVE_BATCH_IMAGES_GT': True,
'SAVE_BATCH_IMAGES_PRED': True,
'SAVE_HEATMAPS_GT': True,
'SAVE_HEATMAPS_PRED': True},
'GPUS': '0',
'LOG_DIR': 'log',
'LOSS': {'USE_TARGET_WEIGHT': True},
'MODEL': 'multiview_pose_resnet',
'MODEL_EXTRA': {'FINAL_CONV_KERNEL': 1,
'PRETRAINED_LAYERS': ['conv1',
'bn1',
'conv2',
'bn2',
'layer1',
'transition1',
'stage2',
'transition2',
'stage3',
'transition3',
'stage4'],
'STAGE2': {'BLOCK': 'BASIC',
'FUSE_METHOD': 'SUM',
'NUM_BLOCKS': [4, 4],
'NUM_BRANCHES': 2,
'NUM_CHANNELS': [48, 96],
'NUM_MODULES': 1},
'STAGE3': {'BLOCK': 'BASIC',
'FUSE_METHOD': 'SUM',
'NUM_BLOCKS': [4, 4, 4],
'NUM_BRANCHES': 3,
'NUM_CHANNELS': [48, 96, 192],
'NUM_MODULES': 4},
'STAGE4': {'BLOCK': 'BASIC',
'FUSE_METHOD': 'SUM',
'NUM_BLOCKS': [4, 4, 4, 4],
'NUM_BRANCHES': 4,
'NUM_CHANNELS': [48, 96, 192, 384],
'NUM_MODULES': 3}},
'NETWORK': {'AGGRE': True,
'HEATMAP_SIZE': array([80, 80]),
'IMAGE_SIZE': array([320, 320]),
'NUM_JOINTS': 20,
'PRETRAINED': 'models/pytorch/imagenet/resnet50-19c8e357.pth',
'SIGMA': 3,
'TARGET_TYPE': 'gaussian'},
'OUTPUT_DIR': 'output',
'PICT_STRUCT': {'DEBUG': False,
'FIRST_NBINS': 16,
'GRID_SIZE': 2000,
'LIMB_LENGTH_TOLERANCE': 150,
'PAIRWISE_FILE': 'data/pict/pairwise.pkl',
'RECUR_DEPTH': 10,
'RECUR_NBINS': 2,
'SHOW_CROPIMG': False,
'SHOW_HEATIMG': False,
'SHOW_ORIIMG': False,
'TEST_PAIRWISE': False},
'POSE_RESNET': {'DECONV_WITH_BIAS': False,
'FINAL_CONV_KERNEL': 1,
'NUM_DECONV_FILTERS': [256, 256, 256],
'NUM_DECONV_KERNELS': [4, 4, 4],
'NUM_DECONV_LAYERS': 3,
'NUM_LAYERS': 50},
'PRINT_FREQ': 100,
'TEST': {'BATCH_SIZE': 2,
'BBOX_FILE': '',
'BBOX_THRE': 1.0,
'DETECTOR': 'fpn_dcn',
'DETECTOR_DIR': '',
'HEATMAP_LOCATION_FILE': 'predicted_heatmaps.h5',
'IMAGE_THRE': 0.1,
'IN_VIS_THRE': 0.0,
'MATCH_IOU_THRE': 0.3,
'MODEL_FILE': '',
'NMS_THRE': 0.6,
'OKS_THRE': 0.5,
'POST_PROCESS': False,
'SHIFT_HEATMAP': False,
'STATE': '',
'USE_GT_BBOX': True},
'TRAIN': {'BATCH_SIZE': 2,
'BEGIN_EPOCH': 0,
'END_EPOCH': 30,
'GAMMA1': 0.99,
'GAMMA2': 0.0,
'LR': 0.001,
'LR_FACTOR': 0.1,
'LR_STEP': [20, 25],
'MOMENTUM': 0.9,
'NESTEROV': False,
'OPTIMIZER': 'adam',
'RESUME': True,
'SHUFFLE': True,
'WD': 0.0001},
'WORKERS': 4}
=> loading pretrained model models/pytorch/imagenet/resnet50-19c8e357.pth
=> init deconv weights from normal distribution
=> init 0.weight as normal(0, 0.001)
=> init 0.bias as 0
=> init 1.weight as 1
=> init 1.bias as 0
=> init 3.weight as normal(0, 0.001)
=> init 3.bias as 0
=> init 4.weight as 1
=> init 4.bias as 0
=> init 6.weight as normal(0, 0.001)
=> init 6.bias as 0
=> init 7.weight as 1
=> init 7.bias as 0
=> init final conv weights from normal distribution
=> init 8.weight as normal(0, 0.001)
=> init 8.bias as 0
MultiViewPose(
(resnet): PoseResNet(
(conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(layer1): Sequential(
(0): Bottleneck(
(conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(layer2): Sequential(
(0): Bottleneck(
(conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(3): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(layer3): Sequential(
(0): Bottleneck(
(conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(3): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(4): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(5): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(layer4): Sequential(
(0): Bottleneck(
(conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(deconv_layers): Sequential(
(0): ConvTranspose2d(2048, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): ConvTranspose2d(256, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): ReLU(inplace=True)
(6): ConvTranspose2d(256, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(7): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(8): ReLU(inplace=True)
)
(final_layer): Conv2d(256, 20, kernel_size=(1, 1), stride=(1, 1))
)
(aggre_layer): Aggregation(
(aggre): ModuleList(
(0): ChannelWiseFC()
(1): ChannelWiseFC()
(2): ChannelWiseFC()
(3): ChannelWiseFC()
(4): ChannelWiseFC()
(5): ChannelWiseFC()
(6): ChannelWiseFC()
(7): ChannelWiseFC()
(8): ChannelWiseFC()
(9): ChannelWiseFC()
(10): ChannelWiseFC()
(11): ChannelWiseFC()
)
)
)
MultiViewPose(
(resnet): PoseResNet(
(conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(layer1): Sequential(
(0): Bottleneck(
(conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(layer2): Sequential(
(0): Bottleneck(
(conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(3): Bottleneck(
(conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(layer3): Sequential(
(0): Bottleneck(
(conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(3): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(4): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(5): Bottleneck(
(conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(layer4): Sequential(
(0): Bottleneck(
(conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(deconv_layers): Sequential(
(0): ConvTranspose2d(2048, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): ConvTranspose2d(256, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): ReLU(inplace=True)
(6): ConvTranspose2d(256, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(7): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(8): ReLU(inplace=True)
)
(final_layer): Conv2d(256, 20, kernel_size=(1, 1), stride=(1, 1))
)
(aggre_layer): Aggregation(
(aggre): ModuleList(
(0): ChannelWiseFC()
(1): ChannelWiseFC()
(2): ChannelWiseFC()
(3): ChannelWiseFC()
(4): ChannelWiseFC()
(5): ChannelWiseFC()
(6): ChannelWiseFC()
(7): ChannelWiseFC()
(8): ChannelWiseFC()
(9): ChannelWiseFC()
(10): ChannelWiseFC()
(11): ChannelWiseFC()
)
)
)
/home/rjajodia/.conda/envs/crossview/lib/python3.8/site-packages/torch/nn/reduction.py:44: UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead.
warnings.warn(warning.format(ret))
=> no checkpoint found at output/mixed/multiview_pose_resnet_50/320_fusion/checkpoint.pth.tar
/home/rjajodia/.conda/envs/crossview/lib/python3.8/site-packages/json_tricks/nonp.py:219: JsonTricksDeprecation: json_tricks.load(s) stripped some comments, but ignore_comments was not passed; in the next major release, the behaviour when ignore_comments is not passed will change; it is recommended to explicitly pass ignore_comments=True if you want to strip comments; see mverleg/pyjson_tricks#74
warnings.warn('json_tricks.load(s) stripped some comments, but ignore_comments was '
/home/rjajodia/.conda/envs/crossview/lib/python3.8/site-packages/torch/optim/lr_scheduler.py:118: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
warnings.warn("Detected call of lr_scheduler.step() before optimizer.step(). "
Traceback (most recent call last):
File "run/pose2d/train.py", line 189, in
main()
File "run/pose2d/train.py", line 160, in main
train(config, train_loader, model, criterion, optimizer, epoch,
File "/home/rjajodia/crossview/run/pose2d/../../lib/core/function.py", line 78, in train
optim.step()
File "/home/rjajodia/.conda/envs/crossview/lib/python3.8/site-packages/torch/optim/lr_scheduler.py", line 67, in wrapper
return wrapped(*args, **kwargs)
File "/home/rjajodia/.conda/envs/crossview/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
return func(*args, **kwargs)
File "/home/rjajodia/.conda/envs/crossview/lib/python3.8/site-packages/torch/optim/adam.py", line 107, in step
denom = (exp_avg_sq.sqrt() / math.sqrt(bias_correction2)).add
(group['eps'])
RuntimeError: CUDA out of memory. Tried to allocate 158.00 MiB (GPU 0; 7.80 GiB total capacity; 6.48 GiB already allocated; 100.19 MiB free; 6.66 GiB reserved in total by PyTorch)

Hello. I got this error while trying to train. Any fixes for this?

pairwise.pkl file generation

Dear authors,

Currently trying to reproduce the results from the paper but I would like to know how the sparse initial matrices were generated? From the pickle file we can see there is a 4096x4096 matrix given for each joint pair forming a limb. My current guess is that dim = (4096, 4096) comes from the initial 16x16x16 voxel space. How exactly were the 1s and 0s in the matrix generated? By some gaussian around the identity (i.e. same voxel) for Joint J_0 and J_1?

Thanks a lot for the help!

Best,
Martin

Question about Human3.6m dataset

Hi, I got H36M dataset from someone and it's in images along with h5 and .mat annotations files.
Directories = images/s_{:02d}act{:02d}subact{:02d}ca{:02d}
image_format= s_{:02d}act{:02d}subact{:02d}ca{:02d}_{:06d}.jpg

I'm unable to generate the annotations files in pickle format using your H36M Toolbox.
Kindly help me in this regard that how we can generate annotations if we already have data in above mentioned format. Thanks

Question about live Cameras

Hi dear Chunyu Wang, thanks for your such a nice work. I want to know that this work can be implemented for live cameras instead of images? I want to use 2 live cameras. Is it possible in your work? I read your work but still I'm confused about my problem.

A problem in "Generalization to New Camera Setups"

You give us three comparable experiments in "7.5 Generalization to New Camera Setups". First one, directly use RPSM to obtain 3D pose and the error is 109 mm. Second one, use 2D estimator's output as pseudo labels to train the network without fusion layer and the error decreases to 61 mm. Third one, add fusion layer and the error is 43 mm.
I'm wondering why the latter two experiments have better performance than the first one. Since the latter two just use pseudo labels as ground truth to train the network, the first one directly use this ground truth. I mean a network should be more likely to give a output close to ground truth, but it should never reach the level of ground truth. So how could the first experiment using ground truth has worse performance than experiments which use a output just close to ground truth?

In the wild dataset

Thank you for opening the source code!
I want to try 3d pose estimation in the wild video.

Is it possible to try in the wild dataset?

Quetestions about the organization of H36M image data

Hi Chunyu,

A great work! Thank you for releasing the code!

I meet a small problem when I organize the H36M image data. After I download and process the data from (https://github.com/anibali/h36m-fetch), the image data is organized as:

${POSE_ROOT}
|-- data
`-- |-- h36m
    `-- |-- annot
        |   |-- h36m_train.pkl
        |   |-- h36m_validation.pkl
        `-- images.zip
            `-- images
                |-- S1
                    |-- Directions-1
                        |-- imageSequence
                            |--54138969
                            |--55011271
                            |--58860488
                            |--60457274
                        |-- annot.h5
                    |-- Directions-2
                |-- S11

As you see, it is inconsistent with the description in the 'INSTALL.md '. Do you have another processing script ? If you have, can you share it?

Question about camera calibration

Hi,

It is unclear to me if you somehow provided camera calibration information to your detection network. If yes, how could I provide my own camera calibration information to use your code with my own images acquired from multiple cameras?

Thank you very much!

Best regards

A question in 3.1 Implementation

Hi, thanks for your great work, but I have one problem about cross view fusion. In 3.1 Implementation, you note that "Different channels of the feature maps share the same weights". The so-called "weights" mean that "non-corresponding locations on the epipolar line will contribute no or little to the fusion". However, the corresponding location should be determined by the depth of the corresponding 3d joint, right? Therefore, in my opinion, the depth of the 3d joint should influence the so-called "weights". I'm wondering why you say that different channels share the same weights, since different joints should have different depths. Could you give me a more specific explanation, thank you very much.

Missing model_best.pth.tar???

hello, when I run this command: python run/pose2d/valid.py --cfg experiments-local/mixed/resnet50/256_fusion.yaml
I meet this error : FileNotFoundError: No such file ro directory, 'output/mixed/multiview_pose_resnet_50/256_fusion/model_best.pth.tar.
could you tell me why? Thank you very much.

visualize video

How to generate the demo video? Can you offer me the code of visualizing the video?

aggregation

congrats on your great work!
Could you please explain a little bit about your code?

in your multiview_pose_resnet.py

why aggregation function has the weights predefined as weights=[0.4, 0.2, 0.2, 0.2]

I think this part is a "triangulation " part right?

you didn't refer your trangulate.py? Instead, you predefine your weight here. why?

Thank you so much!

2D detector model

Hi! Thanks for the great work.
I would like to test the network on my own images. Is it possible to release your 2D detector model?

Thanks.

Questions about the human3.6m annotation file

Hi,
First of all, thanks for open sourcing this great work. I have some questions:
(1) This work retrain 2D pose estimation network using 2D ground truth heatmap of Human3.6 dataset, right? Because other works only use state-of-the-art 2D human pose estimator to generate 2D labels for Human3.6M without using the 2D ground truth.
(2) Another question is that how to generate the h36m_train.pkl?The file contains "joints_2d" and "joints_3d", how to combine the image file with corresponding joints 2d or 3d information?
Thanks in advance!

pairwise.pkl

How do you get the “pairwise_constrain” value in the pairwise.pkl file?If I want to use other Graphical model of human body,what should I do to get this value?

How to do "direct triangulation" on baseline "Single"

I have a question on chaptar 6.3, table 2.

I want to know how you lifting 3D Pose from 2D Pose for baseline "Single". The paper says "direct triangulation".

According to my understanding, the input of "direct triangulation" is multiple view.

But for output of baseline "Single", it is only single view, how to do "direct triangulation" for it?

Limb Length Prior download link broken

Hello.
Could you please update the limb-length priors download link :)
Both the link in the repository and under #6 take me a login page. Please help.
Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.