Dear Vivek, I hope this email finds you well. I am encountering a ra

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

The complete test code is displayed here. <div class="highlight highlight-source-p

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Comments (14)

eigenvivek commented on June 25, 2024

Hi @CYXYZ , in the version of DiffPose on the refactor-se3 branch, there shouldn't be any calls to import pytorch3d

When I search for it, I don't see it in the code: https://github.com/search?q=repo%3Aeigenvivek%2FDiffPose%20pytorch3d&type=code

My advice would be

Install a clean version of the diffpose environment with conda from the updated environment.yml and activate it
Install the latest version of diffdrr from source (git clone https://github.com/eigenvivek/DiffDRR.git; cd DiffDRR; pip install -e .)
Install the local version of diffpose on the refactor-se3 branch (cd DiffPose; pip install -e .)
Try rerunning training

When using the latest version, I've been able to train all models without crashing. However, it's entirely possible that there's some other bug in the geodesic distance code that produces a NaN during training. Just because I haven't seen it yet doesn't mean it's not real!

Please try retraining with a clean environment and let me know if the crashing persists

from diffpose.

CYXYZ commented on June 25, 2024

Dear vivek, something wrong when I run train.py on the refactor-se3 branch:
Traceback (most recent call last):
File "/home/data/cyx/autodl-tmp/DiffPose-refactor-se3/experiments/deepfluoro/train.py", line 236, in
main(id_number)
File "/home/data/cyx/autodl-tmp/DiffPose-refactor-se3/experiments/deepfluoro/train.py", line 207, in main
train(
File "/home/data/cyx/autodl-tmp/DiffPose-refactor-se3/experiments/deepfluoro/train.py", line 70, in train
img = drr(None, None, None, pose=pose, bone_attenuation_multiplier=contrast)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/data/cyx/miniconda3/envs/diffpose/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/data/cyx/miniconda3/envs/diffpose/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/data/cyx/miniconda3/envs/diffpose/lib/python3.12/site-packages/diffdrr/drr.py", line 126, in forward
source, target = self.detector(pose)
^^^^^^^^^^^^^^^^^^^
File "/home/data/cyx/miniconda3/envs/diffpose/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/data/cyx/miniconda3/envs/diffpose/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/data/cyx/miniconda3/envs/diffpose/lib/python3.12/site-packages/diffdrr/detector.py", line 104, in forward
source = pose(self.source)
^^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object is not callable
The above is what I encountered when I executed it in a new environment and ran python setup.py install. Have you encountered this problem?

from diffpose.

CYXYZ commented on June 25, 2024

import torch
from beartype import beartype
from diffdrr.utils import convert
from jaxtyping import Float, jaxtyped
from pytorch3d.transforms import (
so3_rotation_angle,
so3_relative_angle,
standardize_quaternion,
)
from typing import Optional

from beartype import beartype
from diffdrr.utils import convert as convert_so3
from jaxtyping import Float, jaxtyped
from pytorch3d.transforms import Transform3d
from pytorchse3.se3 import se3_exp_map, se3_log_map

@beartype
class RigidTransform(Transform3d):
"""Wrapper of pytorch3d.transforms.Transform3d with extra functionalities."""

@jaxtyped
def __init__(
    self,
    R: Float[torch.Tensor, "..."],
    t: Float[torch.Tensor, "... 3"],
    parameterization: str = "matrix",
    convention: Optional[str] = None,
    device=None,
    dtype=torch.float32,
):
    if device is None and (R.device == t.device):
        device = R.device

    R = convert_so3(R, parameterization, "matrix", convention)
    if R.dim() == 2 and t.dim() == 1:
        R = R.unsqueeze(0)
        t = t.unsqueeze(0)
    assert (batch_size := len(R)) == len(t), "R and t need same batch size"

    matrix = torch.zeros(batch_size, 4, 4, device=device, dtype=dtype)
    matrix[..., :3, :3] = R.transpose(-1, -2)
    matrix[..., 3, :3] = t
    matrix[..., 3, 3] = 1

    super().__init__(matrix=matrix, device=device, dtype=dtype)

def get_rotation(self, parameterization=None, convention=None):
    R = self.get_matrix()[..., :3, :3].transpose(-1, -2)
    if parameterization is not None:
        R = convert_so3(R, "matrix", parameterization, None, convention)
    return R

def get_translation(self):
    return self.get_matrix()[..., 3, :3]

def inverse(self):
    """Closed-form inverse for rigid transforms."""
    R = self.get_rotation().transpose(-1, -2)
    t = self.get_translation()
    t = -torch.einsum("bij,bj->bi", R, t)
    return RigidTransform(R, t, device=self.device, dtype=self.dtype)

def compose(self, other):
    T = super().compose(other)
    R = T.get_matrix()[..., :3, :3].transpose(-1, -2)
    t = T.get_matrix()[..., 3, :3]
    return RigidTransform(R, t, device=self.device, dtype=self.dtype)

def clone(self):
    R = self.get_matrix()[..., :3, :3].transpose(-1, -2).clone()
    t = self.get_matrix()[..., 3, :3].clone()
    return RigidTransform(R, t, device=self.device, dtype=self.dtype)

def get_se3_log(self):
    return se3_log_map(self.get_matrix().transpose(-1, -2))

class GeodesicSE3(torch.nn.Module):
"""Calculate the distance between transforms in the log-space of SE(3)."""

def __init__(self):
    super().__init__()

@beartype
@jaxtyped
def forward(
    self,
    pose_1: RigidTransform,
    pose_2: RigidTransform,
) -> Float[torch.Tensor, "b"]:
    return pose_2.compose(pose_1.inverse()).get_se3_log().norm(dim=1)

Example pose matrices

pose = torch.tensor([[[ 1.8144e-01, 9.8316e-01, -2.1685e-02, 0.0000e+00],
[ 2.5295e-01, -6.7969e-02, -9.6509e-01, 0.0000e+00],
[-9.5031e-01, 1.6962e-01, -2.6102e-01, 0.0000e+00],
[ 2.2374e+02, 3.8585e+02, 2.2170e+02, 1.0000e+00]],

    [[ 1.7733e-01,  9.8249e-01,  5.7100e-02,  0.0000e+00],
     [-6.8154e-02,  7.0139e-02, -9.9521e-01,  0.0000e+00],
     [-9.8179e-01,  1.7258e-01,  7.9398e-02,  0.0000e+00],
     [ 2.0946e+02,  3.2825e+02,  2.1824e+02,  1.0000e+00]],

    [[-6.4248e-02,  9.9234e-01,  1.0555e-01,  0.0000e+00],
     [ 1.7417e-02,  1.0687e-01, -9.9412e-01,  0.0000e+00],
     [-9.9778e-01, -6.2032e-02, -2.4150e-02,  0.0000e+00],
     [ 1.6308e+02,  3.7611e+02,  2.2729e+02,  1.0000e+00]],

    [[ 2.2041e-01,  8.7520e-01, -4.3063e-01,  0.0000e+00],
     [-6.2944e-02, -4.2780e-01, -9.0168e-01,  0.0000e+00],
     [-9.7337e-01,  2.2585e-01, -3.9205e-02,  0.0000e+00],
     [ 3.0673e+02,  3.1338e+02,  5.1707e+01,  1.0000e+00]]], device='cuda:0')

pred_pose = torch.tensor(
[[[ 1.8104e-01, 9.8323e-01, -2.1787e-02, 0.0000e+00],
[ 2.4225e-01, -6.6052e-02, -9.6796e-01, 0.0000e+00],
[-9.5317e-01, 1.6996e-01, -2.5014e-01, 0.0000e+00],
[ 2.2539e+02, 3.8712e+02, 2.2106e+02, 1.0000e+00]],

    [[ 1.7060e-01,  9.8356e-01,  5.9136e-02,  0.0000e+00],
     [-7.7165e-02,  7.3167e-02, -9.9433e-01,  0.0000e+00],
     [-9.8231e-01,  1.6507e-01,  8.8379e-02,  0.0000e+00],
     [ 2.0866e+02,  3.3049e+02,  2.1791e+02,  1.0000e+00]],

    [[-6.3945e-02,  9.9236e-01,  1.0550e-01,  0.0000e+00],
     [ 1.7204e-02,  1.0680e-01, -9.9413e-01,  0.0000e+00],
     [-9.9781e-01, -6.1755e-02, -2.3902e-02,  0.0000e+00],
     [ 1.6372e+02,  3.7662e+02,  2.2650e+02,  1.0000e+00]],

    [[ 1.9835e-01,  8.7822e-01, -4.3518e-01,  0.0000e+00],
     [-5.8063e-02, -4.3270e-01, -8.9967e-01,  0.0000e+00],
     [-9.7841e-01,  2.0372e-01, -3.4833e-02,  0.0000e+00],
     [ 3.0021e+02,  3.1163e+02,  5.1693e+01,  1.0000e+00]]], device='cuda:0')

Assuming you have created instances of pose matrices

pose = torch.tensor(pose, device='cuda:0').clone().detach()
pred_pose = torch.tensor(pred_pose, device='cuda:0').clone().detach()

pose = RigidTransform(R=pose[..., :3, :3], t=pose[..., :3, 3])
pred_pose = RigidTransform(R=pred_pose[..., :3, :3], t=pred_pose[..., :3, 3])

Creating an instance of GeodesicSE3 class

geodesic_calculator = GeodesicSE3()

Calculate geodesic distance

geodesic = geodesic_calculator(pose, pred_pose)

print(geodesic)

I used the code above to test erroneous data and found that NaN values still occur.
tensor([0.0120, 0.0122, nan, 0.0235], device='cuda:0')

from diffpose.

CYXYZ commented on June 25, 2024

The complete test code is displayed here.

import torch
from beartype import beartype
from diffdrr.utils import convert
from jaxtyping import Float, jaxtyped
from pytorch3d.transforms import (
    so3_rotation_angle,
    so3_relative_angle,
    standardize_quaternion,
)
from typing import Optional

from beartype import beartype
from diffdrr.utils import convert as convert_so3
from jaxtyping import Float, jaxtyped
from pytorch3d.transforms import Transform3d
from pytorchse3.se3 import se3_exp_map, se3_log_map




@beartype
class RigidTransform(Transform3d):
    """Wrapper of pytorch3d.transforms.Transform3d with extra functionalities."""

    @jaxtyped
    def __init__(
        self,
        R: Float[torch.Tensor, "..."],
        t: Float[torch.Tensor, "... 3"],
        parameterization: str = "matrix",
        convention: Optional[str] = None,
        device=None,
        dtype=torch.float32,
    ):
        if device is None and (R.device == t.device):
            device = R.device

        R = convert_so3(R, parameterization, "matrix", convention)
        if R.dim() == 2 and t.dim() == 1:
            R = R.unsqueeze(0)
            t = t.unsqueeze(0)
        assert (batch_size := len(R)) == len(t), "R and t need same batch size"

        matrix = torch.zeros(batch_size, 4, 4, device=device, dtype=dtype)
        matrix[..., :3, :3] = R.transpose(-1, -2)
        matrix[..., 3, :3] = t
        matrix[..., 3, 3] = 1

        super().__init__(matrix=matrix, device=device, dtype=dtype)

    def get_rotation(self, parameterization=None, convention=None):
        R = self.get_matrix()[..., :3, :3].transpose(-1, -2)
        if parameterization is not None:
            R = convert_so3(R, "matrix", parameterization, None, convention)
        return R

    def get_translation(self):
        return self.get_matrix()[..., 3, :3]

    def inverse(self):
        """Closed-form inverse for rigid transforms."""
        R = self.get_rotation().transpose(-1, -2)
        t = self.get_translation()
        t = -torch.einsum("bij,bj->bi", R, t)
        return RigidTransform(R, t, device=self.device, dtype=self.dtype)

    def compose(self, other):
        T = super().compose(other)
        R = T.get_matrix()[..., :3, :3].transpose(-1, -2)
        t = T.get_matrix()[..., 3, :3]
        return RigidTransform(R, t, device=self.device, dtype=self.dtype)

    def clone(self):
        R = self.get_matrix()[..., :3, :3].transpose(-1, -2).clone()
        t = self.get_matrix()[..., 3, :3].clone()
        return RigidTransform(R, t, device=self.device, dtype=self.dtype)

    def get_se3_log(self):
        return se3_log_map(self.get_matrix().transpose(-1, -2))
    

class GeodesicSE3(torch.nn.Module):
    """Calculate the distance between transforms in the log-space of SE(3)."""

    def __init__(self):
        super().__init__()

    @beartype
    @jaxtyped
    def forward(
        self,
        pose_1: RigidTransform,
        pose_2: RigidTransform,
    ) -> Float[torch.Tensor, "b"]:
        return pose_2.compose(pose_1.inverse()).get_se3_log().norm(dim=1)

# Example pose matrices
pose = torch.tensor([[[ 1.8144e-01,  9.8316e-01, -2.1685e-02,  0.0000e+00],
         [ 2.5295e-01, -6.7969e-02, -9.6509e-01,  0.0000e+00],
         [-9.5031e-01,  1.6962e-01, -2.6102e-01,  0.0000e+00],
         [ 2.2374e+02,  3.8585e+02,  2.2170e+02,  1.0000e+00]],

        [[ 1.7733e-01,  9.8249e-01,  5.7100e-02,  0.0000e+00],
         [-6.8154e-02,  7.0139e-02, -9.9521e-01,  0.0000e+00],
         [-9.8179e-01,  1.7258e-01,  7.9398e-02,  0.0000e+00],
         [ 2.0946e+02,  3.2825e+02,  2.1824e+02,  1.0000e+00]],

        [[-6.4248e-02,  9.9234e-01,  1.0555e-01,  0.0000e+00],
         [ 1.7417e-02,  1.0687e-01, -9.9412e-01,  0.0000e+00],
         [-9.9778e-01, -6.2032e-02, -2.4150e-02,  0.0000e+00],
         [ 1.6308e+02,  3.7611e+02,  2.2729e+02,  1.0000e+00]],

        [[ 2.2041e-01,  8.7520e-01, -4.3063e-01,  0.0000e+00],
         [-6.2944e-02, -4.2780e-01, -9.0168e-01,  0.0000e+00],
         [-9.7337e-01,  2.2585e-01, -3.9205e-02,  0.0000e+00],
         [ 3.0673e+02,  3.1338e+02,  5.1707e+01,  1.0000e+00]]], device='cuda:0')
    
pred_pose = torch.tensor(  
    [[[ 1.8104e-01,  9.8323e-01, -2.1787e-02,  0.0000e+00],
         [ 2.4225e-01, -6.6052e-02, -9.6796e-01,  0.0000e+00],
         [-9.5317e-01,  1.6996e-01, -2.5014e-01,  0.0000e+00],
         [ 2.2539e+02,  3.8712e+02,  2.2106e+02,  1.0000e+00]],

        [[ 1.7060e-01,  9.8356e-01,  5.9136e-02,  0.0000e+00],
         [-7.7165e-02,  7.3167e-02, -9.9433e-01,  0.0000e+00],
         [-9.8231e-01,  1.6507e-01,  8.8379e-02,  0.0000e+00],
         [ 2.0866e+02,  3.3049e+02,  2.1791e+02,  1.0000e+00]],

        [[-6.3945e-02,  9.9236e-01,  1.0550e-01,  0.0000e+00],
         [ 1.7204e-02,  1.0680e-01, -9.9413e-01,  0.0000e+00],
         [-9.9781e-01, -6.1755e-02, -2.3902e-02,  0.0000e+00],
         [ 1.6372e+02,  3.7662e+02,  2.2650e+02,  1.0000e+00]],

        [[ 1.9835e-01,  8.7822e-01, -4.3518e-01,  0.0000e+00],
         [-5.8063e-02, -4.3270e-01, -8.9967e-01,  0.0000e+00],
         [-9.7841e-01,  2.0372e-01, -3.4833e-02,  0.0000e+00],
         [ 3.0021e+02,  3.1163e+02,  5.1693e+01,  1.0000e+00]]], device='cuda:0')

# Assuming you have created instances of pose matrices
pose = torch.tensor(pose, device='cuda:0').clone().detach()
pred_pose = torch.tensor(pred_pose, device='cuda:0').clone().detach()

pose = RigidTransform(R=pose[..., :3, :3], t=pose[..., :3, 3])
pred_pose = RigidTransform(R=pred_pose[..., :3, :3], t=pred_pose[..., :3, 3])


# Creating an instance of GeodesicSE3 class
geodesic_calculator = GeodesicSE3()

# Calculate geodesic distance
geodesic = geodesic_calculator(pose, pred_pose)

print(geodesic)

from diffpose.

eigenvivek commented on June 25, 2024

Hi @CYXYZ , thanks for pointing out the erroneous code. It's using an older version of the DiffDRR API. I just updated the code and pushed it to the refactor-se3 branch. Please let me know if it's still causing issues.

from diffpose.

CYXYZ commented on June 25, 2024

Dear Vivek,

I hope this letter finds you well. I am writing to seek your assistance regarding an issue I encountered while running the refactor-se3 code.

Upon executing the train.py script, I encountered the following error message:


Traceback (most recent call last):                                                                                                                                     
  File "/home/data/cyx/autodl-tmp/DiffPose-refactor-se3/experiments/deepfluoro/train.py", line 235, in <module>
    main(id_number)
  File "/home/data/cyx/autodl-tmp/DiffPose-refactor-se3/experiments/deepfluoro/train.py", line 206, in main
    train(
  File "/home/data/cyx/autodl-tmp/DiffPose-refactor-se3/experiments/deepfluoro/train.py", line 69, in train
    img = drr(None, None, None, pose=pose, bone_attenuation_multiplier=contrast)
  File "/home/data/cyx/miniconda3/envs/diffpose/lib/python3.12/site-packages/diffdrr/drr.py", line 126, in forward
    source, target = self.detector(pose)
  File "/home/data/cyx/miniconda3/envs/diffpose/lib/python3.12/site-packages/diffdrr/detector.py", line 104, in forward
    source = pose(self.source)
TypeError: 'NoneType' object is not callable

It seems that there's an issue with the pose variable being NoneType, resulting in a TypeError when it is being called as a function.

I have reviewed the code, but I couldn't pinpoint the exact source of the problem. Could you please provide some guidance on how to resolve this issue?

Your assistance in resolving this matter would be greatly appreciated.

Thank you for your time and support.

Warm regards,
cyxyz

from diffpose.

eigenvivek commented on June 25, 2024

Thanks for letting me know, ill sit down and debug the code for a few hours and figure out what I messed up

from diffpose.

eigenvivek commented on June 25, 2024

@CYXYZ , did you pull the latest version of the code on refactor-se3?

The error message shows you're calling drr(None, None, None, pose=pose, bone_attenuation_multiplier=contrast), which is not in the latest version of the program.

from diffpose.

CYXYZ commented on June 25, 2024

Dear Vivek,

I hope this email finds you well. I wanted to take a moment to express my sincere gratitude for your invaluable guidance with the new code. Thanks to your expertise and support, it runs smoothly without any issues. Your insights and direction have been instrumental in ensuring its successful execution.

Looking forward to our continued collaboration and learning from you in the future.

Warm regards,
cyxyz

from diffpose.

eigenvivek commented on June 25, 2024

no problem! hope using the refector-se3 branch or the main branch + diffdrr=0.3.9 has worked properly.

from diffpose.

JamesQian11 commented on June 25, 2024

@CYXYZ , did you pull the latest version of the code on refactor-se3?

The error message shows you're calling drr(None, None, None, pose=pose, bone_attenuation_multiplier=contrast), which is not in the latest version of the program.

Hello CYXYZ,
Base on the NaN problems, I apply refactor-se3 branch + diffdrr=0.3.9, while there is a problem:
from diffdrr.pose import RigidTransform, convert, make_matrix
ModuleNotFoundError: No module named 'diffdrr.pose'
It looks like the diffdrr=0.3.9 dont match branch?

Your help in resolving this issue would be highly appreciated.

Thank you very much!
Kind regards,
James

from diffpose.

CYXYZ commented on June 25, 2024

@CYXYZ , did you pull the latest version of the code on refactor-se3?
The error message shows you're calling drr(None, None, None, pose=pose, bone_attenuation_multiplier=contrast), which is not in the latest version of the program.

Hello CYXYZ, Base on the NaN problems, I apply refactor-se3 branch + diffdrr=0.3.9, while there is a problem: from diffdrr.pose import RigidTransform, convert, make_matrix ModuleNotFoundError: No module named 'diffdrr.pose' It looks like the diffdrr=0.3.9 dont match branch?

Your help in resolving this issue would be highly appreciated.

Thank you very much! Kind regards, James

I use the refector-se3 branch + diffdrr=0.3.11. It has worked properly.

from diffpose.

JamesQian11 commented on June 25, 2024

@CYXYZ , did you pull the latest version of the code on refactor-se3?
The error message shows you're calling drr(None, None, None, pose=pose, bone_attenuation_multiplier=contrast), which is not in the latest version of the program.

Hello CYXYZ, Base on the NaN problems, I apply refactor-se3 branch + diffdrr=0.3.9, while there is a problem: from diffdrr.pose import RigidTransform, convert, make_matrix ModuleNotFoundError: No module named 'diffdrr.pose' It looks like the diffdrr=0.3.9 dont match branch?
Your help in resolving this issue would be highly appreciated.
Thank you very much! Kind regards, James

I use the refector-se3 branch + diffdrr=0.3.11. It has worked properly.

Thank you for your kind advice, while I got this error when I apply diffdrr=0.3.11:
Traceback (most recent call last):
File "train.py", line 11, in
from diffpose.deepfluoro import DeepFluoroDataset, Transforms, get_random_offset
File "/root/DS/DiffPose-refactor-se3/diffpose/deepfluoro.py", line 17, in
from .calibration import perspective_projection
File "/root/DS/DiffPose-refactor-se3/diffpose/calibration.py", line 17, in
@jaxtyped(typechecker=beartype)
TypeError: jaxtyped() got an unexpected keyword argument 'typechecker'

The python version is: 3.8. and others are:
diffdrr 0.3.11
diffpose 0.0.1 /root/DS/DiffPose-refactor-se3

from diffpose.

NaN problems during training about diffpose HOT 14 CLOSED

Comments (14)

Example pose matrices

Assuming you have created instances of pose matrices

Creating an instance of GeodesicSE3 class

Calculate geodesic distance

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent