Comments (34)
Hi @AminSeffo,
Sounds good! :)
-
The training config will be automatically retrieved by the
exp_group/my_autoencoder
you define in them3_template.cfg
. So in particular here, where you define the mapping from class to the name and group of your AAE model:
It's likely you can keep the default parameters, just make sure your input images are in bgr format. The maskrcnn parameters are irrelevant because you are using your own one. -
You can simply convert them to a binary mask and apply the mask to the image before cropping and prediction, as done here:
I once wrote a converter from RLE to binary masks:
https://github.com/thodan/bop_toolkit/blob/af380d7a028b5c44903913e39d652c83a4bc2bdd/bop_toolkit_lib/pycoco_utils.py#L202
from augmentedautoencoder.
Hi @MartinSmeyer,
thanks again for your reply.
I changed the parameter of class_2_encoder (see below) and I defined it as class 1 in the python script, but I still get this error: 1 not contained in config class_names dict_keys([1]
By the way, I trained the model as in the AAE pipeline instructions, where I only changed the path of the .ply
and the VOC background images.
Additional information
[methods]
object_detector = mask_rcnn
object_pose_estimator = auto_pose
[auto_pose]
gpu_memory_fraction = 0.5
color_format = bgr
color_data_type = np.float32
depth_data_type = np.float32
class_2_encoder = {1:"exp_group/my_autoencoder"}
camPose = False
upright = False
topk = 1
pose_visualization = False
[mask_rcnn]
path_to_masks =
inference_time = 0.15
# from test_m3.py
# gt boxes and classes (replace with your favorite detector)
classes = [1]
bboxes = [[860, 511, 929, 667]]
from augmentedautoencoder.
@AminSeffo So you did not rename your experiment group / name when training the AAE? Normally you would put some descriptive names there, but it shouldn't matter.
The problem might be that here the class key is transformed into a string:
Can you try to remove the str()
?
from augmentedautoencoder.
hello,Could you tell me how train 2D detector? Thank you so much~
I used Detectron2 for that. Here is a colab notebook, where you can define a data set and start the training.
from augmentedautoencoder.
hello,Could you tell me how train 2D detector? Thank you so much~
I used Detectron2 for that. Here is a colab notebook, where you can define a data set and start the training.
hello ,I want to know how to train 2D detector with tless train-dataset?
if you could reply me ,i will appreciate!
Hey maybe I can help you doing that but can you please open a new issue for that?
from augmentedautoencoder.
Hey @MartinSmeyer,
thank you again.
I removed str()
and it works, but I am not able to visualize it using the -vis flag because of this error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 0: invalid start byte
I saw this issue #88, however the training worked already.
from augmentedautoencoder.
maybe here the traceback error in details, when running with -vis :
This message will be only logged once.
INFO - 2022-08-01 16:25:09,531 - acceleratesupport - OpenGL_accelerate module loaded
INFO - 2022-08-01 16:25:09,536 - arraydatatype - Using accelerated ArrayDatatype
using egl
('renderer', 'Model paths: ', ['/home/amin/autoencoder_ws/cad_model/nuss_model.ply'])
[0]
Traceback (most recent call last): |
File "/home/amin/6d_pose_estimation/test_m3.py", line 60, in <module>
pose_visualizer.render_poses(img, camK, pose_ests, bbs)
File "/home/amin/6d_pose_estimation/visualization/render_pose.py", line 31, in render_poses
bgr, depth,_ = self.renderer.render_many(obj_ids = [self.classes.index(pose_est.name) for pose_est in pose_ests],
File "/home/amin/anaconda3/envs/sixd_pose_detection/lib/python3.7/site-packages/auto_pose/ae/utils.py", line 15, in decorator
setattr(self, attribute, function(self))
File "/home/amin/6d_pose_estimation/visualization/render_pose.py", line 25, in renderer
vertex_scale=float(self.vertex_scale[0])) #1000 for models in meters
File "/home/amin/anaconda3/envs/sixd_pose_detection/lib/python3.7/site-packages/auto_pose/meshrenderer/meshrenderer.py", line 37, in __init__
vert_norms = gu.geo.load_meshes(models_cad_files, vertex_tmp_store_folder, recalculate_normals=True)
File "/home/amin/anaconda3/envs/sixd_pose_detection/lib/python3.7/site-packages/auto_pose/meshrenderer/gl_utils/geometry.py", line 54, in load_meshes
scene = pyassimp.load(model_path, pyassimp.postprocess.aiProcess_Triangulate)
File "/home/amin/anaconda3/envs/sixd_pose_detection/lib/python3.7/site-packages/pyassimp/core.py", line 315, in load
scene = _init(model.contents)
File "/home/amin/anaconda3/envs/sixd_pose_detection/lib/python3.7/site-packages/pyassimp/core.py", line 211, in _init
call_init(obj, target)
File "/home/amin/anaconda3/envs/sixd_pose_detection/lib/python3.7/site-packages/pyassimp/core.py", line 76, in call_init
_init(obj.contents, obj, caller)
File "/home/amin/anaconda3/envs/sixd_pose_detection/lib/python3.7/site-packages/pyassimp/core.py", line 122, in _init
target.name = str(obj.data.decode("utf-8"))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 0: invalid start byte
from augmentedautoencoder.
There seem to be some special characters in your 3D model file. Please try to debug this yourself.
from augmentedautoencoder.
Hey @MartinSmeyer,
I checked that out. I replaced my 3D model with obj_30.ply
from the BOP challenge and I am still getting the same error.
And I can still try to fix this error.
thank you again
from augmentedautoencoder.
That's strange, I just tried it and it works for me, including tless object 30:
Are you using the CAD models or the reconstructed ones? Could you try to change the model type to CAD in your training config before running the visualization?
[Dataset]
MODEL: cad
from augmentedautoencoder.
Which pyassimp version are you using?
from augmentedautoencoder.
Which pyassimp version are you using?
pyassimp: 3.3
from augmentedautoencoder.
Try to update to
AugmentedAutoencoder/aae_py37_tf26.yml
Line 132 in fec781c
from augmentedautoencoder.
Try to update to
AugmentedAutoencoder/aae_py37_tf26.yml
Line 132 in fec781c
I updated but I got this error now :(
('renderer', 'Model paths: ', ['/home/amin/autoencoder_ws/cad_model/nuss_model.ply'])
[0]
100% |########################################################################################################################################################################################################################################################################################|
Traceback (most recent call last):
File "/home/amin/6d_pose_estimation/test_m3.py", line 64, in <module>
pose_visualizer.render_poses(img, camK, pose_ests, bbs)
File "/home/amin/6d_pose_estimation/visualization/render_pose.py", line 38, in render_poses
far = 10000)
File "/home/amin/anaconda3/envs/sixd_pose_detection/lib/python3.7/site-packages/auto_pose/meshrenderer/meshrenderer.py", line 141, in render_many
assert W <= Renderer.MAX_FBO_WIDTH and H <= Renderer.MAX_FBO_HEIGHT
AssertionError
from augmentedautoencoder.
I think, I have problems with my image dimensions...I will check it out and let you
from augmentedautoencoder.
Hey @MartinSmeyer
I had problems with some dimensions of the bbox from the 2d object detection and now the rendering works, here is the output :
I think, I am facing an issue with scaling. Here is my .ply model my_model_aae.zip , which I scaled in blender. Actually I should have done everything correctly before training the autoencoder (please take a look at the .ply
model if you have time)
Here is also a snap of my_autoencoder.cfg :
[Paths]
MODEL_PATH: /home/amin/autoencoder_ws/cad_model/my_model_aae.ply
BACKGROUND_IMAGES_GLOB: /home/amin/autoencoder_ws/VOCdevkit/VOC2012/JPEGImages/*.jpg
[Dataset]
MODEL: reconst
H: 128
W: 128
C: 3
RADIUS: 700
RENDER_DIMS: (720, 540)
K: [1075.65, 0, 720/2, 0, 1073.90, 540/2, 0, 0, 1]
#Azure Kinect parameters
#RENDER_DIMS: (720, 1280)
#K: [608.1231079101562, 0, 638.6071166992188, 0, 608.0382690429688, 368.2049560546875, 0, 0, 1]
# Scale vertices to mm
VERTEX_SCALE: 1
ANTIALIASING: 1
PAD_FACTOR: 1.
CLIP_NEAR: 10
CLIP_FAR: 10000
NOOF_TRAINING_IMGS: 20000
NOOF_BG_IMGS: 15000
[Augmentation]
REALISTIC_OCCLUSION: False
SQUARE_OCCLUSION: False
MAX_REL_OFFSET: 0.20
CODE: Sequential([
#Sometimes(0.5, PerspectiveTransform(0.05)),
#Sometimes(0.5, CropAndPad(percent=(-0.05, 0.1))),
Sometimes(0.5, Affine(scale=(1.0, 1.2))),
Sometimes(0.5, CoarseDropout( p=0.2, size_percent=0.05) ),
Sometimes(0.5, GaussianBlur(1.2*np.random.rand())),
Sometimes(0.5, Add((-25, 25), per_channel=0.3)),
Sometimes(0.3, Invert(0.2, per_channel=True)),
Sometimes(0.5, Multiply((0.6, 1.4), per_channel=0.5)),
Sometimes(0.5, Multiply((0.6, 1.4))),
Sometimes(0.5, ContrastNormalization((0.5, 2.2), per_channel=0.3))
], random_order=False)
[Embedding]
EMBED_BB: True
MIN_N_VIEWS: 2562
NUM_CYCLO: 36
[Network]
BATCH_NORMALIZATION: False
AUXILIARY_MASK: False
VARIATIONAL: 0
LOSS: L2
BOOTSTRAP_RATIO: 4
NORM_REGULARIZE: 0
LATENT_SPACE_SIZE: 128
NUM_FILTER: [128, 256, 512, 512]
STRIDES: [2, 2, 2, 2]
KERNEL_SIZE_ENCODER: 5
KERNEL_SIZE_DECODER: 5
[Training]
OPTIMIZER: Adam
NUM_ITER: 30000
BATCH_SIZE: 64
LEARNING_RATE: 2e-4
SAVE_INTERVAL: 10000
[Queue]
# OPENGL_RENDER_QUEUE_SIZE: 500
NUM_THREADS: 10
QUEUE_SIZE: 50
And from test_m3.py :
import cv2
import numpy as np
import os
import argparse
import object_detector
from auto_pose.m3_interface.m3_interfaces import BoundingBox
from auto_pose.m3_interface.ae_pose_estimator import AePoseEstimator
from webcam_video_stream import WebcamVideoStream
dir_name = os.path.dirname(os.path.abspath(__file__))
default_cfg = os.path.join(dir_name, '../cfg_m3vision/m3_template_pose.cfg')
parser = argparse.ArgumentParser()
parser.add_argument("--m3_config_path", type=str, default=default_cfg)
parser.add_argument("-vis", action='store_true', default=False)
args = parser.parse_args()
if os.environ.get('AE_WORKSPACE_PATH') == None:
print('Please define a workspace path:\n')
print('export AE_WORKSPACE_PATH=/path/to/workspace\n')
exit(-1)
img = cv2.imread("/home/amin/6d_pose_estimation/image_test.png")
H,W,_ = img.shape
# Azure Kinect camera parameters
f_x=608.1231079101562
f_y=608.0382690429688
c_x=638.6071166992188
c_y=368.2049560546875
camK = np.array([f_x, 0., c_x, 0., f_y, c_y, 0., 0., 1.]).reshape(3, 3)
# gt boxes and classes (replace with your favorite detector)
classes = [1]
bboxes=[[723, 366, 89, 80]]
my_detector=object_detector.Detector()
nuss_detection=my_detector.prediction(img)
bbs = []
h,w = float(H), float(W)
for b,c in zip(bboxes, classes):
bbs.append(BoundingBox(xmin=b[0]/w, xmax=(b[0]+b[2])/w , ymin=b[1]/h, ymax=(b[1]+b[3])/h, classes={(c):1.0}))
# MultiPath Encoder Initialization
aae_pose_estimator = AePoseEstimator("/home/amin/6d_pose_estimation/cfg_m3vision/m3_template_pose.cfg")
# Predict 6-DoF poses
pose_ests = aae_pose_estimator.process(bbs,img,camK)
print(np.array([{p.name:p.trafo} for p in pose_ests]))
# Visualize
if args.vis:
from visualization.render_pose import PoseVisualizer
pose_visualizer = PoseVisualizer(aae_pose_estimator)
pose_visualizer.render_poses(img, camK, pose_ests, bbs)
And finally the used m3_template:
[methods]
object_detector = mask_rcnn
object_pose_estimator = auto_pose
[auto_pose]
gpu_memory_fraction = 0.5
color_format = bgr
color_data_type = np.float32
depth_data_type = np.float32
class_2_encoder = {1:"exp_group/my_autoencoder"}
camPose = False
upright = False
topk = 1
pose_visualization = False
[mask_rcnn]
path_to_masks =
inference_time = 0.15
from augmentedautoencoder.
Hey @MartinSmeyer,
do you have any suggested solutions?
from augmentedautoencoder.
Hey,
#Azure Kinect parameters
#RENDER_DIMS: (720, 1280)
#K: [608.1231079101562, 0, 638.6071166992188, 0, 608.0382690429688, 368.2049560546875, 0, 0, 1]
It's best to use these for training, but render dims is the wrong way around, should be
RENDER_DIMS: (1280, 720)
The pad factor of 1.2
should not be changed.
The 3D model geometry seems okay at first glance. Try again to train with the above parameters and use the ae_train .. -d
option to visualize the reconstruction targets before training.
from augmentedautoencoder.
Thank you again @MartinSmeyer
I corrected the render dims the K matrix as you suggested but I am still getting the same visualization. Moreover, I centered the model using meshlab and it looks now:
from augmentedautoencoder.
Can you please post the image you get with
ae_train ... -d
from augmentedautoencoder.
@MartinSmeyer of course
Here are the images, which are generated using ae_train ... -d
and the centered model using meshlab:
from augmentedautoencoder.
Okay, although the 3D model is hollow and without texture, the size looks alright.
What is the pose that you print out?
from augmentedautoencoder.
and did you retrain with the azure kinect camK? and recreated the embedding?
from augmentedautoencoder.
Shouldn't this classes={(c):1.0})
be this classes={str(c):1.0})
?
from augmentedautoencoder.
sry I closed the issue by mistake
from augmentedautoencoder.
and did you retrain with the azure kinect camK? and recreated the embedding?
Yes I did
from augmentedautoencoder.
Shouldn't this
classes={(c):1.0})
be thisclasses={str(c):1.0})
?
With str I got some errors, we discussed that before: #113 (comment)
from augmentedautoencoder.
With str I got some errors, we discussed that before: #113 (comment)
Ah yes. Would just also remove the ()
from augmentedautoencoder.
With str I got some errors, we discussed that before: #113 (comment)
Ah yes. Would just also remove the
()
Ohh okey I removed it: bbs.append(BoundingBox(xmin=b[0]/w, xmax=(b[0]+b[2])/w , ymin=b[1]/h, ymax=(b[1]+b[3])/h, classes={c:1.0}))
from augmentedautoencoder.
Okay, although the 3D model is hollow and without texture, the size looks alright.
from augmentedautoencoder.
Oh it's in meters, although your 3d model is in mm. Try to add mm=True
as an argument here:
from augmentedautoencoder.
Oh it's in meters, although your 3d model is in mm. Try to add
mm=True
as an argument here:
but where is the translation vector?
from augmentedautoencoder.
It's a 4x4 homogeneous matrix. ;) t = [149.27 , 45.84, 687.40]
in mm
from augmentedautoencoder.
Hi @AminSeffo, I'm glad that someone else is interested in using this as a real-time pose estimator. I'm currently trying to implement my own.
How where your results? I'm also curious about why you choose to go for detectron2 instead of, for instance, Keras Retitnanet.
Is there, by any change, the possibility that you share your work/pipeline, or some indications on how you manage to make it work?
Thanks in advance.
from augmentedautoencoder.
Related Issues (20)
- Unavailable data? HOT 3
- [question]About the LineMod test data HOT 2
- Traceback (most recent call last) after running ae_train exp_group/my_autoencoder HOT 3
- [question] how to install sixd_toolkit as pip package HOT 2
- Rotation Estimation Mismatch HOT 1
- Could you tell me how to train 2D detector in tless or icbin dataset ,
- > > > >I want to know how to train 2D detector with tless train-dataset HOT 1
- Difference CAD and reconst HOT 4
- How to limit the SO(3) rotation in the training set? HOT 5
- `pip install auto-pose` in `conda env create -f aae_py37_tf26.yml` failed. HOT 3
- [question] Unable to train custom .ply file. HOT 2
- The requirement of graphic card memory? HOT 1
- problem with "tensorflow.python.framework.errors_impl.OutOfRangeError"
- [question] Does camera matrix K, width and height have to be the same on the model training and on the testing? HOT 2
- issue running on wsl2
- Re-upload Expired Files for Model Evaluation?
- Data cannot be downloaded
- Data cannot be downloaded HOT 1
- [question] implement ICP registration
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from augmentedautoencoder.