Comments (10)
Really thanks for your patience! I will try again accordingly, hope I can have better results!
from scenerf.
Hi @BeileiCui,
Thanks for your interest. I anwser your question below:
- You don't need the registration if your poses are accurate. I only use registration to correct the transformation in KITTI since its poses are errornous.
- SceneRF doesn't require Velodyne. It reconstructs in the camera coordinates. The T_velo_2_cam2 and T_cam0_2_cam2 are used to only compute the point cloud registration to fix the transformation.
- We use these numbers following SemanticKITTI SSC setting. You can select other numbers.
from scenerf.
Hi, thanks for your reply! I just have another question. I have built up the training pipeline with SCARED now. But some training logs are a little weird compared to the logs you mentioned here. The settings are mostly the same as your original settings except for like image size.
Do you have any suggestions on what the problems might be? And any suggestions on how can I resolve it?
Thanks for your time and look forward to your reply!
from scenerf.
Hi @BeileiCui,
It seems that your relative transformation is incorrect. I would suggest:
- Train without the loss reprojection (comment this line). And render the rgb images to make sure everything is correct except the reprojection.
- Checking your transformation: Passing ground-truth depth to this function and draw the projected image see if you can recover the target image. If the depth is sparse like in lidar, you can draw it on top of the target image to see if they match.
I hope this help.
from scenerf.
Hi, @anhquancao really thanks for your suggestion! I followed your advice and checked the code these days and found out it was the transformation's problem, I have solved it now.
The performance was still not reaching its best due to some reasons. I notice that some parameters related to ray sampling are too high compared to your original training on KITTI. Like shown below: my min_som_vars_epoch is about 160 (yours is about 26), my closest_std_epoch is about 12.5 (yours is about 4.5), my dist_2_closest_gaussian_epoch is about 4 (yours is about 1.5),
So do you have any suggestions on how should I finetune the network? Or could it be some other reason causing this right now?
Looking forward to your reply!
from scenerf.
By the way, here are some parameters of mine currently. The min distance and max distance of SCARED are about 10mm and 250 mm separately. I down-sample the original image by a factor of 2 (1024*1280 to 512 * 640) and decrease the cam_K accordingly.
@click.command()
@click.option('--dataset', default="kitti", help='experiment prefix')
@click.option('--logdir', default="", help='log directory')
@click.option('--root', default="/mnt/data-hdd2/Beilei/Dataset/SCARED", help='path to dataset folder')
@click.option('--preprocess_root', default="/mnt/data-hdd2/Beilei/Dataset/SCARED/preprocess", help='path to preprocess folder')
@click.option('--bs', default=1, help='Batch size')
@click.option('--lr', default=1e-5, help='learning rate')
@click.option('--wd', default=0, help='weight decay')
@click.option('--n_gpus', default=1, help='number of GPUs')
@click.option('--n_workers_per_gpu', default=1, help='number of workers per GPU')
@click.option('--enable_log', default=False, help='enable log')
@click.option('--exp_prefix', default="Train", help='experiment prefix')
@click.option('--n_rays', default=1200, help='Total number of rays')
@click.option('--frames_interval', default=0.5, help='Interval between supervision frames')
@click.option('--sample_train', default=5, help='Sample the train set at certain scale')
@click.option('--max_sample_depth', default=220, help='maximum sample depth')
@click.option('--eval_depth', default=200, help='cap depth at 80 for evaluation')
@click.option('--n_pts_per_gaussian', default=8, help='#points sampled for each gaussian')
@click.option('--n_gaussians', default=4, help='#gaussians')
@click.option('--n_pts_uni', default=32, help='#points sampled uniformly')
@click.option('--std', default=2.0, help='initial std of each gaussian')
@click.option('--add_fov_hor', default=16, help='Amount of angle in degree added to left and right of the horizontal FOV')
@click.option('--add_fov_ver', default=14, help='Amount of angle in degree added to top and bottom of the vertical FOV')
# ideally sphere_h and sphere_w should be img_H * 1.5, img_W * 1.5 (Because we increase the FOV by 1.5).
# However, we empirically found that any sphere_h >= img_H and any sphere_w >= img_W have almost similar performance.
@click.option('--sphere_h', default=600, help='The height of the discretized spherical grid')
@click.option('--sphere_w', default=700, help='The width of the discretized spherical grid')
@click.option('--sequence_distance', default=10, help='Distance between the input and the last frames in the sequence')
@click.option('--som_sigma', default=2.0, help='')
@click.option('--max_epochs', default=50, help='')
@click.option('--use_color', default=True, help='Use color loss')
@click.option('--use_reprojection', default=True, help='Use reprojection loss')
from scenerf.
It means that the std of the gaussian is quite high and the Gaussian peaks are far from the depth the depth.
Maybe you can increase the weight of the lost that minimizes the distance to closest Gaussian.
The loss seems to still decrease sharply. I think you should train for longer.
However, the most important metrics are the depth metrics like: abs_rel, rmse, sq_rel, a1, a2, a3 and they looks good.
You can also make the network focus more on depth by increasing the weight of the reprojection loss.
I also found out that the size of the image in the loss functions is quite important also. You can decrease the size of the image input to the network. But for the images in the loss functions, I would advise you to keep full size.
from scenerf.
Hi, thanks for your advice! Sorry, I don't know if I understand how to decrease the size of image input to the network and keep the full size in the loss function. Aren't the images handled together and sampled with rays๏ผ How can I input different size of images to network and loss function?
Do you mean I can, for example, prepare the data with the original size but makes sphere_h and sphere_w smaller than the original size?
from scenerf.
The metrics like: abs_rel, rmse, sq_rel, a1, a2, a3 they are decreasing but the decreasing speed is already slow. I mentioned performance was still not reaching its best becuase it's weird compared to other SOTA self-supervised depth estimation method. For example, current my best RMSE is about 16 while SOTA method on this dataset is about 6. So I wonder if it's my problem, if I did not set something right for training.
I also notice that my weights_at_depth_epoch is too small, around 0.05 and not increasing(yours is above 0.3 at the beginning at around 0.55 at the end). Could this be the problem? I 'm just confused because I think it's performance should not decrease this much so it must be my problem
from scenerf.
This is the input image and the images used in loss functions.
The depth metrics are much lower because they are computed on a selected random frame in 10m from the infer frames, not the input frame.
0.05 is indeed small, it's the weight of the point closest to the depth. Probably because the points are too far away.
from scenerf.
Related Issues (20)
- Bugs in generate_novel_depths.py HOT 3
- About cam_pts_to_angle HOT 4
- How do I set the data if I want to move the image I acquired with this code? HOT 6
- Are there any changes to the code and configuration for indoor scene reconstruction? HOT 6
- Are this method scene-specific? HOT 2
- train semantickitti problem HOT 2
- Some questions about performance against baseline Adabins HOT 2
- Question about FPS in inference HOT 4
- Asking for preprocess tsdf of 13.84 IoU ckpt HOT 5
- No Module named scenerf" HOT 8
- The details of converting image coordinates to spherical coordinates are somewhat confusing HOT 1
- [Question] 3D reconstruction from image slides HOT 2
- Error in Chekcpoints Saving HOT 1
- Question about Function "depth2disp" HOT 2
- Error with training on single gpu. HOT 5
- About dataset HOT 5
- depth estimation on other images HOT 3
- Can the scene be reconstructed in 3D? HOT 4
- No module named 'scenerf' HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scenerf.