Coder Social home page Coder Social logo

sunset1995 / horizonnet Goto Github PK

View Code? Open in Web Editor NEW
315.0 315.0 87.0 10.25 MB

Pytorch implementation of HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation.

Home Page: https://sunset1995.github.io/HorizonNet/

License: MIT License

Python 100.00%
360-photo computer-vision cvpr2019 horizonnet pano-stretch-augmentation room-layout

horizonnet's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

horizonnet's Issues

error while using customized data

while running the training I am getting below error:

print('Skip ground truth invalid (%s)' % gt_path)
NameError: name 'gt_path' is not defined

please let me know how to correct this error

Non-2x1 images

How can I tweak this to support non-2x1 panorama sizes? That is, 360 degrees yaw but not 180 (~150) degrees pitch?

Magic number in decoder fc layer

Hi, thanks for sharing the code, in the code of decoder bias, what is the meaning of these magic numbers? Can we randomly initialize the bias in the normal way?

HorizonNet/model.py

Lines 214 to 216 in e6d7e03

self.linear.bias.data[0*self.step_cols:1*self.step_cols].fill_(-1)
self.linear.bias.data[1*self.step_cols:2*self.step_cols].fill_(-0.478)
self.linear.bias.data[2*self.step_cols:3*self.step_cols].fill_(0.425)

are there any tutorials / guidelines for sperical projection

Hi sunset1995, thank you so much for your contribution!

I am currently going through this repository trying to understand what the code is doing, but I find some of the functions in inference, postprocessing and evil_utils quite difficult to understand, as I do not know the naming convention, and lack the required background knowledge of the formulas used for 3d 2d transformation.

Would you be so kind as to point me in the right direction in order to understand the code? thank u soooo much > v <

Assertion Error while training with custom datasets

I have tried to train this model with my own custom dataset but the following is the error I encountered.

The format of the dataset followed the one from the tutorial.
The images I used are 360 degree pictures of my room/office taken by myself and I have labeled the images without pre-processing(aligning camera rotation pose). Is this process necessary?

Looking forward to any suggestions.

Traceback (most recent call last): | 0/2 [00:00<?, ?it/s]
File "train.py", line 171, in
x, y_bon, y_cor = next(iterator_train)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 582, in next
return self._process_next_batch(batch)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
AssertionError: Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/ubuntu/Projects/HorizonNet_Original/dataset.py", line 68, in getitem
assert (cor[0::2, 1] > cor[1::2, 1]).sum() == 0
AssertionError

Missing model weights

Hi, thanks for your great work.

In the readme, it states to run

python inference.py --pth ckpt/resnet50_rnn__mp3d.pth --img_glob assets/preprocessed/demo_aligned_rgb.png --output_dir assets/inferenced --visualize

however, this ckpt/resnet50_rnn__mp3d.pth model checkpoint is not released publicly, from what I could see. Is that correct?

Question about ordering corner coordinates

Hi, I am preparing my own custom dataset of 360 panorama images. I read in issue #20 that I have to order the corner coordinates based on the 3d skeleton of the room layout. Can you elaborate more on this ? So I first generate the 3D layout of the room using layout viewer from the panorama image. Then, what is the correct viewing angle to see the order of the layout or corners ?

Getting error while training the model

I have tried to train this model on layoutnet datasets with all default parameters mentioned here (https://github.com/sunset1995/HorizonNet).
I executed the following command
(HorizonNet) D:\HorizonNet-master>python train.py --id resnet50_rnn

I am getting the following error

Epoch: 0%| | 0/300 [00:00<?, ?ep/s]
Traceback (most recent call last):
File "train.py", line 181, in
iterator_train = iter(loader_train)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\utils\data\dataloader.py", line 279, in iter
return _MultiProcessingDataLoaderIter(self)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\utils\data\dataloader.py", line 719, in init
w.start()
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function at 0x0000023358098158>: attribute lookup on main failed

(HorizonNet) D:\HorizonNet-master>Traceback (most recent call last):
File "", line 1, in
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

Then I modified "train.py" at line number 114 as "num_workers=0".
I am using anaconda in which a new environment named HorizonNet is created with python version = 3.6.
Now I am getting the following error

(HorizonNet) D:\HorizonNet-master>python train.py --id resnet50_rnn --epochs 50
Train ep1: 0%| | 0/204 [00:01<?, ?it/s]
Epoch: 0%| | 0/50 [00:01<?, ?ep/s]
Traceback (most recent call last):
File "train.py", line 191, in
losses = feed_forward(net, x, y_bon, y_cor)
File "train.py", line 26, in feed_forward
y_bon_, y_cor_ = net(x)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "D:\HorizonNet-master\model.py", line 242, in forward
feature = self.reduce_height_module(conv_list, x.shape[3]//self.step_cols)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "D:\HorizonNet-master\model.py", line 166, in forward
for f, x, out_c in zip(self.ghc_lst, conv_list, self.cs)
File "D:\HorizonNet-master\model.py", line 166, in
for f, x, out_c in zip(self.ghc_lst, conv_list, self.cs)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "D:\HorizonNet-master\model.py", line 138, in forward
x = self.layer(x)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\container.py", line 100, in forward
input = module(input)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "D:\HorizonNet-master\model.py", line 124, in forward
return self.layers(x)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\container.py", line 100, in forward
input = module(input)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\container.py", line 100, in forward
input = module(input)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "D:\HorizonNet-master\model.py", line 31, in forward
return lr_pad(x, self.padding)
File "D:\HorizonNet-master\model.py", line 21, in lr_pad
return torch.cat([x[..., -padding:], x, x[..., :padding]], dim=3)
RuntimeError: CUDA out of memory. Tried to allocate 66.00 MiB (GPU 0; 6.00 GiB total capacity; 4.28 GiB already allocated; 4.91 MiB free; 4.34 GiB reserved in total by PyTorch)

Please help

I want to save the generated model as an obj+mtl file and import it into Blender. May I ask which UV coordinate values to pass in?

I want to save the generated model as an .obj + .mtl file and import it into Blender. May I ask which UV coordinate values to pass in?
if args.vis:
mesh = o3d.geometry.TriangleMesh()
mesh.vertices = o3d.utility.Vector3dVector(points[:, :3])
mesh.vertex_colors = o3d.utility.Vector3dVector(points[:, 3:] / 255.)
mesh.triangles = o3d.utility.Vector3iVector(faces)

    text = cv2.imread('assets/demo.png')

    
    mesh.triangle_uvs = o3d.open3d.utility.Vector2dVector(xyzrgb)  # ??????---triangle_uvs ???
    mesh.triangle_material_ids = o3d.utility.IntVector([0] * len(faces))
    mesh.textures = [o3d.geometry.Image(text)]

    #save .obj & .mtl 
    o3d.io.write_triangle_mesh('/home/demo01.obj', mesh)

截图_选择区域_20230704172530

截图_选择区域_20230704173421

about `np_coorx2u` and `np_coory2v ` function in `post_proc.py`

hi, sunset
I am very interested in your work
then I have an issue about np_coorx2u and np_coory2v function in post_proc.py

def np_coorx2u(coorx, coorW=1024):
    return ((coorx + 0.5) / coorW - 0.5) * 2 * PI


def np_coory2v(coory, coorH=512):
    return -((coory + 0.5) / coorH - 0.5) * PI

I think coorx / coorW or coory / coorH already got rate , why need +0.5?

boundary training loss encounters impulse

Hi, thanks for sharing your code~
I run your code using the hyper-parameters described in your paper on the dataset (panoContext + Stanford 2d-3d) extracted by your LayoutNet-Pytorch code from .t7, however I find that the boundary training loss encounters impulse in some iterations but converge in the overall trend. Can you tell me why please? Are there some annotations wrong? Thanks a lot!
horizonnet

Error in inference.py

After I trained by my customized data I want to evaluate the model on the test set. But when I run the inference.py for general shape estimation, the error came out and progress interrupted:

Traceback (most recent call last):
File "inference.py", line 198, in
args.min_v, args.r)
File "inference.py", line 112, in inference
if not Polygon(xy2d).is_valid:
File "/home/anaconda3/lib/python3.7/site-packages/shapely/geometry/polygon.py", line 240, in init
ret = geos_polygon_from_py(shell, holes)
File "/home/anaconda3/lib/python3.7/site-packages/shapely/geometry/polygon.py", line 494, in geos_polygon_from_py
ret = geos_linearring_from_py(shell)
File "shapely/speedups/_speedups.pyx", line 239, in shapely.speedups._speedups.geos_linearring_from_py
ValueError: A LinearRing must have at least 3 coordinate tuples

Is it the reason that corners estimations are so bad that it could not generate a polygon? Could you tell me why this happened? Thanks a lot.

Customized dataset

Sorry, I wonder how to define my own data format about occlusion area:
In README_PREPARE_DATASET.md:
"Please note that 713 100 and it floor correspondent 713 415 are occluded."
How to note coordinates for occlusion?

max of 4 walls to model

Hi,
when I'm trying to create a 3d model with more than 4 walls, it looks like it still make it with 4 walls.

can i please get the exactly same resnet that you have been used at your example? (resnet50 rnn_mp3d)

Point Cloud Registration

Hi all,

I am pretty new to 3D modelling and I was wondering if anyone could set me in the right direction regarding point clouds registration.

Was anyone successful in registering the results from room layout to "compose" a 3D floorplan?

I am grateful for any suggestion.

Thank you very much!
R.

Why z0 is set to be 50?

Hi, thanks for your excellent work! I have two questions.

  1. According to my understanding, z0 represents the height of the ceiling from the camera plane,isn't it 1.6 meters? why z0 is set to be 50?
    line 95 inference.py
# Init floor/ceil plane
    z0 = 50
    _, z1 = post_proc.np_refine_by_fix_z(*y_bon_, z0)
  1. Does variable tol mean the threshold of signal when recovering all planes? tol is 0.05 in paper but abs(0.16 * z1 / 1.6) in code.
    line 106 inference.py
    cor, xy_cor = post_proc.gen_ww(xs_, y_bon_[0], z0, tol=abs(0.16 * z1 / 1.6), force_cuboid=force_cuboid)
    line 81 post_proc.py
    invalid = (n < len(vec) * 0.4) | (l > tol)

How to estimate room dimensions from reconstructed layout?

@sunset1995, thank you for the great work.

Is it possible to estimate the dimensions of the room, either from panorama or reconstruction, using known camera parameters, such as camera height or focal length? I would like to know how HorizonNet can recover the actual dimensions of the room in terms of height x width of the walls, for example. Thank you.

3D IoU evaluation bug in `eval_general.py`

Let:

  • A be the 2d area of prediction
  • B be the 2d area of ground-truth
  • I be the 2d area of intersection of prediction and ground-truth
  • Ha be the layout height of prediction
  • Hb be the layout height of ground-truth

The 3D IoU should be:

area3d_I = I * min(Ha, Hb)
area3d_A = A * Ha
area3d_B = B * Hb
iou3d = area3d_I / (area3d_A + area3d_B - area3d_I)

However, the original implementation is wrong:

iou2d = I / (A + B - I)
iouH = min(Ha, Hb) / max(Ha, Hb)
iou3d = iout2d * iouH

For an easier comparison, let rewrite it into same form:

iou3d = iout2d * iouH
iou3d = I / (A + B - I) * iouH
iou3d = I / (A + B - I) * min(Ha, Hb) / max(Ha, Hb)
iou3d = I * min(Ha, Hb) / (A + B - I) / max(Ha, Hb)
iou3d = area3d_I / ((A + B - I) * max(Ha, Hb))

Without loss of generality, let say Ha >= Hb. Then the difference is:

  • area3d_I / (A * Ha + B * Ha - I * Ha) (my fault)
  • area3d_I / (A * Ha + B * Hb - I * Hb) (the correct one)

As B >= I and Ha >= Hb, my 3D IoU is less or equal to the correct 3D IoU.

is there some model adjustment in it?

RuntimeError: Error(s) in loading state_dict for HorizonNet:
Missing key(s) in state_dict: "feature_extractor.encoder.conv1.1.weight", "feature_extractor.encoder.bn1.weight", "feature_extractor.encoder.bn1.bias", "feature_extractor.encoder.bn1.running_mean", "feature_extractor.encoder.bn1.running_var", "feature_extractor.encoder.layer1.0.conv1.weight", "feature_extractor.encoder.layer1.0.bn1.weight", "feature_extractor.encoder.layer1.0.bn1.bias", "feature_extractor.encoder.layer1.0.bn1.running_mean", "feature_extractor.encoder.layer1.0.bn1.running_var", "feature_extractor.encoder.layer1.0.conv2.1.weight", "feature_extractor.encoder.layer1.0.bn2.weight", "feature_extractor.encoder.layer1.0.bn2.bias", "feature_extractor.encoder.layer1.0.bn2.running_mean", "feature_extractor.encoder.layer1.0.bn2.running_var",
...
Unexpected key(s) in state_dict: "stage1.0.layer.1.weight", "stage1.0.layer.1.bias", "stage1.0.layer.2.weight", "stage1.0.layer.2.bias", "stage1.0.layer.2.running_mean", "stage1.0.layer.2.running_var", "stage1.0.layer.2.num_batches_tracked", "stage1.0.layer.5.weight", "stage1.0.layer.5.bias", "stage1.0.layer.6.weight", "stage1.0.layer.6.bias", "stage1.0.layer.6.running_mean", "stage1.0.layer.6.running_var", "stage1.0.layer.6.num_batches_tracked", "stage1.0.layer.9.weight", "stage1.0.layer.9.bias",
...

Adding Multi GPU support and solving dependency issues!

  1. Adding environment.yml for better dependency handling.
  2. Updating the code such that it can run on Multiple GPUs at once or specified CUDA capable device
  3. Implemented automatic mixed precision training (AMP) in the code to reduce the GPU memory overhead. This will help to accommodate larger batch size.

the result seems to be inversed

This is the 360 photo, as you can see, when facing the wall that has the main entrance, the main entrance should be on the right hand side.

demo2_aligned_rgb

the result is inversed along one direction. The door is now on the left hand side

Screen Shot 2020-02-08 at 10 03 32 PM

Finetuned model

Hello,when training, in order to get the finetuned model, what's the meaning of "Finetuned on finetune_general/ 66 images"? I just want to know the process of training and validation,can you help me? Thank you!

finetune model

Hello, I have fine-tuned the training model according to the meaning of the article, but the results are different from yours. I don't know why the result is so bad. Can you help me solve it? Thank you!

Run command “python train.py --id finetune --freeze_earlier_blocks 4"

This is the code I modified.

Create dataloader

####################### #modify#########################
dataset_train_finetune = PanoCorBonDataset(
root_dir=args.train_finetune_dir,
flip=not args.no_flip, rotate=not args.no_rotate, gamma=not args.no_gamma,
stretch=not args.no_pano_stretch)

loader_train_finetune = DataLoader(dataset_train_finetune, args.batch_size_train,
                          shuffle=True, drop_last=True,
                          num_workers=args.num_workers,
                          pin_memory=not args.no_cuda,
                          worker_init_fn=lambda x: np.random.seed())
 ####################### #modify#########################
dataset_train = PanoCorBonDataset(
    root_dir=args.train_root_dir,
    flip=not args.no_flip, rotate=not args.no_rotate, gamma=not args.no_gamma,
    stretch=not args.no_pano_stretch)
loader_train = DataLoader(dataset_train, args.batch_size_train,
                          shuffle=True, drop_last=True,
                          num_workers=args.num_workers,
                          pin_memory=not args.no_cuda,
                          worker_init_fn=lambda x: np.random.seed())
if args.valid_root_dir:
    dataset_valid = PanoCorBonDataset(
        root_dir=args.valid_root_dir,
        flip=False, rotate=False, gamma=False,
        stretch=False)
    loader_valid = DataLoader(dataset_valid, args.batch_size_valid,
                              shuffle=False, drop_last=False,
                              num_workers=args.num_workers,
                              pin_memory=not args.no_cuda)

Start training

for ith_epoch in trange(1, args.epochs + 1, desc='Epoch', unit='ep'):

    # Train phase
    net.train()
    if args.freeze_earlier_blocks != -1:
        b0, b1, b2, b3, b4 = net.feature_extractor.list_blocks()
        blocks = [b0, b1, b2, b3, b4]
        for i in range(args.freeze_earlier_blocks + 1):
            for m in blocks[i]:
                m.eval()
    iterator_train = iter(loader_train)
    iterator_train_finetune = iter(loader_train_finetune)

    ith_batch = 0

#######################################modify##################################
for _ in trange(len(loader_train_finetune),
desc='Train ep%s' % ith_epoch, position=1):
# Set learning rate
adjust_learning_rate(optimizer, args)

        args.cur_iter += 1
        x1, y_bon1, y_cor1 = next(iterator_train_finetune)
        x2, y_bon2, y_cor2 = next(iterator_train)

        x = torch.cat([x1,x2],0)
        y_bon = torch.cat([y_bon1, y_bon2], 0)
        y_cor = torch.cat([y_cor1, y_cor2], 0)

        losses = feed_forward(net, x, y_bon, y_cor)
        for k, v in losses.items():
            k = 'train/%s' % k
            tb_writer.add_scalar(k, v.item(), args.cur_iter)
        tb_writer.add_scalar('train/lr', args.running_lr, args.cur_iter)
        loss = losses['total']

#######################################modify##################################
# backprop
optimizer.zero_grad()
loss.backward()
nn.utils.clip_grad_norm_(net.parameters(), 3.0, norm_type='inf')
optimizer.step()
ith_batch += 1
179,24 82%

How to train the network in three GPUs

I have noticed you have mentioned that it takes four hours to finish the training on three NVIDIA GTX 1080 Ti GPUs. However, you do not describe how to train the network on three GPUs in README.md.
When I run
python train.py --id resnet50_rnn --use_rnn
, it only takes a single GPU, and the batch size of it is eight which is different from that mentioned in your paper.
Could you please describe the process of training in detail.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.