Coder Social home page Coder Social logo

zouchuhang / layoutnet Goto Github PK

View Code? Open in Web Editor NEW
414.0 414.0 93.0 74.34 MB

Torch implementation of our CVPR 18 paper: "LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image"

Home Page: http://openaccess.thecvf.com/content_cvpr_2018/papers/Zou_LayoutNet_Reconstructing_the_CVPR_2018_paper.pdf

License: MIT License

Lua 2.41% MATLAB 56.30% M 0.01% Objective-C 0.01% C++ 7.84% C 12.40% Makefile 0.11% HTML 17.73% CSS 0.12% Perl 0.01% Python 0.34% PostScript 2.72%
3d-layout 3d-reconstruction deep-learning layoutnet

layoutnet's People

Contributors

alexcolburn avatar zouchuhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

layoutnet's Issues

3d layout construction

hello @zouchuhang @alexcolburn
I already checked the issue #3 but I still cannot fully understand the output.
since I do not have any background with 3d environment could you please explain some more details such as which tool to generate the mesh and wrap the pano, which variables to use, etc?

Some questions about dataset

Hello,

Thank you for opensourcing this great work.
Recently I implemented LayoutNet using Pytorch https://github.com/sunset1995/pytorch-layoutnet and found some questions about the ground truth data:

  1. The numbers of line in gt/pano*txt is 1063 while the numbers of gt/label_cor/**/*matis 1028. Can you please supplement the missing ground truth?
  2. If I'm not wrong, labeled corner under label_cor/**/* should be scaled before visualization or evaluation. The scale for stanford2d3d and paonContext are 4.0 and 8.890625 respectively. Is that correct?

Thanks you and have a nice day :)

Max pooling is directly followed by upsampling

The network's structure contains max pooling immediately followed by upsampling.
Maybe I'm missing something but it doesn't seem to make any sense. And just removing it should improved results.

local pool7 = nn.SpatialMaxPooling(2,2,2,2)(conv7_relu)
local unpool00 = nn.SpatialUpSamplingNearest(2)(pool7)

Is there something specific that this structure addresses?

What is the ground-truth mask used during training?

Hi

I saw in the file train_pano_joint.lua that you use somekind of mask to increase the loss at some poisitions.
I couldn't find any reference to that neither in the paper or throughout the repository.

Could you please explain what is the mask, how it is generated and why is it needed?

Thanks

    gtMsk = torch.mul(gtMsk, 4)
    gtMsk = gtMsk:cuda()
    gtMsk_w = torch.cmul(loss_d_1, gtMsk)
    loss_d_1 = torch.add(gtMsk_w, loss_d_1)
    gt2Msk = torch.mul(gt2Msk, 4)
    gt2Msk = gt2Msk:cuda()
    gt2Msk_w = torch.cmul(loss_d_2, gt2Msk)
    loss_d_2 = torch.add(gt2Msk_w, loss_d_2)

Experiment result is not consistent with the result reported on the paper

Hi, thanks a lot for sharing your work!
I download your full approach model pretrained on the panoContext dataset and use it to predict on the test set of the panoContext dataset. However, I find that the results inferred by your pretrained model are 73.85, 1.07 and 3.40 respectively while the results on the paper are 74.48, 1.06 and 3.34 respectively. I doubt wheter the pretrained model can obtain the results reported on the paper or whether I did something wrong...
Can u help me? Thanks a lot!

question about perspective training

I want to train the perspective model from scratch, but I can' t find info_edg_stack_tr_lsun_640_d6_sig20_trname.t7, can you tell me where I can get this dataset.

Seems like gradient computation is not right, correct me if I am wrong

d_aob_x0 = (2*x0-1)*n_v_ao*n_v_bo + dot(v_ao, v_bo) * (x0*n_v_bo/n_v_ao + (x0-1)*n_v_ao/n_v_bo);

line 44 and 45 and other lines after should be
d_aob_x0 = (2*x0-1)n_v_aon_v_bo/(sqrt(1-cos(b_aob)cos(b_aob))+eps) + dot(v_ao, v_bo) * (x0n_v_bo/n_v_ao + (x0-1)*n_v_ao/n_v_bo);
d_aob_x0 = -d_aob_x0 /n_v_ao/n_v_ao/n_v_bo/n_v_bo;

looks like only scale difference, but the optimization steps decrease.

json to mat

Hello,I have used the PanoAnnotator to mark some panorama images, and I get the .json file. How can I get the .mat file in the label_cor like you?(The label_cor only contains the coordinate of corner)

Could you share the expected validation loss during training?

So far I trained only the first 2 steps. (I don't really care about the box prediction, just the edge and corner detection).

First I trained using driver_pano_edg.lua and reached validation loss of 0.12333107 after 3260 iterations, which stopped improving for the rest ~4700 iterations.

Then, using this model, I trained with driver_pano_joint.lua and reached validation loss of 0.20790252 after 1480 iterations, which stopped improving for the rest ~6500 iterations.

It seems to produce results that are not as good as the supplied pretrained model.

What is the expected validation loss in each step?

How to cover rotEdge(variable in getManhattanAndAlign.m ) to t7 ?

I have successfully ran your code getManhattanAndAlign.m and test testNet_pano_full.lua on your demo, but i want to try it on my own pic.
I see your code
lne_ts = torch.load('./data/panoContext_line_test.t7')
and
img_ts = torch.load('./data/panoContext_img_test.t7')

it needs t7 file, but how can i cover variable in matlab to t7 file? it should be saved to image then saved to t7 file ? really thank you for your help.

What is the scale of the reconstructed 3D room to the actual room?

Hi ,

Thanks for sharing your work. This is very interesting and has some great use cases.

With regard to the result, is it possible to find the measurements of the actual room from the 3D reconstruction? Or in other words, how accurate is the 3d reconstruction in terms of wall width or length measurements to the actual room ?

preprocessPano requires pano_edge_tr_1024

I'm trying Matlab scripts, but preprocessPano.m requires pano_edge_tr_1024..
Could you show how to run it properly?

>> preprocessPano
pano_aadmuaxyxouqic
Error using load
Unable to read file '..\data\pano_edge_tr_1024\vp\pano_aadmuaxyxouqic.mat'. No
such file or directory.

Error in preprocessPano (line 59)
    load(['..\data\pano_edge_tr_1024\vp\' im_name '.mat']);

Creating Ground Truth

Hi, thank you for your wonderful work. I would like to create a new dataset for this architecture and was wondering how you created the GT.

I am having trouble interpreting the mat files for the ground truth and how to reproduce them for a series of new images.

Thank you so much for your time in advance, and I sincerely appreciate whatever insights you are able to give.

Issues reproducing network - resulting with different size

Hello! :)
I'm trying to implement your network with keras and it that the network I built has many more parameters than the amount you declared at your paper.
You've mentioned you have been able to train the entire network with a batch size of 20 using 12GB.
(I've even seen in #5 that you've mentioned you use 10.969GB)
It seems that my gpu has 10.57GiB available, but when I try to use a batch size of 15, which by calculation should fit the gpu, the gpu cannot fit the model into it's memory.
I've even removed the 3-D regresson part and it still fails.

So I wanted to ask if you could help me see if i've made any implementation error :)
Could you for example provide the total number of parameters of your model?
And perhaps even better, provide the number of parameters per layer? :)

Here is the description of my implementation :)
I've defined the network as follows:

def layoutnet():
    # Encoder
    input = layers.Input(shape=(6, 512, 1024))  # chw format
    e1 = conv2d_relu_pool(input, 32, name='e1')  # [?, 32, 256, 512]
    e2 = conv2d_relu_pool(e1, 64, name='e2')  # [?, 64, 128, 256]
    e3 = conv2d_relu_pool(e2, 128, name='e3')  # [?, 128, 64, 128]
    e4 = conv2d_relu_pool(e3, 256, name='e4')  # [?, 256, 32, 64]
    e5 = conv2d_relu_pool(e4, 512, name='e5')  # [?, 512, 16, 32]
    e6 = conv2d_relu_pool(e5, 1024, name='e6')  # [?, 1024, 8, 16]
    e7 = conv2d_relu_pool(e6, 2048, name='e7')  # [?, 2048, 4, 8]
    encoder = Model(input, e7)

    # Top decoder branch
    td1 = up_conv2d_relu(e7, 1024, 'td1')  # [?, 8, 16, 1024]
    td1 = layers.Concatenate(axis=1, name='td1_concat')([td1, e6])  # [?, 1024 * 2, 8, 16]

    td2 = up_conv2d_relu(td1, 512, name='td2')  # [?, 16, 32, 512]
    td2 = layers.Concatenate(axis=1, name='td2_concat')([td2, e5])  # [?, 512 * 2, 16, 32]

    td3 = up_conv2d_relu(td2, 256, name='td3')  # [?, 32, 64, 256]
    td3 = layers.Concatenate(axis=1, name='td3_concat')([td3, e4])  # [?, 256 * 2, 32, 64]

    td4 = up_conv2d_relu(td3, 128, name='td4')  # [?, 64, 128, 128]
    td4 = layers.Concatenate(axis=1, name='td4_concat')([td4, e3])  # [?, 128 * 2, 64, 128]

    td5 = up_conv2d_relu(td4, 64, name='td5')  # [?, 128, 256, 64]
    td5 = layers.Concatenate(axis=1, name='td5_concat')([td5, e2])  # [?, 64 * 2, 128, 256]

    td6 = up_conv2d_relu(td5, 32, name='td6')  # [?, 256, 512, 32]
    td6 = layers.Concatenate(axis=1, name='td6_concat')([td6, e1])  # [?, 32 * 2, 256, 512]

    td7 = up_conv2d_relu(td6, 3, name='td7')  # [?, 512, 1024, 3]
    td = layers.Activation('sigmoid')(td7)
    top_decoder = Model(input, td)

    # Bottom decoder branch
    bd1 = layers.Convolution2D(1024, (3, 3), (1, 1), padding='same', activation='relu', name='bd1_conv'+'_conv')(top_decoder.get_layer('td1_upsample').output)  # [?, 1024, 8, 16]
    bd1 = layers.Concatenate(axis=1, name='bd1_concat')([bd1, td1])  # [?, 1024 * 3, 8, 16]

    bd2 = up_conv2d_relu(bd1, 512, name='bd2')  # [?, 16, 32, 512]
    bd2 = layers.Concatenate(axis=1, name='bd2_concat')([bd2, td2])  # [?, 512 * 3, 16, 32]

    bd3 = up_conv2d_relu(bd2, 256, name='bd3')  # [?, 32, 64, 256]
    bd3 = layers.Concatenate(axis=1, name='bd3_concat')([bd3, td3])  # [?, 256 * 3, 32, 64]

    bd4 = up_conv2d_relu(bd3, 128, name='bd4')  # [?, 64, 128, 128]
    bd4 = layers.Concatenate(axis=1, name='bd4_concat')([bd4, td4])  # [?, 128 * 3, 64, 128]

    bd5 = up_conv2d_relu(bd4, 64, name='bd5')  # [?, 128, 256, 64]
    bd5 = layers.Concatenate(axis=1, name='bd5_concat')([bd5, td5])  # [?, 64 * 3, 128, 256]

    bd6 = up_conv2d_relu(bd5, 32, name='bd6')  # [?, 256, 512, 32]
    bd6 = layers.Concatenate(axis=1, name='bd6_concat')([bd6, td6])  # [?, 32 * 3, 256, 512]

    bd7 = up_conv2d_relu(bd6, 1, name='bd7')  # [?, 512, 1024, 1]
    bd = layers.Activation('sigmoid')(bd7)
    bot_decoder = Model(input, bd)

    # 3D box
    # reg = layers.Concatenate(axis=1, name='reg_input')([td, bd])  # [?, 4, 512, 1024]
    # reg = conv2d_relu_pool(reg, 8, name='reg_downsample1')  # [?, 8, 256, 512]
    # reg = conv2d_relu_pool(reg, 16, name='reg_downsample2')  # [?, 16, 128, 256]
    # reg = conv2d_relu_pool(reg, 32, name='reg_downsample3')  # [?, 32, 64, 128]
    # reg = conv2d_relu_pool(reg, 64, name='reg_downsample4')  # [?, 64, 32, 64]
    # reg = conv2d_relu_pool(reg, 128, name='reg_downsample5')  # [?, 128, 16, 32]
    # reg = conv2d_relu_pool(reg, 256, name='reg_downsample6')  # [?, 256, 8, 16]
    # reg = conv2d_relu_pool(reg, 512, name='reg_downsample7')  # [?, 512, 4, 8]
    # reg = layers.Flatten(name='reg_flatten')(reg)
    # reg = layers.Dense(1024, activation='relu', name='reg_dense1')(reg)
    # reg = layers.Dense(256, activation='relu', name='reg_dense2')(reg)
    # reg = layers.Dense(64, activation='relu', name='reg_dense3')(reg)
    # reg = layers.Dense(6, name='reg_dense4')(reg)

    # model = Model(input, [top_decoder, bot_decoder, reg])
    model = Model(input, [td, bd])
    return model

And the number of parameters per layer is shown here:

Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            (None, 6, 512, 1024) 0                                            
__________________________________________________________________________________________________
e1_conv (Conv2D)                (None, 32, 512, 1024 1760        input_1[0][0]                    
__________________________________________________________________________________________________
e1_pool (MaxPooling2D)          (None, 32, 256, 512) 0           e1_conv[0][0]                    
__________________________________________________________________________________________________
e2_conv (Conv2D)                (None, 64, 256, 512) 18496       e1_pool[0][0]                    
__________________________________________________________________________________________________
e2_pool (MaxPooling2D)          (None, 64, 128, 256) 0           e2_conv[0][0]                    
__________________________________________________________________________________________________
e3_conv (Conv2D)                (None, 128, 128, 256 73856       e2_pool[0][0]                    
__________________________________________________________________________________________________
e3_pool (MaxPooling2D)          (None, 128, 64, 128) 0           e3_conv[0][0]                    
__________________________________________________________________________________________________
e4_conv (Conv2D)                (None, 256, 64, 128) 295168      e3_pool[0][0]                    
__________________________________________________________________________________________________
e4_pool (MaxPooling2D)          (None, 256, 32, 64)  0           e4_conv[0][0]                    
__________________________________________________________________________________________________
e5_conv (Conv2D)                (None, 512, 32, 64)  1180160     e4_pool[0][0]                    
__________________________________________________________________________________________________
e5_pool (MaxPooling2D)          (None, 512, 16, 32)  0           e5_conv[0][0]                    
__________________________________________________________________________________________________
e6_conv (Conv2D)                (None, 1024, 16, 32) 4719616     e5_pool[0][0]                    
__________________________________________________________________________________________________
e6_pool (MaxPooling2D)          (None, 1024, 8, 16)  0           e6_conv[0][0]                    
__________________________________________________________________________________________________
e7_conv (Conv2D)                (None, 2048, 8, 16)  18876416    e6_pool[0][0]                    
__________________________________________________________________________________________________
e7_pool (MaxPooling2D)          (None, 2048, 4, 8)   0           e7_conv[0][0]                    
__________________________________________________________________________________________________
td1_upsample (UpSampling2D)     (None, 2048, 8, 16)  0           e7_pool[0][0]                    
__________________________________________________________________________________________________
td1_conv (Conv2D)               (None, 1024, 8, 16)  18875392    td1_upsample[0][0]               
__________________________________________________________________________________________________
td1_concat (Concatenate)        (None, 2048, 8, 16)  0           td1_conv[0][0]                   
                                                                 e6_pool[0][0]                    
__________________________________________________________________________________________________
bd1_conv_conv (Conv2D)          (None, 1024, 8, 16)  18875392    td1_upsample[0][0]               
__________________________________________________________________________________________________
td2_upsample (UpSampling2D)     (None, 2048, 16, 32) 0           td1_concat[0][0]                 
__________________________________________________________________________________________________
bd1_concat (Concatenate)        (None, 3072, 8, 16)  0           bd1_conv_conv[0][0]              
                                                                 td1_concat[0][0]                 
__________________________________________________________________________________________________
td2_conv (Conv2D)               (None, 512, 16, 32)  9437696     td2_upsample[0][0]               
__________________________________________________________________________________________________
bd2_upsample (UpSampling2D)     (None, 3072, 16, 32) 0           bd1_concat[0][0]                 
__________________________________________________________________________________________________
td2_concat (Concatenate)        (None, 1024, 16, 32) 0           td2_conv[0][0]                   
                                                                 e5_pool[0][0]                    
__________________________________________________________________________________________________
bd2_conv (Conv2D)               (None, 512, 16, 32)  14156288    bd2_upsample[0][0]               
__________________________________________________________________________________________________
td3_upsample (UpSampling2D)     (None, 1024, 32, 64) 0           td2_concat[0][0]                 
__________________________________________________________________________________________________
bd2_concat (Concatenate)        (None, 1536, 16, 32) 0           bd2_conv[0][0]                   
                                                                 td2_concat[0][0]                 
__________________________________________________________________________________________________
td3_conv (Conv2D)               (None, 256, 32, 64)  2359552     td3_upsample[0][0]               
__________________________________________________________________________________________________
bd3_upsample (UpSampling2D)     (None, 1536, 32, 64) 0           bd2_concat[0][0]                 
__________________________________________________________________________________________________
td3_concat (Concatenate)        (None, 512, 32, 64)  0           td3_conv[0][0]                   
                                                                 e4_pool[0][0]                    
__________________________________________________________________________________________________
bd3_conv (Conv2D)               (None, 256, 32, 64)  3539200     bd3_upsample[0][0]               
__________________________________________________________________________________________________
td4_upsample (UpSampling2D)     (None, 512, 64, 128) 0           td3_concat[0][0]                 
__________________________________________________________________________________________________
bd3_concat (Concatenate)        (None, 768, 32, 64)  0           bd3_conv[0][0]                   
                                                                 td3_concat[0][0]                 
__________________________________________________________________________________________________
td4_conv (Conv2D)               (None, 128, 64, 128) 589952      td4_upsample[0][0]               
__________________________________________________________________________________________________
bd4_upsample (UpSampling2D)     (None, 768, 64, 128) 0           bd3_concat[0][0]                 
__________________________________________________________________________________________________
td4_concat (Concatenate)        (None, 256, 64, 128) 0           td4_conv[0][0]                   
                                                                 e3_pool[0][0]                    
__________________________________________________________________________________________________
bd4_conv (Conv2D)               (None, 128, 64, 128) 884864      bd4_upsample[0][0]               
__________________________________________________________________________________________________
td5_upsample (UpSampling2D)     (None, 256, 128, 256 0           td4_concat[0][0]                 
__________________________________________________________________________________________________
bd4_concat (Concatenate)        (None, 384, 64, 128) 0           bd4_conv[0][0]                   
                                                                 td4_concat[0][0]                 
__________________________________________________________________________________________________
td5_conv (Conv2D)               (None, 64, 128, 256) 147520      td5_upsample[0][0]               
__________________________________________________________________________________________________
bd5_upsample (UpSampling2D)     (None, 384, 128, 256 0           bd4_concat[0][0]                 
__________________________________________________________________________________________________
td5_concat (Concatenate)        (None, 128, 128, 256 0           td5_conv[0][0]                   
                                                                 e2_pool[0][0]                    
__________________________________________________________________________________________________
bd5_conv (Conv2D)               (None, 64, 128, 256) 221248      bd5_upsample[0][0]               
__________________________________________________________________________________________________
td6_upsample (UpSampling2D)     (None, 128, 256, 512 0           td5_concat[0][0]                 
__________________________________________________________________________________________________
bd5_concat (Concatenate)        (None, 192, 128, 256 0           bd5_conv[0][0]                   
                                                                 td5_concat[0][0]                 
__________________________________________________________________________________________________
td6_conv (Conv2D)               (None, 32, 256, 512) 36896       td6_upsample[0][0]               
__________________________________________________________________________________________________
bd6_upsample (UpSampling2D)     (None, 192, 256, 512 0           bd5_concat[0][0]                 
__________________________________________________________________________________________________
td6_concat (Concatenate)        (None, 64, 256, 512) 0           td6_conv[0][0]                   
                                                                 e1_pool[0][0]                    
__________________________________________________________________________________________________
bd6_conv (Conv2D)               (None, 32, 256, 512) 55328       bd6_upsample[0][0]               
__________________________________________________________________________________________________
bd6_concat (Concatenate)        (None, 96, 256, 512) 0           bd6_conv[0][0]                   
                                                                 td6_concat[0][0]                 
__________________________________________________________________________________________________
td7_upsample (UpSampling2D)     (None, 64, 512, 1024 0           td6_concat[0][0]                 
__________________________________________________________________________________________________
bd7_upsample (UpSampling2D)     (None, 96, 512, 1024 0           bd6_concat[0][0]                 
__________________________________________________________________________________________________
td7_conv (Conv2D)               (None, 3, 512, 1024) 1731        td7_upsample[0][0]               
__________________________________________________________________________________________________
bd7_conv (Conv2D)               (None, 1, 512, 1024) 865         bd7_upsample[0][0]               
__________________________________________________________________________________________________
activation (Activation)         (None, 3, 512, 1024) 0           td7_conv[0][0]                   
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 1, 512, 1024) 0           bd7_conv[0][0]                   
==================================================================================================
Total params: 94,347,396
Trainable params: 94,347,396
Non-trainable params: 0

some poor images cannot make it

Hi,really good work!
when I evaluated some images at optimization step,it's not work any more, I mean most images I used can make it,but except some of them ,for example, like this one below, can you help me:D

Undefined function or variable 'cor_fn_t'.
Error in samplingPanoBox (line 38)
            line_can(line_n+1:end,score_id(line_id)) =cor_fn_t(2*score_id(line_id),1);

3D ground truth interpretation

Hello,

First, thank you for sharing this great work. A quick question about the panoContext_box_train.t7 tensor:

The paper mentions 6 ground truth 3D parameters: sw, sl, sh, tx, tz, r_theta. The first 6 elements in the box tensor above (box[{{1}{1}{1,6}}]), which I believe contain those parameters for the first example image, read:

sw = -0.5154072972870558
sl = -0.6748731674025037
sh = -1.316387492900166
tx = -0.24216556285261603
tz = -0.2114205765327388
r_theta = 0.08283438070600802

A naive interpretation would suggest that the room is almost 3x higher than it is wide? Is there a reason for the negative scale factors? Any guidance on interpretation would be much appreciated

Training does not converge

hello,
thank you for sharing the great work.
I tried the edge training 'th driver_pano_edg.lua' followed with your guidance, but the loss value does not converge. I checked the code and the data but can not find the reason. could you give me some advices?

the output as followed

done
414
Uploaded training
46
Uploaded validation
start training
update param, loss = 0.64732569, gradnorm = 6.7421e+00
update param, loss = 2.26082158, gradnorm = 9.6880e-01
update param, loss = 2.10377669, gradnorm = 1.5707e-03
update param, loss = 2.14185762, gradnorm = 0.0000e+00
update param, loss = 2.16074014, gradnorm = 0.0000e+00
update param, loss = 2.17253232, gradnorm = 0.0000e+00
...
update param, loss = 2.05997038, gradnorm = 0.0000e+00
update param, loss = 2.17931867, gradnorm = 0.0000e+00
update param, loss = 2.07385349, gradnorm = 0.0000e+00
iteration 8000, loss = 2.07385349, gradnorm = 0.0000e+00
validation loss = 2.14375149

Original mapping between dataset ids and panoContext names

Hi

What is the mapping between the images in the .t7 files (from data.zip) and their original name in the panoContext dataset? (and Stanford 2d-3d)

I tried to look in the file ./gt/panoContext_train.txt, but it doesn't,
For example:
./gt/panoContext_train.txt[1] says pano_aurfmkmrmsfgau.png
but it's actually file
./data/panoContext_img_train.t7[{{336},{},{},{}}]

and
./data/panoContext_img_train.t7[{{1},{},{},{}}]
is actually file- pano_93a57c28c5e11bb9c96f944c2a649f2b.jpg which appears as
./gt/panoContext_train.txt[364]

Is there some way to find the original mapping between the datasets?

Also, do you have the rotation matrix that was used to align each image? Or should I just execute the getManhattanAndAlign.m on the images and I will get those exact images?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.