Coder Social home page Coder Social logo

polyworldpretrainednetwork's People

Contributors

zorzi-s avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

polyworldpretrainednetwork's Issues

Problem with choosing backbone

Hi,thank you for your excellent work and code!
I notice you used R2UNet as backbone in your network, compared with the Frame Field Learning which used UResNet101. And in this case, your algorithm is more effective. I'd like to know have you tried to use other backbone(like UResNet101 same as FFL) in your network? will them show similar results?
I'll very appreciate for your reply.

Request for training component

Het Stefano, glad that you put the code in Github. I appreciate it. However I was wondering if it is possible to train the model from scratch with some custom dataset? Did you also share the training module in the repo? Or I missed it? Thank you @zorzi-s

Segmentation mask

Hello There,

Thank you for sharing your work.

In your paper, it was indicated that corner detection is modeled as a segmentation task using weighted binary cross-entropy loss. It is also indicated that training the corner detection part needs polygonal annotations.
My question is what are these polygonal annotations? Do they represent each corner as a single pixel with a value of 1 and leave other pixes with a value of 0? or are they just a binary polygon, meaning pixels on and inside the polygon are 1?

Thanks

URLs `files.icg.tugraz.at/f/...` aren't always working

The following links don't always work, the polyworld_backbone is particularly important, as otherwise it's not possible to proceed to inference and evaluation.

- Poster: [Seafile link](https://files.icg.tugraz.at/f/6a044f133c0d4dd992c5/)

After cloning the repo, download the _polyworld_backbone_ pre-trained weights from [here](https://files.icg.tugraz.at/f/a0375b84e10a44aea669/?dl=1), and place the file in the _trained_weights_ folder.

The CrowdAI Mapping Challenge dataset can be downloaded [here](https://files.icg.tugraz.at/d/a9d6a9412c0f49a88ab9/).

- [json results](https://files.icg.tugraz.at/d/1c7a26dd914d4e1fae98/): here you can find the output annotations in json format with and without using refinement vertex offsets.

- [shp results](https://files.icg.tugraz.at/d/06c7119eb35f431ca4c2/): here you can find archives containing the shapefile annotations ready to be visualized in QGIS.

Exhaustiveness verified thanks to:

grep -r 'files.icg.tugraz.at'

Prediction in train mode?

Hi,

Thank you for releasing the code. I'm wondering why you set the network to train() mode during inferencing? Thanks.

model = R2U_Net()
model = model.cuda()
model = model.train()

head_ver = DetectionBranch()
head_ver = head_ver.cuda()
head_ver = head_ver.train()

suppression = NonMaxSuppression()
suppression = suppression.cuda()

matching = OptimalMatching()
matching = matching.cuda()
matching = matching.train()

Question about "common corners" in Section 7

Hi,

Thanks for the well written and interesting paper.

In the Section 7 you mention that "common corners could be efficiently solved ... by detecting the number of vertices located in the same position and sampling the visual descriptor multiple times ..."

My question is: how would ground-truth permutation matrix construction happen in this case? It seems that for each common corner with n sampled points there'd be n! valid ground-truth permutation matrices.

Thanks!

Questions about method implementation

Hi @zorzi-s, thanks for sharing the code of your interesting work. I have several questions regarding method implementation.

  1. When calculating angle loss and mask segmentation loss, the method must determine the instance correspondences between the predicted and GT polygons. How did you determine the instance correspondences?
  2. By equations (8) and (9), you are using both clockwise and counterclockwise score matrics. But in your code (lines 156-157 of matching.py), I did not see the difference between these two score matrics.
  3. Would you please provide detailed ablation results for different loss terms?
    Thanks for your attention.
    Regards,
    Xiang

Question about Differentiable Polygon Rendering

@zorzi-s Hello! Very pleased with your work,I don't quite understand the algorithm of one of the modules, the formula of the Differentiable Polygon Rendering algorithm you gave. How det(um,vm) is calculated specifically. Are u, v, m all two-dimensional vectors?

Souce code of training process

Hello, I wanna implement your algorithm in your paper, But, I cannot find the training script, So can you help me with that? And it would be so appreciate If you can send me your source code of training process.

using proper set of offset for calculating L_seg and L_angle losses

Thanks for the very interesting work. At the end of section 4, you have mentioned
Since the NMS block is not differentiable, the only way for the network to minimize L_seg and L_angle is to generate a proper set of offsets t for Equation 6.

Can you please elaborate more on this? Does this mean you are not using the t_i values generated by the attentional gnn? Do you propagate this loss through the score matrix generated by the optimal connection network?

Possibility about doing predictions without ground truth annotation files.

Hi @zorzi-s, much appreciated for sharing your interesting and knowledgeable work! I am trying to evaluate the model with my own dataset. I have input the pictures as well as the annotation files in MS-COCO format to the model and achieved some meaningful results, but when trying to make predictions simply based on the pictures, I feel a bit perplexed about what type of inputs I am supposed to feed into the model. I tried PIL images and NumPy arrays but failed to execute the prediction step. Could you please give a suggestion on that?

Simplifying the shape of curved walls

Hi Stefano,

I have a follow-up question regarding the crowdAI dataset. I did some rough statistics on the dataset and found the maximum number of vertices in a building can be 262 (usually buildings with curved walls). I was wondering if you did some preprocessing, e.g. simplifying the shape to reduce the number of vertices? Since if the number of vertices is too large, it will be problematic for the positional refinement part, right? I assume the positional refinement part requires that the predicted polygons and the ground truth polygons should have the same number of vertices. If my assumption is wrong, how do you calculate angle loss of two polygons with different number of vertices?

Thanks in advance!

Best,
Yuanwen

Question about the loss

What result does Y in the Ldet refer to? Is it the final prediction vertex result or the vertex detection result of the first module?

about training

Hello There,
Thank you for sharing your work.
I would like to know if you can now publish the source code for the training?
Thank you for your work!

Questions about Loss

Hi @zorzi-s, much appreciated for sharing your interesting and knowledgeable work!
I have a few questions I'd like to ask you.

  1. When calculating the matched loss function, whether the true value needs to be in a one-to-one correspondence with the predicted value. Indicates whether the dustbin in the superglue is used. Check whether the training loss can be obtained by referring to the following code:
    ` dists = cdist(kp1_projected, kp2_np)

     min1 = np.argmin(dists, axis=0) 
     min2 = np.argmin(dists, axis=1)
    
     min1v = np.min(dists, axis=1) 
     min1f = min2[min1v < 3] 
    
     xx = np.where(min2[min1] == np.arange(min1.shape[0]))[0]
     matches = np.intersect1d(min1f, xx) 
    
     missing1 = np.setdiff1d(np.arange(kp1_np.shape[0]), min1[matches])
     missing2 = np.setdiff1d(np.arange(kp2_np.shape[0]), matches)
    
     MN = np.concatenate([min1[matches][np.newaxis, :], matches[np.newaxis, :]])
     MN2 = np.concatenate([missing1[np.newaxis, :], (len(kp2)) * np.ones((1, len(missing1)), dtype=np.int64)])
     MN3 = np.concatenate([(len(kp1)) * np.ones((1, len(missing2)), dtype=np.int64), missing2[np.newaxis, :]])
     all_matches = np.concatenate([MN, MN2, MN3], axis=1)`
    

loss = [] for i in range(len(all_matches[0])): x = all_matches[0][i][0] y = all_matches[0][i][1] loss.append(-torch.log( scores[0][x][y].exp() )) # check batch size == 1 ?
2. Which of the following are you using for the matching process: Chamfer Distance (CD) and Earth Mover's Distance (EMD) China? Is a gt mapped for each pred?
Could you please give a suggestion on that? Thanks.

Train on custom datasets

Thanks for your excellent work and share it here. I have tested it and it worked as expected. I am wondering if you can also share how to train on custom dataset here. Many thanks

Amazing result! Thanks for your research!

Amazing! Looking forward to your train.py. Just now, I tried prediction.py with my own dataset, and it's amazing. Though I haven't read the paper, I think your work is so meaningful! Now I have a little confused with offset of the result, I hope that I can solve it after reading the paper. Thanks for your research!
未命名图片

about matching loss

@zorzi-s,hello,Does cminimizing the negative log-likelihood of the positive matches of P in the matching loss mean that we only compute the part of the ground truth permutation matrix ¯P where the values are 1?

Challenging Test Set Image IDs for comparison

Hello! Thank you for releasing the code of the model. I find your work really interesting and have been meaning to draw some comparisons on some of the results you report in your paper. In figure 10 of your paper, you illustrate the predictions on some challenging test set images to show how well your model handles them.

Could you kindly share the image IDs of these images so that I can run inference on them and compare them against some of my own experiments? Not exactly a technical issue, but would appreciate it if you could help with this.

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.