Coder Social home page Coder Social logo

hkchengrex / cascadepsp Goto Github PK

View Code? Open in Web Editor NEW
812.0 16.0 92.0 3.19 MB

[CVPR 2020] CascadePSP: Toward Class-Agnostic and Very High-Resolution Segmentation via Global and Local Refinement

Home Page: https://hkchengrex.com/CascadePSP/

License: MIT License

Python 100.00%
segmentation deep-learning pytorch cvpr2020 computer-vision segmentation-refinement refinement-network high-resolution

cascadepsp's People

Contributors

hkchengrex avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cascadepsp's Issues

Questions about the function process_high_res_im

Hi,

may I ask why such kind of threshold can be used to define the not interesting area? And what if the object is relatively small? Thanks a lot.

# Skip when it is not an interesting crop anyway
seg_part_norm = (seg_224_part>0).float()
high_thres = 0.9
low_thres = 0.1
if (seg_part_norm.mean() > high_thres) or (seg_part_norm.mean() < low_thres):
        continue
grid_images = safe_forward(model, im_part, seg_224_part, seg_56_part)
grid_pred_224 = grid_images['pred_224'].to(aggre_device)

What is the ground truth?

I have two classes. so i design the ground truth that has 0 for background and 1 for the other class. Am i right, or another way for ground truth?

How to make training model for paper?

Hi, Great work. After I read the paper, I have a question about model. Your code has just a Global Module. But I think the training process is shown in the picture below. Am I right?
v2-b26e5f1b0b57afa70f01eebe61092881_1440w

Cannot execute the test refinement

Hey guys,

first of all, thanks for your great work.

I just tried to reproduce the test example (aeroplane), but received the following error:

     66             image = self.im_transform(image).unsqueeze(0).to(self.device)
---> 67             mask = self.seg_transform((mask>127).astype(np.uint8)*255).unsqueeze(0).to(self.device)
     68             if len(mask.shape) < 4:
     69                 mask = mask.unsqueeze(0)

~/Documents/Programming/VirtualEnvironments/python3_venv/lib/python3.7/site-packages/torchvision/transforms/transforms.py in __call__(self, img)
     47     def __call__(self, img):
     48         for t in self.transforms:
---> 49             img = t(img)
     50         return img
     51 

~/Documents/Programming/VirtualEnvironments/python3_venv/lib/python3.7/site-packages/torchvision/transforms/transforms.py in __call__(self, pic)
     74             Tensor: Converted image.
     75         """
---> 76         return F.to_tensor(pic)
     77 
     78     def __repr__(self):

~/Documents/Programming/VirtualEnvironments/python3_venv/lib/python3.7/site-packages/torchvision/transforms/functional.py in to_tensor(pic)
     46     if isinstance(pic, np.ndarray):
     47         # handle numpy array
---> 48         img = torch.from_numpy(pic.transpose((2, 0, 1)))
     49         # backward compatibility
     50         if isinstance(img, torch.ByteTensor):

ValueError: axes don't match array

I run the following:

refiner = refine.Refiner(device='cpu')

image = cv2.imread('cascade/aeroplane.jpg')
mask = cv2.imread('cascade/aeroplane.png', cv2.IMREAD_GRAYSCALE)

output = refiner.refine(image, mask, fast=True, L=900) 

Tried it with both torch-1.0.0 torchvision-0.2.1 and the newest versions, but always get the same error. Is this a known issue?

Effect of hyperparameter L

Screenshot from 2021-06-24 16-35-16

Hello, I try to testing with the same L = 900.
I got the result like an original image. Then, I try to decrease L, I got the result but the looks like mixing with the original image.
Why its happen?

Confirm the training is correct

Training my own dataset,one epoch result as below:
It 1450 [TRAIN] [grad_loss ]: 0.0108861
It 1450 [TRAIN] [iou/orig_i ]: 15341.660
It 1450 [TRAIN] [iou/orig_u ]: 30589.000
It 1450 [TRAIN] [iou/new_i_224 ]: 0.0000000
It 1450 [TRAIN] [iou/new_u_224 ]: 17783.060
It 1450 [TRAIN] [iou/new_i_56 ]: 0.0000000
It 1450 [TRAIN] [iou/new_u_56 ]: 17783.060
It 1450 [TRAIN] [iou/new_i_28 ]: 0.0000000
It 1450 [TRAIN] [iou/new_u_28 ]: 17783.060
It 1450 [TRAIN] [iou/new_i_28_2 ]: 0.0000000
It 1450 [TRAIN] [iou/new_u_28_2 ]: 17783.060
It 1450 [TRAIN] [iou/new_i_28_3 ]: 0.0000000
It 1450 [TRAIN] [iou/new_u_28_3 ]: 17783.060
It 1450 [TRAIN] [iou/new_i_56_2 ]: 0.0000000
It 1450 [TRAIN] [iou/new_u_56_2 ]: 17783.060
It 1450 [TRAIN] [total_loss ]: 0.4961829
It 1450 [TRAIN] [ce_loss/s_0 ]: 0.3328005
It 1450 [TRAIN] [l1_loss/s_0 ]: 0.0295411
It 1450 [TRAIN] [l2_loss/s_0 ]: 0.0295335
It 1450 [TRAIN] [loss/s_0 ]: 0.1135052
It 1450 [TRAIN] [ce_loss/s_1 ]: 0.0864096
It 1450 [TRAIN] [l1_loss/s_1 ]: 0.0507903
It 1450 [TRAIN] [l2_loss/s_1 ]: 0.0253867
It 1450 [TRAIN] [loss/s_1 ]: 0.0864096
It 1450 [TRAIN] [ce_loss/s_2 ]: 0.0875118
It 1450 [TRAIN] [l1_loss/s_2 ]: 0.0458868
It 1450 [TRAIN] [l2_loss/s_2 ]: 0.0255962
It 1450 [TRAIN] [loss/s_2 ]: 0.0616267
It 1450 [TRAIN] [ce_loss/s_3 ]: 0.0865094
It 1450 [TRAIN] [l1_loss/s_3 ]: 0.0508352
It 1450 [TRAIN] [l2_loss/s_3 ]: 0.0253933
It 1450 [TRAIN] [loss/s_3 ]: 0.0865094
It 1450 [TRAIN] [ce_loss/s_4 ]: 0.0865065
It 1450 [TRAIN] [l1_loss/s_4 ]: 0.0508407
It 1450 [TRAIN] [l2_loss/s_4 ]: 0.0253932
It 1450 [TRAIN] [loss/s_4 ]: 0.0865065
It 1450 [TRAIN] [ce_loss/s_5 ]: 0.0875015
It 1450 [TRAIN] [l1_loss/s_5 ]: 0.0459030
It 1450 [TRAIN] [l2_loss/s_5 ]: 0.0255950
It 1450 [TRAIN] [loss/s_5 ]: 0.0616252
It 1450 [TRAIN] [iou/orig_iou ]: 0.5015417
It 1450 [TRAIN] [iou/new_iou_224 ]: 0.0000000
It 1450 [TRAIN] [iou/iou_gain_224 ]: -0.501541
It 1450 [TRAIN] [iou/new_iou_56 ]: 0.0000000
It 1450 [TRAIN] [iou/iou_gain_56 ]: -0.501541
It 1450 [TRAIN] [iou/new_iou_28 ]: 0.0000000
It 1450 [TRAIN] [iou/iou_gain_28 ]: -0.501541
It 1450 [TRAIN] [iou/new_iou_28_2 ]: 0.0000000
It 1450 [TRAIN] [iou/iou_gain_28_2 ]: -0.501541
It 1450 [TRAIN] [iou/new_iou_28_3 ]: 0.0000000
It 1450 [TRAIN] [iou/iou_gain_28_3 ]: -0.501541
It 1450 [TRAIN] [iou/new_iou_56_2 ]: 0.0000000
It 1450 [TRAIN] [iou/iou_gain_56_2 ]: -0.501541
I am not sure if it is normal, waiting for your reply.

OnlineDataset Issues

  1. perturb=False flag is broken (self.bilinear_dual_transform_im is missing)
  2. Much more worrying: the code seems to be applying horizontal flip independently for ground truth and label. With 50% chance, that means the ground truth is corrupted and no longer aligns with the rgb image.

how to get the name_seg.png when testing on semantic segmentation ?

Sorry to bother you agin.
1.When i testing on semantic segmentation use the BIG datasets you provided, i found that datasets haven't the name_seg.png picture as you described in the readme. I want to know how to get that style of the seg.png picture.
2.Your paper mentioned the crop process in Local step, in your code i don't find that process. Whether this process output the seg.png picture ?
image
Thanks for your time and kindness.

Dataset problem

Hello, author! My dataset is segmented using labelme, generates a JSON file, and a picture is segmented into classes. Can you train this network? Can there be multiple categories of segmentation results for a single image? If not, how can I make changes to my annotated image?

AttributeError: 'NoneType' object has no attribute 'group'

hello ,when I want to run
python eval_post.py --dir /home/zj/PycharmProjects/CascadePSP-master/CascadePSP-master/output_directory --output /home/zj/PycharmProjects/CascadePSP-master/CascadePSP-master/output_temp_result
,occur a broke :

File "eval_post.py", line 64, in <module>
    this_class = int(re.search(r'\d+', gt_name[::-1]).group()[::-1]) - 1
AttributeError: 'NoneType' object has no attribute 'group'

pytorch version?

Hi! Awesome work!

It would be useful to know what exact versions were used with this repo. Do you think you can add that to the readme / requirements file?

How to get the refined mask file

In the example code, I learned how to get the masked images. However I don't see how to get the refined mask file. Could you please tell me use which function to get the refined mask file?

How to use "squeezenet" as backend to train model??

I can run with resnet,but too slow ,so how to use squeezenet???
I meet error:
RuntimeError: Given groups=1, weight of size [64, 3, 3, 3], expected input[8, 6, 224, 224] to have 3 channels, but got 6 channels instead

Got quite strange result with chessboard-like on it!

image = cv2.imread('/mnt/work/yuyanpeng/code/personal/PaddleSeg/deploy/python/dataset/360/doll/image/00_26.jpg')
mask = cv2.imread(
    '/mnt/work/yuyanpeng/code/personal/PaddleSeg/deploy/python/dataset/360/doll/mask/00_26.jpg', cv2.IMREAD_GRAYSCALE)
    
# model_path can also be specified here
# This step takes some time to load the model
refiner = refine.Refiner(device='cuda:0', model_folder="../downloaded_model") # device can also be 'cpu'

# Fast - Global step only.
# Smaller L -> Less memory usage; faster in fast mode.
output = refiner.refine(image, mask, fast=False, L=800) 

plt.imshow(output)
plt.show()

Following is the result.
image

onedrive can't work

hi, there is something wrong with onedrive when we use it in china. So, can you upload something like model in google drive,thanks!
(作者你好呀,这个onedrive打开之后出现1s就消失了,不知道是不是在内地的原因,您方便的话可以上传到谷歌drive或者百度网盘什么的吗,谢谢啦)

How to generate Binary Masks from labelled dataset?

Sir,

As you reannotate the PASCAL VOC 2012 dataset, can you please tell me how can we generate binary masks from my dataset labelled using LabelMe Image Annotator in python (it generates JSON files), so that I can train this model on my custom dataset. I am doing Instance segmentation and labelled the data according to this.

Please provide that code.

Thanks

testing segmentation

sorry, I having training this model by using VOC2012 dataset and I want to test my model on val dataset.
But I don't know why test directory should include three types images, that is _im.jpg, _seg.png and _gt.png.
I have known that ground truth images and RGB images, but what is input segmentation images?
How to produce it?

Something about ResNet50

Hi,

may I ask what is the purpose to use dilation in your code? and how did you add the zero-initialized channels to the first conv. I cannot find it.

Thank you.

Some train problems

  1. What does the "some_unique_id" mean in the training command "some_unique_id"?
  2. I saw the training images you provided. Is it only training with low-resolution images?

Question about the transformation details

Hi @hkchengrex,
Actually, I still have some question about the transformation details. May I ask what is your purpose to normalize the 'seg' with mean 0.5 and std 0.5? Shouldn't 'seg' be a mask that only has value 0 and 1? And I am not very clear why you use torch.tanh in your network. Are there some advantages? Thank you so much!

Long response time with the demo code

Hi, I tried the following demo code in both my CPU only local machine and SageMaker but seems I have never got a response:

import cv2
import time
import matplotlib.pyplot as plt
import segmentation_refinement as refine
image = cv2.imread('test/aeroplane.jpg')
mask = cv2.imread('test/aeroplane.png', cv2.IMREAD_GRAYSCALE)

# model_path can also be specified here
# This step takes some time to load the model
refiner = refine.Refiner(device='cpu') # device can also be 'cpu'

# Fast - Global step only.
# Smaller L -> Less memory usage; faster in fast mode.
output = refiner.refine(image, mask, fast=False, L=900) 

plt.imshow(output)
plt.show()

I had a successful case with using my 2070 GPU which response in 3s. Do you know what is taking that long for response in CPU-only machine? If I am using the pre-trained model only, is there a way to improve without losing accuracy? Thank you very much, it's a fantastic project!

Boundary_accuracy

Hello,
I have a questions about the boundary accuracy metric implementation.
If I understand well you take the subset of the segmentation maps around the GT boundaries for various radiis.
But why using the accuracy metric since it won't go lower than 0.5 even if the model predicts nothing? That seems misleading to address boundary quality.

I suggest processing the dilated boundaries of GTs and PREDs and computing IoU or Dice on those for each radius.
This will range from 0. to 1. for no overlap to perfect overlap.

I think this is the way it's computed in the Edge Detection community, except they take F1 score.

How to Convert code to C++

Hi, I really like your code, your code works particularly well on our model, but now we need to deploy the model to TensorRT in C++, although a Pytorch model like pspnet can be converted very quickly, but how can a method like process_high_res_im be written with the api in TensorRT (C++), thanks a lot!

How to train on my own data?

Thanks for your amazing work. It is very useful to me! However, how can I train a model on my own data? I browsed training.md but still did not grasp the key points. Do I have to download the training data you provided in training.md? Thanks for your help.

About sample rate in boundary_modification.py

Thanks for your work. I try to understand why you choose small value as sample rate, you choose the contour that the shape is more than ten to modify the contour. I thought big sample rate can help contain more information in a specify contour. So can you explain the reason why you choose 0.1 as the sample rate?

Perturbed labels

Dear author:
Is there any data perturbation code in this package? I didn't find it. Can you send me a copy? Thank you very much!

ValueError: axes don't match array

If you run test.py from segmentation refinement package source you get an error ValueError: axes don't match array thrown at this line:

mask = self.seg_transform((mask>127).astype(np.uint8)*255).unsqueeze(0).to(self.device)

Probably important note: before running python test.py I've changed cuda:0 to cpu at the line when you create a Refiner as I tested it on a machine without CUDA.

ResNet modifications

Could you please clarify what modifications are made in ResNet50 backbone and why?

Usually ResNet50 output stride is 32, but in your version it is 8 (due to dilations in last layers i think).
Also i noticed that layer3 in your code will generate 23 blocks instead of default 6

issue of reproducing performance

Hi,

This work is very impressive!
I run this model using the released checkpoint, and the performance is very good. However when I tried to train a new model, I am not able to get the same performance. Could you please tell me which hyper parameters are used to get the released model's performance?

I have tried 2 settings:

  1. default setting in hyper_para.py.
  2. batch_size=9, lr=3.0e-4, iterations=60000, steps=30000, 2gpus, as indicated in the paper.

Thank you very much!

Test effect of own data

My own data is about the semantic segmentation of two categories. This algorithm is only useful for object segmentation. It is not effective for region semantic segmentation?

DUT-OMRON Link is not working

Hi, @hkchengrex
DUT-OMRON Link (http://saliencydetection.net/duts/download/) is not working, I wonder if you could provide a google drive link for downloading the dataset? Many thanks.

os.system("wget -P ../tmp_download_files http://saliencydetection.net/duts/download/DUTS-TR.zip")
os.system("wget -P ../tmp_download_files http://saliencydetection.net/duts/download/DUTS-TE.zip")

About Loss Calculation

Hi,

may I ask what is the advantage of using L1+L2 loss for supervision at high resolution?

download model error

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='docs.google.com', port=443): Max retries exceeded with url: /uc?export=download&id=103nLN1JQCs2yASkna0HqfioYZO7MA_J9 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x000001C46728F860>: Failed to establish a new connection: [WinError 10060] 由于连接方在一段时间后没有 正确答复或连接的主机没有反应,连接尝试失败。',))

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.