cvlab-stonybrook / dm-count Goto Github PK

View Code? Open in Web Editor NEW

213.0 9.0 52.0 153.07 MB

Code for NeurIPS 2020 paper: Distribution Matching for Crowd Counting.

License: MIT License

Python 100.00%

dm-count's People

Contributors

Stargazers

Watchers

Forkers

cyao118 xinxinatg shiyuan0806 cv-ip selmanozleyen 18150167970 linshuqing juanne aliabd tersekmatija fasladodo svshivapuja dlwbm123 feixuedudiao tbeatty qiuweibin2005 xixiareone hj-pu chinkit-ng famousdirector xa-hyy nathanhundley harryliew haoyuebaizju blackxer wz940216 jaspereb andreamatt ybyangjing jingliang95 midasklr efkandurakli key5n aimicm abcxs unique-chan vccheng2001 mast1ren buiduchanh kpollux h-hui2277 rainbow2811 abdumhmd wang00619 lin-ke sydai nightmare4214 changqp1965 hoang-mai camerayuhang

dm-count's Issues

When will the training code release?

Thanks to the authors. I have read the paper, and I think it is excellent work. However, the training and OT code are not provided, which is difficult to follow the work.

Demo URL

Hello, the visual results display URL of this project is invalid, do you have other urls?

The OT loss is too small.Dose it work?

It's even less than 0:
Epoch 46 Train, Loss: 3.89, OT Loss: -6.02e-07, Wass Distance: 803.41, OT obj value: 113.64, Count Loss: 3.55, TV Loss: 0.35, MSE: 8.88 MAE: 3.55, Cost 24.5 sec

Is OT loss only applicable while training is implemented on square images

Hi I notice that in the process of training, the images will be cropped into images of square shape. Just wondering is it possible that the OT loss can be implemented on the rectangle images during training? Thanks for your time in advance!

How to reduce the big experiment randomness?

Using the code, I observe very big experimental randomness. For example, on QNRF dataset, I obtain results on test set as follows (MAE and MSE):
run 1: 87.621, 149.75
run 2: 92.988, 168.47
run 3: 96.175, 167.79

In the paper, 85.6 and 148.3 are reported. May I ask if the authors have some ideas to reduce the big experiment randomness? With this big randomness, how can we draw conclusion on which model performs well and which doesn't?

Thanks a lot.

Use of dm-count on less dense smaller images

Hello, thanks for the great work!

Do you have any advice on how to use your work with smaller images that are not technically crowds but more occlusions of two, three, four persons. I tried to retrain the model and I am getting heatmaps that localize the head pretty well but it doesn't count the people on the image properly.

For example, I can see that two areas have been located on the heatmap, but only one person is counted in the end. Do you recommend to change something in the code for images that are less dense than a crowd. Also, for your information, I trained on a dataset of small images (approximately 100x60 pixels), therefore I added some padding to reach the size of 512x512.

Any advice would be highly appreciated. Thanks

About density map

Hi, thanks for your amazing work!
I have a question about the density maps estimation. In the paper, your mentioned that didn't use precalculated density maps for ground truth annotations because of the hurt in generalization performance, so only use points annotations right? But after, i got lose because in the toy problem you said "he source density map ˆz is initialized from a uniform distribution between 0 and 0.01", and also in preprocess datasets scripts you calculate a gaussian density map.

So the doubt is if you need a precalculated density map, and in this case how can i calculate in the way you do.

Thanks!

多卡训练

请问可以多卡训练吗？我使用nn.DataParallel的话，loss就会特别大，mae也特别大。

Confusing Definition of the OT Loss

Hi Boyu,

I really appreciate your and your team's contribution to crowd counting. It seems that the proposed DM-Count loss can improve the performance a lot.

However, I found the OT loss defined in ot_loss.py very confusing. There are many notations without proper comments or documentations. Could you please standardise the code so that it only takes the predicted density map & the ground-truth density map as inputs?

Many thanks,
Yiming

A bug occurred while running train.py

train.py is not running and encounters the following problems.
in pytorch 1.2

Traceback (most recent call last):
File "C:/Program Files (x86)/DM-Count-master/train.py", line 64, in
trainer.train()
File "C:\Program Files (x86)\DM-Count-master\train_helper.py", line 110, in train
self.train_eopch()
File "C:\Program Files (x86)\DM-Count-master\train_helper.py", line 126, in train_eopch
for step, (inputs, points, gt_discrete) in enumerate(self.dataloaders['train']):
File "C:\Anaconda\envs\py37\lib\site-packages\torch\utils\data\dataloader.py", line 819, in next
return self._process_data(data)
File "C:\Anaconda\envs\py37\lib\site-packages\torch\utils\data\dataloader.py", line 846, in _process_data
data.reraise()
File "C:\Anaconda\envs\py37\lib\site-packages\torch_utils.py", line 369, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "C:\Anaconda\envs\py37\lib\site-packages\torch\utils\data_utils\worker.py", line 178, in worker_loop
data = fetcher.fetch(index)
File "C:\Anaconda\envs\py37\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "C:\Anaconda\envs\py37\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "C:\Program Files (x86)\DM-Count-master\datasets\crowd.py", line 124, in getitem
return self.train_transform(img, keypoints)
File "C:\Program Files (x86)\DM-Count-master\datasets\crowd.py", line 83, in train_transform
gt_discrete = gen_discrete_map(h, w, keypoints)
File "C:\Program Files (x86)\DM-Count-master\datasets\crowd.py", line 37, in gen_discrete_map
discrete_map = torch.zeros(im_width * im_height).scatter_add(0, index=p_index, src=torch.ones(im_width*im_height)).view(im_height, im_width).numpy()
RuntimeError: Expected object of scalar type Long but got scalar type Int for argument #3 'index'
Your work has given me a lot of help. Thank you very much. I really need you for this problem, and I hope to get your help.

hyperparameters

I'm not able to reproduce the same results as in the paper in particular on NWPU. Did you use the same parameters as reported in the paper ? Did you use 1000 epochs for training ?

About the min_size, max_size in the preprocess

Hi, thanks for this meaningful work. I am wondering how will the min_size, max_size effects the final results or the training process?

DM-Count/preprocess/preprocess_dataset_nwpu.py

Line 99 in cc5f213

def main(input_dataset_path, output_dataset_path, min_size=384, max_size=1920):

Influence of data preprocessing on model performance

Hi, thanks for your excellent work and the open-source codes!
I'm very interested in the novel optimization criterion proposed in your paper and tried to reproduce it locally. But there is a confusing problem of performance degradation when I restrained the height and width of the test images to make sure that they are divisible by a particular number (like, 8, 16, or 32). This data preprocessing strategy is inspired by C^3 Framework and guarantees the output size of some down-sampling layers count meet the requirements of subsequent processing. More details of the test results on ShanghaiTech Part_A are shown in the table below.

	pretrained	resize	MAE	MSE
Reported	✔️		59.68	95.72
Reproduce	✔️	✔️	62.47	101.98

This issue is to sincerely ask you how such a simple operation can have such an obvious impact on the experimental results.

Dataloader for UCF-CC-50

This is a very great work. Thanks a lot

Could you also share the dataloader and experimental setting (like crop size and so on) for dataset UCF-CC-50? Currently, I am working on this dataset, but I have trouble obtaining the results you reported. I think it may due to the dataloader or other experimental setting.

I had an error loading your pre-training model

There was an error when i was training or testing models in the (pretrained_models) folder

File "/home/user/Documents/LJY/DM-Count/test.py", line 49, in
model.load_state_dict(torch.load(model_path, device), strict=False)
File "/home/user/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 386, in load
return _load(f, map_location, pickle_module, **pickle_load_args)
File "/home/user/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 563, in _load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.

I didn't make any mistakes when i experimented with the model i had trained.
Thank you very much for your work. It helped me a lot
Please help me. Thank you!
修改翻译结果

Why not use more advanced backbone

Excellent work!
But why not use more advanced backbone, such as resnet or CSPNet et.?
Is that because of in order to prove the progresiveness of the loss designed?

about density map

Hi, you mentioned you don't calculate the density map in paper. Why I found you did gaussian filter when processing NWPU?

Questions about the prediction effect of different objects

Hi author, I would like to ask how well your model works for estimating the density of waterfowl (e.g. geese)?

shanghaitech_b evaluation

I'm trying to understand the evaluation procedure that you have used for the ShanghaiTech B dataset. Are you monitoring performance on the test set during training and then selecting the best performance? Or, otherwise, how are you deciding when to stop training to achieve the best test performance? ShanghaitechA/B don't have validation sets given.

About the ot_loss why optimizing the dual term's derivate instead of the OT distance ?

Why setting the optimal target equals the dual term "<β*, z^>"s derivate times the prediction instead of the original OT distance? It makes sense to optimize the entire OT loss term "W(z, z^)" or its dual term "<β*, z^>" to force dot regression more sparse and accurate, but why the derivate? Is it mentioned in the paper or supplements?

train with Shanghai Tech Part A and Part B dataset

Hello, I want know if this project has preporcess dataset with Shanghai Tech Part A and Part B dataset, we want to test this excellent model in ours dataset. Thank you for you help.

Regarding hyperparameters for training on UCF-QNRF dataset

Hi, Firstly I would like to say that your paper and code are very impressive. Thank you for releasing the code.

The default hyperparameters worked well for NWPU and ShanghaiTech datasets but are giving a bad results (MAE -- 600) for UCF-QNRF. Could you please clarify about this ?

Thanks in advance :)

I'm trying to test your model on a picture and getting too small number

on model qnrf it says 1975
on model nwpu it says 4878
the heatmap is attached as well

model.eval() when testing

I think you should add model.eval() in test.py. It won't alter the default result, but if batch normalization is enabled it will definitely affect the result. I had to change it because I am using ResNet50 instead of VGG19.

excellent work

I think the idea is very novel and the result is very good. I want to follow your work and use Optimal Transport in another task. But a problem occurs in the training. When training, the ot_obj_values is always negative(loss is -412.95532348025495), and the wd is relatively small (eg. 5.207008814930322), which is quite different from DM-count.

Could it be the reason for the regularization parameter of reg? Could you give me some suggestions?

Best wishes! Thank you very much.

That's correct. \labmda_1 is for OT loss, \labmda_2 is for TV loss.

Hello, I used three different graphics cards to do a control experiment, and reproduced your results on rtx2080ti. Thank you very much for your help. I wish you a bright future

about demo

Hello, I have written a demo to show the output renderings of my own, and the pictures obtained are very small and not very good. Can I make the demo files public?

Telling you preprocess of Shanghai Tech Part A and Part B

you just put the preprocess code in the function train_transform of "Crowd_sh".

some questions about training

Hello! When I was trying to replicate the results about ShanghaiTech A. The best result is Mae = 61.7. This is two points different from your experimental results. When I read your article, I found that you set λ 1 to 0.1 and λ 2 to 0.01. Does this mean that in the process of training, the weight of loss is different, loss = 0.1 * ot_ loss + count_ loss +0.01* tv_ loss？ Could you explain it to me? thank you!