cvlab-stonybrook / dm-count Goto Github PK
View Code? Open in Web Editor NEWCode for NeurIPS 2020 paper: Distribution Matching for Crowd Counting.
License: MIT License
Code for NeurIPS 2020 paper: Distribution Matching for Crowd Counting.
License: MIT License
Thanks to the authors. I have read the paper, and I think it is excellent work. However, the training and OT code are not provided, which is difficult to follow the work.
Hello, the visual results display URL of this project is invalid, do you have other urls?
It's even less than 0:
Epoch 46 Train, Loss: 3.89, OT Loss: -6.02e-07, Wass Distance: 803.41, OT obj value: 113.64, Count Loss: 3.55, TV Loss: 0.35, MSE: 8.88 MAE: 3.55, Cost 24.5 sec
Hi I notice that in the process of training, the images will be cropped into images of square shape. Just wondering is it possible that the OT loss can be implemented on the rectangle images during training? Thanks for your time in advance!
Using the code, I observe very big experimental randomness. For example, on QNRF dataset, I obtain results on test set as follows (MAE and MSE):
run 1: 87.621, 149.75
run 2: 92.988, 168.47
run 3: 96.175, 167.79
In the paper, 85.6 and 148.3 are reported. May I ask if the authors have some ideas to reduce the big experiment randomness? With this big randomness, how can we draw conclusion on which model performs well and which doesn't?
Thanks a lot.
Hello, thanks for the great work!
Do you have any advice on how to use your work with smaller images that are not technically crowds but more occlusions of two, three, four persons. I tried to retrain the model and I am getting heatmaps that localize the head pretty well but it doesn't count the people on the image properly.
For example, I can see that two areas have been located on the heatmap, but only one person is counted in the end. Do you recommend to change something in the code for images that are less dense than a crowd. Also, for your information, I trained on a dataset of small images (approximately 100x60 pixels), therefore I added some padding to reach the size of 512x512.
Any advice would be highly appreciated. Thanks
Hi, thanks for your amazing work!
I have a question about the density maps estimation. In the paper, your mentioned that didn't use precalculated density maps for ground truth annotations because of the hurt in generalization performance, so only use points annotations right? But after, i got lose because in the toy problem you said "he source density map ˆz is initialized from a uniform distribution between 0 and 0.01", and also in preprocess datasets scripts you calculate a gaussian density map.
So the doubt is if you need a precalculated density map, and in this case how can i calculate in the way you do.
Thanks!
请问可以多卡训练吗?我使用nn.DataParallel的话,loss就会特别大,mae也特别大。
Hi Boyu,
I really appreciate your and your team's contribution to crowd counting. It seems that the proposed DM-Count loss can improve the performance a lot.
However, I found the OT loss defined in ot_loss.py
very confusing. There are many notations without proper comments or documentations. Could you please standardise the code so that it only takes the predicted density map & the ground-truth density map as inputs?
Many thanks,
Yiming
train.py is not running and encounters the following problems.
in pytorch 1.2
Traceback (most recent call last):
File "C:/Program Files (x86)/DM-Count-master/train.py", line 64, in
trainer.train()
File "C:\Program Files (x86)\DM-Count-master\train_helper.py", line 110, in train
self.train_eopch()
File "C:\Program Files (x86)\DM-Count-master\train_helper.py", line 126, in train_eopch
for step, (inputs, points, gt_discrete) in enumerate(self.dataloaders['train']):
File "C:\Anaconda\envs\py37\lib\site-packages\torch\utils\data\dataloader.py", line 819, in next
return self._process_data(data)
File "C:\Anaconda\envs\py37\lib\site-packages\torch\utils\data\dataloader.py", line 846, in _process_data
data.reraise()
File "C:\Anaconda\envs\py37\lib\site-packages\torch_utils.py", line 369, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "C:\Anaconda\envs\py37\lib\site-packages\torch\utils\data_utils\worker.py", line 178, in worker_loop
data = fetcher.fetch(index)
File "C:\Anaconda\envs\py37\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "C:\Anaconda\envs\py37\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "C:\Program Files (x86)\DM-Count-master\datasets\crowd.py", line 124, in getitem
return self.train_transform(img, keypoints)
File "C:\Program Files (x86)\DM-Count-master\datasets\crowd.py", line 83, in train_transform
gt_discrete = gen_discrete_map(h, w, keypoints)
File "C:\Program Files (x86)\DM-Count-master\datasets\crowd.py", line 37, in gen_discrete_map
discrete_map = torch.zeros(im_width * im_height).scatter_add(0, index=p_index, src=torch.ones(im_width*im_height)).view(im_height, im_width).numpy()
RuntimeError: Expected object of scalar type Long but got scalar type Int for argument #3 'index'
Your work has given me a lot of help. Thank you very much. I really need you for this problem, and I hope to get your help.
I'm not able to reproduce the same results as in the paper in particular on NWPU. Did you use the same parameters as reported in the paper ? Did you use 1000 epochs for training ?
Hi, thanks for this meaningful work. I am wondering how will the min_size, max_size effects the final results or the training process?
Hi, thanks for your excellent work and the open-source codes!
I'm very interested in the novel optimization criterion proposed in your paper and tried to reproduce it locally. But there is a confusing problem of performance degradation when I restrained the height and width of the test images to make sure that they are divisible by a particular number (like, 8, 16, or 32). This data preprocessing strategy is inspired by C^3 Framework and guarantees the output size of some down-sampling layers count meet the requirements of subsequent processing. More details of the test results on ShanghaiTech Part_A are shown in the table below.
pretrained | resize | MAE | MSE | |
---|---|---|---|---|
Reported | ✔️ | 59.68 | 95.72 | |
Reproduce | ✔️ | ✔️ | 62.47 | 101.98 |
This issue is to sincerely ask you how such a simple operation can have such an obvious impact on the experimental results.
This is a very great work. Thanks a lot
Could you also share the dataloader and experimental setting (like crop size and so on) for dataset UCF-CC-50? Currently, I am working on this dataset, but I have trouble obtaining the results you reported. I think it may due to the dataloader or other experimental setting.
There was an error when i was training or testing models in the (pretrained_models) folder
File "/home/user/Documents/LJY/DM-Count/test.py", line 49, in
model.load_state_dict(torch.load(model_path, device), strict=False)
File "/home/user/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 386, in load
return _load(f, map_location, pickle_module, **pickle_load_args)
File "/home/user/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 563, in _load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.
I didn't make any mistakes when i experimented with the model i had trained.
Thank you very much for your work. It helped me a lot
Please help me. Thank you!
修改翻译结果
Excellent work!
But why not use more advanced backbone, such as resnet or CSPNet et.?
Is that because of in order to prove the progresiveness of the loss designed?
Hi, you mentioned you don't calculate the density map in paper. Why I found you did gaussian filter when processing NWPU?
Hi author, I would like to ask how well your model works for estimating the density of waterfowl (e.g. geese)?
I'm trying to understand the evaluation procedure that you have used for the ShanghaiTech B dataset. Are you monitoring performance on the test set during training and then selecting the best performance? Or, otherwise, how are you deciding when to stop training to achieve the best test performance? ShanghaitechA/B don't have validation sets given.
Why setting the optimal target equals the dual term "<β*, z^>"s derivate times the prediction instead of the original OT distance? It makes sense to optimize the entire OT loss term "W(z, z^)" or its dual term "<β*, z^>" to force dot regression more sparse and accurate, but why the derivate? Is it mentioned in the paper or supplements?
Hello, I want know if this project has preporcess dataset with Shanghai Tech Part A and Part B dataset, we want to test this excellent model in ours dataset. Thank you for you help.
Hi, Firstly I would like to say that your paper and code are very impressive. Thank you for releasing the code.
The default hyperparameters worked well for NWPU and ShanghaiTech datasets but are giving a bad results (MAE -- 600) for UCF-QNRF. Could you please clarify about this ?
Thanks in advance :)
I think you should add model.eval()
in test.py
. It won't alter the default result, but if batch normalization is enabled it will definitely affect the result. I had to change it because I am using ResNet50 instead of VGG19.
I think the idea is very novel and the result is very good. I want to follow your work and use Optimal Transport in another task. But a problem occurs in the training. When training, the ot_obj_values is always negative(loss is -412.95532348025495), and the wd is relatively small (eg. 5.207008814930322), which is quite different from DM-count.
Could it be the reason for the regularization parameter of reg? Could you give me some suggestions?
Best wishes! Thank you very much.
Hello, I used three different graphics cards to do a control experiment, and reproduced your results on rtx2080ti. Thank you very much for your help. I wish you a bright future
Hello, I have written a demo to show the output renderings of my own, and the pictures obtained are very small and not very good. Can I make the demo files public?
you just put the preprocess code in the function train_transform of "Crowd_sh".
Hello! When I was trying to replicate the results about ShanghaiTech A. The best result is Mae = 61.7. This is two points different from your experimental results. When I read your article, I found that you set λ 1 to 0.1 and λ 2 to 0.01. Does this mean that in the process of training, the weight of loss is different, loss = 0.1 * ot_ loss + count_ loss +0.01* tv_ loss? Could you explain it to me? thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.