Coder Social home page Coder Social logo

hitachi-rd-cv / qpic Goto Github PK

View Code? Open in Web Editor NEW
132.0 132.0 33.0 442 KB

Repo for CVPR2021 paper "QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information"

License: Apache License 2.0

Python 100.00%
coco hico-det-training human-object-interaction interaction-detection

qpic's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

qpic's Issues

how to modify the DETR to deformable DETR in qpic

Hi,
according to deformable detr paper statement, the training speed will gain much boosting. Has anyone succeed to use the deformable detr model to train qpic model. Did it improvement a lot ? thank you.

For V-COCO result reproduction, which evaluation code should I use?

I tried to reproduce your results on V-COCO, and I encountered some problems.
I found that the official evaluation code can only achieve 56.5 for Scenario 1 and 58.6 for Scenario 2 with your pre-trained model, while with the vcoco_eval.py in your project which is in PPDM style, the result is 58.35. However, the evaluation process in vcoco_eval.py seems like Scenario 2 in the official evaluation.
Which code should I use to reproduce your experiments on V-COCO? Moreover, if I need to use the vcoco_eval.py, what should I do to get the evaluation results for both scenarios?

HICO-DET finetuned detector

Looking at the paper, it seems that qpic didn't finetune the detr in the hico-det unlike the other models, is it right?
Did you train hoi right away to pretrained detr?

vcoco.pickle

Hello, can you share vcoco.pickle in generate_vcoco_official.py? Thanks.

Different AP Results for pre-trained VCOCO models

Hi folks,

First of all, thank you for sharing this repository. I would like to ask a specific question about the evaluation results of provided pre-trained V-COCO models. I followed the instructions you provided (for constructing annotation files for V-COCO and obtaining pickle files) to get the results. However, comparing with the V-COCO results in the table, I got different average role ap results. For instance, I am providing the output of R-50 QPIC, scenario 1 Role AP result in the attached screenshot. I am wondering possible reasons for the issue, could you please provide assistance about this?

qpic_resnet50_sc1_roleap

The influence from aux_loss

Hi authors,

thanks for your implementation which helps my research a lot. In the supplementary materials of your paper you stated that the auxiliary loss is used following DETR. What will happen if it isn't used for HOI training? I wonder if there's any corresponding experiment on QPIC.

How to get the file logs/checkpoint.pth?

Thanks for your interesting work!
Now I have a question about how to get the file logs/checkpoint.pth,I have not see the file from your google drive.
Desire to your reply.

cope with none hoi image?

Dear author:
recently i did some modification on hico dataset, remove some of the original 117 hoi categories, say 20 categories only. Therefore there are no hoi annotation on some images.
When i try to train the modified hico dataset, I get an error:

File "/media/sdf/long/hoi/qpic/models/hoi.py", line 226, in forward
    indices = self.matcher(outputs_without_aux, targets)
  File "/media/sdf/long/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/media/sdf/long/conda/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context
    return func(*args, **kwargs)
  File "/media/sdf/long/hoi/qpic/models/matcher.py", line 119, in forward
    cost_verb_class = -(out_verb_prob.matmul(tgt_verb_labels_permute) / \
RuntimeError: mat1 dim 1 must match mat2 dim 0

I debugged into code and realise that for no hoi annotation image, the tensor in target will be emtpy tensor, such as
image

I guess this should be the reason of above error.

I wonder does the code support none-hoi-annotation images? if not , how to make it supported ?

Why the pretrained VCOCO model has 81 object classes?

When I tried to evaluate the VCOCO model you provide, I have to set the parameter --num_obj_classes to 81. What is the reason for this setting? And should I set --num_obj_classes 81 during training?

Thanks for your reply!

Trained Model

Could you provide the trained HICO-DET and V-COCO model? Thanks

Pre-trained parameters

Hi!
I want to train on the v-coco data set.
python convert_parameters.py --load_path logs/checkpoint.pth --save_path params/detr-r50-pre-vcoco.pth --dataset vcoco
What does the document logs/checkpoint.pth mean? Where can I download it?
I'm very anxious! Can you reply me quickly? Thank you very much for your work!

How to draw a heatmap for decoder?

Sorry to interrupt, but I'm very interested in how to get a heatmap about the decoder like fig.5 mentioned in paper. Can you provide relevant scripts? thank you!

Trained DETR paramters on v-coco

Hi, thanks for your work! I wish to do the training for v-coco, and I see that detr needs to be pretrained? Is it possible for you to provide the traineeed parameters for DETR? Thanks.

number of decoder layers impact

Hi, thanks for interesting work.
Does the number of decoder layers have a significant impact on performance like in detr?
And I cant find how much v100 you've used. maybe 8?

How to convert .onnx model from qpic.pth

I am facing some issue; onnx conversion of this model.
I tried

torch.onnx.export(pth_model, dummy_input, "onnx_model.onnx", opset_version 11)
dummy_input shape is [1, 3, 720, 1280] # same with this model.

However, the result is
[array([[[nan, nan, nan],
[nan, nan, nan]]], dtype=float32), array([[[nan, nan],
[nan, nan]]], dtype=float32), array([[[nan, nan, nan, nan],
[nan, nan, nan, nan]]], dtype=float32), array([[[nan, nan, nan, nan],
[nan, nan, nan, nan]]], dtype=float32)]

It's nan party!

please commet how to fix it.

Reproduction the result on VCOCO dataset

Hi, I can reproduce the result on HICO-DET dataset, but just get 51 on V-COCO dataset, could you please provide the log.txt? I am not sure where I am wrong. Thanks.

custom dataset implementation

Hi,
Thank you so much for this awesome repo.
I am currently trying a custom dataset implementation using qpic. The dataset's annotation file is in coco format. The convert_vcoco_annotations.py script converts vcoco to HOIA format. But I am trying to convert a coco format annotation to HOIA by manually adding the interactions. In this process, I am unable to understand a few things,
in HOIA format :
what does {subject_id , category_id, object_id} mean in "hoi_annotations" dict key,
what does {category_id} mean in "annotations" dict key"

How to convert the pretrained DETR model for V-COCO style?

Hi! Sorry for bothering again. When I tried to train a V-COCO model, I found that the object classifier of the pre-trained DETR model is only 81-way (including background), while in #6 you claimed that the V-COCO model has an 82-way object classifier (including background and a missing category id). So what should I do to convert the classifier parameters from 81 classes into 82 classes?

Questions about the Loss for verbs.

Thanks for your contribution to HOI, which inspires and enlightens me. However, as I am reading your work, I raise a question that I can not explain. When you calculate losses for all the queries, you set those unmatched queries' object labels as self.num_obj_classes, which is reasonable. But you set those unmatched queries' verb labels as zeros, which is a bit unreasonable. Any reasons for that? Please enlighten me.

V-COCO Evaluation Error

Hello,

Many thanks for your great work.

I am trying to evaluate your pre-trained models on V-COCO.

  1. So, I first generate the official detection pickle via:

python generate_vcoco_official.py \
        --param_path ./params/qpic_resnet50_vcoco.pth \
        --save_path ./logs_vcoco/vcoco.pickle \
        --hoi_path ./data/v-coco
  1. Later, I use the official evaluation code via the following:
from vsrl_eval import VCOCOeval

vsrl_annot_file_s='./data/v-coco/data/vcoco/vcoco_val.json'
split_file_s='./data/v-coco/data/splits/vcoco_val.ids'

coco_file_s='./data/v-coco/data/instances_vcoco_all_2014.json'
vcocoeval = VCOCOeval(vsrl_annot_file_s, coco_file_s, split_file_s)

file_name= './logs_vcoco/vcoco.pickle'
vcocoeval._do_eval(file_name, ovr_thresh=0.5)

Please note that I adapted the latter script from VSG-Net repo. I face the following error during evaluation:

zero-size array to reduction operation minimum which has no identity

on assert(np.amax(rec) <= 1) within _do_agent_eval() function execution.

I wonder if this is a common error, and how it can be mitigated?

Many thanks.

Distance AP

Hi I have a question about distance-wise ap.
image

Does the hoi instance, which participated in calculating ap, correspond to distance=0.1 when the distance between human & center box is between 0.1 and 0.2?

Or I would fully appreciate it if you could provide code for calculate distance wise ap.
Thanks

How to pretrain detector for VCOCO?

Hi, thanks for your excellent work. I am confused about the pre-training for the V-COCO training.
In your README.md, you stated the pretraining has to be carried out for the V-COCO training since some images of the V-COCO evaluation set are contained in the training set of DETR, while the given training command just used the pretrained DETR model without any more pretraining.
Should I apply a pretraining on the dataset excluding the V-COCO evaluation images or just follow your training command?

Result on the person activity without object present

@tamtamz hi thanks for sharing the code base great work, but i had one query, currently when i tested the model for some scenes like an only person running on a beach without any other object present there is no detections/activity in the output, is there any way i can get results like people walking , fighting, waving without depending on the object present in the scene

Thanks in advance

the pre-trained parameters for V-COCO

Hello! I am a graduate student. Your work has inspired me. Thank you very much for your outstanding contributions. Can you send the pre-trained parameters for V-COCO to me? I need it to complete my graduation thesis. I didn't have enough time and equipment to train DETR on COCO dataset. Thank you again! E-mail:[email protected].

Why 82 classes for V-COCO dataset?

Sorry to interrupt, I noticed that in convert_parameters.py, one more output neuron is added for the object class output when the dataset is V-COCO, i.e. ps['model']['obj_class_embed.weight'] will output 82 categories, instead of 81 when using HICO-DET.

And I wonder what is the extra neuron used for, Thank you!

Some useless steps in focal loss

image
Thx for your interesting work! The neg_weights is meaningful only in heatmap based problem, e.g., Gaussian based detection. Removing those steps may avoid some misunderstanding.

'correct_mat'

'HICODetection' object has no attribute 'correct_mat'

Is your architecture end-to-end HOI?

Hello
Thanks for your implementation.
Is your architecture end-to-end HOI? Does it mean that it does not require any features and feature extraction?
For example, I can feed my features to your network?

When generating the .pickle file of v-coco, the officially given test method was used and was unsuccessful. Please give us some comments

loading annotations into memory...
Done (t=1.75s)
creating index...
index created!
loading vcoco annotations...
Traceback (most recent call last):
File "runvcoco.py", line 15, in
vcocoeval._do_eval('/home1/quan107552101247/qpic/vcocof.pickle', ovr_thresh=0.5)
File "/home1/quan107552101247/qpic/vsrl_eval.py", line 192, in _do_eval
self._do_agent_eval(vcocodb, detections_file, ovr_thresh=ovr_thresh)
File "/home1/quan107552101247/qpic/vsrl_eval.py", line 417, in _do_agent_eval
assert(np.amax(rec) <= 1)
File "<array_function internals>", line 6, in amax
File "/home1/quan107552101247/.local/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 2621, in amax
keepdims=keepdims, initial=initial, where=where)
File "/home1/quan107552101247/.local/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 90, in _wrapreduction
return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
ValueError: zero-size array to reduction operation maximum which has no identity

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.