Hi! This is probably not an issue but more a cry for help. I have tr

Help with using asynet for object detection about rpg_asynet HOT 5 CLOSED

KristofferFogh04 commented on July 24, 2024

Help with using asynet for object detection

from rpg_asynet.

Comments (5)

MessikommerNico commented on July 24, 2024

Hi Kristoffer

The layer list looks correct to me. Maybe you can check if one of the layers is missed by the setWeightsEqual function.
Unfortunately, we only have scripts for the FLOP evaluation of the object detection network as well as a unittest script, which is similar to the unit test sparse_VGG_test.py script. Thus, you should be able to find the error using the unit test sparse_VGG_test.py script by adjusting the layer list. You can go iteratively over the layers until you find a difference in output.

Best regards,
Nico

from rpg_asynet.

KristofferFogh04 commented on July 24, 2024

Hi Nico!

Thanks for your reply. It is very valuable to me. After some debugging I managed to find the issue. In facebook_sparse_object_det.py a relu activation function is applied between the two fully connected layers:

x = self.inputLayer(x)
x = self.sparseModel(x)
x = x.view(-1, self.linear_input_features)
x = self.linear_1(x)
x = torch.relu(x) #  <- this one
x = self.linear_2(x)
x = x.view([-1] + self.cnn_spatial_output_size + [(self.nr_classes + 5*self.nr_box)])

It didn't seem like this is supported in the asynchronous model, so my rather primitive solution was to check if it is the second to last layer and then applying the activation function like so:

elif self.layer_list[j][0] == 'ClassicFC':
    if x_asyn[1].ndim == 3:
        fc_output = layer(x_asyn[1].permute(2, 0, 1).flatten().unsqueeze(0))
    else:
        fc_output = layer(x_asyn[1].unsqueeze(0))
    if j == (len(self.layer_list) - 2):
        fc_output = torch.relu(fc_output) # <- Added this
    x_asyn = [None] * 5
    x_asyn[1] = fc_output.squeeze(0)

If there is a more elegant solution, please feel free to suggest it.

from rpg_asynet.

KristofferFogh04 commented on July 24, 2024

By the way, now that I have it working, I have a question regarding speed. Am I supposed to be able to get the asynchronous model to work faster than the synchronous? Below is an example of the asynchronous receiving 100 new events in a sliding window application and computing the output. As you can see it uses 1.8 seconds. In comparison, the sparse synchronous model uses 0.029 seconds for the full sliding window containing 25.000 events. Am I doing something completely wrong or is this the way its supposed to be?

Layer Name: C                   Time: 7.401
Layer Name: BNRelu              Time: 1.808
Layer Name: C                   Time: 3.646
Layer Name: BNRelu              Time: 4.601
Layer Name: MP                  Time: 13.092
Layer Name: C                   Time: 95.582
Layer Name: BNRelu              Time: 1.816
Layer Name: C                   Time: 254.471
Layer Name: BNRelu              Time: 2.410
Layer Name: MP                  Time: 173.119
Layer Name: C                   Time: 210.194
Layer Name: BNRelu              Time: 1.240
Layer Name: C                   Time: 142.097
Layer Name: BNRelu              Time: 1.270
Layer Name: MP                  Time: 148.900
Layer Name: C                   Time: 172.112
Layer Name: BNRelu              Time: 1.061
Layer Name: C                   Time: 145.682
Layer Name: BNRelu              Time: 1.055
Layer Name: MP                  Time: 83.841
Layer Name: C                   Time: 164.930
Layer Name: BNRelu              Time: 1.625
Layer Name: C                   Time: 155.170
Layer Name: BNRelu              Time: 1.094
Layer Name: ClassicC            Time: 3.201
Layer Name: ClassicBNRelu       Time: 0.212
Layer Name: ClassicFC           Time: 8.685
Layer Name: ClassicFC           Time: 1.469
Elapsed time: 1.810353 seconds.

from rpg_asynet.

MessikommerNico commented on July 24, 2024

It could be that depending on the distribution of the active sites, that our method needs so much time. With our current implementation, we report 80.4 ms for processing 1 event for a sliding window of 25000 events. However, given the lower number of required FLOPs, I think there is quite some room for improvements by using specific hardware and optimization.

from rpg_asynet.

MessikommerNico commented on July 24, 2024

By the way, thanks for reporting your solution for your first issue!

from rpg_asynet.

Help with using asynet for object detection about rpg_asynet HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent