Coder Social home page Coder Social logo

Comments (21)

jeffreyyihuang avatar jeffreyyihuang commented on July 20, 2024

Sure, but currently I am busy on some projects. I will probabaly update this repo at about Dec 20.

Jeffrey

from two-stream-action-recognition.

cwzat avatar cwzat commented on July 20, 2024

from two-stream-action-recognition.

cwzat avatar cwzat commented on July 20, 2024

from two-stream-action-recognition.

jeffreyyihuang avatar jeffreyyihuang commented on July 20, 2024

Hi,
This method is called cross-modality pretraining, which is proposed in "Temporal Segment Networks: Towards Good Practices for Deep Action Recognition".
The procedure is to average the weight value across the RGB channels and replicate this average by the channel number of motion stream input( which is 20 is this case).

Jeffrey

from two-stream-action-recognition.

cwzat avatar cwzat commented on July 20, 2024

Thank you, I got it. Now, I am trying this method with keras and I get some troubles, are you familiar with it?If so, I think I can get your help.

from two-stream-action-recognition.

jeffreyyihuang avatar jeffreyyihuang commented on July 20, 2024

Hi, I have tried the two stream network on Keras before but not quite familiar. Could you post your issue? There might be something I can do for help.

from two-stream-action-recognition.

cwzat avatar cwzat commented on July 20, 2024

I use model.get_config() to complete cross-modality pretraining, and i use Inception-resnet-v2 model, optimizer is Adam(default parameters) / SGD(lr=1e-2, 0.9), optical frames are stacked with 10-x and 10-y, but the acc is very low(65%), I want to know more details about your model, or could you give me some advise!

from two-stream-action-recognition.

cwzat avatar cwzat commented on July 20, 2024

If you have keras pretrained optical flow, could you publish it? Thank you !

from two-stream-action-recognition.

jeffreyyihuang avatar jeffreyyihuang commented on July 20, 2024

Sorry, I don't have keras pretrained optical flow model.

I think the reason caused low acc might be the sampling method in your training stage since I do have some related experiences on pytorch framework.
Could you provide some details about how you sample your training data in each batch?

Jeffrey

from two-stream-action-recognition.

roystonrodrigues avatar roystonrodrigues commented on July 20, 2024

Can you please share your pretrained models. This would be helpful to run your code in the testing phase.

from two-stream-action-recognition.

cwzat avatar cwzat commented on July 20, 2024

@jeffreyhuang1 There are 8631 video-samples in train set. Each batch, I randomly choose 32 video-samples from it. And each video i random choose 10 x-frames and 10 y-frames. Then i stack it, the result is (32, 229 , 229, 20). On the third axis, the first ten numbers are 10 x-frames, the last ten numbers are 10 y-frames. All the frames is continuous.

from two-stream-action-recognition.

jeffreyyihuang avatar jeffreyyihuang commented on July 20, 2024

@roystonrodrigues
Hi, I just share my new version of pretrained model and code today. You can test it and feel free to correct my mistakes.

Jeffrey

from two-stream-action-recognition.

jeffreyyihuang avatar jeffreyyihuang commented on July 20, 2024

@cwzat
According to the two-stream paper, I remember that the input of motion stream is a stack of 10 consecutive optical flow. In my opinion, maybe your problem is in the sampling stage that you randomly choose 10 x-frames and 10 y-frames rather than choose the consecutive x,y optical flow.

Jeffrey

from two-stream-action-recognition.

cwzat avatar cwzat commented on July 20, 2024

@jeffreyhuang1 I already choose them consecutivly and the acc is low yet. Could you give me some another advices?

from two-stream-action-recognition.

jeffreyyihuang avatar jeffreyyihuang commented on July 20, 2024

@cwzat

oops, sorry my bad. I lose some information in your message. I check the implementation method in the two-stream paper and find that

screenshot 98

Therefore, on your third axis, the order of your data should be [x0, y0, x1, y1, x2, y2, ...]
Maybe be you can try this one!!

Sorry again for misread your message.
Jeffrey

from two-stream-action-recognition.

cwzat avatar cwzat commented on July 20, 2024

@jeffreyhuang1 It is okay! Thank you very for your help! I am very glad to solve the problem through your help! I try it right now!

from two-stream-action-recognition.

jeffreyyihuang avatar jeffreyyihuang commented on July 20, 2024

@cwzat, I look forward to hearing your good news soon XD

Jeffrey

from two-stream-action-recognition.

cwzat avatar cwzat commented on July 20, 2024

@jeffreyhuang1 I have another quention, how do you choose the optimizer and the parameters?

from two-stream-action-recognition.

jeffreyyihuang avatar jeffreyyihuang commented on July 20, 2024

@cwzat, basically, I just follow the setting in the paper, which uses SGD as the optimizer.
For the batch size and learning rate, I increase learning rate according to the difference between my batch size and the batch size provided in the paper.
More precisely, you can just tune some parameters to boost the model performance.

Jeffrey

from two-stream-action-recognition.

cwzat avatar cwzat commented on July 20, 2024

@jeffreyhuang1 Your methods choosing test set is same as train set? And are you training only the top layers or all the layers?

from two-stream-action-recognition.

jeffreyyihuang avatar jeffreyyihuang commented on July 20, 2024

@cwzat yeah, the stacked optical flow method is the same and I am training all of the layers in resnet101.

Jeffrey

from two-stream-action-recognition.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.