Coder Social home page Coder Social logo

slowfast_feature_extractor's Introduction

Hello Visitor

Tridiv here through the realm of 0s and 1s.

Who am I?

  • I am a Computer Vision / Research Engineer based in Berlin.
  • Till recently, I worked at a couple of startups in the automotive and workplace safety domain.

What am I doing now?

  • I am now looking for a new role either as a Software Engineer in the Computer Vision / Machine Learning domain or a funded PhD position in CV.
  • I am (re-)learning C++ and Rust, not to mention how to create multiple sources of income.

What do I like?

Asking the question Why! I like to know how things work as well as building them.

What do I know?

I am proficient in Python, Pytorch and some bits and pieces of other languages like C++, SQL and Rust.

How to reach me?

tridivrajbhattacharyya

My Stats

Top Langs

GitHub stats

slowfast_feature_extractor's People

Contributors

tridivb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

slowfast_feature_extractor's Issues

'VideoSet' object has no attribute 'sample_width'

Hi, I have some problems when I follow the steps
Traceback (most recent call last):
File "run_net.py", line 131, in
main()
File "run_net.py", line 127, in main
test(cfg=cfg)
File "/home/dutir/zengjingjie/slowfast_feature_extractor/test_net.py", line 169, in test
cfg, path_to_vid, vid_id, read_vid_file=cfg.DATA.READ_VID_FILE
File "/home/dutir/zengjingjie/slowfast_feature_extractor/datasets/videoset.py", line 52, in init
self.frames = self._get_frames()
File "/home/dutir/zengjingjie/slowfast_feature_extractor/datasets/videoset.py", line 90, in _get_frames
(self.sample_width, self.sample_height),
AttributeError: 'VideoSet' object has no attribute 'sample_width'

inference halts after first video

Describe the bug
inference halts after first video, and i check the gpu utils, it shows 0%, seems program stop after first test.

To Reproduce
Steps you followed while encountering the bug

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots/Code Output
If applicable, add screenshots or code output to help explain your problem.
image

Desktop (please complete the following information):

  • OS: centos7
  • Python Version 3.7
  • PyTorch version 1.4

Additional context
Add any other context about the problem here.

Can the repo support roi feature extraction

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps you followed while encountering the bug

Expected behavior
A clear and concise description of what you expected to happen.

Thanks for your work! The repo now support extract video feature from a video or a set of frames
It will be better if it supports the ROI feature extraction part.

Screenshots/Code Output
If applicable, add screenshots or code output to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. Ubuntu 16.04]
  • Python Version
  • PyTorch version

Additional context
Add any other context about the problem here.

AttributeError: CROP_SIZE

hello, i have a question, when i run 'python run_net.py --cfg ./configs/SLOWFAST_8x8_R50.yaml' to extract my video features, the problem of AttributeError: CROP_SIZE that confuses me. Thanks!

Traceback (most recent call last):
File "run_net.py", line 25, in
main()
File "run_net.py", line 21, in main
launch_job(cfg=cfg, init_method=args.init_method, func=test)
File "/home/XX/slowfast/slowfast/utils/misc.py", line 311, in launch_job
func(cfg=cfg)
File "/home/XX/slowfast_feature_extractor/test_net.py", line 94, in test
model = build_model(cfg)
File "/home/XX/slowfast_feature_extractor/models/build.py", line 61, in build_model
cfg.DATA.CROP_SIZE // 32 // pool_size[0][1],
File "/home/XX/anaconda3/envs/slowfast/lib/python3.8/site-packages/yacs/config.py", line 141, in getattr
raise AttributeError(name)
AttributeError: CROP_SIZE

Unable to extract feature vectors other than frame by frame

Describe the bug
The feature extraction is performed frame by frame, instead of in clips, regardless of the values I use for NUM_FRAMES and SAMPLING_RATE I always get the same number of feature vector as frames the video has.

To Reproduce
Execute using the following configuration:

TRAIN:

  ENABLE: False
  DATASET: epickitchens
  BATCH_SIZE: 50
  EVAL_PERIOD: 2
  CHECKPOINT_PERIOD: 1
  CHECKPOINT_FILE_PATH: "SlowFast.pyth"
  CHECKPOINT_TYPE: pytorch
  AUTO_RESUME: True
DATA:
  NUM_FRAMES: 32
  SAMPLING_RATE: 2
  PATH_TO_DATA_DIR: "path to videos"
  TRAIN_JITTER_SCALES: [256, 320]
  TRAIN_CROP_SIZE: 224
  TEST_CROP_SIZE: 256
  READ_VID_FILE: False
  IMG_FILE_EXT: ".jpg"
  IN_FPS: 30
  OUT_FPS: 30
  TARGET_FPS: 30
  IMG_FILE_FORMAT: "frame_{:010d}.jpg"
  INPUT_CHANNEL_NUM: [3, 3]
  VID_FILE_EXT: ""
SLOWFAST:
  ALPHA: 8
  BETA_INV: 8
  FUSION_CONV_CHANNEL_RATIO: 2
  FUSION_KERNEL_SZ: 7
RESNET:
  ZERO_INIT_FINAL_BN: True
  WIDTH_PER_GROUP: 64
  NUM_GROUPS: 1
  DEPTH: 50
  TRANS_FUNC: bottleneck_transform
  STRIDE_1X1: False
  NUM_BLOCK_TEMP_KERNEL: [[3, 3], [4, 4], [6, 6], [3, 3]]
  SPATIAL_STRIDES: [[1, 1], [2, 2], [2, 2], [2, 2]]
  SPATIAL_DILATIONS: [[1, 1], [1, 1], [1, 1], [1, 1]]
NONLOCAL:
  LOCATION: [[[], []], [[], []], [[], []], [[], []]]
  GROUP: [[1, 1], [1, 1], [1, 1], [1, 1]]
  INSTANTIATION: dot_product
BN:
  USE_PRECISE_STATS: True
  NUM_BATCHES_PRECISE: 200
  WEIGHT_DECAY: 0.0
SOLVER:
  BASE_LR: 0.01
  LR_POLICY: steps_with_relative_lrs
  STEPS: [0, 20, 25]
  LRS: [1, 0.1, 0.01]
  MAX_EPOCH: 30
  MOMENTUM: 0.9
  WEIGHT_DECAY: 1e-4
  WARMUP_EPOCHS: 1.0
  WARMUP_START_LR: 0.001
  OPTIMIZING_METHOD: sgd
MODEL:
  NUM_CLASSES: 97
  ARCH: slowfast
  MODEL_NAME: SlowFast
  LOSS_FUNC: cross_entropy
  DROPOUT_RATE: 0.5
TEST:
  ENABLE: True
  DATASET: epickitchens
  BATCH_SIZE: 1
  NUM_SPATIAL_CROPS: 1
  CHECKPOINT_FILE_PATH: "SlowFast.pyth"
  CHECKPOINT_TYPE: pytorch
DATA_LOADER:
  NUM_WORKERS: 8
  PIN_MEMORY: True
NUM_GPUS: 1
NUM_SHARDS: 1
RNG_SEED: 0
OUTPUT_DIR: "./bsh-features"

Expected behavior
Instead of outputting a feature vector for each frame, I should be able to process the video divided in clips.

Desktop:

  • OS: [e.g. Ubuntu 22.04 LTS]
  • Python Version: 3.10.4
  • PyTorch version: 1.12.1+cu113

video frame_list

start = int(index - self.step_size * self.out_size / 2)
end = int(index + self.step_size * self.out_size / 2)

index (int): the video index provided by the pytorch sampler. What does the equation mean?,
How can video index and 'self.step_size * self.out_size / 2'?

Investigate incorrect feature output

Describe the bug
User @ZhengLeon claims no output is being generated for a video.

To Reproduce
Details not provided

Screenshots/Code Output
Loading Video List ...
Done
1 videos to be processed...
0. Processing /home/work/slowfast_feature_extractor/video/0a244c71991f386b161bcd787c7da607...

Code integrity

When I run the program separately, there were some packages that didn't exist. So I put this code in my source code on SlowFast, and after I tried to modify it, the configuration file complained of an error, and I got the following error:
AssertionError: An object named 'SlowFast' was already registered in 'MODEL' registry!
Maybe there is other method to run the program? and Could you tell me what the final feature format looks like?
Looking forward to your reply,Thanks!

About the window of extractor and numbers of features

Thanks for your great work! I read you code but I am not sure if I understand right. For every frame in the video, the extract will generate a 1-D feature vector, and the length of window equals the step_size * out_size, step_size is the drop ratio of the window. I have a question about what does the 8x8 mean in the pre-trained model, previously I think for every 8*8 frames we should output a feature, but now it seems to mean the size of the window? Looking forward to your reply!

Extraction stop

I can extract video's feature
but It stops after a few extractions always.
Why this occur and How do I do fix it ㅜㅜ

TypeError: _construct() missing 10 required positional arguments: 'dim_in', 'dim_out', 'stride', 'dim_inner', 'num_groups', 'trans_func_name', 'stride_1x1', 'inplace_relu', 'nonlocal_inds', and 'instantiation'

When I config my code as you said,I meet a TypeError:

Traceback (most recent call last):
File "run_net.py", line 131, in
main()
File "run_net.py", line 127, in main
test(cfg=cfg)
File "/media/acodec/media/code/slowfast_feature_extractor/test_net.py", line 90, in test
model = model_builder.build_model(cfg)
File "/media/acodec/media/code/slowfast_feature_extractor/models/model_builder.py", line 34, in build_model
model = _MODEL_TYPEScfg.MODEL.ARCH
File "/media/acodec/media/code/slowfast_feature_extractor/models/video_model_builder.py", line 146, in init
self._construct_network(cfg)
File "/media/acodec/media/code/slowfast_feature_extractor/models/video_model_builder.py", line 211, in _construct_network
trans_func_name=cfg.RESNET.TRANS_FUNC,
File "/media/acodec/media/code/slowfast/slowfast/models/resnet_helper.py", line 390, in init
super(ResStage, self).init()
File "/home/acodec/anaconda2/envs/python3_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 72, in init
self._construct()
TypeError: _construct() missing 10 required positional arguments: 'dim_in', 'dim_out', 'stride', 'dim_inner', 'num_groups', 'trans_func_name', 'stride_1x1', 'inplace_relu', 'nonlocal_inds', and 'instantiation'

What is wrong? How do i get rid of this problem?

AttributeError: 'Namespace' object has no attribute 'cfg_file'

Hi I've got this error when i run on this command.

python run_net.py --cfg ./configs/SLOWFAST_8x8_R50.yaml

would you please check how to get over from this error?

Traceback (most recent call last):
File "run_net.py", line 25, in
main()
File "run_net.py", line 17, in main
cfg = load_config(args)
File "/home/ddf/PycharmProjects/sfex/slowfast/slowfast_feature_extractor/configs/custom_config.py", line 57, in load_config
if args.cfg_file is not None:
AttributeError: 'Namespace' object has no attribute 'cfg_file

Does it support distributed inference?

Describe the bug
When I use 6gpus to inference the video,I find that the speed is not faster than 1gpu, does it support distributed inference?

To Reproduce
Steps you followed while encountering the bug

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots/Code Output
If applicable, add screenshots or code output to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. Ubuntu 16.04]
  • Python Version
  • PyTorch version

Additional context
Add any other context about the problem here.

Can I extract video feature based on 1FPS video frames?

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps you followed while encountering the bug

Expected behavior
A clear and concise description of what you expected to happen.
For some reasons, I can't get raw video for feature extraction, only get 1FPS frames of the video. Could I extract the video feature based on these 1FPS frames?

Screenshots/Code Output
If applicable, add screenshots or code output to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. Ubuntu 16.04]
  • Python Version
  • PyTorch version

Additional context
Add any other context about the problem here.

Rgb features only?

Hi,
Thanks for your great job!
I wonder if the flow features can be extracted from videos by you code?

请教一下配置(有关采样率)

我下载了一个8x8的slowfast,我的理解是每8帧选一个中间帧,中间帧邻近共计8帧跑一次视频特征提取,所以出来的shape[0]预期应该是总帧数//8
我跑了一个5000帧25fps的视频,设置IN_FPS=OUT_FPS=25,NUM_FRAMES=SAMPLING_RATE=8,最后的特征维度是(5000,2304),shape[0]不是5000//8,这意味着他并没有每隔8帧取一次的这个sample rate?
谢谢!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.