sense-x / x-temporal Goto Github PK
View Code? Open in Web Editor NEWA general video understanding codebase from SenseTime X-Lab
License: MIT License
A general video understanding codebase from SenseTime X-Lab
License: MIT License
Your project is very well written, but it seems that it has not been updated since TSM. Although there is mmaction, the bottom layer of mmcv feels very unfriendly.
I failed in running ./easy_setup.sh
on my mac. The error is :
Traceback (most recent call last):
File "setup.py", line 28, in <module>
torch.utils.cpp_extension.CUDA_HOME = _find_cuda_home()
File "setup.py", line 20, in _find_cuda_home
nvcc = subprocess.check_output(['which', 'nvcc']).decode().rstrip('\r\n')
File "/opt/anaconda3/lib/python3.7/subprocess.py", line 395, in check_output
**kwargs).stdout
File "/opt/anaconda3/lib/python3.7/subprocess.py", line 487, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['which', 'nvcc']' returned non-zero exit status 1.
It seems that cuda is necessary but MAC doesn't have gpu.
Hi,
Thanks for sharing the code and model zoo. I was wondering if you could direct me how should I modify this code to output predictions on single video files (e.g. .mp4 format) like it is done in slowfast code here
Thanks.
when i set workers>0, i would get error: “RuntimeError: DataLoader worker (pid(s) 1995171, 1996371) exited unexpectedly ”
It seems to happen at “batch = next(iterator)” in “.../X-Temporal/x_temporal/interface/temporal_helper.py"
(note:running the experiment “tin”)
I encountered an issue similar to this that AverageMeter
in utils.py which is used in temporal_helper.py is not correct.
Could you please fix it?
The THCState_getCurrentStream seems deprecated in pytorch 1.5.0 or higher.
I read some information on replacing it with at::cuda::getCurrentCUDAStream, but failed. Any idea on fixing it?
error information below
./X-Temporal/x_temporal/cuda_shift/src/shift_cuda.cpp: In function ‘at::Tensor shift_featuremap_cuda_backward(const at::Tensor&, const at::Tensor&, const at::Tensor&)’:
./X-Temporal/x_temporal/cuda_shift/src/shift_cuda.cpp:41:27: error: ‘THCState_getCurrentStream’ was not declared in this scope
ShiftDataCudaBackward(THCState_getCurrentStream(state),
^~~~~~~~~~~~~~~~~~~~~~~~~
Thanks for your feedback. You can set trainer.no_partial_bn = True if batch size >= 6 in each gpu and retry it, this will not affect the accuracy. That module exists some bug with distributed training, we will fix it quickly.
Thanks for the reply, but it doesn't work for my problem.
When I set no_partial_bn = True, the log file stop at 'save_dir: checkpoint/' and with no update again. and the usage is still about 800~900M.
The changed settings in my YAML file are only dataset related:
root_dir: train: meta_file: /home/renb/project/action_recognition/X-Temporal/data_labels/sthv1/train_videofolder.txt val: meta_file: /home/renb/project/action_recognition/X-Temporal/data_labels/sthv1/val_videofolder.txt test: meta_file: /home/renb/project/action_recognition/X-Temporal/data_labels/sthv1/test_videofolder.txt
Very confused about this.
Thanks again and waiting for you suggestion.
Originally posted by @Amazingren in #1 (comment)
when i train on my dataset,i meet this
problem AttributeError: 'EasyDict' object has no attribute 'meta_file'
Could anyboby help me? Thank you.
Hi. When I test, I can not find the output file and I don't find how to output the multi-label. When multi-label, how to define my pred labels are right. Thanks.
Here the data_loaders is set to "test" at any time.
I'm not sure wether this is a bug or not. Maybe there is something I misunderstand?
how to get the TIN pretrain model in Kinetics-600, i want to get the pretrain model. thanks
Hi, thanks for the great codebase.
Could you kindly provide the code to extract features from custom videos using pre-trained models?
Thank you for your beautiful code.
I run this model base on Kinetics dataset, and video format is .mp4.
1st ERROR: RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
reason: Training on multiple GUPs is called tensor, which is divided into different kinds of memory or video memory.
Solution: In /.. /utils.py line 42, should add ** contiguous().view(1,-1) **, the same as line 46.
The train work but when the Iteration=40, it got a ERROR.
2nd ERROR: decord._ffi.base.DECORDError: [14:51:44] /io/decord/src/video/video_reader.cc:125: Check failed: st_nb >= 0 (-1381258232 vs. 0) ERROR cannot find video stream with wanted index: -1
And UserWarning: resource_tracker: There appear to be 122 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d '
Although I tried many ways, I still couldn't solve the problem.
Would you kindly let me know of the problem. Thanks a lot
In linear_sampler(), bias'shape is NTG, why is it not N*G?
Thanks for your nice style codebase.
However, when I try to train TSM in your codebase, there is a problem which stoped me from training it.
(1)The log file stop at: 2020-04-10 xxxx094-models.py#177: Freezing BatchNorm2D except the first one
, and I wait it for 10 min but with no continue update.
(2)When I use 'gpustat' check the usage of gpu, it shows only about 800M data in each gpu(I use 8 in total)
I am sorry for disturbing you, while as a green hand also would be appreaciate if you could show me some light.
Thank you for your nice code.
When I set the value in default.yaml
gpus:4
dataset:
img_prefix: '{:05d}.jpg'
video_source :True #because I use video as input data
modality: Flow
train:
meta_file : /my_path/train_videofolder.txt
trainer:
no_partial_bn: True
I set train script: train.sh
T=
date +%m%d%H%M ROOT=../.. cfg=default.yaml export PYTHONPATH=$ROOT:$PYTHONPATH CUDA_VISIBLE_DEVICES=4,5,6,7 python $ROOT/x_temporal/train.py --config $cfg | tee log.train.$T
When I run train.sh , the log file stop at 'save_dir: checkpoint/' and with no update again.
Would you kindly let me know of the problem. Thanks a lot
The previous issue was closed. #14 (comment)
Following your suggestion, I used multi-crops and a bigger input, I still can not reproduce the result on Multi-moments in time using TIN and slowfast model(TIN:57 vs 62 in report). However, I can get a little bit better result using tsn model(59.7 vs 58.9). Do you have any idea to solve this?
When training, the tensorboard writer should be closed, or EOFError will be raised.
if self.rank == 0: self.tb_logger.close()
should be added to the temporal_helper.py, at the end of train() to avoid this error
torch.multiprocessing.set_start_method("forkserver")
might not work well for windows users.
Hi, I am trying to test the model-zoo model: TIN trained on MMit dataset based on resnet 50. I changed the default.yaml in ./experiments/tin folder to the following:
version: 1.0
config:
gpus: 4
seed: 2020
dataset:
workers: 4
num_class: 313
num_segments: 16
batch_size: 8
img_prefix: '{:05d}.jpg'
video_source: True
dense_sample: False
modality: RGB
flow_prefix: ''
root_dir: ""
flip: False
input_mean: [0.485, 0.456, 0.406]
input_std: [0.229, 0.224 ,0.225]
crop_size: 224
scale_size: 256
train:
meta_file: /path
val:
meta_file: /workdir/wwn/Multi_Moments_in_Time/mit-val.txt
test:
meta_file: /workdir/wwn/Multi_Moments_in_Time/mit-val.txt
multi_class: True
net:
arch: resnet50
model_type: 2D
tin: True
shift_div: 4
consensus_type: avg
dropout: 0.8
img_feature_dim: 256
pretrain: True # imagenet pretrain for 2D network
trainer:
print_freq: 20
eval_freq: 1
epochs: 35
start_epoch: 0
loss_type: bce
no_partial_bn: True
clip_gradient: 20
lr_scheduler:
warmup_epochs: 1
warmup_type: linear
type: CosineAnnealingLR
kwargs:
T_max: 30
optimizer:
type: SGD
kwargs:
lr: 0.02
momentum: 0.9
weight_decay: 0.0005
nesterov: True
evaluate:
spatial_crops: 1
temporal_samples: 1
saver:
#save_dir: 'checkpoint/'
#pretrain_model: '/path'
resume_model: /home/hadoop-mtcv/cephfs/data/wangwanneng/X-Temporal-master/X-Temporal-master/pretrained/tin_mit_16.pth.tar
but the testing result is 14.4 mAP.
I think maybe there is somthing wrong in the configuration of the model because when testing the model, there are missing keys:
missing keys are as follows:
module.base_model.layer3.4.bn1.num_batches_tracked
module.base_model.layer2.1.bn2.num_batches_tracked
module.base_model.layer3.2.bn3.num_batches_tracked
module.base_model.layer3.5.bn1.num_batches_tracked
module.base_model.bn1.num_batches_tracked
module.base_model.layer4.2.bn3.num_batches_tracked
module.base_model.layer4.1.bn2.num_batches_tracked
module.base_model.layer1.2.bn2.num_batches_tracked
module.base_model.layer2.2.bn1.num_batches_tracked
module.base_model.layer3.5.bn2.num_batches_tracked
module.base_model.layer4.2.bn2.num_batches_tracked
module.base_model.layer4.0.downsample.1.num_batches_tracked
module.base_model.layer1.0.bn3.num_batches_tracked
module.base_model.layer3.0.downsample.1.num_batches_tracked
module.base_model.layer3.3.bn3.num_batches_tracked
module.base_model.layer3.3.bn2.num_batches_tracked
module.base_model.layer4.0.bn1.num_batches_tracked
module.base_model.layer3.2.bn1.num_batches_tracked
module.base_model.layer2.3.bn2.num_batches_tracked
module.base_model.layer1.0.bn2.num_batches_tracked
module.base_model.layer4.1.bn1.num_batches_tracked
module.base_model.layer2.1.bn3.num_batches_tracked
module.base_model.layer2.0.downsample.1.num_batches_tracked
module.base_model.layer3.4.bn3.num_batches_tracked
module.base_model.layer1.0.downsample.1.num_batches_tracked
module.base_model.layer1.2.bn1.num_batches_tracked
module.base_model.layer4.1.bn3.num_batches_tracked
module.base_model.layer4.0.bn3.num_batches_tracked
module.base_model.layer3.1.bn1.num_batches_tracked
module.base_model.layer3.3.bn1.num_batches_tracked
module.base_model.layer1.0.bn1.num_batches_tracked
module.base_model.layer1.1.bn3.num_batches_tracked
module.base_model.layer3.0.bn2.num_batches_tracked
module.base_model.layer3.0.bn3.num_batches_tracked
module.base_model.layer2.1.bn1.num_batches_tracked
module.base_model.layer1.2.bn3.num_batches_tracked
module.base_model.layer2.3.bn1.num_batches_tracked
module.base_model.layer3.1.bn2.num_batches_tracked
module.base_model.layer1.1.bn1.num_batches_tracked
module.base_model.layer2.0.bn3.num_batches_tracked
module.base_model.layer2.0.bn2.num_batches_tracked
module.base_model.layer1.1.bn2.num_batches_tracked
module.base_model.layer3.4.bn2.num_batches_tracked
module.base_model.layer4.0.bn2.num_batches_tracked
module.base_model.layer3.5.bn3.num_batches_tracked
module.base_model.layer2.2.bn2.num_batches_tracked
module.base_model.layer3.1.bn3.num_batches_tracked
module.base_model.layer3.2.bn2.num_batches_tracked
module.base_model.layer2.3.bn3.num_batches_tracked
module.base_model.layer3.0.bn1.num_batches_tracked
module.base_model.layer4.2.bn1.num_batches_tracked
module.base_model.layer2.2.bn3.num_batches_tracked
module.base_model.layer2.0.bn1.num_batches_tracked
so can you share your config file when testing MMit dataset?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.