kinetics-dataset's People
Forkers
shoufachen hzhang57 dexianghong plrbear amir-dz 13301338176 hucui2022 keephappy-365 user8361 haideraltahan voducman namhar tdf1995 784682065 kejie-cn xuyu0010 gjtjx gzcsudo yeboqxc foolishman-wx semhejazi aierhaimian mry1990 klauscc inuxer simonlliu cmu-inf-diva pansanity666 shuaijun-deng wzb1005 dongzeyang deepsworld xgmiao egoyue messiyue sawyermade plaovem lagrangeli yangyanghu leopoldlin bf2harven yuhao20 mbencherif yikoudamifan doranlyong gcxamy pace577 shanye1516 maoyunyao martin-wmx yangzhuang-j dollja niuliling123 pravin74 xinzhe-ni hwijune wgcban kkallidromitis simcs 3x2y ruizewang wzml gzaraunitn ee2110 yeshaokai elv-zhounan zhenligod asdf2kr shutongjin learnerma thomas-lee-adtronics abhinav95 chrisindris istiyakv 2nung youyuehanxiao applezoos lianglili tcrapse hatemhosam rendicahya wzhao5 rushdawn chuan-shanjia coolsquirtle dextermayhewjdkinetics-dataset's Issues
When decoding some of the kinetics400 videos, "moov atom not found" raised
Hi, Thanks for your dataset! It's very helpful!
I have successfully download kinetics400 dataset and annotations and have two questions.
-
Some of the videos in kinetics400 train set are corrupted. Try to decode video with
cv2.VideoCapture("/path/to/video")
, "moov atom not found" raised. The corrupted video file list is here
test.txt -
There are 240k videos in kinetics400 train set. However, it seems that not all of them(about 238k-239k) are in kinetics400 train annotations
Missing videos in downloaded validation and train folder.
Hello,
It looks like a lot of samples found in https://s3.amazonaws.com/kinetics/600/annotations/train.csv and val.csv are missing. In my downloaded val folder there are about 29k video samples but only 15k of them can be found in the https://s3.amazonaws.com/kinetics/600/annotations/val.csv file.
Burak
does kinect700 include kinect600 and kinect 400
fix the space in link
change wget $one
to wget "$one"
in download.sh. Otherwise, the space in the link causes fails when downloading.
what does is_cc mean?
in k400_train,
the header is:
label,youtube_id,time_start,time_end,split,is_cc
so, what does the is_cc mean?
thanks~
Followup on the held out set for Kinetics-600
Hi @kinetics-cvdf ,
Is there a path for held-out test set for Kinetics-600 now like you have https://s3.amazonaws.com/kinetics/600/test/k600_test_path.txt for the test set?
this is related to #10. I am facing the same issue. I was planning to avoid downloading all the videos from youtube.
Some videos recorded in the annotation could not be found
In the process of using the dataset, we found that some videos recorded in train.csv, test.csv, val.csv under the annotations folder were lost. After putting 1200 replace videos into the train folder, there are still 8000 videos that cannot be found.
extracting into subdirectories rather than everything in one dir
The extract.sh untar everything to one folder. Isn't it better to extract it into its destination folder? I.e.
$ more ../extract2.sh
for file in *.tar.gz;
do
mkdir -p "${file%.tar.gz}"
tar -zxf "$file" -C "${file%.tar.gz}"
done
and extract with
DeepMind/Kinetics-600/train$ bash ../extract2.sh k600_train_path.txt
gives
drwxr-sr-x 2 torel users 993 Apr 29 2018 abseiling/
-rw-r--r-- 1 torel users 2000227200 Apr 17 12:49 abseiling.tar.gz
drwxr-sr-x 2 torel users 708 Apr 29 2018 acting in play/
-rw-r--r-- 1 torel users 930025532 Apr 17 12:49 acting in play.tar.gz
drwxr-sr-x 2 torel users 649 Apr 29 2018 adjusting glasses/
Brgds,
Missing videos
It is known that some YouTube videos are not available,i wonder if the Kinetics-400 you provided contain these missing videos.
Size of Some Videos is 0KB
When I use ffmpeg to change the resolution of these videos, I found some errors. So I check the size of the videos. It is found that some videos are 0KB in size.
1. E2kUsRIj4tM_000317_000327.mp4
2. E2NeSaQieHk_000087_000097.mp4
3. MVWayhNpHr0_000065_000075.mp4
4. N74EWF0fs5c_000182_000192.mp4
5. QhF1i23vwps_000379_000389.mp4
6. YCQlaH_Vy8I_000245_000255.mp4
7. _cbZlhduYJY_000503_000513.mp4
8. aCcAcCE7Ixo_000034_000044.mp4
9. gKBhQ-oe_9Q_000177_000187.mp4
10. lm6qgrfJGmw_000027_000037.mp4
11. 28bTQiuymgs_000031_000041.mp4
12. 8iED0lhyrN8_000038_000048.mp4
13. B6GxQKcL7IY_000213_000223.mp4
14. Df6CGDjUkAA_000151_000161.mp4
15. GkGS69GCx4Q_000319_000329.mp4
16. J5xNIJlfBAw_000156_000166.mp4
17. QzmhrYx15_E_000059_000069.mp4
18. ZtCk_0cMZ9U_000347_000357.mp4
19. d_vQWquKtBg_000015_000025.mp4
20. rba-NkJjSNg_000167_000177.mp4
21. wL1Bit-Gv40_000305_000315.mp4
22. GN37yfNvQwM_000132_000142.mp4
23. bOU2oGVBM_o_000030_000040.mp4
24. du6bfkBEfVs_000155_000165.mp4
25. fXRNY6-s-7U_000112_000122.mp4
I don't know if there was an error in the process of downloading or decompressing, or the videos themselves are corrupted.
label missing in k400 test set annotation
The original test set download from https://deepmind.com/research/open-source/kinetics contain labels for each video, which is missing in this release.
from deepmind:
label,youtube_id,time_start,time_end,split
drinking beer,--6bJUbfpnQ,17,27,test
climbing tree,--8YXc8iCt8,2,12,test
surfing water,--coBvtS-eQ,57,67,test
stomping grapes,--q6ElFyVq0,148,158,test
...
download from AWS (the label is missing):
youtube_id,time_start,time_end,split
--6bJUbfpnQ,17,27,test
--8YXc8iCt8,2,12,test
--coBvtS-eQ,57,67,test
--q6ElFyVq0,148,158,test
...
Missing files in the data folder
Hi,
When I was checking the test folder. I found the filename in the test.csv has 39805 records. When checking the file basing on the records in the test.csv, there are 1120 not exist out of 39805. Is this normal?
Thank you.
Videos inside train and test folder may have different names. (K400)
The following is the annotations file for training: [](url
k400_train.csv
)
The following is the list of videos present inside the train folder:
origtrain.txt
Many videos that are present inside the annotations file are either missing or have a different name: Some examples are:
absent: ['abseiling' 'lqciwm6gDrk' 659 669 'train' 0]
absent: ['abseiling' 'Lwti_IVm-Bc' 39 49 'train' 0]
absent: ['abseiling' 'LwyKxe85UWI' 88 98 'train' 0]
absent: ['abseiling' 'lXnebafO2cI' 2145 2155 'train' 0]
absent: ['abseiling' 'LY02AE6XK5I' 381 391 'train' 0]
absent: ['abseiling' 'M-hBdj62g9Y' 48 58 'train' 0]
absent: ['abseiling' 'm-iKFbNcLYM' 30 40 'train' 0]
absent: ['abseiling' 'M1QFHoC4o3A' 78 88 'train' 0]
absent: ['abseiling' 'm25BcZ3B0Hs' 219 229 'train' 0]
absent: ['abseiling' 'M6yv0dy8lYE' 297 307 'train' 0]
absent: ['abseiling' 'm8Pm5kmCuqI' 64 74 'train' 0]
absent: ['abseiling' 'MIIbU2xZcUY' 32 42 'train' 0]
absent: ['abseiling' 'mjsrWa2olhk' 35 45 'train' 0]
absent: ['abseiling' 'MP-Op52e84g' 176 186 'train' 0]
absent: ['abseiling' 'MqBaIW3qmuM' 98 108 'train' 0]
absent: ['abseiling' 'mRdyYMPlJ_8' 73 83 'train' 0]
Can someone please confirm if they have the same issue or am I missing something?
Thank you
No videos found after running the k700_2020_extractor.sh !
I am trying to download the kinetics 700 dataset
I followed the instructions provided But no videos were dowloaded
Is there any suggestions?
k700_2020_downloader.sh
This file is all kinds of wrong. First off, uses sudo for everything and then also downloads all the files twice. I will fix it and do a pull request. This is really, really bad.
Many videos in the Kinetics700-2020 are shorter than 10 seconds
Hi, many videos in the Kinetics700-2020 are shorter than 10 seconds, but they are supposed to be 10 seconds long. In the test split, the percentage is over 25%. Here are some examples that are shorter than 8 seconds.
Kinetics700-2020-test/v55ikd_-Rc4_000141_000151.mp4
Kinetics700-2020-test/52mb2tRzayU_000106_000116.mp4
Kinetics700-2020-test/9k3bdcoMTVY_000013_000023.mp4
Kinetics700-2020-test/f9FftpAwmws_000074_000084.mp4
Kinetics700-2020-test/714LsaiTVVk_000002_000012.mp4
Kinetics700-2020-test/7WtqdnyTXjY_000004_000014.mp4
Kinetics700-2020-test/bbaRarfa-X0_000073_000083.mp4
Kinetics700-2020-test/xKnk1UYdgac_000000_000010.mp4
Kinetics700-2020-test/Pf5jowvNpiE_000013_000023.mp4
Kinetics700-2020-test/A1CQslN-Xbw_000010_000020.mp4
Kinetics700-2020-test/aJw7fScmOGo_000007_000017.mp4
Kinetics700-2020-test/2bI8oYlrWjs_000000_000010.mp4
Kinetics700-2020-test/KV8RVTRTAL0_000007_000017.mp4
Kinetics700-2020-test/rAgdt5mqCwA_000048_000058.mp4
Kinetics700-2020-test/LhW0hADHePo_000000_000010.mp4
Kinetics700-2020-test/Fo7EYCBwDaw_000135_000145.mp4
Kinetics700-2020-test/72PEZjijk8o_000002_000012.mp4
Kinetics700-2020-test/3-3e71B5yBo_000000_000010.mp4
Kinetics700-2020-test/d61S7amsWsM_000003_000013.mp4
Kinetics700-2020-test/191VnlH8z68_000002_000012.mp4
Kinetics700-2020-test/QV6D9MoUlH4_000042_000052.mp4
Kinetics700-2020-test/flXQJFDjw1E_000001_000011.mp4
Kinetics700-2020-test/iCgHfcLhnDU_000318_000328.mp4
Kinetics700-2020-test/6vemGexYgHI_000003_000013.mp4
Kinetics700-2020-test/2AxfjxBvh10_000000_000010.mp4
Kinetics700-2020-test/4LFQuxKfFIQ_000261_000271.mp4
Kinetics700-2020-test/4QYmCBN1nHQ_000046_000056.mp4
Kinetics700-2020-test/cPd1GhGV4Fg_000011_000021.mp4
Kinetics700-2020-test/4V7JPYZBnCM_000014_000024.mp4
Kinetics700-2020-test/3xcQj9HZP5Y_000000_000010.mp4
Kinetics700-2020-test/1LaRLvgZTjI_000114_000124.mp4
Kinetics700-2020-test/8uGAZkuoXVg_000078_000088.mp4
Kinetics700-2020-test/42vZ8I-jRPg_000034_000044.mp4
Kinetics700-2020-test/1f-5jxwtibg_000262_000272.mp4
Kinetics700-2020-test/6_T1NJTMNuc_000000_000010.mp4
Kinetics700-2020-test/1F4REb4pqo0_000001_000011.mp4
Kinetics700-2020-test/3OPqFdZlaNY_000075_000085.mp4
Kinetics700-2020-test/JE8h-yGd25w_000000_000010.mp4
Kinetics700-2020-test/9PVi6qiS7zM_000006_000016.mp4
Kinetics700-2020-test/0hMk37By7t4_000021_000031.mp4
Kinetics700-2020-test/Pd_gOf0TY7M_000050_000060.mp4
Kinetics700-2020-test/KdD5HVxwaQE_000018_000028.mp4
Kinetics700-2020-test/caBITzNkOis_000014_000024.mp4
Kinetics700-2020-test/3lGPnnsf9Y8_000004_000014.mp4
Kinetics700-2020-test/1OvQ9_ZgnIA_000000_000010.mp4
Kinetics700-2020-test/AkIhOrNcbUA_000020_000030.mp4
Kinetics700-2020-test/M45S-HkcwTM_000049_000059.mp4
Kinetics700-2020-test/FOa1tk1Isi0_000038_000048.mp4
Kinetics700-2020-test/OgXl2BKdUoU_000012_000022.mp4
Kinetics700-2020-test/uaKPPePpSY0_000006_000016.mp4
Kinetics700-2020-test/-_D7UCii3FU_000021_000031.mp4
Kinetics700-2020-test/3Hr-2TpgVEE_000057_000067.mp4
Kinetics700-2020-test/1Je9mL8Uudo_000000_000010.mp4
Kinetics700-2020-test/N1IGDSJoia0_000000_000010.mp4
Kinetics700-2020-test/9EiQCNi4bOA_000023_000033.mp4
Kinetics700-2020-test/0C9EO_A2PIY_000004_000014.mp4
Kinetics700-2020-test/B0n-nS4Y6xs_000000_000010.mp4
Kinetics700-2020-test/45E3EdNaoHg_000013_000023.mp4
Kinetics700-2020-test/6hpPVBBGZ74_000009_000019.mp4
Kinetics700-2020-test/a1jyH4CJJR4_000000_000010.mp4
Kinetics700-2020-test/AzQ6mn_6ZKc_000000_000010.mp4
Kinetics700-2020-test/0zr5-JyS0Xc_000047_000057.mp4
Kinetics700-2020-test/43D0gnE5Z7o_000083_000093.mp4
Kinetics700-2020-test/IVW_Yk2lyDg_000000_000010.mp4
Kinetics700-2020-test/2R45XkkgbAQ_000045_000055.mp4
Kinetics700-2020-test/8N6-DeT6mXs_000048_000058.mp4
Kinetics700-2020-test/6ATIhv4DFjo_000034_000044.mp4
Kinetics700-2020-test/3E9AdPkiz9o_000000_000010.mp4
Kinetics700-2020-test/5XgnD4P9B-M_000005_000015.mp4
Kinetics700-2020-test/AB305H8Np48_000040_000050.mp4
Kinetics700-2020-test/3OezYSbd_n4_000064_000074.mp4
Kinetics700-2020-test/Z8e-EfVlIx0_000000_000010.mp4
Kinetics700-2020-test/6m_8FNc2scg_000137_000147.mp4
Kinetics700-2020-test/K43n8RqxbFQ_000101_000111.mp4
Kinetics700-2020-test/kgIEx-OjPG0_000000_000010.mp4
Kinetics700-2020-test/0nLH52UNKhw_000000_000010.mp4
Kinetics700-2020-test/5V7GTuihlQQ_000002_000012.mp4
Kinetics700-2020-test/1hZV-H5yl6s_000000_000010.mp4
Kinetics700-2020-test/COZqe2f1Axg_000031_000041.mp4
Kinetics700-2020-test/29GNPtZaqS4_000001_000011.mp4
Kinetics700-2020-test/83J0uf8cJlI_000025_000035.mp4
Kinetics700-2020-test/6Zl5jX9fjKE_000139_000149.mp4
Kinetics700-2020-test/1AGYst8AKCc_000000_000010.mp4
Kinetics700-2020-test/cEmdLm8cBNE_000037_000047.mp4
Kinetics700-2020-test/1FUiMeIu7sE_000011_000021.mp4
Kinetics700-2020-test/5JBC5X0O73k_000005_000015.mp4
Kinetics700-2020-test/Cpn-XAerL5I_000011_000021.mp4
Kinetics700-2020-test/aFqlkvgQKho_000000_000010.mp4
Kinetics700-2020-test/aUOo5M67Itc_000010_000020.mp4
Kinetics700-2020-test/BJaHpp_K148_000190_000200.mp4
Kinetics700-2020-test/AivUke09tz8_000019_000029.mp4
Kinetics700-2020-test/f5WiwscpVlE_000000_000010.mp4
Kinetics700-2020-test/44esMhYjLRs_000019_000029.mp4
Kinetics700-2020-test/p-koaErOtiI_000075_000085.mp4
Kinetics700-2020-test/aC6__nAesz8_000103_000113.mp4
Kinetics700-2020-test/CrnGGdO3C4M_000030_000040.mp4
Kinetics700-2020-test/8YbNZ3lm7Ts_000000_000010.mp4
Kinetics700-2020-test/CAUQyTTat2M_000011_000021.mp4
Kinetics700-2020-test/CZbXx9UW2FE_000146_000156.mp4
Kinetics700-2020-test/7JYYa4C5u4A_000003_000013.mp4
Kinetics 600 validation set mountain climber empty tar file
Hi,
Thanks for making the kinetics dataset publicly available. I found the following link to be empty under Kinetics 600 validation set. Could you please look into it?
https://s3.amazonaws.com/kinetics/600/val/mountain climber (exercise).tar.gz
Thanks
Question about dir `replacement`
I have downloaded the data and extracted the targz files using .sh scripts. Then how can I use replacement
data? Should I move all the files in replacement
to any other dir?
K600 has videos not in the original release?
Hi,
Thanks for your effort archiving the videos.
I found some videos in the provided K600 not exist in the original release.
In particular, the val set provided here is quite different: around 30k videos downloaded but only ~17k are in the original release.
I wonder whether this is an official updated version, or there is something wrong, like you just happened to include videos from other sources by accident?
Thanks,
k600_extract script
find $curr_dl -type f | while read file; do mv "$file"
echo $file | tr ' ' ''done
should be
find $curr_dl -type f | while read file; do mv "$file"
echo $file | tr ' ' ''; done
Missing the semicolon leads to an error.
k600_extractor.sh throws errors during extraction
Running bash k600_extractor.sh
gives the following output (first 11 lines)
Extracting k600_targz/train/abseiling.tar.gz to k600/train
Extracting k600_targz/train/play.tar.gz to k600/train
tar (child): k600_targz/train/play.tar.gz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
Extracting k600_targz/train/glasses.tar.gz to k600/train
tar (child): k600_targz/train/glasses.tar.gz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
...
Expected behaviour: No errors must be thrown during extraction
Missing annotations directory in arrange_by_classes.py
kinetics-dataset/arrange_by_classes.py
Line 22 in 7fddcaa
hi, i just tested this script with the k400 version of the dataset and figured that the 'annotations' folder where the csv's reside in is missing in the script. I'm not sure if this fix would be correct for all the kinetics versions.
fix:
split_csv = load_label(path / 'annotations' / f'{split}.csv')
GPU settings
Hello, Thanks for the great contribution !
Could you please describe the required GPU memory and the number of GPUs for the experiments in the table?
label missing for k600?
Hi, I am wondering where can we get the ground truth label for the k600? It's not in the link you provided in the README and also I checked with DeepMind's csv and json but it seems that not every video in the k600 test set is in their csv or json, for example, the id 0y-r_p-0TwM
(in the part_2.tar.gz
) is not in the csv.
Thank you!
Sizes of The Video Datasets
Hi, could you (or anyone that have downloaded the datasets) provide an estimation of the dataset size so we could better plan out the disk space for them? Thanks! I'm planning to download K400 and K600.
Download part of kinetics dataset
No held-out test in Kinetics-600
I have downloaded all the videos in https://s3.amazonaws.com/kinetics/600/test/k600_test_path.txt, but found the "held-out test set" missing. There are 72,924 videos in https://s3.amazonaws.com/kinetics/600/annotations/test.csv, but only 59,608 downloaded.
Could you help us how to get the held-out test set in Kinetics-600? Thanks! @kinetics-cvdf
part_120.tar.gz is not a tar.gz file but a tar file
label missing in k600 test set annotation
The original test set download from https://deepmind.com/research/open-source/kinetics contain labels for each video, which is missing in this release.
from deepmind:
label,youtube_id,time_start,time_end,split
drinking beer,--6bJUbfpnQ,17,27,test
climbing tree,--8YXc8iCt8,2,12,test
surfing water,--coBvtS-eQ,57,67,test
stomping grapes,--q6ElFyVq0,148,158,test
...
download from AWS (the label is missing):
youtube_id,time_start,time_end,split
--6bJUbfpnQ,17,27,test
--8YXc8iCt8,2,12,test
--coBvtS-eQ,57,67,test
--q6ElFyVq0,148,158,test
...
Please provide the required csv file as soon as possible.
no videos are downloaded
I followed the instructions, but it didn't download any video rather the folders are empty.
Question Regarding kinetics-400 Dataset: What are test videos?
Hello, I'm new to the field of Action recognition and have a question regarding the dataset split. Specifically for the kinetics-400 dataset, in the paper "Unmasked Teacher: Towards Training-Efficient Video Foundation Models," they provide the following summary for the number of training and validation data:
In the Video Swin Transformer paper, they also describe the kinetics-400 dataset as follows:
Both papers commonly state that kinetics-400 consists of approximately 240k training videos and 20k validation videos. However, the CSV file provided in this GitHub repository contains around 40k test videos that are not mentioned in the papers. Could you please clarify what are these test videos?
Additionally, the link to https://deepmind.com/research/open-source/kinetics is not working correctly. Has the official project page been removed?
I would appreciate insights from those who have continued their research in the field of Action Recognition, and familiar with the kinetics dataset.
使用mmaction2的RGB帧提取工具出现错误
very serious problem with k600 test set
After downloading, there are a lot of videos in the training set, and the actual number of correct videos is only more than 3,000
k400
issue
Hello, I didn't find kinetics-dataset on the official website
md5sum
Thanks for this great repo.
It would be much better if md5sum
can be provided for checking files.
High resolution of videos, is it necessary ?
I would like to ask you about the resolution of the videos in dataset. I download some tar.gz files to inspect the resolutions of videos and I noticed that some videos are in 720p resolution. I think that this is completely unnecessary. What I mean is, that in most models I know the first action that perform in a dataset is to resize it for example to 227x227, so a resolution of 720p is a waste of space. So:
- Is there a reason that you have such high resolutions in dataset ?
- Have you tried to run a model with videos with initial high resolution and initial low resolution and notice any difference ?
In a same situation for Sports1M dataset from youtube, the authors suggest to store the low resolution videos.
HTTP request sent, awaiting response... 404 Not Found
bash download.sh k400_..._path.txt always return 404 Not Found error
it looks like the wget "$one" doesn't work
https://stackoverflow.com/questions/7623698/wget-cant-download-404-error
I have tried those methods and it just didn't work.
Could you give me some advice?
No K600 train.csv and val.csv provided
@kinetics-cvdf Could you provide train.csv and val.csv instead of train.txt and val.txt please? Thanks in advance!
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.