Coder Social home page Coder Social logo

ferreirafabio / video2tfrecord Goto Github PK

View Code? Open in Web Editor NEW
153.0 8.0 37.0 28.44 MB

Easily convert RGB video data (e.g. .avi) to the TensorFlow tfrecords file format for training e.g. a NN in TensorFlow. This implementation allows to limit the number of frames per video to be stored in the tfrecords.

License: MIT License

Python 96.52% Shell 3.48%
video-understanding deep-learning preprocessor tensorflow tensorflow-tfrecords opencv neural-network optical-flow

video2tfrecord's Introduction

Downloads

Description

Easily convert RGB video data (e.g. tested with .avi and .mp4) to the TensorFlow tfrecords file format for training e.g. a NN in TensorFlow. Due to common hardware/GPU RAM limitations in Deep Learning, this implementation allows to limit the number of frames per video to be stored in the tfrecords or to simply use all video frames. The code automatically chooses the frame step size s.t. there is an equal separation distribution of the individual video frames.

The implementation offers the option to include Optical Flow (currently OpenCV's calcOpticalFlowFarneback) as an additional channel to the tfrecords data (it can be easily extended in this regard, for example, by exchanging the currently used Optical Flow algorithm with a different one). Acompanying the code, we've also added a small example with two .mp4 files from which two tfrecords batches are created (1 video per tfrecords file). To access the examples, make sure to use the GitHub repo instead of the pip package.

This implementation was created during a research project and grew historically. Therefore, we invite users encountering bugs to pull-request fixes.

Installation

run the following command:

pip install video2tfrecord 

Writing (video) to tfrecord

After installing the package, you execute the following exemplary command to start the video-to-tfrecord conversion:

from video2tfrecord import convert_videos_to_tfrecord

convert_videos_to_tfrecord(source_path, destination_path, n_videos_in_record, n_frames_per_video, "*.avi") 

while n_videos_in_record being the number of videos in one single tfrecord file, n_frames_per_video being the number of frames to be stored per video and source_path containing your .avi video files. Set n_frames_per_video="all" if you want all video frames to be stored in the tfrecord file (keep in mind that tfrecord can become very large).

Reading from tfrecord

see test.py for an example

Manual installation

If you want to set up your installation manually, use the install scripts provided in the repository.

The package has been successfully tested with:

  • Python 3.4, 3.5 and 3.6
  • tensorflow 1.5.0
  • opencv-python 3.4.0.12
  • numpy 1.14.0

OpenCV troubleshooting

If you encounter issues with OpenCV (e.g. because you use a different version), you can build OpenCV locally from the repository [1] (e.g. refer to StackOverflow thread under [2]). Make sure to use the specified version as in different versions there might be changes to functions within the OpenCV framework.

Parameters and storage details

By adjusting the parameters at the top of the code you can control:

  • input dir (containing all the video files)
  • output dir (to which the tfrecords should be saved)
  • resolution of the images
  • video file suffix (e.g. *.avi) as RegEx(!include asterisk!)
  • number of frames per video that are actually stored in the tfrecord entries (can be smaller than the real number of frames)
  • image color depth
  • if optical flow should be added as a 4th channel
  • number of videos a tfrecords file should contain

The videos are stored as features in the tfrecords. Every video instance contains the following data/information:

  • feature[path] (as byte string while path being "blobs/i" with 0 <= i <=number of images per video)
  • feature['height'] (while height being the image height, e.g. 128)
  • feature['width'] (while width being the image width, e.g. 128)
  • feature['depth'] (while depth being the image depth, e.g. 4 if optical flow used)

Future work:

  • supervised learning: allow to include a label file (e.g. .csv) that specifies the relationship <videoid> to <label> in each row and store label information in the records
  • use compression mode in TFRecordWriter (options = tf.python_io.TFRecordOptions(tf.python_io.TFRecordCompressionType.GZIP))
  • improve documentation
  • add the option to use all video frames instead of just a subset (use n_frames_per_video="all")
  • write small exemplary script for loading the tfrecords + meta-data into a TF QueueRunner (see test.py)
  • replace Farneback optical flow with a more sophisticated method, say dense trajectories
  • Question to users: would it make sense to offer video2tfrecord as a web service (i.e. upload videos, get tfrecords back)?

Additional contributors: Jonas Rothfuss (https://github.com/jonasrothfuss/)

video2tfrecord's People

Contributors

alvarocavalcante avatar dependabot[bot] avatar energee avatar ferreirafabio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

video2tfrecord's Issues

Unknown command line flag 'f'

Hi.
I am trying to run the Unit Test with the included 2 videos with
convert_videos_to_tfrecord(source_path=localdir, destination_path=local_out,
n_videos_in_record=n_videos_per_record,
n_frames_per_video=n_frames,
dense_optical_flow=True,
width=width,
height=height,
file_suffix="*.mp4")
and I get this error:

in create(self)
7 width=width,
8 height=height,
----> 9 file_suffix=".mp4")
10
11 filenames = gfile.Glob(os.path.join(local_out, "
.tfrecords"))

/srv/hops/anaconda/anaconda/envs/demo_tensorflow_jdowlin0/lib/python3.6/site-packages/video2tfrecord/video2tfrecord.py in convert_videos_to_tfrecord(source_path, destination_path, n_videos_in_record, n_frames_per_video, file_suffix, dense_optical_flow, width, height, color_depth, video_filenames)
174 total_batch_number = int(math.ceil(len(filenames) / n_videos_in_record))
175 print('Batch ' + str(i + 1) + '/' + str(total_batch_number) + " completed")
--> 176 assert data.size != 0, 'something went wrong during video to numpy conversion'
177 save_numpy_to_tfrecords(data, destination_path, 'batch_',
178 n_videos_in_record, i + 1, total_batch_number,

AssertionError: something went wrong during video to numpy conversion

Any ideas what this could be?

AssertionError: something went wrong during video to numpy conversion

Hi,
While running it on colab I am getting the assertion error.
`---------------------------------------------------------------------------

AssertionError Traceback (most recent call last)

in ()
1 from video2tfrecord import convert_videos_to_tfrecord
2
----> 3 convert_videos_to_tfrecord('/content/drive/My Drive/XVD/videos/','content', 3, 10, "*.mp4")

/usr/local/lib/python3.6/dist-packages/video2tfrecord/video2tfrecord.py in convert_videos_to_tfrecord(source_path, destination_path, n_videos_in_record, n_frames_per_video, file_suffix, dense_optical_flow, width, height, color_depth, video_filenames)
174 total_batch_number = int(math.ceil(len(filenames) / n_videos_in_record))
175 print('Batch ' + str(i + 1) + '/' + str(total_batch_number) + " completed")
--> 176 assert data.size != 0, 'something went wrong during video to numpy conversion'
177 save_numpy_to_tfrecords(data, destination_path, 'batch_',
178 n_videos_in_record, i + 1, total_batch_number,

AssertionError: something went wrong during video to numpy conversion

Any idea as to what might be causing this and how to fix it?

High memory use

Hello guys, I found another issue when trying to use the project in my context. The problem is that I'm running the code on a computer with only 8GB of RAM, and I would like to put as many videos as possible in each TFrecord file.

It turns out that the memory consumption increases as the program runs, until the point that it just crashes. It turns out that the "data" array keeps occupying memory even when a new iteration of the for loop starts...

To solve this, I basically added the following lines of code:

data = None

for i, batch in enumerate(filenames_split):
     if data is not None:
          data = None

When assigning the "data" array to None the memory is released instantly. With this simple modification, I'm able to put about 140 into each file!

Do you think that I could create a PR with this modification?

Too much storage required

Hello everyone, the video2tfrecord library is working very well, great work!

I've done just some small modifications to run the code on TF 2, and this is one of my suggestions, we could detect the TensorFlow version and change the code dynamically.

Another interesting point is the compression of the video frames. Using the default code, each TFRecord with 5 videos and 25 frames got about 340MB. With this proportion, just my test set would get about 21GB...

To solve this, I changed the code to generate the byte image from this: image.tostring() to this: tf.image.encode_jpeg(image).numpy(). With this change, the same TFRecord just got about 18MB which is much more efficient to store video datasets. This could be a second modification...

What do you think about this?

assertion error

I use video2record to deal with some short video, I want to turn them into tfrecord then I can input them into tensorflow.
but it responds with an error
"AssertionError: something went wrong during video to numpy conversion"

could you please help me with this issue ?
Thank you in advance.

Problem in numpy

I use video2record to deal with some short video, I want to turn them into tfrecord then I can input them into tensorflow. Then I use videorecord on github: https://github.com/ferreirafabio/video2tfrecord But when I run it, the error said numpy wrong here are my code(python):

from video2tfrecord import convert_videos_to_tfrecord

convert_videos_to_tfrecord(source_path="video", destination_path="video", n_videos_in_record= 1 , n_frames_per_video = 5 , file_suffix="*.mp4")
here are the error:

AssertionError Traceback (most recent call last) in () 1 from video2tfrecord import convert_videos_to_tfrecord 2 ----> 3 convert_videos_to_tfrecord(source_path="video", destination_path="video", n_videos_in_record= 1 , n_frames_per_video = 5 , file_suffix="*.mp4")

/usr/local/lib/python3.6/dist-packages/video2tfrecord/video2tfrecord.py in convert_videos_to_tfrecord(source_path, destination_path, n_videos_in_record, n_frames_per_video, file_suffix, dense_optical_flow, width, height, color_depth, video_filenames) 174 total_batch_number = int(math.ceil(len(filenames) / n_videos_in_record)) 175 print('Batch ' + str(i + 1) + '/' + str(total_batch_number) + " completed") --> 176 assert data.size != 0, 'something went wrong during video to numpy conversion' 177 save_numpy_to_tfrecords(data, destination_path, 'batch_', 178 n_videos_in_record, i + 1, total_batch_number,

AssertionError: something went wrong during video to numpy conversion

new problem

Here are the error info

`pi@Hans:~/Desktop/testtrans/video2tfrecord$ python3 test.py 
Total videos found: 2
1 of 1 videos within batch processed:  ./example/input/100998.mp4
Batch 1/2 completed
Writing ./example/output/batch_1_of_2.tfrecords
1 of 1 videos within batch processed:  ./example/input/100999.mp4
Batch 2/2 completed
Writing ./example/output/batch_2_of_2.tfrecords
2018-10-03 23:59:48.840133: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
WARNING:tensorflow:From /home/pi/.local/lib/python3.6/site-packages/tensorflow/python/training/input.py:187: QueueRunner.__init__ (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
WARNING:tensorflow:From /home/pi/.local/lib/python3.6/site-packages/tensorflow/python/training/input.py:187: add_queue_runner (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
WARNING:tensorflow:From test.py:106: start_queue_runners (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
E
======================================================================
ERROR: test_example1 (__main__.Testvideo2tfrecord)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/pi/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1292, in _do_call
    return fn(*args)
  File "/home/pi/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1277, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/home/pi/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1367, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 3686400 values, but the requested shape has 691200
	 [[{{node Reshape_5}} = Reshape[T=DT_UINT8, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](DecodeRaw_1, Reshape_2/shape)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 30, in test_example1
    get_number_of_records(filenames, n_frames))
  File "test.py", line 109, in get_number_of_records
    video = sess_valid.run([image_seq_tensor_val])
  File "/home/pi/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 887, in run
    run_metadata_ptr)
  File "/home/pi/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1110, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/pi/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1286, in _do_run
    run_metadata)
  File "/home/pi/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1308, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 3686400 values, but the requested shape has 691200
	 [[{{node Reshape_5}} = Reshape[T=DT_UINT8, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](DecodeRaw_1, Reshape_2/shape)]]

Caused by op 'Reshape_5', defined at:
  File "test.py", line 123, in <module>
    unittest.main()
  File "/usr/lib/python3.6/unittest/main.py", line 95, in __init__
    self.runTests()
  File "/usr/lib/python3.6/unittest/main.py", line 256, in runTests
    self.result = testRunner.run(self.test)
  File "/usr/lib/python3.6/unittest/runner.py", line 176, in run
    test(result)
  File "/usr/lib/python3.6/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib/python3.6/unittest/suite.py", line 122, in run
    test(result)
  File "/usr/lib/python3.6/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib/python3.6/unittest/suite.py", line 122, in run
    test(result)
  File "/usr/lib/python3.6/unittest/case.py", line 653, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib/python3.6/unittest/case.py", line 605, in run
    testMethod()
  File "test.py", line 30, in test_example1
    get_number_of_records(filenames, n_frames))
  File "test.py", line 100, in get_number_of_records
    image_seq_tensor_val = read_and_decode(filename_queue_val, n_frames)
  File "test.py", line 74, in read_and_decode
    image = tf.reshape(image, [1, height, width, num_depth])
  File "/home/pi/.local/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 6296, in reshape
    "Reshape", tensor=tensor, shape=shape, name=name)
  File "/home/pi/.local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/pi/.local/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/home/pi/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3272, in create_op
    op_def=op_def)
  File "/home/pi/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1768, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Input to reshape is a tensor with 3686400 values, but the requested shape has 691200
	 [[{{node Reshape_5}} = Reshape[T=DT_UINT8, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](DecodeRaw_1, Reshape_2/shape)]]


----------------------------------------------------------------------
Ran 1 test in 11.783s

FAILED (errors=1)
`

I try to use only 2 videos to transform, so no problem with numpy array, but this problem
hope you can help me to solve it
Bet wishes for you and your video2tfrecord

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.