Coder Social home page Coder Social logo

driver_attention_prediction's Introduction

Driver Attention Prediction Model

Downloading Dataset:

The Berkeley DeepDrive Attention dataset can be downloaded here: https://bdd-data.berkeley.edu/. Click on the "Download Dataset" to get to the user portal and then you will find the BDD-Attention dataset listed together with other Berkeley DeepDrive video datasets.

Project Introduction:

This project accompanies the paper Predicting Driver Attention in Critical Situations (https://arxiv.org/abs/1711.06406)

Demo:

Demo image

Video demos

Video demo cover

Model structure

Model structure image

Using Our Code:

Dependencies

The code was written with Tensorflow 1.5, Keras 2.1.5 and some other common packages. A Docker image (blindgrandpa/tf150_kr215) was prepared for running the code. The Dockerfile of that Docker image is at ./docker_images/tf150_kr215/ in this repo. The Dockerfile lists all the dependencies. In order to use this Docker image to run our code, you need to have nvidia-docker installed.

Use our model to do inference on your videos

If you want to use our model to generate predicted driver attention maps for your videos, please follow the steps below.

  1. Put your videos into the directory ./data/inference/camera_videos/

  2. Parse your videos into frames. Please run the following command to parse your videos into frames. video_suffix should be the suffix of your video files. sample_rate is for how many frames per second you want to have predicted attention maps. 3 Hz is recommended. We assume that there is no underscore in the names of your video files.

python parse_videos.py \
--video_dir=data/inference/camera_videos \
--image_dir=data/inference/camera_images \
--sample_rate=3 \
--video_suffix=.mp4
  1. Convert video frames to Tensorflow tfrecords files. All the video frames will be divided into n_divides tfrecords files. The frames of each video will be divided into groups that have at most longest_seq frames. You can change longest_seq according to the memory size of your computer.
python write_tfrecords_for_inference.py \
--data_dir=data/inference \
--n_divides=2 \
--longest_seq=35
  1. Download the pre-trained weights. Download this zip file and unzip it to ./

  2. Download the pre-trained weights of Alexnet. Downlaod bvlc_alexnet.npy and put it at ./

  3. Predict driver attention maps by runnning the following command. The predicted attention maps will be at ./pretrained_models/model_for_inference/prediction_iter_0/. The files will be named in the pattern "VideoName_TimeInMilliseconds.jpg".

python infer.py \
--data_dir=data \
--model_dir=pretrained_models/model_for_inference

Train our model from scratch with your data

If you want to train our model from scratch with your videos, please follow the steps below.

  1. Put your camera videos and gaze map videos of your training set and validation set into the following directories ./data/training/camera_videos/, ./data/training/gazemap_videos/, ./data/validation/camera_videos/ and ./data/validation/gazemap_videos/

  2. Parse your videos into frames. Please run the following commands to parse your videos into frames. video_suffix should be the suffix of your video files. sample_rate is for how many frames per second you want to have predicted attention maps. 3 Hz is recommended. We assume that there is no underscore in the names of your video files.

python parse_videos.py \
--video_dir=data/training/camera_videos \
--image_dir=data/training/camera_images \
--sample_rate=3 \
--video_suffix=.mp4

python parse_videos.py \
--video_dir=data/training/gazemap_videos \
--image_dir=data/training/gazemap_images \
--sample_rate=3 \
--video_suffix=.mp4

python parse_videos.py \
--video_dir=data/validation/camera_videos \
--image_dir=data/validation/camera_images \
--sample_rate=3 \
--video_suffix=.mp4

python parse_videos.py \
--video_dir=data/validation/gazemap_videos \
--image_dir=data/validation/gazemap_images \
--sample_rate=3 \
--video_suffix=.mp4
  1. Convert video frames to Tensorflow tfrecords files. All the video frames will be divided into n_divides tfrecords files. The frames of each video will be divided into groups that have at most longest_seq frames. You can change longest_seq according to the memory size of your computer.
python write_tfrecords_for_inference.py \
--data_dir=data/training \
--n_divides=2 \
--longest_seq=35

python write_tfrecords_for_inference.py \
--data_dir=data/validation \
--n_divides=2 \
--longest_seq=35
  1. Download the pre-trained weights of Alexnet. Downlaod bvlc_alexnet.npy and put it at ./

  2. Since our model uses pretrained AlexNet as a feature extractor and fixes this module during training, we first calculate AlexNet features of the input camera videos to save training time.

python make_feature_maps.py \
--data_dir=data/training \
--model_dir=pretrained_models/model_for_inference

python make_feature_maps.py \
--data_dir=data/validation \
--model_dir=pretrained_models/model_for_inference
  1. Generate Tensorflow tfrecords files that contain camera frames, AlexNet features and ground-truth gaze maps. image_size is the size of the image shown in Tensorboard. It is not the size of the image fed to the model.
python write_tfrecords.py \
--data_dir=data/training \
--n_divides=2 \
--feature_name=alexnet \
--image_size 288 512 \
--longest_seq=35

python write_tfrecords.py \
--data_dir=data/training \
--n_divides=2 \
--feature_name=alexnet \
--image_size 288 512 \
--longest_seq=35
  1. Training. Replace a_name_for_this_experiment with another folder name you like.
python train.py \
--data_dir=data \
--model_dir=logs/a_name_for_this_experiment \
--batch_size=10 \
--n_steps=6 \
--feature_name=alexnet \
--train_epochs=500 \
--epochs_before_validation=3 \
--image_size 288 512 \
--feature_map_channels=256 \
--quick_summary_period=20 \
--slow_summary_period=100
  1. Track the training in Tensorboard.
tensorboard --logdir=logs/a_name_for_this_experiment

In the images tag you can see the training input camera frame, ground-truth gaze map and the predicted attention map (for training set).

Fine-tune our model from scratch with your data

If you want to fine-tune our model with your data, please follow the steps described in the previous section Train our model from scratch with your data, and also add these additional steps before step 7:

6.1. Download the pre-trained weights. Download this zip file and unzip it to ./

6.2. Copy the pre-trained model to a new folder where you want to save your fine-tuned model

cp pretrained_models/model_for_finetuning/* logs/a_name_for_this_experiment/

Test a trained model with your testing data

If you want to test a trained model with your testing data, please first prepare your testing data by following the steps 1 to 6 drscribed in the section Train our model from scratch with your data, but replace the folder name training or validation in the structions by testing. Then all the files related to your testing data should be under the folder data/testing. Then let us assume that the log files of the trained model are in the folder logs/a_name_for_this_experiment. Then please run the following command.

python predict.py \
--data_dir=data \
--model_dir=logs/a_name_for_this_experiment \
--batch_size=1 \
--feature_name=alexnet \
--feature_map_channels=256

The predicted attention maps will be at ./logs/a_name_for_this_experiment/prediction_iter_*/. The files will be named in the pattern "VideoName_TimeInMilliseconds.jpg".

driver_attention_prediction's People

Contributors

kevintli avatar pascalxia avatar taklee96 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

driver_attention_prediction's Issues

Inference on videos gives black gazemap

Hello,
I am trying to implement your work as baseline, but when I run the inference on my own data it gives output black images. I followed the steps described, it seems to be all right.
My impression is that there's something missing in the pre-processing phase (so it can be about my data). Could you please give some feedback on it?
I really appreciate your work,
Bests,
Gabriele

Skipped frames on some videos

I have noticed skipping frames on some videos. ex. (training 1277.mp4)
Is this also the case on the original BDD-100K videos?
Thanks,

Question regarding testing docs

The guidelines for testing a model first tell to follow the steps 1-6 of the training process with updated folder names and finally run the predict.py with the finetuned model.

We did this for our finetuned model and it produced valid results. One thing we stumbled over is in step 5 of the training process. The inference model is used here with

python make_feature_maps.py \
--data_dir=data/testing \
--model_dir=pretrained_models/model_for_inference

Is this the correct usage or shouldn't we use our finetuned model also for this step? I already gave it a try but this results in a similar error as in #7.

'Application' Directory in \Data directory.

I am trying to train the model from scratch on same dataset.
in step 6 -

python make_feature_maps.py \
--data_dir=data/training \
--model_dir=pretrained_models/model_for_inference

directory 'applications' is required. I am not sure what does that mean. Also i get feature map not found for each image.

Inference fails when using finetuned models

We use you model as baseline and finetunded the model with our own data following the guidelines for training and finetuning. We tested our model following the guidelines for testing and in produces valid results.

However if we want to use our finetuned model for predicting attention maps on unseen data the inference fails with this error: NotFoundError (see above for traceback): Key encoder/Variable_1 not found in checkpoint.

Is this related to a wrong usage from our side or is this a bug in the code?

For completeness the entire stack trace of the failure

Convert frames to tf records...
/usr/local/lib/python3.5/dist-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters

  0%|          | 0/33 [00:00<?, ?it/s]
100%|##########| 33/33 [00:17<00:00,  1.89it/s]

  0%|          | 0/33 [00:00<?, ?it/s]
100%|##########| 33/33 [00:15<00:00,  2.07it/s]
No. of /tmp/data/inference videos: 66

Generate ROI predictions...
/usr/local/lib/python3.5/dist-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
INFO:tensorflow:Using config: {'_log_step_count_steps': 10, '_is_chief': True, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_num_ps_replicas': 0, '_task_type': 'worker', '_tf_random_seed': None, '_num_worker_replicas': 1, '_model_dir': '/tmp/weights/finetuned/', '_task_id': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f78493e7080>, '_session_config': None, '_master': '', '_keep_checkpoint_every_n_hours': 10000, '_service': None, '_keep_checkpoint_max': 5, '_save_summary_steps': inf}
2020-04-22 14:47:05.293285: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2020-04-22 14:47:05.384798: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:895] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-22 14:47:05.385236: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.392
pciBusID: 0000:01:00.0
totalMemory: 3.94GiB freeMemory: 3.62GiB
2020-04-22 14:47:05.385257: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-04-22 14:47:05.962379: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
INFO:tensorflow:Restoring parameters from /tmp/weights/finetuned/best_ckpt/model.ckpt-3133
2020-04-22 14:47:06.014079: W tensorflow/core/framework/op_kernel.cc:1198] Not found: Key encoder/Variable_1 not found in checkpoint
2020-04-22 14:47:06.014691: W tensorflow/core/framework/op_kernel.cc:1198] Not found: Key encoder/Variable not found in checkpoint
2020-04-22 14:47:06.015068: W tensorflow/core/framework/op_kernel.cc:1198] Not found: Key encoder/Variable_2 not found in checkpoint
2020-04-22 14:47:06.015023: W tensorflow/core/framework/op_kernel.cc:1198] Not found: Key encoder/Variable_4 not found in checkpoint
2020-04-22 14:47:06.015783: W tensorflow/core/framework/op_kernel.cc:1198] Not found: Key encoder/Variable_5 not found in checkpoint
2020-04-22 14:47:06.015942: W tensorflow/core/framework/op_kernel.cc:1198] Not found: Key encoder/Variable_6 not found in checkpoint
2020-04-22 14:47:06.016060: W tensorflow/core/framework/op_kernel.cc:1198] Not found: Key encoder/Variable_7 not found in checkpoint
2020-04-22 14:47:06.016010: W tensorflow/core/framework/op_kernel.cc:1198] Not found: Key encoder/Variable_3 not found in checkpoint
2020-04-22 14:47:06.016549: W tensorflow/core/framework/op_kernel.cc:1198] Not found: Key encoder/Variable_9 not found in checkpoint
2020-04-22 14:47:06.017427: W tensorflow/core/framework/op_kernel.cc:1198] Not found: Key encoder/Variable_8 not found in checkpoint
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1350, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1329, in _run_fn
    status, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: Key encoder/Variable_1 not found in checkpoint
	 [[Node: save/RestoreV2_5 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_5/tensor_names, save/RestoreV2_5/shape_and_slices)]]
	 [[Node: save/RestoreV2_13/_25 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_74_save/RestoreV2_13", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "infer.py", line 209, in <module>
    main(argv=sys.argv)
  File "infer.py", line 185, in main
    for res in predict_generator:
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 430, in predict
    hooks=input_hooks + hooks) as mon_sess:
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 787, in __init__
    stop_grace_period_secs=stop_grace_period_secs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 511, in __init__
    self._sess = _RecoverableSession(self._coordinated_creator)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 972, in __init__
    _WrappedSession.__init__(self, self._create_session())
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 977, in _create_session
    return self._sess_creator.create_session()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 668, in create_session
    self.tf_sess = self._session_creator.create_session()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 440, in create_session
    init_fn=self._scaffold.init_fn)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/session_manager.py", line 273, in prepare_session
    config=config)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/session_manager.py", line 189, in _restore_checkpoint
    saver.restore(sess, checkpoint_filename_with_path)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1686, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 895, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1128, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1344, in _do_run
    options, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1363, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Key encoder/Variable_1 not found in checkpoint
	 [[Node: save/RestoreV2_5 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_5/tensor_names, save/RestoreV2_5/shape_and_slices)]]
	 [[Node: save/RestoreV2_13/_25 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_74_save/RestoreV2_13", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

Caused by op 'save/RestoreV2_5', defined at:
  File "infer.py", line 209, in <module>
    main(argv=sys.argv)
  File "infer.py", line 185, in main
    for res in predict_generator:
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 430, in predict
    hooks=input_hooks + hooks) as mon_sess:
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 787, in __init__
    stop_grace_period_secs=stop_grace_period_secs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 511, in __init__
    self._sess = _RecoverableSession(self._coordinated_creator)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 972, in __init__
    _WrappedSession.__init__(self, self._create_session())
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 977, in _create_session
    return self._sess_creator.create_session()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 668, in create_session
    self.tf_sess = self._session_creator.create_session()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 431, in create_session
    self._scaffold.finalize()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 210, in finalize
    self._saver = training_saver._get_saver_or_default()  # pylint: disable=protected-access
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 821, in _get_saver_or_default
    saver = Saver(sharded=True, allow_empty=True)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1239, in __init__
    self.build()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1248, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1284, in _build
    build_save=build_save, build_restore=build_restore)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 759, in _build_internal
    restore_sequentially, reshape)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 471, in _AddShardedRestoreOps
    name="restore_shard"))
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 428, in _AddRestoreOps
    tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 268, in restore_op
    [spec.tensor.dtype])[0])
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_io_ops.py", line 1031, in restore_v2
    shape_and_slices=shape_and_slices, dtypes=dtypes, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1625, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

NotFoundError (see above for traceback): Key encoder/Variable_1 not found in checkpoint
	 [[Node: save/RestoreV2_5 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_5/tensor_names, save/RestoreV2_5/shape_and_slices)]]
	 [[Node: save/RestoreV2_13/_25 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_74_save/RestoreV2_13", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

UnboundLocalError: local variable 'ckpt_path' referenced before assignment

I am facing this issue while generating Alexnet features .

to reproduce

python make_feature_maps.py \ --data_dir=data/training \ --model_dir=pretrained_models/model_for_inference
The traceback is here
Traceback (most recent call last): File "make_feature_maps.py", line 186, in <module> main(argv=sys.argv) File "make_feature_maps.py", line 163, in main checkpoint_path=ckpt_path) UnboundLocalError: local variable 'ckpt_path' referenced before assignment

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.