tobias-fischer / rt_gene Goto Github PK

RT-GENE: Real-Time Eye Gaze and Blink Estimation in Natural Environments

Home Page: http://www.imperial.ac.uk/personal-robotics

License: Other

CMake 0.63% Python 80.15% Jupyter Notebook 14.81% MATLAB 4.31% Shell 0.10%

gaze-estimation gaze head-pose-estimation robots robotics human-robot-interaction blink-detection-algorithm rt-gene dataset ros

rt_gene's Introduction

RT-GENE & RT-BENE: Real-Time Eye Gaze and Blink Estimation in Natural Environments

This repository contains code and dataset references for two papers: RT-GENE (Gaze Estimation; ECCV2018) and RT-BENE (Blink Estimation; ICCV2019 Workshops).

RT-GENE (Gaze Estimation)

License + Attribution

The RT-GENE code is licensed under CC BY-NC-SA 4.0. Commercial usage is not permitted. If you use this dataset or the code in a scientific publication, please cite the following paper:

@inproceedings{FischerECCV2018,
author = {Tobias Fischer and Hyung Jin Chang and Yiannis Demiris},
title = {{RT-GENE: Real-Time Eye Gaze Estimation in Natural Environments}},
booktitle = {European Conference on Computer Vision},
year = {2018},
month = {September},
pages = {339--357}
}

This work was supported in part by the Samsung Global Research Outreach program, and in part by the EU Horizon 2020 Project PAL (643783-RIA).

Overview + Accompanying Dataset

The code is split into four parts, each having its own README contained. There is also an accompanying dataset (alternative link) to the code. For more information, other datasets and more open-source software please visit the Personal Robotic Lab's website: https://www.imperial.ac.uk/personal-robotics/software/.

RT-GENE ROS package

The rt_gene directory contains a ROS package for real-time eye gaze and blink estimation. This contains all the code required at inference time.

RT-GENE Standalone Version

The rt_gene_standalone directory contains instructions for eye gaze estimation given a set of images. It shares code with the rt_gene package (above), in particular the code in rt_gene/src/rt_gene.

RT-GENE Inpainting

The rt_gene_inpainting directory contains code to inpaint the region covered by the eyetracking glasses.

RT-GENE Model Training

The rt_gene_model_training directory allows using the inpainted images to train a deep neural network for eye gaze estimation.

RT-BENE (Blink Estimation)

License + Attribution

The RT-BENE code is licensed under CC BY-NC-SA 4.0. Commercial usage is not permitted. If you use our blink estimation code or dataset, please cite the relevant paper:

@inproceedings{CortaceroICCV2019W,
author={Kevin Cortacero and Tobias Fischer and Yiannis Demiris},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision Workshops},
title = {RT-BENE: A Dataset and Baselines for Real-Time Blink Estimation in Natural Environments},
year = {2019},
}

RT-BENE was supported by the EU Horizon 2020 Project PAL (643783-RIA) and a Royal Academy of Engineering Chair in Emerging Technologies to Yiannis Demiris.

Overview + Accompanying Dataset

The code is split into several parts, each having its own README. There is also an associated RT-BENE dataset. For more information, other datasets and more open-source software please visit the Personal Robotic Lab's website: https://www.imperial.ac.uk/personal-robotics/software/. Please note that a lot of the code is shared with RT-GENE (see above), hence there are many references to RT-GENE below.

RT-BENE ROS package

The rt_gene directory contains a ROS package for real-time eye gaze and blink estimation. This contains all the code required at inference time. For blink estimation, please refer to the estimate_blink.py file.

RT-BENE Standalone Version

The rt_bene_standalone directory contains instructions for blink estimation given a set of images. It makes use of the code in rt_gene/src/rt_bene.

RT-BENE Model Training

The rt_bene_model_training directory contains the code required to train models with the labels contained in the RT-BENE dataset (see below). We will soon at evaluation code in this directory, too.

RT-BENE Dataset

We manually annotated images contained in the "noglasses" part of the RT-GENE dataset. The RT-BENE dataset on Zenodo contains the eye image patches and associated annotations to train the blink models.

rt_gene's People

Stargazers

Watchers

Forkers

aaramirezd horanyinora bubblyyi zhaozz-lab nicole7han shahargigi leon-liangwu zhang405744522 avatarworld ibug-hci2 drastraios barmazal alsuren huxian0402 dark-art absorbguo kevinnt2018 wivj4zm4fpg0 mmurray ashesh-0 paranoidrabbid monkidea liuguoyou merria28 aliushn manfred-git liunaiming hengfei-wang azuredsky bigmms bjj9 lindayuanyuan metu-kalfa adas-eye balandongiv martinhoang11 simonmssu gazelei zkxwy1996 dun933 ulziibayarrepo uptodiff zergey senfu chris-arvin sunshinewhy aaravrules incandescenced1 sainisanjay kanos-taeyeon stamcini sherif-mooo younesmch lucas1996-xgboost jxncyym grantzheng86 jonathangoorblink exponentialr imperialcollegelondon ahmed-alhindawi csreddy14 peterzs hctiws ajinkyapuar cedricgoubard hoohahabighead addvaluejack

rt_gene's Issues

Incorporate blink estimation

We will present a method for blink estimation based on the RT-GENE dataset at the ICCV2019 workshops. We'll need to clean the code and merge it to this repo.

Separate external libraries from internally developed files

setup instructions

I am exploring eye tracking solutions for one of my research work.
Your research solution seems to be interesting.
Can you let me know the setup instructions?

Thanks,

Add face-alignment as dependency

See ros/rosdistro#19413

Questions about output data

I currently have two problems.

Problem 1: I'm running the estimate_gaze_standalone file and I'm printing the gaze and headpose data to a txt file. But I find that both gaze and headpose parameters are an array like [0.14779022336006165, 0.401142060756778335]. I would like to know what the two numbers in this array represent, is it the direction of the gaze and headpose respectively? Shouldn't it be a 3D data if it's a direction? (I mean there should be three numbers in each array)

Problem 2: The method I'm currently using is to pump frame the video and then store the jpg file in the samples_gaze folder. But this will display that image after processing each image, how can I automatically process all of the deposited images all over again without me having to manually close the display of each photo?

Estimate_gaze.launch

Hi Tobias

I am only using a webcam is there any other things I need to change in my estimate_gaze.launch file? I continue to get this message with the bold text in red when I run the the file. I also had trouble (memory error) installing the torch modules ie torch torchvision face-alignment due to the lack of ram I think, and hence downloaded them with no cache like so pip --no-cache-dir install face-alignment, so not sure if that is also a reason. Thank you

Best wishes,

Godfrey

Traceback (most recent call last):
File "/home/godfrey/wheelchair/src/rt_gene/rt_gene/scripts/extract_landmarks_node.py", line 19, in
landmark_extractor = rt_gene.extract_landmarks_method.LandmarkMethod()
File "/home/godfrey/wheelchair/src/rt_gene/rt_gene/src/rt_gene/extract_landmarks_method.py", line 118, in init
self.face_net = FaceDetector(device=self.device_id_facedetection)
File "/home/godfrey/.local/lib/python2.7/site-packages/face_alignment/detection/sfd/sfd_detector.py", line 39, in init
os.path.join(path_to_temp_detector))
File "/usr/lib/python2.7/urllib.py", line 98, in urlretrieve
return opener.retrieve(url, filename, reporthook, data)
File "/usr/lib/python2.7/urllib.py", line 249, in retrieve
tfp = open(filename, 'wb')
IOError: [Errno 2] No such file or directory: '/home/godfrey/.face_alignment/data/s3fd_convert.pth.download'
2019-06-30 16:13:57.814582: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-06-30 16:13:57.818264: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2699870000 Hz
2019-06-30 16:13:57.818466: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5641d151a150 executing computations on platform Host. Devices:
2019-06-30 16:13:57.818503: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): ,
Load model model_nets/Model_allsubjects1.h5

[gaze/extract_landmarks_new-2] process has died [pid 8301, exit code 1, cmd /home/godfrey/wheelchair/src/rt_gene/rt_gene/scripts/extract_landmarks_node.py /camera_info:=/kinect2/hd/camera_info /image:=/kinect2/hd/image_color_rect __name:=extract_landmarks_new __log:=/home/godfrey/.ros/log/a9d158b6-9b49-11e9-8a16-c4b301d0eb11/gaze-extract_landmarks_new-2.log].
log file: /home/godfrey/.ros/log/a9d158b6-9b49-11e9-8a16-c4b301d0eb11/gaze-extract_landmarks_new-2.log*

2019-06-30 16:13:58.874737: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
/home/godfrey/.local/lib/python2.7/site-packages/keras/engine/saving.py:327: UserWarning: Error in loading the saved optimizer state. As a result, your model is starting with a freshly initialized optimizer.
warnings.warn('Error in loading the saved optimizer '

Unstable head pose

A student of @hjchang reported the following issue:

When I cover half of my face with my hand, rt_gene gives me an incorrect head pose estimation. I guess the cause of the problem is that the program recorded the rotation matrix and translation matrix of the last time, because when I occlude half of the face, the face recognition module will still detect the face normally, but will give a face landmark with large error (See the attached picture for details). At this time, if we use this face landmark to calculate the rotation matrix and translation matrix, we will get the wrong answer, and it will influence the next round calculation.

He expects the issue in the following code: https://github.com/Tobias-Fischer/rt_gene/blob/master/rt_gene/src/rt_gene/extract_landmarks_method.py#L335-L347

/cc @twarz @ngageorange

Use catkin install python to install scripts

Error with multiple subjects

The following error is produced when multiple people are looking at the camera which is then recovered when there is only one person.

[ERROR] [1550842837.937656]: bad callback: <bound method LandmarkMethod.callback of <rt_gene.extract_landmarks_method.LandmarkMethod object at 0x7f9a51898910>>
Traceback (most recent call last):
  File "/opt/ros/kinetic/lib/python2.7/dist-packages/rospy/topics.py", line 750, in _invoke_callback
    cb(msg)
  File "/home/ahmed/catkin_ws/src/rt_gene/rt_gene/src/rt_gene/extract_landmarks_method.py", line 392, in callback
    self.process_image(color_msg)
  File "/home/ahmed/catkin_ws/src/rt_gene/rt_gene/src/rt_gene/extract_landmarks_method.py", line 270, in process_image
    self.publish_subject_list(timestamp, self.subjects)
  File "/home/ahmed/catkin_ws/src/rt_gene/rt_gene/src/rt_gene/extract_landmarks_method.py", line 337, in publish_subject_list
    subject_list_message = self.__subject_bridge.images_to_msg(subjects, timestamp)
  File "/home/ahmed/catkin_ws/src/rt_gene/rt_gene/src/rt_gene/subject_ros_bridge.py", line 70, in images_to_msg
    msg.subjects.append(self.__subject_bridge.images_to_msg(subject_id, s))
  File "/home/ahmed/catkin_ws/src/rt_gene/rt_gene/src/rt_gene/subject_ros_bridge.py", line 51, in images_to_msg
    msg.right_eye_img = self.__cv_bridge.cv2_to_imgmsg(subject.right_eye_color, "rgb8")
  File "/opt/ros/kinetic/lib/python2.7/dist-packages/cv_bridge/core.py", line 246, in cv2_to_imgmsg
    raise TypeError('Your input type is not a numpy array')
TypeError: Your input type is not a numpy array

Download models if not existing

Allow running estimate_gaze.py etc without having to run download_models.py; in that case, download the models at runtime.

Train,Test and Validation division of data

Hi, I'm not able to understand how the division of data was done for 3 fold evaluation in your paper. on RT-Gene dataset. Let me elaborate.

I'm clear on the division of persons for training data and validation data. Validation data is always [s014,s015,s016]. Training data becomes any two of the three.

1. 's001', 's002', 's008', 's010'
2. 's003', 's004', 's007', 's009'
3. 's005', 's006', 's011', 's012', 's013'

Issue

For test data, I see conflicting descriptions. At location of dataset link, it says that from the above mentioned 3 divisions, you pick 2 divisions for training and one for testing. There is no 's000' here. However, in your pytorch version of the code https://github.com/Tobias-Fischer/rt_gene/blob/master/rt_gene_model_training/pytorch/train_model.py#L180, I see 's000' included in test data in all three folds.
In the tensorflow implementation, https://github.com/Tobias-Fischer/rt_gene/blob/master/rt_gene_model_training/tensorflow/prepare_dataset.m#L137 the code divides the data for each subject into train and test.

I'm interested in knowing the configuration of 3 fold evaluatioin with which you have gotten the 3D gaze estimation results in your paper on RT-Gene dataset.
Thanks !! :)

Use trained model for inference

I have trained the model as per the rt_gene_training folder. I have saved the final models for each folds. I want to use these models to test on new images. How to do it? Which model to use?

I don't want to go through the whole ros pipeline.
Thanks

gaze vector

we have two gaze vectors from both eyes, do you compute the mean of both gaze vectors and use it as the label ?

unable to run catkin build

`
pc-136@linux:~/catkin_ws$ cd $HOME/catkin_ws && catkin build

Profile: default
Extending: [cached] /opt/ros/melodic
Workspace: /home/pc-136/catkin_ws

Build Space: [exists] /home/pc-136/catkin_ws/build
Devel Space: [exists] /home/pc-136/catkin_ws/devel
Install Space: [unused] /home/pc-136/catkin_ws/install
Log Space: [exists] /home/pc-136/catkin_ws/logs
Source Space: [exists] /home/pc-136/catkin_ws/src
DESTDIR: [unused] None

Devel Space Layout: linked
Install Space Layout: None

Additional CMake Args: None
Additional Make Args: None
Additional catkin Make Args: None
Internal Make Job Server: True
Cache Job Environments: False

Whitelisted Packages: None
Blacklisted Packages: None

Workspace configuration appears valid.

[build] Found '1' packages in 0.0 seconds.
[build] Package table is up to date.
Starting >>> rt_gene

Errors << rt_gene:make /home/pc-136/catkin_ws/logs/rt_gene/build.make.001.log
Traceback (most recent call last):
File "/home/pc-136/catkin_ws/src/rt_gene/rt_gene/cfg/ModelSize.cfg", line 4, in
from dynamic_reconfigure.parameter_generator_catkin import *
File "/opt/ros/melodic/lib/python2.7/dist-packages/dynamic_reconfigure/init.py", line 38, in
import roslib
File "/opt/ros/melodic/lib/python2.7/dist-packages/roslib/init.py", line 50, in
from roslib.launcher import load_manifest
File "/opt/ros/melodic/lib/python2.7/dist-packages/roslib/launcher.py", line 42, in
import rospkg
ModuleNotFoundError: No module named 'rospkg'
make[2]: *** [/home/pc-136/catkin_ws/devel/.private/rt_gene/include/rt_gene/ModelSizeConfig.h] Error 1
make[1]: *** [CMakeFiles/rt_gene_gencfg.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
make: *** [all] Error 2
cd /home/pc-136/catkin_ws/build/rt_gene; catkin build --get-env rt_gene | catkin env -si /usr/bin/make --jobserver-fds=6,7 -j; cd -
...............................................................................
Failed << rt_gene:make [ Exited with code 2 ]
Failed <<< rt_gene [ 0.2 seconds ]
[build] Summary: 0 of 1 packages succeeded.
[build] Ignored: None.
[build] Warnings: None.
[build] Abandoned: None.
[build] Failed: 1 packages failed.
[build] Runtime: 0.2 seconds total.
`

Cross-dataset validation

Hi @Tobias-Fischer
I implement the algorithm and archieve a similar result in rt_gene dataset (about 7.9% compare with 7.7% in the paper). But when I do the cross-dataset validation, the error is very high (e.g columbia, MPII). I think that its the process of warping eyes which affects the final results. Would you share the implement detail to us?
Thank you for your good work.

error when running estimate_gaze_standalone.py with the PyTorch model

I am trying to run estimate_gaze_standalone.py with the PyTorch model "Model_allsubjects1_pytorch.model". And some errors occurred. It looks like something related to the model define:
Traceback (most recent call last): File "estimate_gaze_standalone.py", line 241, in <module> gaze_estimator = GazeEstimator("cuda:0", args.models) File "../rt_gene/src/rt_gene/estimate_gaze_pytorch.py", line 32, in __init__ _model.load_state_dict(torch.load(ckpt)) File "/home/thanhtm/anaconda3/envs/eye_gaze/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1045, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for GazeEstimationAllCNNModel: Missing key(s) in state_dict: "_left_module.0.weight", "_left_module.0.bias", "_left_module.2.weight", "_left_module.2.bias", "_left_module.4.weight", "_left_module.4.bias", "_left_module.6.weight", "_left_module.6.bias", "_left_module.8.weight", "_left_module.8.bias", "_left_module.10.weight", "_left_module.10.bias", "_left_module.12.weight", "_left_module.12.bias", "_left_module.14.weight", "_left_module.14.bias", "_left_module.16.weight", "_left_module.16.bias", "_left_module.18.weight", "_left_module.18.bias", "_left_module.20.weight", "_left_module.20.bias", "_left_module.22.weight", "_left_module.22.bias", "_right_module.0.weight", "_right_module.0.bias", "_right_module.2.weight", "_right_module.2.bias", "_right_module.4.weight", "_right_module.4.bias", "_right_module.6.weight", "_right_module.6.bias", "_right_module.8.weight", "_right_module.8.bias", "_right_module.10.weight", "_right_module.10.bias", "_right_module.12.weight", "_right_module.12.bias", "_right_module.14.weight", "_right_module.14.bias", "_right_module.16.weight", "_right_module.16.bias", "_right_module.18.weight", "_right_module.18.bias", "_right_module.20.weight", "_right_module.20.bias", "_right_module.22.weight", "_right_module.22.bias", "concat.2.weight", "concat.2.bias", "concat.4.weight", "concat.4.bias", "concat.5.weight", "concat.5.bias", "concat.7.weight", "concat.7.bias". Unexpected key(s) in state_dict: "left_features.0.weight", "left_features.0.bias", "left_features.2.weight", "left_features.2.bias", "left_features.5.weight", "left_features.5.bias", "left_features.7.weight", "left_features.7.bias", "left_features.10.weight", "left_features.10.bias", "left_features.12.weight", "left_features.12.bias", "left_features.14.weight", "left_features.14.bias", "left_features.17.weight", "left_features.17.bias", "left_features.19.weight", "left_features.19.bias", "left_features.21.weight", "left_features.21.bias", "left_features.24.weight", "left_features.24.bias", "left_features.26.weight", "left_features.26.bias", "left_features.28.weight", "left_features.28.bias", "right_features.0.weight", "right_features.0.bias", "right_features.2.weight", "right_features.2.bias", "right_features.5.weight", "right_features.5.bias", "right_features.7.weight", "right_features.7.bias", "right_features.10.weight", "right_features.10.bias", "right_features.12.weight", "right_features.12.bias", "right_features.14.weight", "right_features.14.bias", "right_features.17.weight", "right_features.17.bias", "right_features.19.weight", "right_features.19.bias", "right_features.21.weight", "right_features.21.bias", "right_features.24.weight", "right_features.24.bias", "right_features.26.weight", "right_features.26.bias", "right_features.28.weight", "right_features.28.bias", "xl.0.weight", "xl.0.bias", "xl.1.weight", "xl.1.bias", "xl.1.running_mean", "xl.1.running_var", "xl.1.num_batches_tracked", "xr.0.weight", "xr.0.bias", "xr.1.weight", "xr.1.bias", "xr.1.running_mean", "xr.1.running_var", "xr.1.num_batches_tracked", "fc.0.weight", "fc.0.bias", "fc.2.weight", "fc.2.bias", "concat.0.weight", "concat.0.bias", "concat.1.running_mean", "concat.1.running_var", "concat.1.num_batches_tracked". size mismatch for concat.1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([2048, 4098, 1, 1]). size mismatch for concat.1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([2048]).

Provide standalone (non-ROS) script

So far, it is rather difficult to run the pipeline without ROS. For the computer vision community, it would be much easier to read an image from disk and predicting the gaze on this image.

This is now possible (still beta though) thanks to a lot of refactoring. The script can be found here: https://github.com/Tobias-Fischer/rt_gene/blob/modularised/rt_gene/scripts/estimate_gaze_standalone.py

If anyone is using this, please let me know if you have feedback.

Move standalone scripts into standalone folder

It doesn't make sense to bury them in the ROS stuff.

Dataset mirror

Provide alternative download besides box, e.g. zenodo

Tensorflow 2 compatibility

As the title says ..

RT_BENE CSV duplicates introducing sampling bias

Hi all,
I've noticed that there are duplicates in the CSV files related to training; an example of this is s000_blink_labels.csv row 13983 and 13984:

left_017863_rgb.png,0.5
left_017863_rgb.png,0.0

While RTBeneDataset.py removes indeterminately labeled samples (i.e. any label of 0.5 is discarded), the following row is labeled as 0.0 and is thus used for training.

I'm not sure how this affects results but I would suggest that the 0.5 and 0.0 labels be removed all together?

I'll sanitise the data personally and create a pull request but I thought I would converse here about it...

Enhance device_id

Allow different devices for FaceDetector and FaceAlignment, and allow usage of CPU in addition to GPU.

details about other dataset performence

Hi. @Tobias-Fischer

I noticed there are some evaluation results on MPIIGaze and UT Multi-view dataset on this link: https://paperswithcode.com/paper/rt-gene-real-time-eye-gaze-estimation-in

Task	Dataset	Model	Metric name	Metric value	Global rank
Gaze Estimation	MPII Gaze	RT-GENE 2 model ensemble	Angular Error	4.6	# 2
Gaze Estimation	MPII Gaze	RT-GENE single model	Angular Error	4.8	# 3
Gaze Estimation	MPII Gaze	RT-GENE 4 model ensemble	Angular Error	4.3	# 1
Gaze Estimation	RT-GENE	RT-GENE 4 model ensemble	Angular Error	7.7	# 1
Gaze Estimation	UT Multi-view	RT-GENE 4 model ensemble	Angular Error	5.1	# 1

But I not found details about these dataset in this repo. Can you provide more details or codes?
That would give us a lot of helps.

Thanks for this excellent work!

Use image_proc node to rectify images

Requires simple change in the launch file

Update readme for ROS, to use conda

Conda install seems easier than pip.

UT Multi-view gaze dataset usage

Hi Sir,
Excited to see your gaze project. the readme says you use UT Multi-view dataset, clould you please provide some code about the pre-processing of this dataset? I am facing a problem about the converting of gaze vector to yaw and pitch angle.
In the dataset, s00/test/000_left.csv contain following lines:

0.657246  0.0937712  -0.74782  -0.0511348  -0.548227  -0.0143822  -7.6967E-06  1.92417E-06  600

the first 3 numbers '0.657246 0.0937712 -0.74782' is gaze direction in camera coordinate, I convert it to yaw and pitch directly like this:

   # gaze direction (gx, gy, gz)
    yaw = np.arctan2(-gx, -gz)
    pitch = np.arcsin(-gy)

But it seems not correct after I plot the gaze vector to original image. Could you give some tips or code?

Thanks very much.

GPU

Hi Tobias,

I am doing my MSc Project with Hyung Jin and am planning on using the rt_gene for eye tracking. My laptop doesnt have a GPU, I am just wondering whether it will still run? And that it would just be a lot slower with the CPU?

Best wishes,

Godfrey

Change tf lookup to matrix multiplication

Change ROS script to use same logic as in standalone script, following 896845d

Labels in Readme.md in wrong order

Abs: README.md errors?
I noticed some description about labels in README_Dataset.md:

label_combined.txt

...
seq_number, [head pose: down(pos) / up (neg), left(pos) / right(neg)], [gaze: up(pos) / down(neg), right(pos) / left(neg)], timestamp

I compare some images and its correponding labels, and I think that it should be this form:
[head pose: left(pos) / right(neg), down(neg) / up (pos)], [gaze: right(pos) / left(neg), up(pos) / down(neg)]

Update standalone script with head pose fixes

Face boxes on the side can be discarded

Hey,

I had a problem with the half part of my image (from webcam) where face boxes were discarded.

This comes from the following line:

rt_gene/rt_gene/src/rt_gene/extract_landmarks_method.py

Line 158 in 065443d

    
           if x_left_top > 0 and y_left_top > 0 and x_right_bottom < image.shape[0] and y_right_bottom < image.shape[

You have to revert indexes when you check x and y with the border of the image:
if x_left_top > 0 and y_left_top > 0 and x_right_bottom < image.shape[1] and y_right_bottom < image.shape[0] and confidence > 0.8:

Best,
Kévin

Improve computational speed for multi person estimation

At the moment the frame rate drops considerably when more than 3 people or so are in the scene. Let's find out where the bottleneck is and improve the speed.

RT_BENE Dataset left/right folder

Hi all,
Just wondering about the dataset used for the RT_BENE training - The README and the paper stipulate that the "sXYZ_noglasses" public dataset is used rather than the "sXYZ_glasses" section. However, the "sXYZ_noglasses" dataset doesn't have left/right eye patches/folders and only has face images for the GAN training of RT_GENE.

Does that mean left/right patches were extracted from the face? If so, which landmark extractor was used?

Add pytorch dependency to package.xml

As soon as ros/rosdistro#19382 is merged.

/cc @ngageorange

download_models.py

when I run this script(rt_gene/rt_gene/scripts/download_models.py), raise ConnectTimeout Error as below:
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='imperialcollegelondon.app.box.com', port=443): Max retries exceeded with url: /public/static/zblg37jitf9q245k3ytad8nv814nz9o8.model (Caused by ConnectTimeoutError(<urllib3.connection.VerifiedHTTPSConnection object at 0x7fcb5097e950>, 'Connection to imperialcollegelondon.app.box.com timed out. (connect timeout=10)'))

And I have failed to directly open the url either.

How to fix it? Thanks for your time.

Where to get s3fd_facedetector.pth

Good work. When I try to reproduce the results in your paper, I found no s3fd_facedetector.pth for pytorch_training. Could you please provide me with a download link? Thanks a lot. @Tobias-Fischer @ngageorange @hjchang @dvarnai @twarz

no visual output after estimate gaze launch

Hi,

Once I run roslaunch rt_gene estimate_gaze.launch, I do not see any visual GUI showing the real time video overlaid with gaze vector. The images from the webcam are not even saved anywhere.

All I see is this form of output on the terminal,

Time diff: -3.10000000114e-08
Time now: 1570652893.51 message color: 1570652893.47 diff: 0.0460674762726
est_gaze_c: [ 0.05357103 -0.04007208]
Face Detector Frequency: 72.17Hz for 1 Faces
Elapsed after detecting transformed_landmarks: 0.0429921150208
New get_eye_image time: 0.000105142593384
Head Rotation from last frame: 0.00
Translation based on landmarks [ 0.61903054 -0.0703548 -0.02048527]
Elapsed total: 0.0445210933685

I am new to ROS, am I doing something wrong?

Thanks,
Anil

train & test environment

Hi,
I tried to run the train_model in rt_gene_model_training, but it crashed with "MemoryError" when moving on to the 2nd fold.

Error message shows:
Traceback (most recent call last):
File "train_model.py", line 88, in
train_images_L, train_images_R, train_gazes, train_headposes, train_num = get_train_test_data_twoeyes(train_files, 'train')
File "/home/janghak/HDD/workspace/github/rt_gene/rt_gene_model_training/train_tools.py", line 171, in get_train_test_data_twoeyes
images_r = np.vstack([files[idx][label]['imagesR'] for idx in range(len(files))])
File "/home/janghak/anaconda3/envs/ge/lib/python3.7/site-packages/numpy/core/shape_base.py", line 283, in vstack
return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
MemoryError

I have tried running with batch size of 4, still crashing.
I am using cuda 9.1 and want to know if matching the cuda and cudnn version would solve the problem.

Thank you.

headpose in dataset from 3D to 2D

Headpose in label_combined.txt is 2-dimensional， while in label_headpose_detailed.txt is 3-dimensional。If I want to train on MPIIGaze， how to transform headpose from 3D to 2D？

PyTorch evaluation: Allow ensemble evaluation

Currently the PyTorch version only allows evaluation of a single model. We should allow evaluating ensembles of models.

Where is model definition in Pytorch branch?

Where is "gaze_estimation_models_pytorch.py"? The PyTorch branch may work now or still in experimenting?
Thanks~

type of left-right eyes image

I am wondering about this. I used matplotlib to visual left-right eyes and this is the result. can you explain this?

Release new version

Since the 3.0 release, we made significant improvements to the code base. Let's roll out a new release.

See https://github.com/Tobias-Fischer/rt_gene/releases/tag/v4.0-beta for a beta release.

Big shout out to @ahmed-alhindawi for many of the changes. Please let me know if I have forgotten something in this list.

Error when running estimate_gaze_standalone

I've been trying to run the standalone file for the last few days, but I'm currently getting errors when the environment is all configured.

After I run that python2 estimate_gaze_standalone.py samples_gaze

it reports an error as follows:

Loading networks

Using device cuda:0 for face detection

Trackback(most recent call last):

File "estimate_gaze_standalone.py", line 182, in

gaze_estimator = GazeEstimator("/gpu:0", args.models)

TypeError: new() takes exactly 4 arguments(3 given)

It looks like I am just missing one argument when calling GazeEstimator.

My current configuration is Nvidia driver 450.57 CUDA 9.0 cudnn 7.3.

I'm not quite sure why this error is occurring, so hopefully, you can give me some guidance.

model architecture

I was reading the paper and the source code for creating the model using pytorch. I could not find the head pose estimation net as mentioned in figure 2 (that takes in the face as the input). Secondly, I just wanted to confirm whether you use Resnet18 and/or VGG16 on the left and right eye images then pass the outputs to the GazeEstimationAbstractModel.

Attribute error

When I am running the standalone gaze estimation script on an image , I am getting the following error : AttributeError: 'TrackedSubject' object has no attribute 'marks'. Does it mean the landmarks are not getting detected in that image ?

Issues with installation

I'm having trouble building the rt_gene project. I have followed the software installation steps (including installing ros melodic and all the necessary packages) and received the following error messages:

Installation attempt on an Ubuntu 18.04 Virtual Machine:

Installation attempt on the Windows Subsystem for Linux (Ubuntu 18.04):

reproduce

Hi, I ran your model training code with only changing the path. And I get following results:
Ensemble: 13.480204036924533 +- 3.772117425419047
Model 0: 15.14554381498352 +- 3.4026110207019453
Model 1: 14.49058686821486 +- 3.9057199613231455
Model 2: 13.60103630178998 +- 3.4098169707825843
Model 3: 14.496607138990932 +- 3.3971626371824732
Ensemble: 12.86649487961544 +- 3.6739599920738577
Model 0: 13.41469958155073 +- 3.516076889661513
Model 1: 13.237533686031147 +- 3.5218109462077685
Model 2: 13.376636419685994 +- 3.341763248689219
Model 3: 13.861938083633717 +- 3.30115446948625
Ensemble: 12.639686620568382 +- 3.6341994290148607
Model 0: 13.336198507628263 +- 3.2624630095315488
Model 1: 13.110821038670409 +- 3.4787565847257684
Model 2: 13.355155726032125 +- 3.0450093915309906
Model 3: 13.533675092731762 +- 3.055964720850351
Ensemble: 13.048184325211764 +- 3.711550885325419
Model 0: 13.621509926156996 +- 3.2997626846039445
Model 1: 13.280787940634553 +- 3.5138811741676848
Model 2: 13.611282873524203 +- 3.4179222462072456
Model 3: 13.632872017170953 +- 3.4504936315324413

Is this a roughly correct result? Your paper reported the within-dataset result of rt-gene is 7.7 degrees. what did i ignore?

Implement face_encoding so that tracking maintains subject_id's are related to facial features

Hello,

In a recent pull-request the issue of subject_id's came up with the use of UUID's initially but that broke downstream projects and thus reverted to counting.

The issue, aptly raised by @twarz, is that if subject_1, subject_2 are being tracked, but then subject_1 is lost, the next time they appear they'll be labelled as subject_3.

IMHO, they should appear as subject_1 again.

One particular solution to this is to use a facial encoding - something similar to facial_encoding from dlib.

Would this be something that would be useful to people?

detailed training setup

Hi, I ran your code without any modification. But the results of evaluation on your dataset are not ideal, could you provide some hints? Thanks a lot.

tobias-fischer / rt_gene Goto Github PK

rt_gene's Introduction

RT-GENE & RT-BENE: Real-Time Eye Gaze and Blink Estimation in Natural Environments

RT-GENE (Gaze Estimation)

License + Attribution

Overview + Accompanying Dataset

RT-GENE ROS package

RT-GENE Standalone Version

RT-GENE Inpainting

RT-GENE Model Training

RT-BENE (Blink Estimation)

License + Attribution

Overview + Accompanying Dataset

RT-BENE ROS package

RT-BENE Standalone Version

RT-BENE Model Training

RT-BENE Dataset

rt_gene's People

Stargazers

Watchers

Forkers

rt_gene's Issues

Issue

label_combined.txt

Recommend Projects

Recommend Topics

Recommend Org