pydsgz / deepvog Goto Github PK

View Code? Open in Web Editor NEW

148.0 148.0 66.0 178.91 MB

Pupil segmentation and gaze estimation using fully convolutional neural networks

License: GNU General Public License v3.0

Python 99.63% TeX 0.37%

deepvog's People

Contributors

Stargazers

Watchers

Forkers

yqianjiang keyky keishatsai jason-sunjiankang cgh2797 zuxfoucault zhaozz-lab robingong ilxbxco steveseguin thinkronize suleymanveli thomasstephan1000 damon88888 zhang405744522 msharrock hewie8023 ieyer jaredquekjz xiewt86 renmengyuan spokv andreofner yujie-tao andygong2018 biotrillion saccadic yuan776 marenan borislevant nawwaradam vhcg77 ashwoods shelltitan allenhuang0708 sudhir-guardex bobdeng1974 a154609 burdettadam adas-eye choi612 xiamenwcy kamranbinaee wujinlonglovezhangmiao1314 thiwankajayasiri rworld-ravi vicwk jimchh junjie2008v sumsuddin yujin0827 git933 aditya-gupta21 cancan101 rajsrimanth6 kunalchaturvedi28 yakeworld dcnieho easychart jessechai krasner yyhhoi

deepvog's Issues

Blank output

I tried running the "test_if_model_work.py" file. The test_image.png file included with the python and h5 files didn't work, so I just took an image off google images. However, the output image is black. The numbers in the outputted array are on the order of 10^-5, 10^-6 and 10^-7. When I tried scaling these to the 0-255 range, the output was just grayscale noise. I've included the image for completeness. I only slightly modified test_if_model_works, and did not modify anything else. Here is the slightly modified test_if_model_work code I used:

(Edit: I downloaded all of the codes in the DeepVOG folder, but it is not clear which ones are dependencies and which is the main code that runs everything.)

def test_if_model_work():
model = load_DeepVOG()
img = np.zeros((1, 240, 320, 3))
reader=ski.imread("test_image.png")/255
reader.resize((240,320))
img[:,:,:,:] = reader.reshape(1, 240, 320, 1)
prediction = model.predict(img)
ski.imsave("test_prediction.png", prediction[0,:,:,1])
#print(prediction)
viewer = ImageViewer(np.uint8(prediction[0,:,:,1]*10000000))
viewer.show()

deepVOG

Hi there,
I need the dataset for that project to use it in my own project

Does it run on systems without GPUs ?

Training parameters of the model

Hi,

Really impressive work and many thanks for making it open-source. I was trying to replicate your model by re-training using another dataset, but I never reached comparable with your published pre-trained weights. While I understand that you might not be able to share your training data, could you please reveal some of the hyperparameters you used for training, e.g. learning rate, optimizer, batch size, epochs, regularization etc. (augmentation was kindly described in the paper).

Thank you

Enforce Aspect Ratio

I assume that the aspect ratio here should be the same:

DeepVOG/deepvog/inferer.py

Lines 23 to 25 in 7bad240

    
                       ori_video_shape (tuple or list or np.ndarray): Original video shape from your camera, (height, width) in pixel. If you cropped the video before, use the "original" shape but not the cropped shape 
        
                       sensor_size (tuple or list or np.ndarray): Sensor size of your camera, (height, width) in mm. For 1/3 inch CMOS sensor, it should be (3.6, 4.8). Further reference can be found in https://en.wikipedia.org/wiki/Image_sensor_format and you can look up in your camera product menu

i.e ori_video_shape[0] / ori_video_shape[1] ~= sensor_size [0] / sensor_size[1] (within floating point tolerance)?

If that is the case, then the inputs can be checked with an assertion, etc

Hello, what should I do if I want to fine-tune it?

Hello, what should I do if I want to fine-tune it? Please be more detailed, thank you.

DeepVOG annotation

How exactly do we upload the annotation to the DeepVOG model once images are annotated? Are we to make a separate script for it or can you provide one?

ValueError

I've managed to run the code on the demo versions successfully. I then tried to use my own video (in .mp4 format) and receive this error:

ValueError: No way to determine width or height from video. Need -sininputdict. Consult documentation on I/O.

I'm not sure whether my understanding of the documentation is wrong. I believe I have set the video size and sensor size correctly, and don't fully understand the above error. I have tried running the program using:

python -m deepvog --fit ./output.mp4 ./demo_eyeball_model.json -v ./demo_visualization_fitting.mp4 -b 32

as well as one command I found in the readme:

python -m deepvog --fit ./output.mp4 ./demo_eyeball_model.json --flen 12 -vs 300,400 -s 0.005,0.005 -b 32

Do you have any advice you could offer?

raise ValueError('You are trying to load a weight file ' ValueError: You are trying to load a weight file containing 35 layers into a model with 18 layers.

Demo Videos

The demo videos can not be played, is there a way to play them?

the initialization parameters

deepvog/inferer.py Line 46
self.mm2px_scaling = np.linalg.norm(self.ori_video_shape) / np.linalg.norm(self.sensor_size)#mm转换为像素 self.model = model self.confidence_fitting_threshold = 0.96 self.eyefitter = SingleEyeFitter(focal_length=self.flen * self.mm2px_scaling, pupil_radius=2 * self.mm2px_scaling, initial_eye_z=50 * self.mm2px_scaling)#眼睛模型
Hello, I would like to ask a question about initialization parameters. Because the test data I am using is synthetic image data, I am unable to determine the camera parameters.

mm2px_scaling：What is the meaning of this parameter generation? It seems that it has no actual physical meaning.
initial_eye_z：It seems that initializing the z-coordinates of the eyeball in this way cannot be used as the actual z-coordinate calculation.

wrong unit?

I just downloaded the code and tested the fit on the demo video included in the repo and I get the following eye model
{ "eye_centre": [ [ -189.91085622309689 ], [ 129.4567167957034 ], [ 3333.3333333333335 ] ], "aver_eye_radius": 1286.1201062769812 }
Shouldn't the units be in mm? Do these numbers make sense? Also I was expecting the algorithm to estimate the depth (z) of the eye center but it always reports the given initial z value. Is that correct?

Visualization of eye center

When trying out the visualization branch, i noticed that the eye center was off from what it supposed to be. The pupil ellipse and Gaze vector were accurate but the blue line from the gaze to eye center did not form a continuous vector as it should have.

I am assuming the error is in the unprojection but I have not tracked through the math.

Hello, i need a labeling tool, please tell me how to use. Then I want to use your Deepvog to fine-tune my data. What can I do?

How can I run this on platform like Nvidia Jetson nano?

Whenever I try to run it on Jetson nano, the device starts throttling and the program starts reporting about memory issue. What can I do to resolve this?

Pupil diameter

Good day!

Thanks for your research and implementation. Can I get the pupil diameter?

error about "No unprojected gaze lines or ellipse centres were added (not yet initalized). Use add_to_fitting() function to add them first"

hello, thank you for sharing your project. I installed it as read.me, and started fit step, but I got the error "No unprojected gaze lines or ellipse centres were added (not yet initalized). Use add_to_fitting() function to add them first ". Besides, i want konw that whether can i utilize the project to estimate gaze of video from DMS(driver monitor system) NIR camera ?

Issue in inferer.py with resizing.

The variable resizing is passed into _preprocess_image but in the body of the code, the variable shape_correct is passed which has opposite definition as resizing. Suggest changing resizing in the function definition to be shape_correct and change line 277 to be if not shape_correct:.

Low GPU usage / Performance

I am running fitting on your demo script using the command:

python -m deepvog --fit ./demo.mp4 ./demo_eyeball_model.json -m -b 32

I've seen that in your paper (Section 3.1.5 - Inference Speed) you run your program at 130Hz for batch sizes of 32, however when I run your program on your demo files (even without visualisation) I am averaging around 15Hz.

I am using a machine with the following specs:
CPU - Intel Xeon 12-core 2.5Ghz w/ Windows 10
GPU - Nvidia GeForce RTX 2080 Ti
RAM - 64GB

Python - 3.6.1
Tensorflow-gpu - 1.15.0
CUDA - 10.0
cuDNN - 7.6.5

Is there anything obvious that I am missing here that could lead to the weak performance?

tensorflow.python.framework.errors_impl.OperatorNotAllowedInGraphError: using a `tf.Tensor` as a Python `bool` is not allowed in Graph execution. Use Eager execution or decorate this function with @tf.function.

Hi,

I'm trying to run the demo as instructed and I get the following:
tensorflow.python.framework.errors_impl.OperatorNotAllowedInGraphError: using a tf.Tensoras a Pythonbool is not allowed in Graph execution. Use Eager execution or decorate this function with @tf.function..

I'm using windows 7 (with an i7 processor and w/o GPU)... not sure if it has any connection.

Thanks in advance!

Keras type error : symbolic inputs/outputs donot match

raise TypeError('Keras symbolic inputs/outputs do not '
TypeError: Keras symbolic inputs/outputs do not implement __len__. You may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching, preventing Keras from automatically converting the API call to a lambda layer in the Functional Model. This error will also get raised if you try asserting a symbolic input/output directly

Request for dataset availability

Hey,

Good work. We are looking to similar lines but with really low dimensional data. We are currently getting data from Webcam.

Can we please have dataset links as well? We and the community can greatly benefit from the same.

import deepvog

import deepvog is not working

.model.DeepVOG_model cannot be found

running on Google Colab

	ori_video_shape (tuple or list or np.ndarray): Original video shape from your camera, (height, width) in pixel. If you cropped the video before, use the "original" shape but not the cropped shape

	sensor_size (tuple or list or np.ndarray): Sensor size of your camera, (height, width) in mm. For 1/3 inch CMOS sensor, it should be (3.6, 4.8). Further reference can be found in https://en.wikipedia.org/wiki/Image_sensor_format and you can look up in your camera product menu