aras62 / pie Goto Github PK

Annotations for Pedestrian Intention Estimation (PIE) dataset

License: MIT License

Python 99.67% Shell 0.33%

pie's Introduction

Pedestrian Intention Estimation (PIE)

This repository contains code and annotations for the Pedestrian Intention Estimation (PIE) dataset: A. Rasouli, I. Kotseruba, T. Kunic, J.K. Tsotsos, PIE: A Large-Scale Dataset and Models for Pedestrian Intention Estimation, ICCV, 2019 and a series of scripts for visualization and scenario evaluation.

Content

annotations: Contains the annotations and a script for extracting images from raw videos
scenarioEval: Contains code for scenario generation and metrics for trajectory and action prediction
visualization: Contains a series of scripts for visualizing data samples and illustrating trajectory and action prediction modules
model_outputs: Contains models' outputs for behavior prediction to be used for evaluation and visualization
utilities: Contains configuration file for evaluation and visualization, dataset interfaces, and other utility functions
camera_params: Contains camera parameters for the PIE dataset

Citation

If you use our dataset, please cite:

@InProceedings{Rasouli_2019_ICCV,
author = {Rasouli, Amir and Kotseruba, Iuliia and Kunic, Toni and Tsotsos, John K.},
title = {PIE: A Large-Scale Dataset and Models for Pedestrian Intention Estimation and Trajectory Prediction},
booktitle = {International Conference on Computer Vision (ICCV)},
year = {2019}}

Authors

Amir Rasouli
Iuliia Kotseruba

Please send an email to [email protected] or [email protected] if there are any problems with downloading or using the data.

License

This project is licensed under the MIT License - see the LICENSE file for details

pie's People

Contributors

Stargazers

Watchers

pie's Issues

Mean and standard deviation of images in dataset

Hello:

I have been searching through the code and I have not found the mean and standard deviation of the dataset images (per channel). Do you have calculated them, by any chance? If so it would help me a lot.

Thanks!

What does the bounding box coordinates mean?

Hello, i dont understand the mean of bounding box coordinates such as xbr .Is that the object before the car?

Camera parameters

Hello!

Do you have intrinsic and/or extrinsic camera parameters available anywhere? I read your paper and the information in the repository and I did not find anything and maybe I searched in the wrong places.

Thank you!

"Taffic light" data extracted wrong from _get_annotations() function

The number of frames and the number of the box for a traffic light element is not the same and also the traffic light state always shows empty.

The reason for this is the replication of line 386 and 387

Also, the line 390 should be:

annotations[traffic_annt][obj_id]['state'].append(self._map_text_to_scalar('state',
                                                                                                    b.find(                                         './attribute[@name=\"state\"]').text))

Clarification regarding the label "crossing" in the Annotation Attributes

Hello,
First of all, thanks for this amazing work!

I had a couple of doubts regarding your work and I was hoping you could shed some light on them.

The annotation of the "crossing" label.

You've defined the "crossing" label as - "crossing: 1 (crossing), 0 (not crossing), -1 (irrelevant). This indicates whether the pedestrian was observed crossing the road in front of the ego-vehicle"

Does "crossing the road in front of the ego-vehicle" mean that the pedestrian was literally seen crossing in front of the ego-vehicle (traversing the width of the ego-vehicle) or does it simply mean that the pedestrian was crossing the road in the field of view in front of the ego-vehicle? (So, in this case, the pedestrian might've started crossing the road, but the ego-vehicle goes past them before they could cross the vehicle.)

The C/NC problem

i) In one of your other papers - "Benchmark for evaluating Pedestrian Action Prediction", you solve the C/NC problem. In this case, the "crossing" label is the ground truth label right? And not the per-frame "cross" label, I assume?
ii) In the same paper, all the crossing sequences in any observation period for training and testing are pre-crossing?
Is there no analysis of what happens once the crossing activity begins? (Apart from trajectory prediction as is done in the PIE paper.)

Thanks in advance for your time!!

The label does not correspond to the video picture

Hello author, I am numbered in the image set03\video_0005\03045.png image, using the provided coordinates (1291.6, 652.01, 1416.2, 994.3), But the drawn border does not correspond to the picture. The picture is shown below.I don't know why it doesn't match.

images interface

hi，sir, thank you for making this great dataset!
however,my dask is too small to clip all vedio. I want to use the interface. but, due to my poor knowledge, I cannot undestand hoe to use the interface. Could you please provide a python file which can diretly be used to make pictures?

Alternatively, one can use the interface to only extract annotated frames (which would require 1TB space) as follows:

from pie_data import PIE
pie_path = <path_to_the_root_folder>
imdb = PIE(data_path=pie_path)
imdb.extract_and_save_images(extract_frame_type='annotated')

about video to images

hi,

Thank you very much for your excellent work

Where should this code be used when generating images?

from pie_data import PIE
pie_path = <path_to_the_root_folder>
imdb = PIE(data_path=pie_path)
imdb.extract_and_save_images(extract_frame_type='annotated')

annotation

hello Amir,

I'm new to pedstrain intent detection,I hope to get your help.
what's means the 'ab' in pedstrain ID annotation, eg:0_7_40b. In JAAD dataset.
And I need to transfer the ped_id datatype(str) to float, do you have any advise for the id unique.
I try to use str.replace('_','').replace('a','1')..and so on. what do you think?
looking forward to your replay.

lee

There may be an error in the pedestrian boundingbox annotation of set03video0001

Hello,
There may be an error in the pedestrian boundingbox annotation of set03video0001,
For example, The pedestrian boundingbox of set03video0001_04349.png is [542.87, 631.34, 682.74, 1032.32], as shown in the figure:

It can be seen from the figure that there are no pedestrians in the rectangular box.
I checked the data of other sets, and the annotations can match correctly, but set03video0001 cannot.
Your work is very valuable and important to me. I look forward to your reply. Thank you very much!

Speed Annotation unit

Hello,

In which unit is the speed given in the annotations? in m/s or in km/h?

Best Regards
Moritz

Hello! What does the 'intention' attribute mean? What's the difference of "intention" and "cross"?

Question about camera height

Hello,
do you have information about the height of the camera when it was recorded?
In the paper, it is only said that the camera was placed "below the rear-view mirror".
I would really appreciate if you could provide more information.
Thank you.

camera params

Is the camera equipment used in the JAAD dataset the same as that used in the PIE dataset?

Extracting images clarification on README

Following the 'Extracting images' paragraph, while extracting only the annotated frames, I encountered the following error:

FileNotFoundError: [Errno 2] No such file or directory: 'annotations'

I suggest writing in the README to unzip the annotation files first and then run the code.

The Annotations cannot match images

I want to see what the annotations show in the picture. So I use generate_database() to generate pklfile. But it cannot match with every images. The code is following:

import cv2
a = cv2.imread('set02/video_0003/frame_821.jpg')
a = cv2.rectangle(a, (134, 802), (110, 704), (255, 0, 0), 5)
cv2.imshow('ad', a)
cv2.waitKey()
cv2.destroyAllWindows()

The pedestrian(2_3_194) annotation on set02/video_0003/frame_821 is [110.61,704.5,134.85,802.92].
I use opencv function of rectangele to show this, but you see it cannot match pedestrian.
Could you please tell where either the code is error or the pkl is error? Thank you for your help!

Pie Dataset website not accessible

Hello Amir and Co, I figured out that the PIE dataset website is not available or accessible since one week.
http://data.nvision2.eecs.yorku.ca/PIE_dataset/

Was there any known issue or update ongoing !
Thanks for your feedback.

Mean and standard deviation of images in the dataset

Hello:

Thanks!

Some images are missing cannot run the model

Hello there, I've been working on this dataset the past few weeks and I've encountered a problem during the training step. I get an error saying that a file has note been found. This file is an image from the set_01/video_01. Image 17715.png
After looking manually in my folder it is right there is such image.
To extract images from the video clips I used the shell script it took quiet a while to run but I believe it did it succesfully.
Do you know if some errors can occur during the extraction of images from the clips?
I'm not sure what to do now.

Thanks in advance

_height_check function in pie_data.py is calculating the width

Hi,

The description of the dataset says that the bounding boxes are annotated as (top left, bottom right)

According to this when calculating the height you should the y coordinates of the top left and bottom right. But in the code of _height_check you are using the x coordinates. Which actually calculates the width of the bounding boxes.

Please correct me if I am wrong

Pedestrian intention statistics different in web and code

Hello!

Today I used your function get_data_stats and I found that the number of pedestrians is different from the one in the webpage of the dataset. I checked with an improvised loop, and I obtain the same results as in the function, so maybe it is not updated in the webpage.

I attached a screenshot at the end. The results from the function are the following:

crossing:
crossing: 519
irrelevant: 468
not-crossing: 855

Also, the total number in the function output is the same as in the web.

If there is something wrong with my findings, please tell me, it will help my research.

Hope it helps!

Annotation of vehicles are not completed?

Hi Amir,

Thank you for making this great dataset!
I'm using PIE for my research and I wanted to get trajectories of vehicles as well. I have noticed that only some of the vehicles in the videos are labelled. Does the original PIE contain all vehicle labels? If no, what is the specific reasons that only some the vehicles are labelled?

Thank you,
Brian

Annotation 'designated'

Hello,

in the python interface in the function generate_database() the annotation 'designated' is listed. However, this annotation is not included in the xml files and is not listed in the readme.
Does this annotation only exist for JAAD and not for PIE?

About Attributes : intention_prob

Is there a detailed explanation of the vehicle annotations?

Hi Amir,

First of all, excellent work on the PIE dataset! It provides a very detailed annotations for both intention and trajectory prediction tasks. But I'm a bit lost in the vehicle annotations. Can I ask:

What is the difference between the yaw pitch roll and gyroX, Y, Z? (I assume both of them are rotations)
How do accX, accY, and accZ get calculated? What unit are they measured in?

Generally, I was trying to calculate the ground truth relative camera pose (yaw, pitch, roll, x, y, z) between the frames but couldn't wrap my head around those vehicle annotations.

Best,
Yijun

Annotation 'signalized'

Hello,

Can you explain to me in more detail how the annotation signalized is differentiated? I looked a bit more at the data to understand the annotations.

I checked Set01 Video_0007 Frames 224 to 499. This person has the ID 1_4_105.

Why is the annotation 'signalized' = 3 for this pedestrian? I can't see any signal in the frames. Can you explain this to me?

Thank you very much for your help!

Many greetings
Moritz

Extract other attributes for prediction

Hi Amir,
I noticed they you only used intention_prob, trajectory and vehicle speed for prediction but what about the other attributes like standing, looking, traffic light etc. please let us know your thoughts Thank you

Calibration not working?

Hi,

i'm trying to use to calibration information provided with the dataset, but I cannot get good correspondence between coordinates in the world space and the pixel space. I use the provided intrincis matrix, distortion coffecients and for camera rotation I use the pitch of 10 degrees (nothing else is provided with the data).

I am using the below code which should draw horizontal lines in what should be the distance of 5 meters, 10 meters, etc, but as you can see the lines distances are way off - the first line is 5 meters away, but in the image is very close to the car.

What is even more worrying is that these drawn lines should converge to the vanishing point (the left edge of the lines should correspond to the world coordinates of [X=0, Y=0, Z=Inf] ), but this point in the pixel space seems to be higher and lot more to the left that where it actually is. Am I missing something here?


import cv2
import numpy as np
import matplotlib.pyplot as plt


def world_points_to_pixel_space(keypoints_3d, proj_mat):
    keypoints_3d_hom = np.hstack([keypoints_3d, np.ones((keypoints_3d.shape[0], 1), dtype=np.float32)])
    keypoints_hom = np.dot(proj_mat, keypoints_3d_hom.T).T
    keypoints = np.vstack([keypoints_hom[:, 0] / keypoints_hom[:, 2], keypoints_hom[:, 1] / keypoints_hom[:, 2]]).T
    return keypoints



#Load calibration details
calib = {"cam_height_mm": 1270, "cam_pitch_deg": -10, "dim": [1920, 1080], "K": [[1004.8374471951423, 0.0, 960.1025514993675], [0.0, 1004.3912782107128, 573.5538287373604], [0.0, 0.0, 1.0]], "D": [[-0.02748054291929438], [-0.007055051080370751], [-0.039625194298025156], [0.019310795479533783]]}
mtx = np.array(calib['K'])
cam_height = np.array(calib['cam_height_mm']) / 1000.
distCoeffs = np.array(calib['D'])
h, w = (1080, 1920)
pitch = calib['cam_pitch_deg'] * math.pi / 180.

#Load image
image_path = '$PIE_PATH/images/set01/video_0002/04951.png'
img_data = cv2.imread(image_path)
img_data = cv2.cvtColor(img_data, cv2.COLOR_BGR2RGB)



#Undistort image
newcameramtx, roi = cv2.getOptimalNewCameraMatrix(mtx, distCoeffs, (w, h), 0, (w, h), centerPrincipalPoint=True)
img_data = cv2.undistort(img_data, mtx, distCoeffs, None, newcameramtx)

# Get projection matrix as K * R
K = np.hstack([newcameramtx, np.zeros((3, 1))])
rot = np.array([[1, 0, 0, 0], [0, np.cos(pitch), -np.sin(pitch), 0], [0, np.sin(pitch), np.cos(pitch), 0], [0, 0, 0, 1]])
proj_mat = np.matmul(K, rot)

# Draw lines on the road which are 2 meters wide, with left edge in the center of the camera (X = 0m)
line_width_in_meters = 2
for distance in range(5, 200, 5):
    points_3d = np.asarray([[0, cam_height, distance], [line_width_in_meters,  cam_height, distance]], dtype='float32')
    points_pixel = world_points_to_pixel_space(points_3d, proj_mat).astype(np.int32)
    cv2.line(img_data, tuple(points_pixel[0, :]), tuple(points_pixel[1, :]), (255, 0, 0), lineType=cv2.LINE_AA)

    if distance <= 15:
        cv2.putText(img_data, '{}m'.format(distance), tuple(points_pixel[1,:]), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 0, 0))

plt.imshow(img_data)
plt.show(block=True)

what does '1_1_1' represent in annotation?

Hi,
if you use the function data_generation(), the outputs contain a list of annotations corresponding to a video, there are labels like '1_1_1', what does this represent?
pedestrian id?

Road Boundaries

Hi,

I checked the readme file and the attribute files, But I am not able to find information on road boundaries. Am I missing something?
Thanks.

P.S. It is mentioned in the paper that road boundaries are included.

Question about Intention annotation

Hi Amir, while reading your code, I found that the intention label generated in your dataset code is the same along a whole sequence, which could result in a problem that the a pedestrian's intention is still crossing even he/she just finished crossing, which seems to be wrong. Could you please explain it? Thank you!

How to use the dataset to train a YOLO object detector?

Hello, I want to train a custom object detection using the PIE Dataset, but I want to train a yolo, which does not use the xml format for annotations. I have 2 Questions:

How can I convert the annotations to be used with yolo (.txt)? I need the Bounding Box positions, as well as the attribute, if the person is walking or standing (for 2 different classes in detection--> "person standing" or "person walking"). Is there an easy way of converting the annotations for my use, or would you suggest to use another object detector?
Furthermore I want to extract the vehicles heading direction and current speed, so that I have a .csv file for each video file, in which each row represents a frame of the video with the 2 values.
I am quite confused by the complexity of the annotaions. Any help is appreciated, thanks :)

Difficulty relating traffic light annotations

Hi,
I am trying to relate the traffic light annotations to the pedestrian tracklets. I am reading the data_cache/pie_database.pkl to visualize the annotations.

Here when I read

AnnData = I[set][vid]
traffic_annots = AnnData['traffic_annotations']
TLS = traffic_annots['1_1_1tl']
tl_frames = TLS['frames']
tl_bbox = TLS['bbox']
tl_state=TLS['state']

Now issues arise,

Mostly all states are []
tl_frames are larger than tl_bbox and have some '3' in them. such frameids do not exist
tl_bbox dont exactly fall on the correct spot.

For example

        for idx, b in enumerate(tl_bbox):
            i = cv2.imread(imgloc + set + '/' + vid + '/' + str(tl_frames[idx]).zfill(5) + '.png')
            if i is not None:
                cv2.rectangle(i, (int(b[0]), int(b[1])), (int(b[2]), int(b[3])), (0, 255, 0), 2, cv2.LINE_AA)
                cv2.imshow('TL', cv2.resize(i, None, fx=0.5, fy=0.5))

The traffic light starts drifting with the car motion. The first TL is correct but then it keeps drifting away.
https://drive.google.com/open?id=1CanNpO29LtOYB2uapb2DJE6SlnPGrE8t

Is there something wrong with pkl creation ? Or am I reading the boxes wrong.

P.S : a visualization script might go a along way.

aras62 / pie Goto Github PK

pie's Introduction

Pedestrian Intention Estimation (PIE)

Content

Citation

Authors

License

pie's People

Contributors

Stargazers

Watchers

Forkers

pie's Issues

Recommend Projects

Recommend Topics

Recommend Org