rafaelpadilla / object-detection-metrics Goto Github PK

View Code? Open in Web Editor NEW

4.8K 4.8K 1.0K 10.62 MB

Most popular metrics used to evaluate object detection algorithms.

License: MIT License

Python 100.00%

average-precision bounding-boxes mean-average-precision metrics object-detection pascal-voc precision-recall

object-detection-metrics's People

Contributors

Stargazers

Watchers

Forkers

alexeyab vsd550 chlapec tangzixia jing-vision rickchen147258 briando2005 tlgt2017 liviust endymecy yxliang shlpu shanliwa1 yuechengyin hahazhky tuq820 ylf-li urvishp80 chineseqsc helobilly trendingtechnology 2martens lianxiaolei winnerineast hhaandroid 00liujj fanofjava zhuyiche qiqika starstylesky wangbingok1118 autohe jiyulongxu gengkunling yonghaohe suixiaodan codermckee abc3436645 stephenivy07 gozhiyuanliu erika1203 northrend nafis00141 gzzgz hooping lanshanikilven suzhenghang leeyang01 hdjsjyl xxtm neerajsarwan aish11 madraziw saeedarabi92 bigpo denethor1997 fendaq huyhoang17 mlnagents dmitrysarov tringn david-leon fantasycheng 760146355 frizy-up houkai haoyuanz13 boomluo huuthuan966 yougmark zhangjunpeng9354 yashar78 jacktank yqtianust pandasmx rohit167 yesyu m0redr1nk bug1989 zgsxwsdxg minggangzhao xjtueducation adamdad amseej zxydi1992 snow31450588 vhank tienbt93 juniorhighschoolstudent synioe xiaodongdreams sapjunior netphone satoshianai dongzhuoyao peterpaniff neuralsec zineos okanyildiran gfjiyue

object-detection-metrics's Issues

in object detection using tensorflow

Number of detected bboxes to be recorded and 11 recall level approach

I was checking your logic and 2 questions arose:

Since detection bboxes can be quite a lot and we usually filter to use only above a confidence threshold if that threshold (typically for me is 0.5 meaning I ignore all bboxes with confidence below this) has any impact on the mAP? Since I need to write the bbox coordinates in a txt file it make sense to filter them out but I want to know if it has any impact on mAP.
You seem to calculate the mAP in a continuous manner rather in the proposed 11 recall levels approach. Is that so? Why do you choose this approach (I guess it's simpler to just calculate the area of the recall rectangles)? It doesn't seem to impact the resulting mAP but still it's a deviation from the original paper.

No confidence score for the predicted boxes

I have ground truth boxes and predicted boxes from YOLO and DPM and Openpose algorithms. Their format is [ x y w h] and I do not have the confidence scores. Is it possible to use your python code for getting the precision and recall and the curve? I get the below error:
Metrics-master/lib/BoundingBox.py", line 45, in init
'For bbType='Detection', it is necessary to inform the classConfidence value.')
OSError: For bbType='Detection', it is necessary to inform the classConfidence value.

environment

could you tell me your environment? thank you

Giving synatx error while validating the savepath

Since iam new to python, i couldnot figure out , what the problem is. i had given gt files, detection files as per instructions. can you check, what can be the problem. i wasted one day for this error. still could not find. help me here

########################################
$ python pascalvoc.py -gtformat xyrb or $ python pascalvoc.py
File "pascalvoc.py", line 292
[print(e) for e in errors]
^
SyntaxError: invalid syntax
################################

False Positive (FP): A wrong detection. Detection with IOU < threshold

why IOU < threshold ？
I think FP is one detection whose IOU > threshold but with the wrong label.
Thanks

Clarification for True Negatives

In the README the following is said about True Negatives:

Does not apply. It would represent a corrected misdetection. In the object detection task there are many possible bounding boxes that should not be detected within an image. Thus, TN would be all possible bounding boxes that were corrrectly not detected (so many possible boxes within an image). That's why it is not used by the metrics.

Can we cite a (formal enough) source for this information? I tried searching an evidence to support these statements but I couldn't find any solid support.

Thanks so much.

How should be prepared data if file with prediction bbox or ground truth box is empty?

How should be prepared data if prediction file or ground truth file is empty?

Question: Is it possible to get the AR(Average Recall) using this tool?

Hi,

I am trying to get the AR(average recall) defined by coco . I am wondering to know is it possible that I can get this value directly from the results returned after running python pascalvoc.py.

I noticed that it return a list 'recall' when running python pascalvoc.py. However, I am not sure the relationship between this 'recall' and the AR defined by coco. Can you give me some explanation?

Thanks.

No detection cases

Thank you for the great repo!
How does the code count for the case the model could not detect anything?
Suppose a case that there is an image which has some objects in it. However the model could not detect anything. Should I pass an empty .txt file for that image?

How confidence is used

Hello, I've read some other closed issues but still I don't understand how confidence works in this context.

For instance, for one image my model predicts 3 outputs:
A: confidence 0.9
B: confidence 0.7
C: confidence 0.2

Should I use all of them? Even if confidence for C is low or I should filter them before (like using a threshold of 0.5?).
I was hoping to find what is the best threshold of confidence that I should use with my model.
Thanks.

Did you conclude differences comparing with official algorithms?

Hi, I really appreciate your efforts on this project. Actually, I am writing a paper of my project. I would like to use your code to validate my datasets. Could you please add some comparison and how to cite this project?

Packaging as a pip modules

Thanks for the useful library, this is much needed!

I'm wondering if you would be interested in converting this to a pip module? So that other users would just have to import object_detection_metrics to run your code, and avoid shipping your code along the repo.

We could push a PR for this.

Puzzled: how to select TP if there are more than one detection overlapping a ground truth

Hi man, your example is very clear and i like it very much!

But I have a puzzle here: in your example, when there are more than one detection overlapping a ground truth, the detection with the highest IOU is taken as TP (e.g detection E is taken as TP in Image 2). However, I think that when iou is satisfied, you should take the one with the highest confidence as the TP (e.g detection D). I am referring to the first answer from here.
Looking forward for your reply, thank you！

is COCO metric coming?

Hello, I really think standardized metrics are needed for OD, nice work.

Are there plans of expanding the project to include the COCO metric?

How to calculate the acc TP and acc FP in table 2

I dont understand the meaning and how to calculate the acc TP and acc FP in table 2.
Can you explain detail about it?

Converting VOC annotation format to desired format

Hi! I have all my annotations in XML format. I wish to evaluate a model on my own custom dataset (which do not belong to the original VOC classes). Is there a quick way to convert the annotations to the format for evaluation?

'NoneType' object is not subscriptable when execute the example in readme

Hi,

First of all, thanks for your tools!

I want to execute the example. I just git clone the project and then type:
python3 pascalvoc.py

I got a img: (different with your image)

After I closed this img,

I got error like this:

Traceback (most recent call last):
  File "pascalvoc.py", line 328, in <module>
    cl = metricsPerClass['class']
TypeError: 'NoneType' object is not subscriptable

The result.txt has content:

Object Detection Metrics
https://github.com/rafaelpadilla/Object-Detection-Metrics


Average Precision (AP), Precision and Recall per class:

Any suggestion?

how can I get the average precision recall curve for multiclass?

Firstly, thank you for providing this code.
I want to know how to draw average precision graph for all calsses.
I think it will be possible to update some codes in the GetPascalVOCMetrics.

Question - Is mAP of 2.22% the "correct" value for example in Sample_2 folder?

Hi. Thanks a lot for this repo. I just wanted to verify/test the code with the example you provide in the Sample2 folder. When I run the code with an IOU threshold of 0.5, I get a mAP of 2.22%. Is this correct?

It just seems to be a very low value for an example which I would have assumed would have perhaps had some more overlap between ground-truth and targets. I just wanted to double-check that this value is expected and correct as I couldn't find anything in the documentation about the expected mAP for class 'object' in the example folder.

Thanks!

Question about the PR curve from the example

Hi, I read through your example, which is really a nice explanation. However, I don't understand why precision all become zero after recall>0.466. Can you give some intuition to do so?

False Negative detections

We can name False Negative detection as those, which we received after hight confidences threshold.
But at the same time we got FN from IoU calculation stage as rejecting gt boxes. For example on image with 3 gt boxes we detect only one prediction box, and it got big enough IoU, mAP metric don't care about those 2 not detected?

What if part of my predicted bounding box is out of the image? Should I trim it?

Great work! Thanks!
If part of my predicted bounding box is out of the image, should I trim it before I run the pascalvoc.py?
For example, one of my predicted box is described by x, y, w, h. If x<0, should I set x=0 before I run the script? Or I should do nothing, just run the script?
Thanks!

Question:use the parameter about -gtcoords and -detcoords

Dear @rafaelpadilla ,
I am using your code to run yolov3 model,in your instructions ,about the -gtcoords and -detcoords,you say I should use '-gtcoords rel' in yolo model.Because the coordinates are relative to the image sizecoordinates are relative to the image size.
But when I use default sets,the code run success.I think it is not necessary because my ground truth bounding boxes infomations'format is same as detected bounding boxes information'format.
Can you give me some explanation?
Thanks.

Sort FP and TP images in result

Hi @rafaelpadilla

Thank you for the extremely helpful repo.

Is there any way to sort the FP and TP images in results ?

This will be helpful. Thank you.

Penalising False detections

Hi @rafaelpadilla , good work here.
However, I don't think your implementation penalizes false detections of the same object like PASCAL VOC as described "However, if multiple detections of the same object are detected, it counts the first one as a positive while the rest as negatives."
Am I missing something? Would like to know your thoughts on this.
Thank you.

Possible bug ?

It seems that duplicated detections is currently discarted and not marked as false positive as should be.

The code extracted from here:

	if iouMax >= IOUThreshold:
	    if det[dects[d][0]][jmax] == 0:
	        TP[d] = 1  # count as true positive
	        # print("TP")
	    det[dects[d][0]][jmax] = 1  # flag as already 'seen'
	# - A detected "cat" is overlaped with a GT "cat" with IOU >= IOUThreshold.
	else:
	    FP[d] = 1  # count as false positive
	    # print("FP")

Should be:

	if iouMax >= IOUThreshold:
	    if det[dects[d][0]][jmax] == 0:
	        TP[d] = 1  # count as true positive
	        # print("TP")
	    else:                                              ## ADDED
	        FP[d] = 1  # count as false positive           ## ADDED
	    det[dects[d][0]][jmax] = 1  # flag as already 'seen'
	# - A detected "cat" is overlaped with a GT "cat" with IOU >= IOUThreshold.
	else:
	    FP[d] = 1  # count as false positive
	    # print("FP")

As the original code*:

	% assign detection as true positive/don't care/false positive
	if ovmax>=VOCopts.minoverlap
		if ~gt(i).diff(jmax)
			if ~gt(i).det(jmax)
				tp(d)=1;            % true positive
		gt(i).det(jmax)=true;
			else       %% THIS SHOULD BE ADDED
				fp(d)=1;            % false positive (multiple detection) 
			end
		end
	else
		fp(d)=1;                    % false positive
	end

*Download it here and take a look at line 93 from file VOCevaldet.m inside the folder VOCcode

Or am I missing something in your code that justifies it ?
Thank you.

why the code is wrong

I opened the project in pycharm 2017.3(community),but there are several red lines below the code.Does anyone have the same situation as me?

Using this approch on segmented images rather than bounding boxes.

Hey there, your implementation is nice and would provide a detailed performance evaluation. I want to use this to evaluate my results for moving objects detection in a video....but im using background subtraction which gives an output of the moving object segmented...
Refer to the image below for simple understanding....
.......ORIGINAL IMAGE.......DETECTION.......GROUND-TRUTH

So in this case how can I use the approach, since im not using bounding boxes
Thank you in advance!

Assertion error

I have groundtruth and detections files in the required format. But it get the below error.

I changed the default values of gtformat and detformat to 'xyrb' as my data is in the format

Traceback (most recent call last):
File "pascalvoc.py", line 331, in
showGraphic=showPlot)
File "/home/rotu/Downloads/final/keras-frcnn-master/metrics/lib/Evaluator.py", line 187, in PlotPrecisionRecallCurve
results = self.GetPascalVOCMetrics(boundingBoxes, IOUThreshold, method)
File "/home/rotu/Downloads/final/keras-frcnn-master/metrics/lib/Evaluator.py", line 106, in GetPascalVOCMetrics
iou = Evaluator.iou(dects[d][3], gt[j][3])
File "/home/rotu/Downloads/final/keras-frcnn-master/metrics/lib/Evaluator.py", line 390, in iou
assert iou >= 0
AssertionError

how to calculate recall in the table?

Hi, sir, I know how to calculate the precision in the table, but I can't figure out how to calculate the recall in the second table. recall need FN, how to get FN

mAP is 0 for all the classes detected

Hi Rafael,

I am pretty sure I followed all the instructions properly but still I can't get another result rather then 0 mAP for all my classes. I am attaching the detection and groundtruth files and the csvs that I used to generate the txt files.

I issued the following command as my boxes are in the configuration

python pascalvoc.py -gt groundtruths_blindenhund_test -det detections_blindenhund_test -gtformat xyrb -detformat xyrb

Really appreciate the help given.

csvs with the boxes.zip
detections_blindenhund_test.zip
groundtruths_blindenhund_test.zip

[question] metric explained

Hi Rafel, thaks for this great explaination. I just wanted to confirm if the way you explained matches the way I think it is correct.

Considering the example below, with a minimum iou of 20% and with one ground truth object and two detections, the one with higher Iou is considered and the other is considered as a FP. When we rank the detecitons by confidence, we get:

	Confidence	AccTP	AccFP	Precision	Recall
Green	.99	0	1	0	0
Blue	.30	1	1	0.5	1

The first row does not make sense to me at all because we are thresholding our detection by the one with top confidence and neglecting all the other detections and thus, there is no reason to consider the green one as a FP. It was considered as FP because there was another detection with higher Iou. Does it make sense to you?

Detection files left top width height

how can i find (left top width height) from detection images?

Understanding the graph

I've trained a model and generated result using your Object-Detection-Metrics repository.
For threshold value 0.3

For threshold value 0.5

I am really having a hard time understanding the graph. Can you please explain the graph that will help me a lot. If you can manage time please reply.

Got 0 mAP of all classes and I have checked #26.

This is one of my ground truth and detection.
Thanks.
Pred0.txt
GT0.txt

Possible bug - Evaulator.py

Firstly, your project is awesome!
Why you cut out last element from mrec in ElevenPointInterpolatedAP (Evaluator.py at line 332)?
argGreaterRecalls = np.argwhere(mrec[:-1] >= r)
You lose one recall point, why?
I think it should be argGreaterRecalls = np.argwhere(mrec[:] >= r)

This can reduce map especially for small set of detections.

[question] clarify

Darknet's output:
('obj_label', confidence, (bounding_box_x_px, bounding_box_y_px, bounding_box_width_px, bounding_box_height_px))
The X and Y coordinates are from the center of the bounding box. Subtract half the width or height to get the lower corner.

needed output:
<class_name>

what is the left,top,right,bottom from darknet's output?

Questions.

Hello @rafaelpadilla
Thank you for all your support. This repo is extremely helpful.

I have few questions. Could you please help me.
I have 1 class and used tinyYolo.
Q1 ) How should we know which threshold is best to choose?
at 0.1 Threshold I got mAP : 63.07% , lamr : 0.52 , FP : 3942 and TP 1460.
at 0.3 Threshold mAP : 59.72% , lamr : 0.52 , FP : 861 and TP 1325.
at 0.5 Threshold mAP : 52.24% , lamr : 0.57 , FP : 861 and TP 1121

So. Is there a way to find an optimal threshold at one go?

Q2.) What is the difference between lamr and ROC? (I am not clear with both terms)

Q3 ) Is there any difference between IOUThreshold and threshold?If so could you please explain.

Sorry. If the question look dumb or illogical. I am new to the topic

Thank you for your time.

Giving mAP about 2.2% even if groungtruth and detection folder contains same data files.

I am trying out this tool but it is giving very low mAP about 2.25% even if the detection folder and groundtruth folder contains same files. Meaning that I have put same text files in both the folders and even then also it is giving very low mAP. How is that possible? Does this is a problem with the data or the tool itself?

Unable to execute pascalvoc.py

I am getting the following error while executing the file.

~/Documents/Object-Detection-Metrics $ python pascalvoc.py -h
File "pascalvoc.py", line 197
[print(e) for e in errors]
^
SyntaxError: invalid syntax
arm@arm-nb-t470p ~/Documents/Object-Detection-Metrics $ python pascalvoc.py -v
File "pascalvoc.py", line 197
[print(e) for e in errors]
^
SyntaxError: invalid syntax

Therefore, I am unable to execute the script on my generated GT and Detections.

Can you tell how do you find the acc Tp and acc fp I'm bit confused of it.

I want to know procedure of acc Tp ACC Fp

Output multiple results on same plot

Thank you for the useful measurement tool!
Is it possible that input one and more prediction folders and plot them on the same picture?

About fixed size images and False negative (FN)?

@rafaelpadilla Thank very much for your contribution. Maybe your code is for fixed size image, like
......CoordinatesType.Absolute, (200, 200),........
would you explain please for different size?

How can I calculate FN (false negative), would you suggest me please?

Thanks

missing else part: if not, the box should be a FP.

Object-Detection-Metrics/lib/Evaluator.py

Line 102 in a9bb67c

if det[dects[d][0]][jmax] == 0:

mAP on COCO YoloV3 Paper

@rafaelpadilla First, Thanks for creating this repository. Excellent code and well explained.

I was wondering by any chance you tried running this repo against detections obtained through YoloV3 COCO with official weights loaded. I'm only getting [email protected]: 40% compared to the advertised [email protected]: 55% on the paper. What **nms/confidence/iou** threshold should be set in order to get proper mAP as stated on the paper?

Using same metric with another repo, got same number of TF and FP but mAP is different

Hi,
I run your code with detections from darknet detection framework (AlexeyAB branch) using AUC mode. Your code returns the same number of TP and FP as darknet (and the same number of positive obviously) but map function but mAP is different.

With your repo:

maize  - mAP: 91.39 %, TP: 150, FP: 24, npos: 162
bean   - mAP: 85.93 %, TP: 151, FP: 41, npos: 171
carrot - mAP: 74.80 %, TP: 112, FP: 51, npos: 134
npos = 467

Darknet map output:

detections_count = 1469, unique_truth_count = 467  
name = maize,  ap = 94.02%, TP = 150, FP = 24
name = bean,   ap = 91.14%, TP = 151, FP = 41
name = carrot, ap = 79.26%, TP = 112, FP = 51

I can't figure out in which repo the error is hiding, if there is one. Have some idea?

why should order the detections by their confidences

Optimize source code

Thank you for sharing your work, it saves a lot of my time ^^
However, I have a suggestion to improve the calculating performance. In Evaluator.py, you call GetPascalVOCMetrics() every time you need to calculate mAP of a new class. This process is time-consuming. The return results are the same for every class.
So, you should calculate the self.results = self.GetPascalVOCMetrics(boundingboxes, IOUThreshold) once at the init() class. In the def PlotPrecisionRecallCurve(), you can use for res in self.results:....
It can save a lot of calculating time.

CoordinateTypes.Absolute Not defined

Hi rafael
I get an error at the point of procedure def parameter list
where we referring CoordinatesTypes' class attribute.
I think we should put utils first
kaggle/working/Object-Detection-Metrics/lib/Evaluator.py in ()
16 import numpy as np
17
---> 18 from BoundingBox import *
19 from BoundingBoxes import *
20 from utils import *

/kaggle/working/Object-Detection-Metrics/lib/BoundingBox.py in ()
2
3
----> 4 class BoundingBox:
5 def init(self,
6 imageName,

/kaggle/working/Object-Detection-Metrics/lib/BoundingBox.py in BoundingBox()
10 w,
11 h,
---> 12 typeCoordinates=CoordinatesType.Absolute,
13 imgSize=None,
14 bbType=BBType.GroundTruth,