visceral-project / evaluatesegmentation Goto Github PK

View Code? Open in Web Editor NEW

311.0 311.0 87.0 43.97 MB

A program to evaluate the quality of image segmentations.

License: Apache License 2.0

C 6.40% C++ 92.12% CMake 0.81% Dockerfile 0.68%

evaluatesegmentation's People

Contributors

Stargazers

Watchers

Forkers

kalpathy prags snowripple ivanshih zjucsxxd dryuna sheryjoe mbouksim evaluatesegmentation hanibuomer xinkang xielm12 arasharchor hachterberg tianzq nagyistge jerryitp nagyist alonshmilo ginobilinie sssomani zgsxwsdxg lxh-123 comeonlgq maxcqy zfxu cylhf drkarim ivjia liu3xing3long liangyuandg hjmjohnson suixiaodan deepblueparticle wanfengkai niklr holgerroth mengxiangzheng sepalmasi suhitaghosh10 jasonxingqi tustslf andysworth wjx2 dgshphm ece7048 pint1022 zlinzju pyfdrw xiaotingdazheng tommylitlle xuebinqin reda-abdellah catandrain yezhounan perenn1s blissseven liuyudian qianshuqinghan kaczmarj abbas009 slydoggo dhinkris nunofernandes-plight lidialuq mseng10 tbredbenner romstriker harrylan apple3c wonlee2019 ghali007 vivianchow tanbobobo wjmlong pedromezquita schellm joaquimlima68 junqiangchen pratik-silverlabs isonohealth drevesz soumickmj lgc-git gasperpodobnik

evaluatesegmentation's Issues

Fixed image is empty!

If the ground truth mask is empty (no positive classes - which is clearly possible in real life), it shows this error.

Quantile Hausdorff distance is not deterministic

Dear Abdel,

First, thank you very much for your implementation and great review of evaluation metrics.
I may have found a problem with your Hausdorff distance (HD) implementation. In your paper (An Efficient Algorithm for Calculating the Exact Hausdorff Distance), you write that the implementation is easily extendible to the quantile HD (Section 3.6). Unfortunately, I found that the quantile HD based on your algorithm does not yield deterministic results and the results are different compared to other implementations (note that the exact HD seems to be correct). I think the non-deterministic results might arise due to random sampling and the different results to other algorithms due to the early breaking in your algorithm.

I created three ground truth / segmentation examples (download here), which all yield different 95th HD in different runs. You can download the examples here (mha and png images) and run your script (version 2017.04.25) to reproduce my findings:

./EvaluateSegmentation ./EXAMPLE1_GROUNDTRUTH.mha ./EXAMPLE1_SEGMENTATION.mha -use [email protected]@

Can you please comment on this and do you have a fix for this issue?

Best,
Fabian

Hausdorff Percentile

I have been using the 95th percentile Hausdorff distance function in this tool, and have seen that while it calculates the max HD correctly, it seems to have problems with calculating percentile distances, drastically over-reporting the actual value. For example,

$ EvaluateSegmentation label1.nii.gz label2.nii.gz -use [email protected]@
this command will give me an HD95 of 12.845233 (units: voxel). The average HD of these two labels (-use AVGDIST) is 0.159714 voxels. Looking at a different HD95 implementation (https://github.com/deepmind/surface-distance) for the same example, it gives the same max HD as this tool, but a much lower 95th percentile HD of 2.23 voxels (which seems to be correct, as this implementation also outputs the raw distance values generated and I could calculate the percentile myself). Could you please look into this issue?

About the exe

Dear sir,
How to use your code?

versioning this project

thank you very much for providing this tool. it would be helpful if versions were provided of this tool. this can be done with the help of github releases.

releasing versions of this tool would have several benefits

users can report exactly which version they used (they can do this with a git commit hash at the moment but that is less apparent to the average user in my opinion).
pre-compiled binaries can be added to the releases on github, so users can download a binary for a particular version. this way, binaries do not have to be part of the git history.
- for an example of what i mean, please see the "Assets" at https://github.com/stedolan/jq/releases
versions allow a structured way of explaining what changes have been made to the tool from one version to the next.

what i am suggesting should not take too much effort. i am also happy to help with this if people would find this useful. having even one published version would be particularly helpful.

Suggested: add average displacement

Average displacement is among the best metric measurements of segmentation accuracy. It would complete the need for accuracy measures if that is among the output of this tool.

Suggesting this because the average displacement I compute with other software does not match well the one currently given. The best would be the average distance between both sides.
Dorian

Coefficient of Variation not available

Hi, while trying both -use all and -use ...,COEFVAR,... along with others, I found that the generated xml file only contains 21 evaluation metrics. The missing one is COEFVAR. Could you please fix this issue.

Thanks

Naved

Hausdorff, number of voxels or mm?

First, let me thank you for this great tool.

I am using your tool for some automatic lesion segmentation. Question is: what does the Hausdorff variables represent, number of voxels or the metric distance in millimeters.

Dorian

Bug in Rand index

Hey,

I noticed that the Rand Index here is calculated by (a + d)/(a + b + c + d).
But shouldn't it be (a + b)/(a + b + c + d) ?

Best regards

EvaluateSegmentation/source/RandIndexMetric.h

Line 84 in 4cff08d

return (a + d)/(a + b + c + d);

Memory allocation problem

I have tried to run the Windows pre-compiled exe on a large volume (1024,1024,500). The output is "Memory allocation 2 !" I assume this means that the program cannot allocate sufficient memory. Is there any work around (other than using smaller datasets?)

full help message does not show when running `EvaluateSegmentation -help`

when I run EvaluateSegmentation -help, I do not see the full help message. i did some debugging and the problem is related to the for loop referenced below:

EvaluateSegmentation/source/EvaluateSegmentation.cxx

Lines 64 to 68 in a10b393

    
           for(int i=0 ; i< METRIC_COUNT ; i++){ 
        
           	if(!metricInfo[i].testmetric){ 
        
           		std::cout << "	" << metricInfo[i].metrId << "	:" << metricInfo[i].help<< std::endl; 
        
           	} 
        
           }

in other words, the following section is not shown in the help message:

-default or -def =reads default options from a file default.txt in the current folder. All the options above except image filenames can be used as defaults. Default options are overridden by options given in the command line.

Example:
/usr/local/bin/EvaluateSegmentation groundtruth.nii segment.nii -use RNDIND,[email protected]@,[email protected]@ -xml result.xml

2)For help on evaluation of landmark, type: /usr/local/bin/EvaluateSegmentation -loc 
3)For help on lesion detection evaluation, type: /usr/local/bin/EvaluateSegmentation -det 


  ---** VISCERAL 2013, www.visceral.eu **---

i was able to fix it with the following, but this isn't a true fix, because there must be an underlying issue somewhere else.

if (i == (METRIC_COUNT-1)) {
	break;
}

previous versions of this tool did not have this issue (for example, the earlier builds in the builds directory). But the most recent Ubuntu build does have this problem.

i don't know c/c++, so i'm afraid i can't help much with debugging further.

Description: itk::ERROR: Singular matrix. Determinant is 0.

When I run exe in my Windows,Error:
itk::ExceptionObject (01B8F780)
Location: "unknown"
File: c:\itk 4.4.2\source\modules\core\common\include\itkMatrix.h
Line: 240
Description: itk::ERROR: Singular matrix. Determinant is 0.

Can you help me out?

Ubuntu build crashes when -xml argument is missing

I noticed the Ubuntu build in EvaluateSegmentation-2020.08.28-Ubuntu.zip crashes when comparing two .nii volumes without providing the -xml argument. When providing the -xml argument it executes successfully. I am using Ubuntu 20.04.2.

./EvaluateSegmentation femur_left_manual.nii femur_left_automatic.nii -help -use DICE returns:

Similarity:
Segmentation fault (core dumped)

./EvaluateSegmentation femur_left_manual.nii femur_left_automatic.nii -help -use DICE -xml results.xml returns:

Similarity:
DICE	= 0.942233	Dice Coefficient (F1-Measure) 

Distance:

Classic Measures:

Total execution time= 1671 milliseconds


  ---** VISCERAL 2013, www.visceral.eu **---

cmd-line tool (win) yields no results

hi guys, great little tool which usually works great. however I sometimes encouter the situation that after initializing processing of two labels, the tool seems to do something but does not yield any results and goes back to a blinking cursor without any output. any idea what this signifies?

Wrong values for background segmentation

Hello,

I want to evaluate a multiclass segmentation and I used your tool (which is efficient and easy to use, thanks!), but I guess I found an error when computing the metrics between the background of my ground truth segmentation and the background of my test segmentation. I created a nifti file containing 1 for the background and 0 for the other classes. Then since I evaluate small classes, this segmentation (background) fills nearly all the field of view of my nifti file.

But when I compute the metrics between these two segmentations (ground truth background and test background), I get an xml file with wrong values, for instance a DICE of 0.5, which is clearly underestimate, or a number of False Positive voxels way too high.

It seems that the 2 segmentations files are seen as shifted in relation to each other. Of course I checked their volume, resolution, and transform matrix but all these properties are identical for the two images. I opened the 2 files with FSL and ITKSnap, to ensure their overlap and they are definitely not shifted.

So I don't really know where does this error come from, and I am afraid that this error could have occur when comparing the other classes of my multiclass experiment.
Do you have an idea about how to fix this issue ?

Thanks a lot for your help !

PS : In order to reproduce this issue, I attach the 2 nifti files containing the segmentations and a txt file containing the resulting metrics (I just converted the output xml file in txt since github doesn't support xml files).
GroundThruth_Seg.nii.gz
Test_Seg.nii.gz
Metric_computed.txt

Measure selection not working

When running your pre-compiled Ubuntu package, the use -METHOD flag does not really have any effect on the actual measures being computed.

I tried with simply:
./EvaluateSegmentation labels.tif segmentation.tif –use RNDIND -xml result.xml

And wanting to exclude the time-intensive AVGDIST with:
./EvaluateSegmentation labels.tif segmentation.tif -use DICE,JACRD,GCOERR,VOLSMTY,KAPPA,AUC,RNDIND,ADJRIND,ICCORR,MUTINF,FALLOUTCOEFVAR,VARINFO,PROBDST,MAHLNBS,SNSVTY,SPCFTY,PRCISON,[email protected]@,ACURCY -xml result.xml

and in both cases all measures are being computed.

Thanks anyway for the excellent tool that came really handy in evaluating our segmentations :)

	for(int i=0 ; i< METRIC_COUNT ; i++){
	if(!metricInfo[i].testmetric){
	std::cout << " " << metricInfo[i].metrId << " :" << metricInfo[i].help<< std::endl;
	}
	}