I've just downloaded the TCGA@Focus dataset at zenodo. It contains 19346 image files, which is more than the file list provided at data/[email protected]. Are those extra image files usable for testing? Here's the folder structure.
Image Patches Database/
├── In Focus [14489 png files]
├── Out of Focus [4006 png files]
└── Out of Focus (Marker) [851 png files]
Additionally, what's the meaning of the second column in data/[email protected]? I've noticed the first column is the file path and the third column is the label for out-of-focus classification. Here's some examples from the list:
In Focus/Adrenal Gland/TCGA-OR-A5JB-01Z-00-DX3.A71196CF-F710-49EC-8F1A-2F90F60CDA0C/patch_i_32381_j_37784.png,Adrenal Gland,0,0
Out of Focus/Hypopharynx/TCGA-BB-A5HY-01Z-00-DX1.0BDC44E8-17E5-4907-9C9B-417FF335AEC8/patch_i_17777_j_38759.png,Hypopharynx,0,1
Out of Focus (Marker)/Tonsil/TCGA-DQ-7590-01Z-00-DX1.8319DF99-1663-48DE-BB6A-A156FBAC2EA1/patch_i_50229_j_25458.png,Tonsil,1,1