jwwangchn / ai-tod Goto Github PK

View Code? Open in Web Editor NEW

174.0 174.0 18.0 1.99 MB

Official code for "Tiny Object Detection in Aerial Images".

License: MIT License

Python 100.00%

aerial-images computer-vision dataset object-detection

ai-tod's People

Contributors

Stargazers

Watchers

Forkers

cbanyungong trendingtechnology qnet-koyama keeping7 yawudede zifei-zhao eis-vipg chasel-tsui chandlerbing65nm terrisgo cv-det zhengqun nemeziz zhuty2001 small-object-detection moshebeutel adam-s-tech

ai-tod's Issues

请问AI-TOD数据集怎么转换成YOLO格式

你好，我根据readme的步骤运行generate_aitod_imgs.py后生成了aitodtoolkit/xview/xview_aitod_sets文件夹。
请问是自动更新了aitod文件夹的数据集为完整数据集了吗。

如果是完整数据集的话，aitod文件夹下的结构是1）annotations 2）images。
如果我想生成.txt后缀的label标签的话，请问接下来应该怎么做？你们的代码里是否有提供类似的方法呢？

ModuleNotFoundError: No module named 'lxml'

在获取AI-TOD图片时，输入指令python generate_aitod_imgs.py，报错如下：

File "/root/autodl-tmp/mmdet-rfla-main/data/AI-TOD/aitodtoolkit/wwtool/wwtool/datasets/dump.py", line 3, in
import lxml.etree as ET
ModuleNotFoundError: No module named 'lxml'

希望得到指点，感谢！

如何支持其他框架的aitod数据集适配

博士您好！我对这份数据集做出的贡献很感兴趣，但我只找到了mmdetection对这份数据集的适配支持。
我希望能在其他代码框架下训练这份数据集，例如faster-rcnn.pytorch，我该怎么做呢，您可以指导一下，给我一个方向吗？因为源代码只支持了coco和voc数据集，而aitod在指标上略有差异。

Emergency！！！Error using AI-TOD dataset

Greetings! I have downloaded the AI-TOD dataset correctly. But when I train it, I find that it reports an error：

……………… Illegal image file: /root/autodl-tmp/AI-TOD/train/3b444d792.png, and it will be ignored Traceback (most recent call last)
assert ct > 0, 'not found any coco record in %s' % (anno_path) AssertionError: not found any coco record in /root/autodl-tmp/AI-TOD/annotations/aitod_trainval_v1.json

I can make sure that the path to my dataset is not wrong and the framework used is not wrong, because switching to a different dataset will train it fine。I'm not sure what this is about, many of the images are ruled illegitimate and therefore ignored。The json file won't open either, causing me to not be able to find the problem。

Very much looking forward to your reply!

Python tool generate_aitod_imgs.py doesn't generate a combined AI-TOD and xView dataset with JSON annotations for use in MMDET-AITOD training

I was able to use the AI-TOD V2 Dataset with MMDET-AITOD training by referencing the AI-TOD V2 JSON annotations. All worked well. Model does converge, and inferences look good. But you describe using the 'generate_aitod_imgs.py' tool to create a full AI-TOD AND xView dataset. I assume this is to get an even bigger dataset? But, when I run the tool, it creates many directories, but the only new directory that looks like the full set is the 'xview_aitod_sets' directory in the xview folder, not the aitod folder. This isn't what the tool description is. Anyway, the problem is that the 'xview_aitod_sets' folder does have what looks like a full set of data (train, val, trainval, and test), but it's NOT in JSON format like AI-TOD. It's in images/labels format (text files which correspond to image files) which is more like YOLO. There is no annotations JSON file with AI-TOD and xView labels together.

This is NOT what your MMDET-AITOD is seems to be looking for in the 'mmdet-aitod/mmdet-nwdrka/configs_nwdrka/nwd_rka' V2 config files. They are looking for JSON annotation files.

Questions:

Is AI-TOD just the vehicles from xView, or is it a different dataset that can be COMBINED with xView vehicles it using your tool?
If it is a different dataset, then why is the full set that is written out when running your 'generate_aitod_imgs.py' tool not in the AI-TOD format so it can also be used with MMDET-AITOD?

The question about M-CenterNet

Hello! There are four center in M-CenterNet. The center point in Centernet is Gaussian distribution, how do you deal with it.

It would be nice to put the source code of the proposed detector (M-CenterNet)

In my view, if you publish a new dataset and proposed that detectors are working with better accuracy, then you should submit the codes of that detector. Thus, we can train or load your pretrained weights and see how it is working well. Then, we decide to use your dataset and follow your training technique. It is also important note that you are proposing a detector in your paper for this dataset.

How can I transfer the annoations to COCO format?

Hello. Could you please share the code used to transfer the ai-tod annotations into COCO format?

Is there any other way to download the AITOD dataset, such as Baidu Netdisk ？

I find it difficult to download AITOD dataset with onedrive. Is there any other way to download the AITOD dataset, such as Baidu Netdisk ？

指标计算问题

你好，我在计算指标时，使用pycocotools只能计算ap small ap middle，我想请教一下如何计算ap tiny 和ap vt

Unexpected results from DotD papre

Hi,

I came across your interesting paper
Dot Distance for Tiny Object Detection in Aerial Images.

I tried to implement it in mmdetection framework and tested on our in house dataset, however, I got unexpected results. Replacing the IoU metric for assigning by DotD showed a degredation of 0.8 of mAP.

I attached here the code I wrote.

@IOU_CALCULATORS.register_module()
class DotDistOverlaps:

    def __init__(self, average_size, scale=1., dtype=None):
        self.average_size = math.sqrt(average_size)
        self.scale = scale
        self.dtype = dtype

    def __call__(self, bboxes1, bboxes2):
        """Calculate IoU between 2D bboxes.

        Args:
            bboxes1 (Tensor): bboxes have shape (m, 4) in <x1, y1, x2, y2>
                format, or shape (m, 5) in <x1, y1, x2, y2, score> format.
            bboxes2 (Tensor): bboxes have shape (m, 4) in <x1, y1, x2, y2>
                format, shape (m, 5) in <x1, y1, x2, y2, score> format, or be
                empty. If ``is_aligned `` is ``True``, then m and n must be
                equal.

        Returns:
            Tensor: shape (m, n) if ``is_aligned `` is False else shape (m,)
        """
        assert bboxes1.size(-1) in [0, 4, 5]
        assert bboxes2.size(-1) in [0, 4, 5]
        if bboxes2.size(-1) == 5:
            bboxes2 = bboxes2[..., :4]
        if bboxes1.size(-1) == 5:
            bboxes1 = bboxes1[..., :4]
        

        centroids1_cx = (bboxes1[..., 0] + bboxes1[..., 2]) / 2.0
        centroids1_cy = (bboxes1[..., 1] + bboxes1[..., 3]) / 2.0

        # [B, M, 2]
        centroids1 = torch.stack((centroids1_cx, centroids1_cy), dim=1)
        
        centroids2_cx = (bboxes2[..., 0] + bboxes2[..., 2]) / 2.0
        centroids2_cy = (bboxes2[..., 1] + bboxes2[..., 3]) / 2.0

        # [B, N, 2]
        centroids2 = torch.stack((centroids2_cx, centroids2_cy), dim=1)
        distances = (centroids1[:, None, :] -
                    centroids2[None, :, :]).pow(2).sum(-1).sqrt()            
        dotd = torch.exp(-distances / self.average_size)
        return  dotd

I am not sure if I missed something there. Could you please check that?

Moreover, I tried to download the AI-TOD dataset, to test my implementation on it but I could not, could you please advise me how can I get the data.
Thanks,

Mohammed Jabreel

About the annotation files of AI-TOD

Hi,

I follow the orignal instruction and aitodtoolkit and I get some annotation files, but I find the numbers of instances are not consistent as reported in AI-TOD paper. So I want to know if there are any processed annotation files that can be used directly.

I listed the results as following:

My results:

category_id	aitod_train.json	aitod_val.json	aitod_trainval.json	aitod_test.json
1	623	170	793	745
2	512	140	652	689
3	5269	2477	7746	5860
4	13539	3791	17330	17633
5	293	34	327	292
6	248051	59906	307957	306678
7	14126	3841	17967	15443
8	176	67	243	290
total	282589	70426	353015	347630

Clarification Needed on Bounding Box Coordinate Conversion for AI-TOD Dataset in YOLO Format

Hello,

I have converted the AI-TOD dataset to the YOLO format. In your paper, you describe the bounding box representations as bi = (cxi, cyi, wi, hi), where cxi and cyi are the center coordinates. During the conversion process, I interpreted cxi and cyi as the actual center coordinates. However, I encountered an issue during training: YOLO does not accept negative values, and some of my bounding box representations contain negative values.

I believe this issue arose because I might have misunderstood the representation of cxi and cyi. To clarify, should I have calculated these center coordinates by summing the two values and then dividing by two? Or was my initial interpretation of using cxi and cyi as direct center coordinates correct?

I'm seeking clarification to understand if my conversion approach was incorrect. Any guidance would be greatly appreciated.

train on my own dataset

Hello！Thanks for your work，but how do I train on my own dataset？Just organize the dataset like this？

Error in tiff images while executing python generate_aitod_imgs.py.

Hello Sir,

I followed the steps as mentioned, and go to the execution part of "python generate_aitod_imgs.py". However, in my terminal, the error depicted in the below given image is being shown. Please do let me know of a way in which I can rectify it and whether the error is in some of the tiff images only or something else. Thanks.

请问aitod（不包括xview部分）是coco格式吗？

aitod里面的文件结构和coco很像（images和json的标注），请问它是coco格式吗？感谢！

Where can I find the leaderboard.

@jwwangchn Hi. Great dataset! Just wondering where can I find the leaderboard based on the AI-TOD dataset?

是否有合并之前的ai-tod的annotations？

你好，是否有合并之前的ai-tod的annotations？我只想用一部分的数据。

How to download Xview Training set in terminal?

Hi. I found the training set cannot be downloaded by using commands like 'wget+downloading URL'. How could I download it in a non-GUI terminal?

AI-TOD的标签

你好，请问config文件中提到的aitod_training_v1.0.json和aitod_validation_v1.0.json在哪里下载？
我把complete_annotations解压后，没有找到aitod_training_v1.0.json和aitod_validation_v1.0.json。
谢谢，期待您的回复。

dataset is not available

The link to the dataset is invalid, could you update it? Thanks.

Source codes of "M-CenterNet"

Thanks for your great works! And I'm wondering where can I find the source code of "M-CenterNet"?

What is the keyword of AI-TOD_wo_xview

I find the AI-TOD_wo_xview need keyword

AP未达到论文中报告的结果

你好，我在AI-TOD的test set上运行了detectors_cascade_rcnn_r50_aitod_rpn_nwd.py，但性能未达到论文中报告的精度（20.8 AP）。
我对detectors_cascade_rcnn_r50_aitod_rpn_nwd.py进行了以下修改：
1）在训练时，将detectors_cascade_rcnn_r50_aitod_rpn_nwd修改为在两个GPU上运行，每个GPU上运行4张图片，保持batch size为8。

2）在推理时，修改用于推理的图像和label路径：
ann_file='data/AI-TOD/annotations/aitod_test_v1_1.0.json',
img_prefix='data/AI-TOD/test/',

使用以下指令得到test set的性能：
python tools/test.py work_dirs/nwd/detectors_cascade_rcnn_r50_aitod_rpn_nwd/detectors_cascade_rcnn_r50_aitod_rpn_nwd.py work_dirs/nwd/detectors_cascade_rcnn_r50_aitod_rpn_nwd/epoch_12.pth --eval bbox

3）具体的AP性能如下：
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 14018/14018, 2.3 task/s, elapsed: 6169s, ETA: 0s
Evaluating bbox...
Loading and preparing results...
DONE (t=6.76s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type bbox
DONE (t=2665.67s).
Accumulating evaluation results...
DONE (t=29.74s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=1500 ] = 0.187
Average Precision (AP) @[ IoU=0.25 | area= all | maxDets=1500 ] = -1.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1500 ] = 0.451
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1500 ] = 0.126
Average Precision (AP) @[ IoU=0.50:0.95 | area=verytiny | maxDets=1500 ] = 0.041
Average Precision (AP) @[ IoU=0.50:0.95 | area= tiny | maxDets=1500 ] = 0.175
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1500 ] = 0.266
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1500 ] = 0.352
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.283
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=300 ] = 0.300
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1500 ] = 0.303
Average Recall (AR) @[ IoU=0.50:0.95 | area=verytiny | maxDets=1500 ] = 0.054
Average Recall (AR) @[ IoU=0.50:0.95 | area= tiny | maxDets=1500 ] = 0.317
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1500 ] = 0.383
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1500 ] = 0.424
Optimal LRP @[ IoU=0.50 | area= all | maxDets=1500 ] = -1.000
Optimal LRP Loc @[ IoU=0.50 | area= all | maxDets=1500 ] = -1.000
Optimal LRP FP @[ IoU=0.50 | area= all | maxDets=1500 ] = -1.000
Optimal LRP FN @[ IoU=0.50 | area= all | maxDets=1500 ] = -1.000
#Class-specific LRP-Optimal Thresholds #
[-1. -1. -1. -1. -1. -1. -1. -1.]

4）配置环境如下：
Python: 3.7.15 (default, Nov 24 2022, 21:12:53) [GCC 11.2.0]
CUDA available: True
GPU 0,1: NVIDIA A100 80GB PCIe
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_11.1.TC455_06.29069683_0
GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
PyTorch: 1.10.0+cu111
PyTorch compiling details: PyTorch built with:

GCC 7.3
C++ Version: 201402
Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX512
CUDA Runtime 11.1
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
CuDNN 8.0.5
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5,
TorchVision: 0.11.0+cu111
OpenCV: 4.6.0
MMCV: 1.3.5
MMCV Compiler: GCC 9.4
MMCV CUDA Compiler: 11.1
MMDetection: 2.13.0+

请问产生这种情况的原因是什么？期待您的回复，谢谢！

No reply to evaluation email.

Could you provide me with a sample file for evaluation (such as sample.zip for submission), is it in dota form or coco form?
The test email has not responded to my message so far.
Hope to hear from you, thanks.

xView数据集问题

xView的数据集从官网下载下来，发现train_images.zip里面从1604.tif往后好多数据都损坏了，求大佬们提供个下载链接啊。。

How to download the dataset

I can't see the link to the dataset. When will it be available? Thanks

Test set contaminated with Trainval objects!

Hi there!

I was hoping to use the AI-TOD dataset, but noticed that both v1 and v2 have contaminated the test set with trainval objects and parts of trainval images.

This is happening because the overlapping crops from each xView dataset image are split across the train, val, and test sets.

Additionally this inflates the total number of objects.

Here are the stats I’ve found:

V1:
Total number of contaminating xview bboxes: 84704
Total number of unique xview bboxes: 308286
Total number of xview bboxes: 443943
Total number of bboxes: 700621

V2:
Total number of contaminating xview bboxes: 44801
Total number of unique xview bboxes: 419747
Total number of xview bboxes: 475857
Total number of bboxes: 752746

Given this overlap on xView, I'm concerned there may be additional contamination in the non-xView images. But without access to the script creating the non-xView images crops, I'm not able to check this.

I sent an email 2 months ago regarding this but haven't heard back yet. Let me know if I’m mistaken!

Here's the script I created to measure these numbers:

import json
import os
import numpy as np
from tqdm import tqdm


def process_annos(annos):

    id_to_filename = dict()
    print('Process Images:')
    for img_ann in tqdm(annos['images']):
        filename = img_ann['file_name']
        filename_arr = filename.split('_')
        if len(filename_arr) == 4:
            id_ = img_ann['id']
            id_to_filename[id_] = filename

    xview_id_to_ori_bboxes = dict()
    print('Process Annotations:')
    for ann in tqdm(annos['annotations']):
        if ann['iscrowd']:
            continue
        if not ann['image_id'] in id_to_filename.keys():
            continue

        filename = id_to_filename[ann['image_id']]
        filename = os.path.splitext(filename)[0]
        filename_arr = filename.split('_')
        xview_id = filename_arr[0]
        offsets = filename_arr[-2:]
        offsets = np.array([float(coord) for coord in offsets])

        bbox = ann['bbox']
        bbox = np.array(bbox)
        bbox[:2] += offsets

        if xview_id not in xview_id_to_ori_bboxes.keys():
            xview_id_to_ori_bboxes[xview_id] = []

        xview_id_to_ori_bboxes[xview_id].append(bbox)

    return xview_id_to_ori_bboxes


aitod_test_filename = './aitod_test_v1_1.0.json'
# aitod_test_filename = './AI-TOD-v2/aitodv2_test.json'
with open(aitod_test_filename, 'r') as f:
    annos_test = json.load(f)
print('Process Test:')
xview_id_to_ori_bboxes_test = process_annos(annos_test)


aitod_trainval_filename = './aitod_trainval_v1_1.0.json'
# aitod_trainval_filename = './AI-TOD-v2/aitodv2_trainval.json'
with open(aitod_trainval_filename, 'r') as f:
    annos_trainval = json.load(f)
print('Process Trainval:')
xview_id_to_ori_bboxes_trainval = process_annos(annos_trainval)


total_num_bboxes = len(annos_test['annotations']) + len(annos_trainval['annotations'])

total_num_contaminating_bboxes = 0
total_num_xview_bboxes = 0
total_num_unique_xview_bboxes = 0
for xv_id, bboxes_test in xview_id_to_ori_bboxes_test.items():
    if not xv_id in xview_id_to_ori_bboxes_trainval.keys():
        continue
    bboxes_train = xview_id_to_ori_bboxes_trainval[xv_id]

    num_bboxes_test = len(bboxes_test)
    num_bboxes_train = len(bboxes_train)
    total_num_xview_bboxes += num_bboxes_test
    total_num_xview_bboxes += num_bboxes_train

    # duplicates also exist among test bboxes, remove these
    bboxes_test = np.unique(bboxes_test, axis=0)
    # duplicates also exist among trainval bboxes, remove these
    bboxes_train = np.unique(bboxes_train, axis=0)

    num_unique_bboxes_test = len(bboxes_test)
    num_unique_bboxes_train = len(bboxes_train)
    total_num_unique_xview_bboxes += num_unique_bboxes_test
    total_num_unique_xview_bboxes += num_unique_bboxes_train

    all_bboxes = np.concatenate((bboxes_test, bboxes_train))
    unique_bboxes, counts = np.unique(all_bboxes, axis=0, return_counts=True)
    contaminating_bboxes = unique_bboxes[counts > 1]
    num_contaminating_bboxes = len(contaminating_bboxes)
    if num_contaminating_bboxes > 0:
        print(f'Xview id: {xv_id} \t Contaminating bboxes: {num_contaminating_bboxes}')
        total_num_contaminating_bboxes += num_contaminating_bboxes

print(f'Total number of contaminating xview bboxes: {total_num_contaminating_bboxes}')
print(f'Total number of unique xview bboxes: {total_num_unique_xview_bboxes}')
print(f'Total number of xview bboxes: {total_num_xview_bboxes}')
print(f'Total number of bboxes: {total_num_bboxes}')

Not able to find the xview_train.geojson file

Hey sir,

Could you please help me with the .geojson file? I am not able to understand from where to download it. I have downloaded the rest of the files and images and arranged them as mentioned by am stuck with just this one. Any help would be useful.

NameError: name 'wasserstein_nms' is not defined

你好，我训练atss_r50_aitod_nwd.py时，程序不报错。
但我训练detectors_cascade_rcnn_r50_aitod_rpn_nwd.py时，程序报错NameError: name 'wasserstein_nms' is not defined，请问是怎么回事呢？
环境如下：
Python: 3.7.15 (default, Nov 24 2022, 21:12:53) [GCC 11.2.0]
CUDA available: True
GPU 0,1: Tesla V100-SXM2-32GB
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.1, V10.1.105
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.5.0+cu101
PyTorch compiling details: PyTorch built with:

GCC 7.3
C++ Version: 201402
Intel(R) Math Kernel Library Version 2019.0.5 Product Build 20190808 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
OpenMP 201511 (a.k.a. OpenMP 4.5)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 10.1
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
CuDNN 7.6.3
Magma 2.5.2
TorchVision: 0.6.0+cu101
OpenCV: 4.6.0
MMCV: 1.3.5
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 10.1
MMDetection: 2.13.0+9775ac2

label format dataset

Hi @jwwangchn

which label format dataset for training (coco,voc, yolo...)?

Best regards,

PeterPham