jwwangchn / ai-tod Goto Github PK
View Code? Open in Web Editor NEWOfficial code for "Tiny Object Detection in Aerial Images".
License: MIT License
Official code for "Tiny Object Detection in Aerial Images".
License: MIT License
你好,我根据readme的步骤运行generate_aitod_imgs.py后生成了aitodtoolkit/xview/xview_aitod_sets文件夹。
请问是自动更新了aitod文件夹的数据集为完整数据集了吗。
如果是完整数据集的话,aitod文件夹下的结构是1)annotations 2)images。
如果我想生成.txt后缀的label标签的话,请问接下来应该怎么做?你们的代码里是否有提供类似的方法呢?
在获取AI-TOD图片时,输入指令python generate_aitod_imgs.py,报错如下:
File "/root/autodl-tmp/mmdet-rfla-main/data/AI-TOD/aitodtoolkit/wwtool/wwtool/datasets/dump.py", line 3, in
import lxml.etree as ET
ModuleNotFoundError: No module named 'lxml'
希望得到指点,感谢!
博士您好!我对这份数据集做出的贡献很感兴趣,但我只找到了mmdetection对这份数据集的适配支持。
我希望能在其他代码框架下训练这份数据集,例如faster-rcnn.pytorch,我该怎么做呢,您可以指导一下,给我一个方向吗?因为源代码只支持了coco和voc数据集,而aitod在指标上略有差异。
Greetings! I have downloaded the AI-TOD dataset correctly. But when I train it, I find that it reports an error:
……………… Illegal image file: /root/autodl-tmp/AI-TOD/train/3b444d792.png, and it will be ignored Traceback (most recent call last)
assert ct > 0, 'not found any coco record in %s' % (anno_path) AssertionError: not found any coco record in /root/autodl-tmp/AI-TOD/annotations/aitod_trainval_v1.json
I can make sure that the path to my dataset is not wrong and the framework used is not wrong, because switching to a different dataset will train it fine。I'm not sure what this is about, many of the images are ruled illegitimate and therefore ignored。The json file won't open either, causing me to not be able to find the problem。
Very much looking forward to your reply!
I was able to use the AI-TOD V2 Dataset with MMDET-AITOD training by referencing the AI-TOD V2 JSON annotations. All worked well. Model does converge, and inferences look good. But you describe using the 'generate_aitod_imgs.py' tool to create a full AI-TOD AND xView dataset. I assume this is to get an even bigger dataset? But, when I run the tool, it creates many directories, but the only new directory that looks like the full set is the 'xview_aitod_sets' directory in the xview folder, not the aitod folder. This isn't what the tool description is. Anyway, the problem is that the 'xview_aitod_sets' folder does have what looks like a full set of data (train, val, trainval, and test), but it's NOT in JSON format like AI-TOD. It's in images/labels format (text files which correspond to image files) which is more like YOLO. There is no annotations JSON file with AI-TOD and xView labels together.
This is NOT what your MMDET-AITOD is seems to be looking for in the 'mmdet-aitod/mmdet-nwdrka/configs_nwdrka/nwd_rka' V2 config files. They are looking for JSON annotation files.
Questions:
Hello! There are four center in M-CenterNet. The center point in Centernet is Gaussian distribution, how do you deal with it.
In my view, if you publish a new dataset and proposed that detectors are working with better accuracy, then you should submit the codes of that detector. Thus, we can train or load your pretrained weights and see how it is working well. Then, we decide to use your dataset and follow your training technique. It is also important note that you are proposing a detector in your paper for this dataset.
Hello. Could you please share the code used to transfer the ai-tod annotations into COCO format?
I find it difficult to download AITOD dataset with onedrive. Is there any other way to download the AITOD dataset, such as Baidu Netdisk ?
你好,我在计算指标时,使用pycocotools只能计算ap small ap middle,我想请教一下如何计算ap tiny 和ap vt
Hi,
I came across your interesting paper
Dot Distance for Tiny Object Detection in Aerial Images.
I tried to implement it in mmdetection framework and tested on our in house dataset, however, I got unexpected results. Replacing the IoU metric for assigning by DotD showed a degredation of 0.8 of mAP.
I attached here the code I wrote.
@IOU_CALCULATORS.register_module()
class DotDistOverlaps:
def __init__(self, average_size, scale=1., dtype=None):
self.average_size = math.sqrt(average_size)
self.scale = scale
self.dtype = dtype
def __call__(self, bboxes1, bboxes2):
"""Calculate IoU between 2D bboxes.
Args:
bboxes1 (Tensor): bboxes have shape (m, 4) in <x1, y1, x2, y2>
format, or shape (m, 5) in <x1, y1, x2, y2, score> format.
bboxes2 (Tensor): bboxes have shape (m, 4) in <x1, y1, x2, y2>
format, shape (m, 5) in <x1, y1, x2, y2, score> format, or be
empty. If ``is_aligned `` is ``True``, then m and n must be
equal.
Returns:
Tensor: shape (m, n) if ``is_aligned `` is False else shape (m,)
"""
assert bboxes1.size(-1) in [0, 4, 5]
assert bboxes2.size(-1) in [0, 4, 5]
if bboxes2.size(-1) == 5:
bboxes2 = bboxes2[..., :4]
if bboxes1.size(-1) == 5:
bboxes1 = bboxes1[..., :4]
centroids1_cx = (bboxes1[..., 0] + bboxes1[..., 2]) / 2.0
centroids1_cy = (bboxes1[..., 1] + bboxes1[..., 3]) / 2.0
# [B, M, 2]
centroids1 = torch.stack((centroids1_cx, centroids1_cy), dim=1)
centroids2_cx = (bboxes2[..., 0] + bboxes2[..., 2]) / 2.0
centroids2_cy = (bboxes2[..., 1] + bboxes2[..., 3]) / 2.0
# [B, N, 2]
centroids2 = torch.stack((centroids2_cx, centroids2_cy), dim=1)
distances = (centroids1[:, None, :] -
centroids2[None, :, :]).pow(2).sum(-1).sqrt()
dotd = torch.exp(-distances / self.average_size)
return dotd
I am not sure if I missed something there. Could you please check that?
Moreover, I tried to download the AI-TOD dataset, to test my implementation on it but I could not, could you please advise me how can I get the data.
Thanks,
Mohammed Jabreel
Hi,
I follow the orignal instruction and aitodtoolkit and I get some annotation files, but I find the numbers of instances are not consistent as reported in AI-TOD paper. So I want to know if there are any processed annotation files that can be used directly.
I listed the results as following:
My results:
category_id | aitod_train.json | aitod_val.json | aitod_trainval.json | aitod_test.json |
---|---|---|---|---|
1 | 623 | 170 | 793 | 745 |
2 | 512 | 140 | 652 | 689 |
3 | 5269 | 2477 | 7746 | 5860 |
4 | 13539 | 3791 | 17330 | 17633 |
5 | 293 | 34 | 327 | 292 |
6 | 248051 | 59906 | 307957 | 306678 |
7 | 14126 | 3841 | 17967 | 15443 |
8 | 176 | 67 | 243 | 290 |
total | 282589 | 70426 | 353015 | 347630 |
Hello,
I have converted the AI-TOD dataset to the YOLO format. In your paper, you describe the bounding box representations as bi = (cxi, cyi, wi, hi), where cxi and cyi are the center coordinates. During the conversion process, I interpreted cxi and cyi as the actual center coordinates. However, I encountered an issue during training: YOLO does not accept negative values, and some of my bounding box representations contain negative values.
I believe this issue arose because I might have misunderstood the representation of cxi and cyi. To clarify, should I have calculated these center coordinates by summing the two values and then dividing by two? Or was my initial interpretation of using cxi and cyi as direct center coordinates correct?
I'm seeking clarification to understand if my conversion approach was incorrect. Any guidance would be greatly appreciated.
Hello Sir,
I followed the steps as mentioned, and go to the execution part of "python generate_aitod_imgs.py". However, in my terminal, the error depicted in the below given image is being shown. Please do let me know of a way in which I can rectify it and whether the error is in some of the tiff images only or something else. Thanks.
aitod里面的文件结构和coco很像(images和json的标注),请问它是coco格式吗?感谢!
@jwwangchn Hi. Great dataset! Just wondering where can I find the leaderboard based on the AI-TOD dataset?
你好,是否有合并之前的ai-tod的annotations?我只想用一部分的数据。
Hi. I found the training set cannot be downloaded by using commands like 'wget+downloading URL'. How could I download it in a non-GUI terminal?
你好,请问config文件中提到的aitod_training_v1.0.json和aitod_validation_v1.0.json在哪里下载?
我把complete_annotations解压后,没有找到aitod_training_v1.0.json和aitod_validation_v1.0.json。
谢谢,期待您的回复。
The link to the dataset is invalid, could you update it? Thanks.
Thanks for your great works! And I'm wondering where can I find the source code of "M-CenterNet"?
I find the AI-TOD_wo_xview need keyword
你好,我在AI-TOD的test set上运行了detectors_cascade_rcnn_r50_aitod_rpn_nwd.py,但性能未达到论文中报告的精度(20.8 AP)。
我对detectors_cascade_rcnn_r50_aitod_rpn_nwd.py进行了以下修改:
1)在训练时,将detectors_cascade_rcnn_r50_aitod_rpn_nwd修改为在两个GPU上运行,每个GPU上运行4张图片,保持batch size为8。
2)在推理时,修改用于推理的图像和label路径:
ann_file='data/AI-TOD/annotations/aitod_test_v1_1.0.json',
img_prefix='data/AI-TOD/test/',
使用以下指令得到test set的性能:
python tools/test.py work_dirs/nwd/detectors_cascade_rcnn_r50_aitod_rpn_nwd/detectors_cascade_rcnn_r50_aitod_rpn_nwd.py work_dirs/nwd/detectors_cascade_rcnn_r50_aitod_rpn_nwd/epoch_12.pth --eval bbox
3)具体的AP性能如下:
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 14018/14018, 2.3 task/s, elapsed: 6169s, ETA: 0s
Evaluating bbox...
Loading and preparing results...
DONE (t=6.76s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type bbox
DONE (t=2665.67s).
Accumulating evaluation results...
DONE (t=29.74s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=1500 ] = 0.187
Average Precision (AP) @[ IoU=0.25 | area= all | maxDets=1500 ] = -1.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1500 ] = 0.451
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1500 ] = 0.126
Average Precision (AP) @[ IoU=0.50:0.95 | area=verytiny | maxDets=1500 ] = 0.041
Average Precision (AP) @[ IoU=0.50:0.95 | area= tiny | maxDets=1500 ] = 0.175
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1500 ] = 0.266
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1500 ] = 0.352
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.283
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=300 ] = 0.300
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1500 ] = 0.303
Average Recall (AR) @[ IoU=0.50:0.95 | area=verytiny | maxDets=1500 ] = 0.054
Average Recall (AR) @[ IoU=0.50:0.95 | area= tiny | maxDets=1500 ] = 0.317
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1500 ] = 0.383
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1500 ] = 0.424
Optimal LRP @[ IoU=0.50 | area= all | maxDets=1500 ] = -1.000
Optimal LRP Loc @[ IoU=0.50 | area= all | maxDets=1500 ] = -1.000
Optimal LRP FP @[ IoU=0.50 | area= all | maxDets=1500 ] = -1.000
Optimal LRP FN @[ IoU=0.50 | area= all | maxDets=1500 ] = -1.000
#Class-specific LRP-Optimal Thresholds #
[-1. -1. -1. -1. -1. -1. -1. -1.]
4)配置环境如下:
Python: 3.7.15 (default, Nov 24 2022, 21:12:53) [GCC 11.2.0]
CUDA available: True
GPU 0,1: NVIDIA A100 80GB PCIe
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_11.1.TC455_06.29069683_0
GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
PyTorch: 1.10.0+cu111
PyTorch compiling details: PyTorch built with:
请问产生这种情况的原因是什么?期待您的回复,谢谢!
Could you provide me with a sample file for evaluation (such as sample.zip for submission), is it in dota form or coco form?
The test email has not responded to my message so far.
Hope to hear from you, thanks.
xView的数据集从官网下载下来,发现train_images.zip里面从1604.tif往后好多数据都损坏了,求大佬们提供个下载链接啊。。
I can't see the link to the dataset. When will it be available? Thanks
Hi there!
I was hoping to use the AI-TOD dataset, but noticed that both v1 and v2 have contaminated the test set with trainval objects and parts of trainval images.
This is happening because the overlapping crops from each xView dataset image are split across the train, val, and test sets.
Additionally this inflates the total number of objects.
Here are the stats I’ve found:
V1:
Total number of contaminating xview bboxes: 84704
Total number of unique xview bboxes: 308286
Total number of xview bboxes: 443943
Total number of bboxes: 700621
V2:
Total number of contaminating xview bboxes: 44801
Total number of unique xview bboxes: 419747
Total number of xview bboxes: 475857
Total number of bboxes: 752746
Given this overlap on xView, I'm concerned there may be additional contamination in the non-xView images. But without access to the script creating the non-xView images crops, I'm not able to check this.
I sent an email 2 months ago regarding this but haven't heard back yet. Let me know if I’m mistaken!
Here's the script I created to measure these numbers:
import json
import os
import numpy as np
from tqdm import tqdm
def process_annos(annos):
id_to_filename = dict()
print('Process Images:')
for img_ann in tqdm(annos['images']):
filename = img_ann['file_name']
filename_arr = filename.split('_')
if len(filename_arr) == 4:
id_ = img_ann['id']
id_to_filename[id_] = filename
xview_id_to_ori_bboxes = dict()
print('Process Annotations:')
for ann in tqdm(annos['annotations']):
if ann['iscrowd']:
continue
if not ann['image_id'] in id_to_filename.keys():
continue
filename = id_to_filename[ann['image_id']]
filename = os.path.splitext(filename)[0]
filename_arr = filename.split('_')
xview_id = filename_arr[0]
offsets = filename_arr[-2:]
offsets = np.array([float(coord) for coord in offsets])
bbox = ann['bbox']
bbox = np.array(bbox)
bbox[:2] += offsets
if xview_id not in xview_id_to_ori_bboxes.keys():
xview_id_to_ori_bboxes[xview_id] = []
xview_id_to_ori_bboxes[xview_id].append(bbox)
return xview_id_to_ori_bboxes
aitod_test_filename = './aitod_test_v1_1.0.json'
# aitod_test_filename = './AI-TOD-v2/aitodv2_test.json'
with open(aitod_test_filename, 'r') as f:
annos_test = json.load(f)
print('Process Test:')
xview_id_to_ori_bboxes_test = process_annos(annos_test)
aitod_trainval_filename = './aitod_trainval_v1_1.0.json'
# aitod_trainval_filename = './AI-TOD-v2/aitodv2_trainval.json'
with open(aitod_trainval_filename, 'r') as f:
annos_trainval = json.load(f)
print('Process Trainval:')
xview_id_to_ori_bboxes_trainval = process_annos(annos_trainval)
total_num_bboxes = len(annos_test['annotations']) + len(annos_trainval['annotations'])
total_num_contaminating_bboxes = 0
total_num_xview_bboxes = 0
total_num_unique_xview_bboxes = 0
for xv_id, bboxes_test in xview_id_to_ori_bboxes_test.items():
if not xv_id in xview_id_to_ori_bboxes_trainval.keys():
continue
bboxes_train = xview_id_to_ori_bboxes_trainval[xv_id]
num_bboxes_test = len(bboxes_test)
num_bboxes_train = len(bboxes_train)
total_num_xview_bboxes += num_bboxes_test
total_num_xview_bboxes += num_bboxes_train
# duplicates also exist among test bboxes, remove these
bboxes_test = np.unique(bboxes_test, axis=0)
# duplicates also exist among trainval bboxes, remove these
bboxes_train = np.unique(bboxes_train, axis=0)
num_unique_bboxes_test = len(bboxes_test)
num_unique_bboxes_train = len(bboxes_train)
total_num_unique_xview_bboxes += num_unique_bboxes_test
total_num_unique_xview_bboxes += num_unique_bboxes_train
all_bboxes = np.concatenate((bboxes_test, bboxes_train))
unique_bboxes, counts = np.unique(all_bboxes, axis=0, return_counts=True)
contaminating_bboxes = unique_bboxes[counts > 1]
num_contaminating_bboxes = len(contaminating_bboxes)
if num_contaminating_bboxes > 0:
print(f'Xview id: {xv_id} \t Contaminating bboxes: {num_contaminating_bboxes}')
total_num_contaminating_bboxes += num_contaminating_bboxes
print(f'Total number of contaminating xview bboxes: {total_num_contaminating_bboxes}')
print(f'Total number of unique xview bboxes: {total_num_unique_xview_bboxes}')
print(f'Total number of xview bboxes: {total_num_xview_bboxes}')
print(f'Total number of bboxes: {total_num_bboxes}')
Hey sir,
Could you please help me with the .geojson file? I am not able to understand from where to download it. I have downloaded the rest of the files and images and arranged them as mentioned by am stuck with just this one. Any help would be useful.
你好,我训练atss_r50_aitod_nwd.py时,程序不报错。
但我训练detectors_cascade_rcnn_r50_aitod_rpn_nwd.py时,程序报错NameError: name 'wasserstein_nms' is not defined,请问是怎么回事呢?
环境如下:
Python: 3.7.15 (default, Nov 24 2022, 21:12:53) [GCC 11.2.0]
CUDA available: True
GPU 0,1: Tesla V100-SXM2-32GB
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.1, V10.1.105
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.5.0+cu101
PyTorch compiling details: PyTorch built with:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.