GenAI UAV

AI CUP 2024 Spring Official Competition Website

Generative-AI Navigation Information Competition for UAV Reconnaissance in Natural Environments I：Image Data Generation

以生成式AI建構無人機於自然環境偵察時所需之導航資訊競賽 I －影像資料生成競賽

🚀 Check workshop.ipynb to reproduce the result we've made.

🤗 Or follow the Usage to customize your workflow!

📈 Check Result or refer to the Submission History section for more details.

Team ID: TEAM_5333

Placement:

Public: 18th
Private: 13th

Members:

Chen-Yang Yu, NCKU (Team Leader)
Yuan-Chun Chiang, NTU
Yu-Hao Chiang, NCKU
Xin-Xian Lin, NCKU

Collaboration:

Team 5574: Sherry2580/AI-cup-2024-spring

Introduction

Our task is to translate the black-and-white draft imagery into drone imagery.

Domain Type	Draft Imagery	Drone Imagery
Road
River

Dataset

Training Dataset Format

The dataset contains 2 domains:

label_img: black-and-white draft imagery.
img: drone imagery.

Training Dataset Folder Structure:

training_dataset
├── label_img (trainA)
│   ├── TRA_RI_1000000.png
│   ├── TRA_RI_1000001.png
│   └── ...
└── img (trainB)
    ├── TRA_RO_1000000.jpg
    ├── TRA_RO_1000001.jpg
    └── ...

Training Dataset Data Preprocess

We have provided some preprocessing method in our code, including:

Data Filtering (remove low-quality images at img):

we remove the image that is too blurry.
Data Augmentation:

we employ horizontal flip and vertical flip to augment the dataset.

Raw Image Method Results

Horizontal Flip

Vertical Flip

Dataset Split (Enhanced Model's Architecture):

split the dataset into RIVER and ROAD domains.

dataset
├── train_ROAD
│   ├── trainA (Draft Images)
│   └── trainB (Drone Images)
└── train_RIVER
    ├── trainA (Draft Images)
    └── trainB (Drone Images)

Note: we do not get the best result by using all the above methods.

Testing Dataset Format

The testing dataset contains only the label_img folder, which is the black-and-white draft imagery.

Testing Dataset Folder Structure:

testing_dataset
└── label_img (testA)
    ├── PRI_RI_1000000.png
    ├── PRI_RI_1000001.png
    └── ...

After Dataset Split:

dataset
├── test_ROAD
│   └── testA (Draft Images)
│       ├── PRI_RO_1000000.png
│       ├── PRI_RO_1000001.png
│       └── ...
└── test_RIVER
    └── testA (Draft Images)
        ├── PRI_RI_1000000.png
        ├── PRI_RI_1000001.png
        └── ...

Model Pipeline

We propose 2 methods to train the model.

Baseline (ROAD-RIVER at same time)
Enhanced (2 domain-specific models)

Baseline (ROAD-RIVER at same time)

At first, we train the model with all the ROAD and RIVER dataset at the same conditional GAN model. However, the result is not good enough.

Enhanced (2 domain-specific models)

Hence, we proposed to train 2 domain-specific models for ROAD and RIVER dataset separately.

Other Methods

Hyperparameter Tuning

We have tried to tune the hyperparameters, including n_epochs, n_epochs_decay, batch_size, netG. The best result we got is to train the model with the following hyperparameters:

n_epochs = 200 
n_epochs_decay = 200
batch_size = 1
netG = unet_256

Super Resolution

Since the result from the pix2pix model is in 256x256 format, we tried to use the super resolution method to upscale the image to 428x240. However, the result did not improve a lot. (You can check the super resolution code in other/super_resolution.ipynb)

Potential Method to Improve

We believe that the result can be improved by using pix2pixHD or img2img-turbo.

However, due to the lack of hardware resources and competition time limitation, we did not try this method.

Result

We show the result of the baseline and enhanced model in the following table.

FID (Frechet Inception Distance) as the evaluation metric.
The lower the score, the better the result.

Model	Public Testing	Private Testing
Baseline	141.6813	x
Enhanced	129.4026	128.060178996
Enhanced + data filtering + data augmentation	206.5882	206.667928949

Unfortuantely, when we try to add more data preprocess to our dataset, the result turns worse. Since we train the model with batch_size 64, which cause the GAN learning unstable. If we have more time, we will try to train the model with a smaller batch size.

Setup

git clone https://github.com/LittleFish-Coder/gen-ai-uav

cd gen-ai-uav

pip install -r requirements.txt

Make sure you download the dataset from the AI cup website, and put the dataset in the gen-ai-uav/dataset folder.

Usage

run workshop.ipynb to directly reproduce the result we've made.

Before you start, make sure you have finished the Setup section.

At this section, we have 3 steps for you to follow:

(you can customize your own workflow by following the steps below)

Prepare The Dataset
Train The Model (optional)
Test The Model

In each notebook, we provide baseline and enhanced method for you to follow. (You can just finish the basline part for quick testing.)

1. Prepare The Dataset

Run dataset/preprocess_dataset.ipynb to download and preprocess the dataset.

2. Train The Model (optional)

We have provided the pre-trained model, you can directly move to the next step.

If you want to train the model, please run train_model.ipynb

3. Test The Model

We provide the pre-trained model, you can directly run test_model.ipynb for baseline dataset testing.

Submission History

unfold the details to see the submission history.

Time	Filename	Public Score	Private Score	Description
4/24	submission.zip	Format Error	x	Inference with AI cup pretrained-weight
5/04	submission1.zip	178.4705	x	1. Inference with pre-trained-weight 2. Preprocess: invert the white and black color
5/04	submission2.zip	182.4264	x	test the model with trained-weight-epoch-40
5/04	submission3.zip	181.2201	x	test the model with trained-weight-epoch-170
5/05	submission400.zip	172.6293	x	test the model with trained-weight-epoch-400
5/05	submission200.zip	142.2167	x	retrain the model with 200 epoch since I misuse the training set
5/06	submission_road_river.zip	134.3143	x	train 2 domain-specific models for road and river train with 200 epochs
5/17	submission_retrain200.zip	142.1900	x	1. use the re-trained weights for all dataset (200 epochs) 2. test the image in single_test_mode
5/17	submission_road_river_80epochs.zip	144.3565	x	train 2 domain-specific models for 80 epochs and test in single mode
5/17	submission_all_load_size_256.zip	141.6813	x	test the image in single_test_mode and load_size as 256
5/18	submission_road_river_400epochs.zip	124.7482	x	train 2 domain-specific models for 400 epochs and test in single mode
5/21	submission_retrain200_resnet.zip	172.1164	1000.0	retrain model with resnet block
5/21	submission_private_resnet.zip	1000.0	173.808621769	use the resnet trained model to inference on private testing dataset
5/21	submission_private_unet256.zip	1000.0	138.084645591	use the unet256 trained model to inference on the private testing dataset
5/25	submission_road_river_400.zip	129.4026	128.060178996	1. test with public and private dataset 2. train 2 domain-specific model for 400 epochs
5/26	upscaled_images.zip	126.9314	128.301406203	use super resolution to upscale image from 256x256 to 420x240
5/26	submission_428_240.zip	133.3959	132.658006179	upscale 2 domain data in 428x240 (before: I miss resize the size in 420x240)
5/26	upscaled_images_428_240.zip	127.3133	129.260890304	super resolution upscale to 428x240
5/27	submission_1000epoch.zip	132.2360	131.510869954	retrain model with data filtered with 1000epoch and resize with interpolation CUBIC
5/27	submission_400_interpolation.zip	133.2471	130.429431557	use pretrained 400 netG and resize using interpolation cubic
5/28	submission_400_1000.zip	206.5882	206.667928949	train from 400 pre-trained to 1000
5/28	upscaled_images.zip	1000.0	156.563343145	upscale image (only private) _ 400 epoch
5/28	submission_20.zip	147.1295	147.788939653	refinetune the dataset and train 20 epoch

Acknowledgements & Reference

junyanz/pytorch-CycleGAN-and-pix2pix

@inproceedings{CycleGAN2017,
  title={Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks},
  author={Zhu, Jun-Yan and Park, Taesung and Isola, Phillip and Efros, Alexei A},
  booktitle={Computer Vision (ICCV), 2017 IEEE International Conference on},
  year={2017}
}


@inproceedings{isola2017image,
  title={Image-to-Image Translation with Conditional Adversarial Networks},
  author={Isola, Phillip and Zhu, Jun-Yan and Zhou, Tinghui and Efros, Alexei A},
  booktitle={Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on},
  year={2017}
}

Raw Image	Method	Results
	Horizontal Flip
	Vertical Flip

littlefish-coder / gen-ai-uav Goto Github PK

gen-ai-uav's Introduction