Coder Social home page Coder Social logo

aim's Introduction

Deep Automatic Natural Image Matting [IJCAI-21]

This is the official repository of the paper Deep Automatic Natural Image Matting.

Jizhizi Li, Jing Zhang, and Dacheng Tao

Introduction | Network | AIM-500 | Results | Installation | Inference code | Training code | Statement


๐Ÿ“† News

The training code will be released soon.

[2021-10-02]: Publish the network, the inference code and the pretrained models.

[2021-07-16]: Publish the validation dataset AIM-500. Please follow the readme.txt for details.

Introduction

Different from previous methods only focusing on images with salient opaque foregrounds such as humans and animals, in this paper, we investigate the difficulties when extending the automatic matting methods to natural images with salient transparent/meticulous foregrounds or non-salient foregrounds.

To address the problem, we propose a novel end-to-end matting network, which can predict a generalized trimap for any image of the above types as a unified semantic representation. Simultaneously, the learned semantic features guide the matting network to focus on the transition areas via an attention mechanism.

We also construct a test set AIM-500 that contains 500 diverse natural images covering all types along with manually labeled alpha mattes, making it feasible to benchmark the generalization ability of AIM models. Results of the experiments demonstrate that our network trained on available composite matting datasets outperforms existing methods both objectively and subjectively.

Network - AimNet

We propose the methods consist of:

  • Improved Backbone for Matting: an advanced max-pooling version of ResNet-34, serves as the backbone for the matting network, pretrained on ImageNet;

  • Unified Semantic Representation: a type-wise semantic representation to replace the traditional trimaps;

  • Guided Matting Process: an attention based mechanism to guide the matting process by leveraging the learned semantic features from the semantic decoder to focus on extracting details only within transition area.

The backbone pretrained on ImageNet, the model pretrained on DUTS dataset, and the model pretrained on synthetic matting dataset will be released soon.

Pretrained Backbone on ImageNet Pretrained Model on DUTS Dataset Pretrained Model on Synthetic Matting Dataset (update)
Click to download Click to download Click to download

AIM-500

We propose AIM-500 (Automatic Image Matting-500), the first natural image matting test set, which contains 500 high-resolution real-world natural images from all three types (SO, STM, NS), many categories, and the manually labeled alpha mattes. Some examples and the amount of each category are shown below. The AIM-500 dataset is published now, can be downloaded directly from [this link](http s://drive.google.com/drive/folders/1IyPiYJUp-KtOoa-Hsm922VU3aCcidjjz?usp=sharing). Please follow the readme.txt for more details.

Portrait Animal Transparent Plant Furniture Toy Fruit
100 200 34 75 45 36 10

Results on AIM-500

We test our network on different types of images in AIM-500 and compare with previous SOTA methods, the results are shown below.

Installation

Requirements:

  • Python 3.6.5+ with Numpy and scikit-image
  • Pytorch (version 1.4.0)
  • Torchvision (version 0.5.0)
  1. Clone this repository and go into the repository

    git clone https://github.com/JizhiziLi/aim.git

    cd AIM

  2. Create conda environment and activate

    conda create -n aim python=3.6.5

    conda activate aim

  3. Install dependencies, install pytorch and torchvision separately if you need

    pip install -r requirements.txt

    conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch

Our code has been tested with Python 3.6.5, Pytorch 1.4.0, Torchvision 0.5.0, CUDA 10.1 on Ubuntu 18.04.

Inference Code

Test and Evaluate on AIM-500

Here we provide the procedure of testing on AIM-500 dataset and get the evaluation results by our pretrained model:

  1. Download AIM-500 dataset from here and unzip to your path /dataset/path/, insert the path AIM_DATASET_ROOT_PATH and REPOSITORY_ROOT_PATH in the file core/config.py;

  2. Download the pretrained AimNet model from here and unzip to the folder models/pretrained/;

  3. Setup parameters in the file scripts/test_dataset.sh and run by:

    chmod +x scripts/test_dataset.sh

    scripts/test_dataset.sh;

  4. The output will be generated into the folder args.test_result_dir, the logging file along with the evaluation results will be in the file logs/test_logs/args.logname.

Please note that due to the broken down of the SSD card in our lab's computer, the model we used to report the results in the paper has been corrupted. Thus, we re-trained a model on the synthetic matting dataset and achieved a even better results on AIM-500. We release the updated pretrained model in section Network.

We also report the results in the following table, while AimNet (paper) indicates the one used to report in the paper and AimNet (update) indicates the re-trained one. With same test strategy Hybrid (1/3 & 1/4), the updated version performs even better than the one in the paper. We also report the objective results and some subjective results of test strategy Hybrid (1/2 & 1/4) which has more details in the transition area.

Whole Image Tran. SAD-Type SAD-Category
Model Test SAD MSE MAD Conn. Grad. SAD SO STM NS Avg. Animal Human Transp. Plant Furni. Toy Fruit Avg.
AimNet (paper) Hybrid (1/3&1/4) 43.92 0.0161 0.0262 43.18 33.05 30.74 31.80 94.02 134.31 86.71 26.39 24.68 148.68 54.03 62.70 53.15 37.17 58.11
AimNet (update) Hybrid (1/3&1/4) 41.72 0.0153 0.0248 41.23 33.97 28.47 30.57 86.94 125.97 81.16 24.97 24.11 131.20 52.11 65.09 50.11 35.26 54.69
AimNet (update) Hybrid (1/2&1/4) 44.25 0.0165 0.0263 43.78 34.66 28.89 32.48 91.86 133.46 86.27 26.05 25.09 137.31 57.75 68.88 53.81 36.87 57.97

Test on Your Sample Images

Here we provide the procedure of testing on sample images by our pretrained model:

  1. Insert the path REPOSITORY_ROOT_PATH in the file core/config.py;

  2. Download the pretrained AimNet model from here and unzip to the folder models/pretrained/;

  3. Save your sample images in folder samples/original/.;

  4. Setup parameters in the file scripts/test_samples.sh and run by:

    chmod +x scripts/test_samples.sh

    scripts/test_samples.sh;

  5. The results of alpha matte and transparent color image will be saved in folder samples/result_alpha/. and samples/result_color/..

We show some sample images from the internet, the predicted alpha mattes, and their transparent results as below. We use the pretrained model from section Network with Hybrid (1/2 & 1/4) test strategy.

Statement

If you are interested in our work, please consider citing the following:

@inproceedings{ijcai2021-111,
  title     = {Deep Automatic Natural Image Matting},
  author    = {Li, Jizhizi and Zhang, Jing and Tao, Dacheng},
  booktitle = {Proceedings of the Thirtieth International Joint Conference on
               Artificial Intelligence, {IJCAI-21}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},
  editor    = {Zhi-Hua Zhou},
  pages     = {800--806},
  year      = {2021},
  month     = {8},
  note      = {Main Track}
  doi       = {10.24963/ijcai.2021/111},
  url       = {https://doi.org/10.24963/ijcai.2021/111},
}

This project is under the MIT license. For further questions, please contact Jizhizi Li at [email protected].

Relevant Projects

[1] End-to-end Animal Image Matting, arxiv, 2020 | Paper | Github
โ€‚ โ€‚ โ€‚Jizhizi Li, Jing Zhang, Stephen J. Maybank, Dacheng Tao

[2] Privacy-Preserving Portrait Matting, ACM MM, 2021 | Paper | Github
โ€‚ โ€‚ โ€‚Jizhizi Liโˆ—, Sihan Maโˆ—, Jing Zhang, and Dacheng Tao

aim's People

Contributors

jizhizili avatar bruinxiong avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.