Coder Social home page Coder Social logo

suyukun666 / s2cnet Goto Github PK

View Code? Open in Web Editor NEW
21.0 4.0 1.0 15.27 MB

Official PyTorch implementation of the “Spatial-Semantic Collaborative Cropping for User Generated Content”. (AAAI24)

Python 56.32% Shell 2.39% C++ 22.96% C 4.07% Cuda 14.25%
image-cropping

s2cnet's Introduction

Spatial-Semantic Collaborative Cropping for User Generated Content

A large amount of User Generated Content (UGC) is uploaded to the Internet daily and displayed to people world-widely through the client side (eg., mobile and PC). This requires the cropping algorithms to produce the aesthetic thumbnail within a specific aspect ratio on different devices. However, existing image cropping works mainly focus on landmark or landscape images, which fail to model the relations among the multi-objects with the complex background in UGC. Besides, previous methods merely consider the aesthetics of the cropped images while ignoring the content integrity, which is crucial for UGC cropping. In this paper, we propose a Spatial-Semantic Collaborative cropping network (S2CNet) for arbitrary user generated content accompanied by a new cropping benchmark. Specifically, we first mine the visual genes of the potential objects. Then, the suggested adaptive attention graph recasts this task as a procedure of information association over visual nodes. The underlying spatial and semantic relations are ultimately centralized to the crop candidate through differentiable message passing, which helps our network efficiently to preserve both the aesthetics and the content integrity. Extensive experiments on the proposed UGCrop5K and other public datasets demonstrate the superiority of our approach over state-of-the-art counterparts.

AAAI24

Usage

Requirement

torch >= 1.1.0
torchvision >= 0.7.0
python3

Installation

Build and install source code of roi_align_api and rod_align_api.

bash make_all.sh

Preparation

  1. Download the dataset (GAICv1 and GAICv2)

    dataset
     --GAIC
           --annotations
           --bbox
           --images
     --GAIC-journal
           --annotations
           --bbox
           --images
    
  2. Download the pre-trained weight (GAICv1 and GAICv2)

Inference

python test.py --cfg {path_to_config} --pretrained {path_to_pretrained_weight}

Citation

If you find the code useful, please consider citing our paper using the following BibTeX entry.

@article{su2024spatial,
  title={Spatial-Semantic Collaborative Cropping for User Generated Content},
  author={Su, Yukun and Cao, Yiwen and Deng, Jingliang and Rao, Fengyun and Wu, Qingyao},
  journal={arXiv e-prints},
  pages={arXiv--2401},
  year={2024}
}

Acknowledge

Our project references the codes in the following repos.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.