Coder Social home page Coder Social logo

iccv2023-papers-with-code's Introduction

ICCV2023-Papers-with-Code

ICCV 2023 论文和开源项目合集(papers with code)!

2160 papers accepted!

ICCV 2023 收录论文IDs:https://t.co/A0mCH8gbOi

注1:欢迎各位大佬提交issue,分享ICCV 2023论文和开源项目!

注2:关于往年CV顶会论文以及其他优质CV论文和大盘点,详见: https://github.com/amusi/daily-paper-computer-vision

ICCV 2021

如果你想了解最新最优质的的CV论文、开源项目和学习资料,欢迎扫码加入【CVer学术交流群】!互相学习,一起进步~

【ICCV 2023 论文开源目录】

Avatars

Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control

Paper: https://arxiv.org/abs/2303.17606

Code: https://github.com/songrise/AvatarCraft

Backbone

Rethinking Mobile Block for Efficient Attention-based Models

CLIP

PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization

CLIPTrans: Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation

NeRF

IntrinsicNeRF: Learning Intrinsic Neural Radiance Fields for Editable Novel View Synthesis

Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control

FlipNeRF: Flipped Reflection Rays for Few-shot Novel View Synthesis

Tri-MipRF: Tri-Mip Representation for Efficient Anti-Aliasing Neural Radiance Fields

Diffusion Models(扩散模型)

PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment

FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model

BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion

BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction

DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion

DIRE for Diffusion-Generated Image Detection

Prompt

Read-only Prompt Optimization for Vision-Language Few-shot Learning

Introducing Language Guidance in Prompt-based Continual Learning

视觉和语言(Vision-Language)

Read-only Prompt Optimization for Vision-Language Few-shot Learning

目标检测(Object Detection)

Femtodet: an object detection baseline for energy versus performance tradeoffs

Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment

Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object Detection

ASAG: Building Strong One-Decoder-Layer Sparse Detectors via Adaptive Sparse Anchor Generation

目标跟踪(Visual Tracking)

Cross-modal Orthogonal High-rank Augmentation for RGB-Event Transformer-trackers

语义分割(Semantic Segmentation)

Segment Anything

MARS: Model-agnostic Biased Object Removal without Additional Supervision for Weakly-Supervised Semantic Segmentation

FreeCOS: Self-Supervised Learning from Fractals and Unlabeled Images for Curvilinear Object Segmentation

Residual Pattern Learning for Pixel-wise Out-of-Distribution Detection in Semantic Segmentation

Disentangle then Parse:Night-time Semantic Segmentation with Illumination Disentanglement

视频目标分割(Video Object Segmentation)

Towards Robust Referring Video Object Segmentation with Cyclic Relational Consensus

视频实例分割(Video Instance Segmentation)

DVIS: Decoupled Video Instance Segmentation Framework

医学图像分类

BoMD: Bag of Multi-label Descriptors for Noisy Chest X-ray Classification

医学图像分割

CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection

Low-level Vision

Self-supervised Learning to Bring Dual Reversed Rolling Shutter Images Alive

超分辨率(Super-Resolution)

Spherical Space Feature Decomposition for Guided Depth Map Super-Resolution.

3D点云(3D Point Cloud)

Robo3D: Towards Robust and Reliable 3D Perception against Corruptions

Instance-aware Dynamic Prompt Tuning for Pre-trained Point Cloud Models

Point Contrastive Prediction with Semantic Clustering for Self-Supervised Learning on Point Cloud Videos

3D目标检测(3D Object Detection)

PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images

DQS3D: Densely-matched Quantization-aware Semi-supervised 3D Detection

SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection

StreamPETR: Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection

Cross Modal Transformer: Towards Fast and Robust 3D Object Detection

MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation

Revisiting Domain-Adaptive 3D Object Detection by Reliable, Diverse and Class-balanced Pseudo-Labeling

SA-BEV: Generating Semantic-Aware Bird's-Eye-View Feature for Multi-view 3D Object Detection

3D语义分割(3D Semantic Segmentation)

Rethinking Range View Representation for LiDAR Segmentation

3D目标跟踪(3D Object Tracking)

MBPTrack: Improving 3D Point Cloud Tracking with Memory Networks and Box Priors

视频理解(Video Understanding)

Unmasked Teacher: Towards Training-Efficient Video Foundation Models

图像生成(Image Generation)

FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model

BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion

视频生成(Video Generation)

Simulating Fluids in Real-World Still Images

图像编辑(Image Editing)

Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing

视频编辑(Video Editing)

FateZero: Fusing Attentions for Zero-shot Text-based Video Editing

人体运动生成(Human Motion Generation)

BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction

低光照图像增强(Low-light Image Enhancement)

Implicit Neural Representation for Cooperative Low-light Image Enhancement

场景文本检测(Scene Text Detection)

场景文本识别(Scene Text Recognition)

Self-supervised Character-to-Character Distillation for Text Recognition

MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition

图像检索(Image Retrieval)

Zero-Shot Composed Image Retrieval with Textual Inversion

图像融合(Image Fusion)

DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion

轨迹预测(Trajectory Prediction)

EigenTrajectory: Low-Rank Descriptors for Multi-Modal Trajectory Forecasting

人群计数(Crowd Counting)

Point-Query Quadtree for Crowd Counting, Localization, and More

Video Quality Assessment(视频质量评价)

Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives

其它(Others)

MotionBERT: A Unified Perspective on Learning Human Motion Representations

Graph Matching with Bi-level Noisy Correspondence

LDL: Line Distance Functions for Panoramic Localization

Active Neural Mapping

Reconstructing Groups of People with Hypergraph Relational Reasoning

iccv2023-papers-with-code's People

Contributors

amusi avatar masterbin-iiau avatar wkcn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

iccv2023-papers-with-code's Issues

ICCV 2023 paper: PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment

Paper name/title: PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment
Paper link: https://arxiv.org/pdf/2306.15667.pdf
Code link: https://github.com/facebookresearch/PoseDiffusion

Topic: Diffusion model / Structure from Motion (SfM)
Personally I think SfM is a more appropriate field for this paper though I did not find it in the readme file :)

Thanks for maintaining such a repository.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.