Coder Social home page Coder Social logo

I am a master student of Information Processing Lab at University of Washington. I am currently working on embodied agent and video understanding. Have a look at my homepage for more details.

When I am not doing research, I like photography, traveling, and singing.



My GPTs:


Updates:

  • 06/2024: One technique report accepted to CVPR 2024 workshop @ NTIRE.
  • 06/2024: We are working with Pika Lab to develop next-generation video understanding and generation models.
  • 05/2024: One paper accepted to CVPR 2024 workshop @ Embodied AI.
  • 04/2024: We are hosting CVPR 2024 Long-form Video Understanding Challenge @ LOVEU.
  • 04/2024: Invited talk at AgentX seminar about our STEVE series works.
  • 03/2024: One paper accepted to ICLR 2024 workshop at LLM Agents.
  • 02/2024: Two papers accepted to CVPR 2024.
  • 02/2024: Invited talk at AAAI 2024 workshop at IMAGEOMICS.
  • 12/2023: One paper accepted to ICASSP 2024.
  • 12/2023: One paper accepted to AAAI 2024.
  • 11/2023: Two papers accepted to WACV 2024 and its workshop at CV4Smalls.
  • 09/2023: One paper accepted to ICCV 2023 workshop at TNGCV-DataComp.
  • 09/2023: One paper accepted to IEEE T-MM.
  • 08/2023: One paper accepted to BMVC 2023.
  • 07/2023: Two papers accepted to ACM MM 2023.
  • 07/2023: Finished my research internship at Microsoft Research Asia (MSRA), Beijing.
  • 07/2023: Two papers accepted to ICCV 2023.

Wenhao Chai's Projects

3d-vista icon 3d-vista

Official implementation of ICCV 2023 paper "3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment"

all-seeing icon all-seeing

This is the official implementation of the paper "The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World"

arxiv-daily icon arxiv-daily

🎓 Automatically Update Some Fields Papers Daily using Github Actions (Update Every 12th hours)

awesome-3d-gaussian-splatting icon awesome-3d-gaussian-splatting

Curated list of papers and resources focused on 3D Gaussian Splatting, intended to keep pace with the anticipated surge of research in the coming months.

awesome-detection-transformer icon awesome-detection-transformer

Collect some papers about transformer for detection and segmentation. Awesome Detection Transformer for Computer Vision (CV)

awesome-drivelm icon awesome-drivelm

📚 A collection of resources and papers on Large Language Models in autonomous driving

awesome-llm-3d icon awesome-llm-3d

Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources

awesome-long-context icon awesome-long-context

A curated list of resources about long-context in large-language models and video understanding.

awesome-nerf icon awesome-nerf

A curated list of awesome neural radiance fields papers

awesome-video-diffusion icon awesome-video-diffusion

A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.

awesome-visual-question-answering icon awesome-visual-question-answering

A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.

awesome-vqvae icon awesome-vqvae

📚 A collection of resources and papers on Vector Quantized Variational Autoencoder (VQ-VAE) and its application

bmprinciples icon bmprinciples

A collection of phenomenons observed during the scaling of big foundation models, which may be developed into consensus, principles, or laws in the future

citygen icon citygen

🏙️🌆🌃 Try Infinite and Controllable 3D City Layout Generation!

cvinw_readings icon cvinw_readings

A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''

ed-pose icon ed-pose

[ICLR 2023] Official implementation of the paper "Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation "

leetcode icon leetcode

:pencil2: LeetCode solutions in C++ 11 and Python3

llama-efficient-tuning icon llama-efficient-tuning

Easy-to-use fine-tuning framework using PEFT (PT+SFT+RLHF with QLoRA) (LLaMA-2, BLOOM, Falcon, Baichuan)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.