rese1f Goto Github PK

followers: 176.0 following: 169.0 repos: 57.0 gists: 0.0

Name: Wenhao Chai

Type: User

Company: University of Washington

Bio: Univ. of Washington

Twitter: re5e1f

Location: Seattle, US

Blog: https://rese1f.github.io/

I am a master student of Information Processing Lab at University of Washington. I am currently working on embodied agent and video understanding. Have a look at my homepage for more details.

When I am not doing research, I like photography, traveling, and singing.

My GPTs:

Academic Paper Writing Assistant: For AI academic papers.
Paper Search Engine: Expert in latest academic paper search and summary.

Updates:

06/2024: One technique report accepted to CVPR 2024 workshop @ NTIRE.
06/2024: We are working with Pika Lab to develop next-generation video understanding and generation models.
05/2024: One paper accepted to CVPR 2024 workshop @ Embodied AI.
04/2024: We are hosting CVPR 2024 Long-form Video Understanding Challenge @ LOVEU.
04/2024: Invited talk at AgentX seminar about our STEVE series works.
03/2024: One paper accepted to ICLR 2024 workshop at LLM Agents.
02/2024: Two papers accepted to CVPR 2024.
02/2024: Invited talk at AAAI 2024 workshop at IMAGEOMICS.
12/2023: One paper accepted to ICASSP 2024.
12/2023: One paper accepted to AAAI 2024.
11/2023: Two papers accepted to WACV 2024 and its workshop at CV4Smalls.
09/2023: One paper accepted to ICCV 2023 workshop at TNGCV-DataComp.
09/2023: One paper accepted to IEEE T-MM.
08/2023: One paper accepted to BMVC 2023.
07/2023: Two papers accepted to ACM MM 2023.
07/2023: Finished my research internship at Microsoft Research Asia (MSRA), Beijing.
07/2023: Two papers accepted to ICCV 2023.

Wenhao Chai's Projects

3d-vista

Official implementation of ICCV 2023 paper "3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment"

ai-hackathon-molecular-dynamics

Codes for AI Hackathon Molecular Dynamics.

all-seeing

This is the official implementation of the paper "The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World"

aperture-jekyll-template

:camera: Photography portfolio Jekyll template

arxiv-daily

🎓 Automatically Update Some Fields Papers Daily using Github Actions (Update Every 12th hours)

awesome-3d-gaussian-splatting

Curated list of papers and resources focused on 3D Gaussian Splatting, intended to keep pace with the anticipated surge of research in the coming months.

awesome-colorful-llm

Learn the colorful world (Vision/Audio/Robotic) from LLM

awesome-cvpr2022

workshop, tutorial, oral, and poster with notes in cvpr2022

awesome-detection-transformer

Collect some papers about transformer for detection and segmentation. Awesome Detection Transformer for Computer Vision (CV)

awesome-drivelm

📚 A collection of resources and papers on Large Language Models in autonomous driving

awesome-foundation-models

A curated list of foundation models for vision and language tasks

awesome-llm-3d

Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources

awesome-long-context

A curated list of resources about long-context in large-language models and video understanding.

awesome-mllm-hallucination

📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

awesome-multimodal-large-language-models

Latest Papers and Datasets on Multimodal Large Language Models

awesome-nerf

A curated list of awesome neural radiance fields papers

awesome-shapley-value

Reading list for "The Shapley Value in Machine Learning" (JCAI 2022)

awesome-skeleton-based-action-recognition

A curated paper list of awesome skeleton-based action recognition.

awesome-video-diffusion

A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.

awesome-visual-question-answering

A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.

awesome-vqvae

📚 A collection of resources and papers on Vector Quantized Variational Autoencoder (VQ-VAE) and its application

awesome_prompting_papers_in_computer_vision

A curated list of prompt-based paper in computer vision and vision-language learning.

bmprinciples

A collection of phenomenons observed during the scaling of big foundation models, which may be developed into consensus, principles, or laws in the future