rese1f Goto Github PK

followers: 178.0 following: 177.0 repos: 57.0 gists: 0.0

Name: Wenhao Chai

Type: User

Company: University of Washington

Bio: Univ. of Washington

Twitter: re5e1f

Location: Seattle, US

Blog: https://rese1f.github.io/

I am currently a internship at Pika Lab, and also a master student of Information Processing Lab at University of Washington. I am currently working on video understanding and generation, as well as embodied agent. Have a look at my homepage for more details.

When I am not doing research, I like photography, traveling, and singing.

My GPTs:

Academic Paper Writing Assistant: For AI academic papers.
Paper Search Engine: Expert in latest academic paper search and summary.

Updates:

07/2024: Two papers accepted to ACM MM 2024.
07/2024: Two papers accepted to ECCV 2024.
06/2024: One technique report accepted to CVPR 2024 workshop @ NTIRE.
06/2024: We are working with Pika Lab to develop next-generation video understanding and generation models.
05/2024: One paper accepted to CVPR 2024 workshop @ Embodied AI.
04/2024: We are hosting CVPR 2024 Long-form Video Understanding Challenge @ LOVEU.
04/2024: Invited talk at AgentX seminar about our STEVE series works.
03/2024: One paper accepted to ICLR 2024 workshop at LLM Agents.
02/2024: Two papers accepted to CVPR 2024.
02/2024: Invited talk at AAAI 2024 workshop at IMAGEOMICS.
12/2023: One paper accepted to ICASSP 2024.
12/2023: One paper accepted to AAAI 2024.
11/2023: Two papers accepted to WACV 2024 and its workshop at CV4Smalls.
09/2023: One paper accepted to ICCV 2023 workshop at TNGCV-DataComp.
09/2023: One paper accepted to IEEE T-MM.
08/2023: One paper accepted to BMVC 2023.
07/2023: Two papers accepted to ACM MM 2023.
07/2023: Finished my research internship at Microsoft Research Asia (MSRA), Beijing.
07/2023: Two papers accepted to ICCV 2023.

Wenhao Chai's Projects

llm-agent-paper-list

The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.

minisora

The Mini Sora project aims to explore the implementation path and future development direction of Sora.

missing-label-detection

With imperfect bounding box annotation, 30% of missing labels in this project, normal detection method like YOLOv5 doesn’t achieve a relatively good result. In our project, we use COCO dataset. And we greatly eliminate the negative influence on missing labels by using a modified loss function and dynamic weight.

mmlab-cn-ntu

test

moviechat

[CVPR 2024] 🎬💭 chat with over 10K frames of video!

multi-modality-arena

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

muse-maskgit-pytorch

Implementation of Muse: Text-to-Image Generation via Masked Generative Transformers, in Pytorch

old_web

personal website built on beautiful jekyll, feel free to clone and modify

openscene

3D Occupancy Prediction Benchmark in Autonomous Driving

pose2img

pose-driven human natural image generation based on latent diffusion model

poseda

[ICCV 2023] Global Adaptation meets Local Generalization: Unsupervised Domain Adaptation for 3D Human Pose Estimation

random-bridge-generator

a blender platform for developing computer vision-based structural inspection algorithms

rese1f

Config files for my GitHub profile.

self-supervised-cross-view-3d-human-pose-estimation-and-localization-in-video

A algorithm to process 3D-multi cross-view dataset based on Human3.6M or others, and realize the mapping from 2D joints location to 3D in our dataset.

spinal-segmentation-and-3d-reconstruction

[ICSU 2022 best] The Code of ICSU 2022 best paper "Automatic Spinal Ultrasound Image Segmentation and Deployment for Real-time Spine Volumetric Reconstruction"

stablevideo

[ICCV 2023] StableVideo: Text-driven Consistency-aware Diffusion Video Editing

steve

⛏💎 STEVE in Minecraft is for See and Think: Embodied Agent in Virtual Environment

structural-health-monitoring-hrnet

Codes for the competition IC-SHM 2021.

tuning_playbook

A playbook for systematically maximizing the performance of deep learning models.

uiuc-cs357-22sp

Workspace for CS357

uniap

[AAAI 2024] UniAP: Towards Universal Animal Perception in Vision via Few-shot Learning

univhp

Unified Human-centric Perception Model and Benchmark in Sports

vfd-2000

[ICTAI 2022] VFD-2000 Dataset and official page for "Weakly Supervised Two-Stage Training Scheme for Deep Video Fight Detection Model"

video-dataset-maker

A pipeline covers downloading videos from YouTube and extracting frames using ffmpeg.

video_captioning_datasets

Summary about Video-to-Text datasets. This repository is part of the review paper *Bridging Vision and Language from the Video-to-Text Perspective: A Comprehensive Review*

xpretrain

Multi-modality pre-training

zju-ncov-hitcarder-sample

Sample of https://github.com/Long0x0/ZJU-nCov-Hitcarder.

rese1f Goto Github PK

Wenhao Chai's Projects

Recommend Projects

Recommend Topics

Recommend Org