MotionBERT: A Unified Perspective on Learning Human Motion Representations

This is the official PyTorch implementation of the paper "MotionBERT: A Unified Perspective on Learning Human Motion Representations" (ICCV 2023).

Inference

Please refer to docs/inference.md for 3D Pose Estimation and Human Mesh Recovery.

Inference implementation in Google Colab

Refer to MotionBERT_Colab_Implementation file for inference local implementation.

Cloning GitHub Repositories

Clone MotionBERT repository which contains Pretrained models and checkpoints.
Clone AlphaPose repository.
Install required libraries. And import required libraries.
Use T4 GPU for runtime in colab Check if a CUDA-compatible GPU is available and set device.

Setup AlphaPose

Install additional dependencies for AlphaPose.
Setup the Environment to run AlphaPose.
Authenticate and Download Pre-trained Models to Google Drive and create directories in AlphaPose folder and store that Pre-trained Models in colab.
Upload the custom video for further processing in order to execute AlphaPose on it.
Copy the path of the video

For example: copy this -->  /content/pose.mp4

Generate the 2D key-points from AlphaPose

Change directory to AlphaPose and generate the 2D key-points from AlphaPose for custom video using below code:

!python scripts/demo_inference.py --cfg configs/halpe_26/resnet/256x192_res50_lr1e-3_1x.yaml --checkpoint pretrained_models/halpe26_fast_res50_256x192.pth --indir examples/res/vis --video {vid_ff} --save_video

Save the video with Alphapose Keypoints. Example Output:

Run 3D Pose Estimation Inference

Run MotionBERT Inference of 3D Pose Estimation using below code:

%cd /content/MotionBERT
!python /content/MotionBERT/infer_wild.py \
--vid_path {vid_ff} \
--json_path /content/AlphaPose/examples/res/alphapose-results.json \
--out_path /content/MHFormer_out

Example Output of 3D Pose Estimation for custom video.

Install additional dependencies for MotionBERT to run the Human Mesh Recovery inference.
Download the datasets here and put them to data/mesh/. We use Human3.6M, COCO, and PW3D for training and testing. Descriptions of the joint regressors could be found in SPIN.
Download the SMPL model(basicModel_neutral_lbs_10_207_0_v1.0.0.pkl) from SMPLify, put it to data/mesh/, and rename it as SMPL_NEUTRAL.pkl

Run Human Mesh Recovery Inference

Run MotionBERT Inference of Human Mesh Recovery using below code:

%cd /content/MotionBERT
!python /content/MotionBERT/infer_wild_mesh.py --vid_path {vid_ff} --json_path /content/AlphaPose/examples/res/alphapose-results.json --out_path /content/MeshOut --ref_3d_motion_path /content/MHFormer_out/X3D.npy

Example Output of Human Mesh Recovery for custom video.

MotionBERT Testing

Install necessary packages.
Download H36M dataset and Extract the H36M pkl data in /content/ directory in colab.
Split the dataset to get it ready for evaluation.
Perform testing

!python train.py \
--config configs/pose3d/MB_train_h36m.yaml \
--evaluate checkpoint/pose3d/MB_train_h36m/best_epoch.bin

Testing output:
Protocol #1 Error (MPJPE): 39.21385534671374 mm
Protocol #2 Error (P-MPJPE): 32.93466426644884 mm

Implementation in Local Environment

Installation

Download and install Anaconda latest version.
Download and install Python latest version.
Download and install CUDA ToolKit as per your machine and GPU

Now create MotionBERT environment using Anaconda and Python and install PyTorch in it.

Clone the MotionBERT github repo in that environment.

conda create -n motionbert python=3.7 anaconda
conda activate motionbert
# Please install PyTorch according to your CUDA version.
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -r requirements.txt

Getting Started

Task	Document
Pretrain	docs/pretrain.md
3D human pose estimation	docs/pose3d.md
Skeleton-based action recognition	docs/action.md
Mesh recovery	docs/mesh.md

Download Models

Model	Download Link	Config	Performance
MotionBERT (162MB)	OneDrive	pretrain/MB_pretrain.yaml	-
MotionBERT-Lite (61MB)	OneDrive	pretrain/MB_lite.yaml	-
3D Pose (H36M-SH, scratch)	OneDrive	pose3d/MB_train_h36m.yaml	39.2mm (MPJPE)
3D Pose (H36M-SH, ft)	OneDrive	pose3d/MB_ft_h36m.yaml	37.2mm (MPJPE)
Mesh (with 3DPW, ft)	OneDrive	mesh/MB_ft_pw3d.yaml	88.1mm (MPVE)

Inference implementation in Local Environment

Refer to MotionBERT_local_Implementation file for inference local implementation.

Install PyTorch version 2.1.0

!pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121

Install required packages openmim and mmengine

Install mmcv version 2.1.0

!pip install mmcv==2.1.0 -f https://download.openmmlab.com/mmcv/dist/cu121/torch2.1/index.html

Clone MMPose Repository. It is based on MotionBERT.

! git clone https://github.com/open-mmlab/mmpose.git

Clone MMDetection Repository

!git clone https://github.com/open-mmlab/mmdetection.git

Import required libraries.

Run the MMPose for 3D Pose Estimation

!python demo/body3d_pose_lifter_demo.py \
demo/mmdetection_cfg/rtmdet_m_640-8xb32_coco-person.py \
https://download.openmmlab.com/mmpose/v1/projects/rtmpose/rtmdet_m_8xb32-100e_coco-obj365-person-235e8209.pth \
configs/body_2d_keypoint/rtmpose/body8/rtmpose-m_8xb256-420e_body8-256x192.py \
https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/rtmpose-m_simcc-body7_pt-body7_420e-256x192-e48f03d0_20230504.pth \
configs/body_3d_keypoint/motionbert/h36m/motionbert_dstformer-243frm_8xb32-240e_h36m-original.py \
https://download.openmmlab.com/mmpose/v1/body_3d_keypoint/pose_lift/h36m/motionbert_ft_h36m-d80af323_20230531.pth \
--input /pose.mp4 \
--output-root  vis_results

Example Output of 3D Pose Estimation for custom video.

Hints

The model could handle different input lengths (no more than 243 frames). No need to explicitly specify the input length elsewhere.
The model uses 17 body keypoints (H36M format). If you are using other formats, please convert them before feeding to MotionBERT.

Benchmarking Results

Benchmarking results of AlphaPose and MotionBERT 3D Pose Inference for 5 iterations in Google Colab.

            Fig: AlphaPose Benchmark Performance

            Fig: MotionBERT Benchmark Performance

Real-time Application

Build the Real-Time Application using Streamlit.

Initially clone the both repositories and install required libraries and dependencies in colab.
Writing separate functions to get 2D keypoints using Alphapose and to 3D Pose and Mesh Using MotionBERT in Colab.
Writing script app.py to run above functions and take the video input and show the output videos of 3D Pose and Mesh Recovery.
Streamlit doesn’t work directly in colab. The given Network URL doesn’t work because it is in browser environment.
So we are integrating Ngrok with streamlit can we streamlit app in browser.
Create account in Ngrok and get API_KEY. Use that in Streamlit application code and integrate Ngrok with Streamlit to access the streamlit app.

mohithreddy1 / motionbert Goto Github PK

motionbert's Introduction