Coder Social home page Coder Social logo

motionbert's Introduction

MotionBERT: A Unified Perspective on Learning Human Motion Representations

This is the official PyTorch implementation of the paper "MotionBERT: A Unified Perspective on Learning Human Motion Representations" (ICCV 2023).


Inference

Inference implementation in Google Colab

Refer to MotionBERT_Colab_Implementation file for inference local implementation.

Cloning GitHub Repositories

  1. Clone MotionBERT repository which contains Pretrained models and checkpoints.

  2. Clone AlphaPose repository.

  3. Install required libraries. And import required libraries.

  4. Use T4 GPU for runtime in colab Check if a CUDA-compatible GPU is available and set device.

Setup AlphaPose

  1. Install additional dependencies for AlphaPose.

  2. Setup the Environment to run AlphaPose.

  3. Authenticate and Download Pre-trained Models to Google Drive and create directories in AlphaPose folder and store that Pre-trained Models in colab.

  4. Upload the custom video for further processing in order to execute AlphaPose on it.

  5. Copy the path of the video

For example: copy this -->  /content/pose.mp4

Generate the 2D key-points from AlphaPose

  1. Change directory to AlphaPose and generate the 2D key-points from AlphaPose for custom video using below code:
!python scripts/demo_inference.py --cfg configs/halpe_26/resnet/256x192_res50_lr1e-3_1x.yaml --checkpoint pretrained_models/halpe26_fast_res50_256x192.pth --indir examples/res/vis --video {vid_ff} --save_video
  1. Save the video with Alphapose Keypoints. Example Output:


Run 3D Pose Estimation Inference

  1. Run MotionBERT Inference of 3D Pose Estimation using below code:
%cd /content/MotionBERT
!python /content/MotionBERT/infer_wild.py \
--vid_path {vid_ff} \
--json_path /content/AlphaPose/examples/res/alphapose-results.json \
--out_path /content/MHFormer_out
  1. Example Output of 3D Pose Estimation for custom video.



  1. Install additional dependencies for MotionBERT to run the Human Mesh Recovery inference.

  2. Download the datasets here and put them to data/mesh/. We use Human3.6M, COCO, and PW3D for training and testing. Descriptions of the joint regressors could be found in SPIN.

  3. Download the SMPL model(basicModel_neutral_lbs_10_207_0_v1.0.0.pkl) from SMPLify, put it to data/mesh/, and rename it as SMPL_NEUTRAL.pkl

Run Human Mesh Recovery Inference

  1. Run MotionBERT Inference of Human Mesh Recovery using below code:
%cd /content/MotionBERT
!python /content/MotionBERT/infer_wild_mesh.py --vid_path {vid_ff} --json_path /content/AlphaPose/examples/res/alphapose-results.json --out_path /content/MeshOut --ref_3d_motion_path /content/MHFormer_out/X3D.npy
  1. Example Output of Human Mesh Recovery for custom video.


MotionBERT Testing

  1. Install necessary packages.

  2. Download H36M dataset and Extract the H36M pkl data in /content/ directory in colab.

  3. Split the dataset to get it ready for evaluation.

  4. Perform testing

!python train.py \
--config configs/pose3d/MB_train_h36m.yaml \
--evaluate checkpoint/pose3d/MB_train_h36m/best_epoch.bin
Testing output:
Protocol #1 Error (MPJPE): 39.21385534671374 mm
Protocol #2 Error (P-MPJPE): 32.93466426644884 mm

Implementation in Local Environment

Installation

  1. Download and install Anaconda latest version.

  2. Download and install Python latest version.

  3. Download and install CUDA ToolKit as per your machine and GPU


Now create MotionBERT environment using Anaconda and Python and install PyTorch in it.

Clone the MotionBERT github repo in that environment.

conda create -n motionbert python=3.7 anaconda
conda activate motionbert
# Please install PyTorch according to your CUDA version.
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -r requirements.txt

Getting Started

Task Document
Pretrain docs/pretrain.md
3D human pose estimation docs/pose3d.md
Skeleton-based action recognition docs/action.md
Mesh recovery docs/mesh.md

Download Models

Model Download Link Config Performance
MotionBERT (162MB) OneDrive pretrain/MB_pretrain.yaml -
MotionBERT-Lite (61MB) OneDrive pretrain/MB_lite.yaml -
3D Pose (H36M-SH, scratch) OneDrive pose3d/MB_train_h36m.yaml 39.2mm (MPJPE)
3D Pose (H36M-SH, ft) OneDrive pose3d/MB_ft_h36m.yaml 37.2mm (MPJPE)
Mesh (with 3DPW, ft) OneDrive mesh/MB_ft_pw3d.yaml 88.1mm (MPVE)

Inference implementation in Local Environment

Refer to MotionBERT_local_Implementation file for inference local implementation.

  1. Install PyTorch version 2.1.0

    !pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
    
  2. Install required packages openmim and mmengine

  3. Install mmcv version 2.1.0

    !pip install mmcv==2.1.0 -f https://download.openmmlab.com/mmcv/dist/cu121/torch2.1/index.html
    
  4. Clone MMPose Repository. It is based on MotionBERT.

    ! git clone https://github.com/open-mmlab/mmpose.git
    
  5. Clone MMDetection Repository

    !git clone https://github.com/open-mmlab/mmdetection.git
    
  6. Import required libraries.

  7. Run the MMPose for 3D Pose Estimation

    !python demo/body3d_pose_lifter_demo.py \
    demo/mmdetection_cfg/rtmdet_m_640-8xb32_coco-person.py \
    https://download.openmmlab.com/mmpose/v1/projects/rtmpose/rtmdet_m_8xb32-100e_coco-obj365-person-235e8209.pth \
    configs/body_2d_keypoint/rtmpose/body8/rtmpose-m_8xb256-420e_body8-256x192.py \
    https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/rtmpose-m_simcc-body7_pt-body7_420e-256x192-e48f03d0_20230504.pth \
    configs/body_3d_keypoint/motionbert/h36m/motionbert_dstformer-243frm_8xb32-240e_h36m-original.py \
    https://download.openmmlab.com/mmpose/v1/body_3d_keypoint/pose_lift/h36m/motionbert_ft_h36m-d80af323_20230531.pth \
    --input /pose.mp4 \
    --output-root  vis_results
    
  8. Example Output of 3D Pose Estimation for custom video.



Hints

  1. The model could handle different input lengths (no more than 243 frames). No need to explicitly specify the input length elsewhere.

  2. The model uses 17 body keypoints (H36M format). If you are using other formats, please convert them before feeding to MotionBERT.

Benchmarking Results

Benchmarking results of AlphaPose and MotionBERT 3D Pose Inference for 5 iterations in Google Colab.

            Fig: AlphaPose Benchmark Performance 

            Fig: MotionBERT Benchmark Performance 

Real-time Application

Build the Real-Time Application using Streamlit.

  1. Initially clone the both repositories and install required libraries and dependencies in colab.

  2. Writing separate functions to get 2D keypoints using Alphapose and to 3D Pose and Mesh Using MotionBERT in Colab.

  3. Writing script app.py to run above functions and take the video input and show the output videos of 3D Pose and Mesh Recovery.

  4. Streamlit doesn’t work directly in colab. The given Network URL doesn’t work because it is in browser environment.

  5. So we are integrating Ngrok with streamlit can we streamlit app in browser.

  6. Create account in Ngrok and get API_KEY. Use that in Streamlit application code and integrate Ngrok with Streamlit to access the streamlit app.

motionbert's People

Contributors

mohithreddy1 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.