Coder Social home page Coder Social logo

fabiopoiesi / 4dm Goto Github PK

View Code? Open in Web Editor NEW
12.0 3.0 5.0 103.4 MB

This is the repository of the paper "Multi-view data capture for dynamic object reconstruction using handheld augmented reality mobiles"

License: MIT License

Python 2.80% C# 82.62% ShaderLab 1.13% Objective-C++ 0.04% Starlark 1.52% C++ 11.74% C 0.05% Dockerfile 0.09% Shell 0.01%

4dm's Introduction

Multi-view data capture for dynamic object reconstruction using handheld augmented reality mobiles

This repository contains a system to capture nearly-synchronous frame streams from multiple and moving handheld mobiles that is suitable for dynamic object 3D reconstruction. Each mobile executes Simultaneous Localisation and Mapping (SLAM) on-board to estimate its pose, and uses a wireless communication channel to send or receive synchronisation triggers. We use the SLAM algorithm integrated in Android ARCore. Our system can harvest frames and mobile poses in real time using a decentralised triggering strategy and a data-relay architecture that can be deployed either at the Edge or in the Cloud. We show the effectiveness of our system by employing it for 3D skeleton and volumetric reconstructions. Our triggering strategy achieves equal performance to that of an NTP-based synchronisation approach, but offers higher flexibility, as it can be adjusted online based on application needs.

Paper (pdf)

Modules

This project is divided into two software blocks, the capturing system and the reconstruction software, which in turn are composed on several modules. Specifically,

  • app: Android ARCore-based mobile application to capture the frames
  • data-manager: server to process the captured frames
  • mlapi-server: server to manage synchronision and enrolments of the mobiles
  • reconstruction_sw: Python scripts to perform 3D pose and volumetric reconstructions using the captured data

Getting started

Please check the documentation

The 4DM dataset

This is the 4DM dataset that involves six people recording with their mobiles a person acting table tennis in an outdoor setting. The 4DM dataset is characterised by cluttered backgrounds, cast shadows and people appearing in each other's view, thus becoming likely distractors for object detection and human pose estimation.

4DM is composed of three sequences:

  • 4DM-Easy: all mobiles are stably held by people during capture
  • 4DM-Medium: three out of six mobiles are stably held, the others undergo motion
  • 4DM-Hard: all mobiles undergo motion

The host mobile generates triggers at 10Hz. Frames have a resolution of 640x480 and an average size of about 160KB. The latency between mobiles and the Relay Server was about 5ms.

Download (zip)

Citing our work

Please cite the following paper if you use our code or our dataset:

@article{Bortolon2021,
    title = {Multi-view data capture for dynamic object reconstruction using handheld augmented reality mobiles},
    author = {Bortolon, Matteo and Bazzanella, Luca and Poiesi, Fabio},
    journal = {Journal of Real-Time Image Processing},
    volume = {18},
    pages = {345–355},
    month = {Mar},
    year = {2021}
}

Acknowledgements

This research has received funding from the Fondazione CARITRO - Ricerca e Sviluppo programme 2018-2020.

4dm's People

Contributors

fabiopoiesi avatar mbortolon97 avatar

Stargazers

 avatar  avatar Sergio Povoli avatar 파래 avatar  avatar  avatar  avatar Matt Shaffer avatar Nathan Jenkins avatar Xavier Weber avatar Yu Jing avatar Qing Shuai avatar

Watchers

visonpon avatar  avatar Matt Shaffer avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.