pbattaglia / physicsvision Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
Our plan is to add new functionality to Yibiao's existing visual parsing module that uses physical constraints to improve 3D scene interpretations. The visual parser will use physical constraints in 2 ways: 1) by respecting a "physics prior", in which objects that are unstable or interpenetrating other objects are penalized, and 2) by incorporating physically plausible proposals into the parser's search scheme, in order to invest the bulk of the effort on scene states that will not be penalized by the physics prior. Both of these uses of physical constraints capitalize on a forward simulation of physical dynamics, using a computerized physics engine (Bullet).
We have three goals for this week:
*** Tentative
When evaluating the physics score, we can factor the penalize on a per-object basis. Then, subsequent physical-adjustment proposals will selectively target those objects that scored poorly in physical plausibility.
Proposal details:
a. Given the parser's existing 2D detection, we will develop a physical contact detection function, which pushes the parser's 2D detection through 3D scene and searches out the 3D support and contact points. This "push" will be aligned with a conical projection outwards from the camera, which inscribes the object, and allows the proposal to select a z-coordinate for which the object is supported by (contact from below) or attached to (contact from a side), an existing object in the scene. The delete move is a reversal of an add, for reversibility of the sampler.
b. For existing objects in the scene, their depth can be modified using using a similar procedure as (a).
c. For existing objects, we can change the location or pose by "bumping" the object, so that the resultant proposed state is at a physically plausible static equilibrium.
d. We assume the floating blocks are very unlikely (with probability 0). The valid proposal should be either supported from its bottom, either itself or by a second, lower object, or be attached to another object in the scene.
Evaluate our algorithm:
The methodology:
a. Evaluate the 3D detection rate by finding correspondences between parsing results and ground truths
b. Evaluate the 2D segmentation accuracy by comparing 2D projected segmentations of inferred sample scenes with 2D projected segmentations of ground truth, where each projection is taken from various camera angles around the scene.
The dataset:
We can generate the synthesized red/yellow scene and tower scene with ground truth.
We also plan to build up some real block scene with toy wood blocks, and take pictures of them. They can be used in the paper to evaluate qualitatively.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.