Coder Social home page Coder Social logo

robot-progress's Introduction

Robot Grasping - Generalising to Novel Objects

Repository to track the progress in robot grasping, including the datasets and the current state-of-the-art.

Methods

Detecting Grasp Poses

Summary: Given two or more images, this algorithm tries to find a few points which indicate good grasping locations. These points are then triangulated to compute a 3D grasping position. It is a super-vised learning method, trained on synthetic data. Effectively grasps wide range of (unseen) objects.

Summary: Introduces 5-dimensional grasp representation. Presents two-step cascaded system. First network has fewer features and can effectively prune unlikely grasps. Second network only handles those few good grasps. The input is a single RGB-D image. A small network is used to evaluate potential grasps. The best grasps are inputs for the second larger network, that outputs the best grasp. This is then converted to a robot grasp that includes a grasping point and an approach vector. It uses the rectangle's parameters and the surface normal at the rectangle's center to compute this. The network is trained on the Cornell Dataset, which is particulary set up for parellel gripper robots.

Summary: Presents single-stage regression to grasp bounding boxes, not using sliding-window methods. Runs in 13fps on GPU. Can also predict multiple grasps; works better, especially with objects that can be grasped in multiple ways. Also uses 5D representation. Standard ConvNetwork that outputs 6 neurons, trained on Cornell Dataset, pretrained on ImageNet. Best so far: 88 procent accuracy.

Summary: Implements ResNet50. Cornell Dataset; pretrained on ImageNet; 5D pose. Best so far: 89.1 procent accuracy. Does not test with real robot.

Summary: Predicts multiple-grasp poses. Network has two parts: feature extractor (DNN) & multi-grap predictor (regresses grasp rectangles from oriented anchor boxes; classifies these to graspable or not). Cornell Dataset. Best so far: 97.74 procent accuracy. Does not test with real robot.

Future work: Detect grasp locations for all objects in an image. Handle overlapping objects.

Summary: Proposes a rotation ensemble module (REM): convolutions that rotates network weights. 5D poses; Cornell dataset: 99.2 procent accuracy. Test on real (4-axis) robot: 93.8 succes rate (on 8 small objects).

Surveys

Summary: Talks about object localization; object segmenation; 6D-pose estimation; grasp detection; end2end; motion planning; datasets.

Deep Reinforcement Learning

Each image pixel corresponds to a movement (either push or grasping) executed on the 3D location of that pixel in the scene.

Input (to a FCN) is a single image. Predict dense pixel-wise predictions of future expected reward: fully convolutional action-value functions.

Each state St is an RGB-D heightmap image representation at time step t. "Each individual FCN φψ takes as input the heightmap image representation of the state st and outputs a dense pixel-wise map of Q values with the same image size and resolution as that of st, where each individual Q value prediction at a pixel p represents the future expected reward of executing primitive ψ at 3D location q where q 􏰏 p ∈ st. Note that this formulation is a direct amalgamation of Q-learning with visual affordance-based manipulation."

Future Work: 1-Not using heightmap, but another respresentation. 2-Train on larger variety of shapes. 3-Add more motions/manipulation.

Surveys

Summary: Reviews Deep RL methods in a realistic SIMULATED environment. Off-policy Q-learning; Regression with Monte-Carlo; Corrected Monte-Carlo; Deep Deterministic Policy Gradient; Push Consistency Learning. DQL performs best in low-data regimes. Monte-Carlo performs a bit better in high-data.

Future Work: 1: Focus on combining best of bootstrapping and multistep return. 2: Evaluate similar methods on real robots.

Other

Summary: Pixel-wise probability predictions for four different grasping primitives. Manually annoted dataset, pixels get 0, 1 or neither.

Summary: Kit assembling. Not really training data so: time-reverse disassembly. Learn policies for robotic assembly that can generalize to new objects.

robot-progress's People

Contributors

saafke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.