Coder Social home page Coder Social logo

felixchenfy / voice_control_turtlebot--masters_final Goto Github PK

View Code? Open in Web Editor NEW
5.0 3.0 2.0 4.28 MB

A toy project of using voice to tell a Turtlebot Robot to detect and move to target, achieved by 4 components (1) speech classification, (2) object detection, (3) plane detection, and (4) control of wheel motion.

License: MIT License

ros-packages speech-command-recognition object-detection data-augmentation ransac plane-detection turtlebot3 motion-control

voice_control_turtlebot--masters_final's Introduction

Simon Says - Master's Final Project

Introduction

Background: This is the final project for my Master's degree of Robotics in Northwestern University. Date: 2019/12/15.

Abstract: The project name is called Simon Says: After I say a number, the Turtlebot Robot detects this number on the whiteboard in front of it, and then moves there and crashes into the number.

Demo:

In 2019/06, I took the above demo video. The project's repo is here. However, the code is in a mess and is hardware dependent, which I do not recommend reading.

In 2019/10~2019/12, I divided this big project into 4 separate ROS packages and refactored the code. Each package uses ROS topics/services as interface, and is self-contained. See below. (But I haven't combined them to create another video demo.)

Four ROS Packages

1. Speech Commands Classification

Method: MFCCs features + LSTM.

Abstract: (1) Press key to record audio; (2) Speak a word to microphone; (3) Finally, see the classification result on GUI and ROS topic.

Repo: https://github.com/felixchenfy/ros_speech_commands_classification

2. Object Detection

Method: Data augmentation + YOLOv3.

Abstract: Run 3 scripts to (1) Synthesize images (by putting few template images onto backgrounds), (2) Train YOLOv3, and (3) Detect objects for: one image, images, video, webcam, or ROS topic.

Repo: https://github.com/felixchenfy/ros_yolo_as_template_matching

3. Plane Detection

Method: Depth Image + RANSAC

Abstract: A python node to detect planes from depth image by using RANSAC algorithm.

Repo: https://github.com/felixchenfy/ros_detect_planes_from_depth_img

4. Turtlebot Control

Method: "Move to Pose" algorithm

Abstract: ROS services for controlling Turtlebot3 to target pose.

Repo: https://github.com/felixchenfy/ros_turtlebot_control

Other Packages

I have developed several other small packages that might be helpful when dealing with image data:

Acknowledgement

The idea of this Simon Says project was proposed by Prof. Ying Wu. Many thanks to Prof. Wu for providing me the chance of joining his lab, working on this interesting project, and sharing papers in the group meeting.

Also, I'd like to express great thanks to Matthew, who is the Vice Director of MS Robotics program, for the guidance and help to my project through out the whole program.

voice_control_turtlebot--masters_final's People

Contributors

felixchenfy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

stevenlol robcn

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.