Coder Social home page Coder Social logo

uwarg / computer-vision Goto Github PK

View Code? Open in Web Editor NEW
16.0 16.0 27.0 24.47 MB

A set of libraries and projects which will be used in all computer vision applications.

License: Other

C 2.22% C++ 86.17% CMake 6.20% XSLT 4.35% Shell 1.06%
computer-vision warg

computer-vision's People

Contributors

chrishajduk84 avatar zheyuanzhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

computer-vision's Issues

Setting Up Jenkins

We need someone to set up Jenkins on the WARG server and configure it for our project.
It needs to be configured to poll for github updates (it looks like we won't be able to use webhooks on the university network) and regenerate our doxygen documentation.
The documentation needs to be generated into the gh-pages branch and pushed to the repository.
It also needs to be configured to run our tests (which are so far largely non-existent) and we will need to update the testing class to generate results compatible with xUnit.

Target Geolocation [Target Analysis]

Geolocate individual pixeltargets (converting error from pixels to degrees of latitude/longitude) and create a Target to wrap each pixeltarget. Each target should only contain one pixeltarget and duplicates will later be merged.

Testing [All Modules]

At some point we need to set up testing suites for all of the modules. Ideally these tests should be set up for each individual module. We should start setting up black box tests as soon as possible as they will be very useful when starting to implement the modules (after which we can add white box tests).
Since a lot of the functionality we will be testing does not give exact results, we should have tests measure and record the deviation of test results rather than pass or fail, as well as time taken for testing. Furthermore, If we are profiling the time taken for tests, we should probably set up the tests to be run on the hardware that will be used at the competition (i.e. the WARG desktop).
This also means that tests have to be geared towards time as well (meaning testing scenarios that may take more/less time, testing large amounts of data, etc.)

I haven't yet figured exactly what the best way to set up the tests is, but cmake (which I'd like to use for the purposes of building the project) has some interesting stuff to use for testing which we should look at (found here)

TODO:

  • Create a testing program that can handle testing for all the modules (as well as record and document results)
  • Set up an environment for the computer vision software as well as nightly automated testing on the WARG desktop computer
  • Write tests for all of the modules

Image Import Module

This issue is for more detailed information on the Image Import module as well as for discussion of its implementation and assignment of duties.

Description

Module for creating frames from image and video files and matching them with telemetry information that corresponds to them.

Implementation Ideas

We have two ways of determining capture time for images, the internal timestamp as stored in the exif metadata, and the time recorded in the telemetry that a signal was sent to the camera. The issue is that the first only has second resolution, and the second in unreliable if any of the metadata is missing (as we don't know how many photos were taken in the missing period), however we should be able to use both to get fairly accurate timing. by using the exif timestamp to find a frame's location in the telemetry and the telemetry's recorded time that the plane sent the signal plus whatever delay we measure between when the signal is sent and the camera takes the picture, that way we should be able to get 1/5 second resolution while bypassing the issues of solely using the telemetry.

See http://www.exiv2.org/ for more information on reading exif metadata.

Creating Metadata Objects from Timestamps [Image Import]

We need someone to add the part of the Image Import module that fills in metadata objects using a photo timestamp. It calculate appropriate error so that we can have some idea of how close geolocation estimates are. More specifically so that we can tell whether inaccurate geolocation is caused by the geolocation algorithm or lack of good photographs.

Goals:

  • Create a function/class that, when passed a numeric timestamp, returns a metadata object by filling in the appropriate fields using the telemetry log entries, extrapolating if possible.
  • Test the camera to find the time difference between when the microcontroller sends a trigger signal to the camera and when the camera takes a picture (make sure the camera is in manual focus mode).
  • Include measurement error estimates using the error associated with the estimated difference in time between the image and the closest telemetry log entries. This should take into account pitch/roll/yaw rates (angular error) as well as airspeed (linear error).

Implementation Ideas (from original module issue)

We have two ways of determining capture time for images, the internal timestamp as stored in the exif metadata, and the time recorded in the telemetry that a signal was sent to the camera. The issue is that the first only has second resolution, and the second in unreliable if any of the metadata is missing (as we don't know how many photos were taken in the missing period), however we should be able to use both to get fairly accurate timing. by using the exif timestamp to find a frame's location in the telemetry and the telemetry's recorded time that the plane sent the signal plus whatever delay we measure between when the signal is sent and the camera takes the picture, that way we should be able to get 1/5 second resolution while bypassing the issues of solely using the telemetry.

I believe @rorico was working on this last term, but I don't remember how far he got.

Target Analysis Module

This issue is for more detailed information on the Target Analysis module as well as for discussion of its implementation and assignment of duties.

Description

Module for geolocating targets using their pixel locations and photo metadata, determine target type and calculate possible error. As targets are processed unique targets will be identified and the data combined into a single object.

Implementation Ideas

Some work on the geolocation had been done for the last competition. The code doesn't seem to be on github (correct me if I'm wrong), though I do have a copy if anyone wants to look at it.

Assignees

@antoniosehk
@chrishajduk84
@francisli91

Reading Input [Image Import]

The image import class is completely unimplemented at its current state.

ImageImport needs to be able to read from image files, video files and video devices and populate a buffer full of frames which can be removed from the buffer with next_frame() (next_frame() will be queried by multiple threads and thus must be thread safe!).

Get pictures and video

Currently we'll need images and videos to test and train our proposed algorithms. The more data the better. Captures of both target and non-target data would be good.

Where is this data currently stored? What is a good, fault tolerant location to store this data?

I'm proposing we put it in the Google Drive or whatever we currently use for sharing files.

TODO

  • Get video streams from camera
  • Get images of target and non-target data
  • Determine where to store current and future training data

Test Result Graphs

Currently we have a testing module that logs test results in a csv file. Eventually these tests will be run every time a commit is made. We would like to be able to auto-generate graphs which will allow us to view regressions/improvements in the code visually.
This should be created as a separate program and thus can be written in any programming language.
It should also be written so that it is easily extensible.
I would suggest looking at various graphing APIs first before dealing with CSV parsing.

Merging Duplicate Targets [Target Analysis]

We need someone to write a class/function that takes a list of targets and merges duplicate targets by comparing the location and type of the targets.
Also note that there may be similar targets grouped closely together, so it may be wise to revisit the results of the template matching at this point so that close and somewhat dissimilar targets can be distinguished (e.g. square vs pentagon shaped fields). For close nearly identical targets we will have to rely on location estimates.
When merging targets, the location should be updated to include the additional location data from the other targets. Ideally if our error estimates are accurate we should be able to compute the new location and error by taking the overlap between each individual target's location and error.

Reading exif metadata from Images [Image Import]

The image import module, as well as loading frames from the disk and from video devices, also needs to fill the frame's metadata object with information from the telemetry file. This is facilitated by the use of exif metadata from the pictures.
(this is already discussed in more detail in the issue describing the ImageImport module however this issue is specifcally for the exif metadata)

We need someone to create a helper function that will read the timestamp from an image file using exiv2.
Exiv2 will also need to be added as a dependency to the project.

Conventions

This Issue will document the conventions to be used in the project.
If you have any suggestions, please mention them.

Variable Names:
varName

Constant Names:
CONST_NAME

Function Names:
function_name

Class Names:
ClassName

File Names:
file_name.ext

Indentation:
4 spaces

brackets:

void foo(int i) {
    int x = (i + 2)/2;
}

Target Templates [Target Analysis]

The idea as it stands for the the target analysis module is to compare targets identified by the targetid module to target templates to determine their type. We thought that we would store the target templates as json objects which can be read in at runtime.

We need someone to create a c++ object to represent the target that can be serialized from a json object (with json cpp. See here for an example)
We also need to determine how we should describe targets so that we can compare them to the data stored in PixelTarget objects (i.e. what fields should be stored in the json). Colour is the obvious one, but we also need to describe the shape of the object, identifying characteristics such as QR codes, whether it is a solid shape or just a border etc.
Furthermore, the pixeltargets need to be updated since they shouldn't really contain a type field and need a way of exposing the contour.

Update Code and Doxygen Formatting

The code is currently in various states of formatting, headers for example mostly don't include a Doxygen class comment and the information that should be in the class comment is in a file comment.
All code should be updated to conform to the code conventions as specified on the wiki page. Example Doxygen comments are also on the wiki page.

Ranking Targets [Target Analysis]

We need someone to add a class/function to the target analysis module that will determine the type of a pixeltarget by comparing and ranking it against the target templates
Highest rank should obviously determines type, however may want to consider analyzing the results of the ranking so that we can make use of how certain we are of the rank and use that information to weed out false positives.

Target Identification Module

This issue is for more detailed information on the Target Identification module as well as for discussion of its implementation and assignment of duties.

Description

Module for analyzing frames using OpenCV tools, locating objects of interest and gathering information about the objects such as target colour, pixel area, perimeter and shape.

Implementation Ideas

This module can largely be based off of the code that I wrote for the 2015 competition which is currently stored in the 2015-attempt branch.
The above-mentioned code uses the OpenCV function kmeans to create a new image that contains only the n most common colours in the image. This image is then subtracted from the original image to isolate all of the uncommon coloured areas and shapes are identified by applying OpenCV's contour detection on the final image.
It is worth noting that kmeans is a very resource intensive function, and will be more costly than any other single part of this project.

It may also be useful to look at Isaac and Rees's target identification when looking at implementation improvements. (no clue where the code is)

Notes

While it is probably a good idea to start with the implementation that we already have, another possibility that we can look into is using machine learning for target identification. Unless there is someone who already has experience with machine learning who would be willing to implement it, other parts of the project should take precedence. But once the rest of the project is on track to being completed if there is someone eager to spend the time to learn (which I may eventually do myself, having the time being the key issue here) it may be well worth the time and effort.

Useful Links

Here's a link to the OpenCV 3.0 Documentation which includes tutorials.
OpenCV Doc
Histogram Based Image Segmentation Article

Assignees

@benjaminwinger
@Rich143
@francisli91

QR code identification

The code as it is doesn't have any structure in place for reading QR codes.
see here for a stand-alone implementation of the ZBar library for reading QR codes that we used at the last competition.

I recommend using ZBar for now, though feel free to look into other libraries later and compare them to ZBar.

How to implement:
A QR code handling class should be added to the targetid module.
Fields to store QR code data should be added to the PixelTarget class (in the core module)
A cropped image of the target should also be added to PixelTarget (for now assume that it will be created by some other part of the library, we'll get someone else to implement the cropping of the image)
Appropriate libraries (ZBar) need to be added to modules/targetid/CMakeLists.txt
A suite of tests should be written testing QR codes of different sizes and qualities (include tests with blurry images, small bar code images)

This should be a good general introduction to the codebase since it touches most aspects of the project.

Build System

I was thinking again about build systems. At the meeting today I mentioned that we should use make, though note that I actually meant to say CMake (a cross platform system which can generate makefiles for various systems such as GNU Make, Xcode or MS Visual Studio).
An alternative would be to just use Make, which can run normally in OS X and through Cygwin on Windows.

The other question I guess is what environment the end product will be running in. If we want to eventually run the program on a processor on the plane then the software will have to be able to run the program in Linux, however are we also going to use Linux on the Desktop for post-processing? For the past competition we had to use windows because we were using Microsoft ICE and Correlator3D for image stitching and volume calculation respectively, but if we end up having working in-house software for those purposes before the next competition then that will not be an issue.
If we end up just running the software in Linux, then targeting it specifically at Linux makes some sense, as we wouldn't have to worry about making everything cross-platform, but it may make things more frustrating for those of you who develop in Windows if you have to deal with everything through Cygwin (I think OS X handles Linux stuff better, but I'm not sure). Then again, I've never used Cygwin, so I don't really know.

Being someone who just uses Linux, I'm perfectly fine with targeting it solely at Linux, but what do the rest of you think?
In essence, CMake will allow for all development environments, but will require that we make a few sacrifices in implementation and would make the build system somewhat more complicated (it really only adds one more step to the build process though).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.