uwarg / computer-vision Goto Github PK
View Code? Open in Web Editor NEWA set of libraries and projects which will be used in all computer vision applications.
License: Other
A set of libraries and projects which will be used in all computer vision applications.
License: Other
We need someone to set up Jenkins on the WARG server and configure it for our project.
It needs to be configured to poll for github updates (it looks like we won't be able to use webhooks on the university network) and regenerate our doxygen documentation.
The documentation needs to be generated into the gh-pages branch and pushed to the repository.
It also needs to be configured to run our tests (which are so far largely non-existent) and we will need to update the testing class to generate results compatible with xUnit.
Geolocate individual pixeltargets (converting error from pixels to degrees of latitude/longitude) and create a Target to wrap each pixeltarget. Each target should only contain one pixeltarget and duplicates will later be merged.
At some point we need to set up testing suites for all of the modules. Ideally these tests should be set up for each individual module. We should start setting up black box tests as soon as possible as they will be very useful when starting to implement the modules (after which we can add white box tests).
Since a lot of the functionality we will be testing does not give exact results, we should have tests measure and record the deviation of test results rather than pass or fail, as well as time taken for testing. Furthermore, If we are profiling the time taken for tests, we should probably set up the tests to be run on the hardware that will be used at the competition (i.e. the WARG desktop).
This also means that tests have to be geared towards time as well (meaning testing scenarios that may take more/less time, testing large amounts of data, etc.)
I haven't yet figured exactly what the best way to set up the tests is, but cmake (which I'd like to use for the purposes of building the project) has some interesting stuff to use for testing which we should look at (found here)
This issue is for more detailed information on the Image Import module as well as for discussion of its implementation and assignment of duties.
Module for creating frames from image and video files and matching them with telemetry information that corresponds to them.
We have two ways of determining capture time for images, the internal timestamp as stored in the exif metadata, and the time recorded in the telemetry that a signal was sent to the camera. The issue is that the first only has second resolution, and the second in unreliable if any of the metadata is missing (as we don't know how many photos were taken in the missing period), however we should be able to use both to get fairly accurate timing. by using the exif timestamp to find a frame's location in the telemetry and the telemetry's recorded time that the plane sent the signal plus whatever delay we measure between when the signal is sent and the camera takes the picture, that way we should be able to get 1/5 second resolution while bypassing the issues of solely using the telemetry.
See http://www.exiv2.org/ for more information on reading exif metadata.
We need someone to add the part of the Image Import module that fills in metadata objects using a photo timestamp. It calculate appropriate error so that we can have some idea of how close geolocation estimates are. More specifically so that we can tell whether inaccurate geolocation is caused by the geolocation algorithm or lack of good photographs.
We have two ways of determining capture time for images, the internal timestamp as stored in the exif metadata, and the time recorded in the telemetry that a signal was sent to the camera. The issue is that the first only has second resolution, and the second in unreliable if any of the metadata is missing (as we don't know how many photos were taken in the missing period), however we should be able to use both to get fairly accurate timing. by using the exif timestamp to find a frame's location in the telemetry and the telemetry's recorded time that the plane sent the signal plus whatever delay we measure between when the signal is sent and the camera takes the picture, that way we should be able to get 1/5 second resolution while bypassing the issues of solely using the telemetry.
I believe @rorico was working on this last term, but I don't remember how far he got.
This issue is for more detailed information on the Target Analysis module as well as for discussion of its implementation and assignment of duties.
Module for geolocating targets using their pixel locations and photo metadata, determine target type and calculate possible error. As targets are processed unique targets will be identified and the data combined into a single object.
Some work on the geolocation had been done for the last competition. The code doesn't seem to be on github (correct me if I'm wrong), though I do have a copy if anyone wants to look at it.
The image import class is completely unimplemented at its current state.
ImageImport needs to be able to read from image files, video files and video devices and populate a buffer full of frames which can be removed from the buffer with next_frame() (next_frame() will be queried by multiple threads and thus must be thread safe!).
Currently we'll need images and videos to test and train our proposed algorithms. The more data the better. Captures of both target and non-target data would be good.
Where is this data currently stored? What is a good, fault tolerant location to store this data?
I'm proposing we put it in the Google Drive or whatever we currently use for sharing files.
Currently we have a testing module that logs test results in a csv file. Eventually these tests will be run every time a commit is made. We would like to be able to auto-generate graphs which will allow us to view regressions/improvements in the code visually.
This should be created as a separate program and thus can be written in any programming language.
It should also be written so that it is easily extensible.
I would suggest looking at various graphing APIs first before dealing with CSV parsing.
We need someone to write a class/function that takes a list of targets and merges duplicate targets by comparing the location and type of the targets.
Also note that there may be similar targets grouped closely together, so it may be wise to revisit the results of the template matching at this point so that close and somewhat dissimilar targets can be distinguished (e.g. square vs pentagon shaped fields). For close nearly identical targets we will have to rely on location estimates.
When merging targets, the location should be updated to include the additional location data from the other targets. Ideally if our error estimates are accurate we should be able to compute the new location and error by taking the overlap between each individual target's location and error.
The image import module, as well as loading frames from the disk and from video devices, also needs to fill the frame's metadata object with information from the telemetry file. This is facilitated by the use of exif metadata from the pictures.
(this is already discussed in more detail in the issue describing the ImageImport module however this issue is specifcally for the exif metadata)
We need someone to create a helper function that will read the timestamp from an image file using exiv2.
Exiv2 will also need to be added as a dependency to the project.
This Issue will document the conventions to be used in the project.
If you have any suggestions, please mention them.
Variable Names:
varName
Constant Names:
CONST_NAME
Function Names:
function_name
Class Names:
ClassName
File Names:
file_name.ext
Indentation:
4 spaces
brackets:
void foo(int i) {
int x = (i + 2)/2;
}
The idea as it stands for the the target analysis module is to compare targets identified by the targetid module to target templates to determine their type. We thought that we would store the target templates as json objects which can be read in at runtime.
We need someone to create a c++ object to represent the target that can be serialized from a json object (with json cpp. See here for an example)
We also need to determine how we should describe targets so that we can compare them to the data stored in PixelTarget objects (i.e. what fields should be stored in the json). Colour is the obvious one, but we also need to describe the shape of the object, identifying characteristics such as QR codes, whether it is a solid shape or just a border etc.
Furthermore, the pixeltargets need to be updated since they shouldn't really contain a type field and need a way of exposing the contour.
The code is currently in various states of formatting, headers for example mostly don't include a Doxygen class comment and the information that should be in the class comment is in a file comment.
All code should be updated to conform to the code conventions as specified on the wiki page. Example Doxygen comments are also on the wiki page.
We need someone to add a class/function to the target analysis module that will determine the type of a pixeltarget by comparing and ranking it against the target templates
Highest rank should obviously determines type, however may want to consider analyzing the results of the ranking so that we can make use of how certain we are of the rank and use that information to weed out false positives.
This issue is for more detailed information on the Target Identification module as well as for discussion of its implementation and assignment of duties.
Module for analyzing frames using OpenCV tools, locating objects of interest and gathering information about the objects such as target colour, pixel area, perimeter and shape.
This module can largely be based off of the code that I wrote for the 2015 competition which is currently stored in the 2015-attempt branch.
The above-mentioned code uses the OpenCV function kmeans to create a new image that contains only the n most common colours in the image. This image is then subtracted from the original image to isolate all of the uncommon coloured areas and shapes are identified by applying OpenCV's contour detection on the final image.
It is worth noting that kmeans is a very resource intensive function, and will be more costly than any other single part of this project.
It may also be useful to look at Isaac and Rees's target identification when looking at implementation improvements. (no clue where the code is)
While it is probably a good idea to start with the implementation that we already have, another possibility that we can look into is using machine learning for target identification. Unless there is someone who already has experience with machine learning who would be willing to implement it, other parts of the project should take precedence. But once the rest of the project is on track to being completed if there is someone eager to spend the time to learn (which I may eventually do myself, having the time being the key issue here) it may be well worth the time and effort.
Here's a link to the OpenCV 3.0 Documentation which includes tutorials.
OpenCV Doc
Histogram Based Image Segmentation Article
Check the Pull request
@benjaminwinger Would you have any idea why this happens? Is it the file format?
The code as it is doesn't have any structure in place for reading QR codes.
see here for a stand-alone implementation of the ZBar library for reading QR codes that we used at the last competition.
I recommend using ZBar for now, though feel free to look into other libraries later and compare them to ZBar.
How to implement:
A QR code handling class should be added to the targetid module.
Fields to store QR code data should be added to the PixelTarget class (in the core module)
A cropped image of the target should also be added to PixelTarget (for now assume that it will be created by some other part of the library, we'll get someone else to implement the cropping of the image)
Appropriate libraries (ZBar) need to be added to modules/targetid/CMakeLists.txt
A suite of tests should be written testing QR codes of different sizes and qualities (include tests with blurry images, small bar code images)
This should be a good general introduction to the codebase since it touches most aspects of the project.
I was thinking again about build systems. At the meeting today I mentioned that we should use make, though note that I actually meant to say CMake (a cross platform system which can generate makefiles for various systems such as GNU Make, Xcode or MS Visual Studio).
An alternative would be to just use Make, which can run normally in OS X and through Cygwin on Windows.
The other question I guess is what environment the end product will be running in. If we want to eventually run the program on a processor on the plane then the software will have to be able to run the program in Linux, however are we also going to use Linux on the Desktop for post-processing? For the past competition we had to use windows because we were using Microsoft ICE and Correlator3D for image stitching and volume calculation respectively, but if we end up having working in-house software for those purposes before the next competition then that will not be an issue.
If we end up just running the software in Linux, then targeting it specifically at Linux makes some sense, as we wouldn't have to worry about making everything cross-platform, but it may make things more frustrating for those of you who develop in Windows if you have to deal with everything through Cygwin (I think OS X handles Linux stuff better, but I'm not sure). Then again, I've never used Cygwin, so I don't really know.
Being someone who just uses Linux, I'm perfectly fine with targeting it solely at Linux, but what do the rest of you think?
In essence, CMake will allow for all development environments, but will require that we make a few sacrifices in implementation and would make the build system somewhat more complicated (it really only adds one more step to the build process though).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.