paloukari / orcadetector Goto Github PK
View Code? Open in Web Editor NEWA VGGish-based DNN trained on the Watkins Marine Mammal Sound Database, with transfer learning from Audioset, to detect multiple marine mammal species.
License: MIT License
A VGGish-based DNN trained on the Watkins Marine Mammal Sound Database, with transfer learning from Audioset, to detect multiple marine mammal species.
License: MIT License
We can pass in a CLI arg to specify which model to run (e.g. vggish
or cnn
or logreg
), then we conditionally build the desired model before running model.fit_generator()
.
Right now, too much computation is done real-time as the generator tries to load an individual sample (and it all appears to happen on the CPU). We probably need to pregenerate numpy arrays and save to hdf5 files which can be loaded w/o additional processing.
By plugging in to the Keras callback framework, we can record train/val loss and accuracy after each training epoch, and then generate plots after each training run.
NOTE: we can take this code directly from Ram's and my 266 project and plug it in here. It doesn't need new development.
We need to investigate what's the best way to reduce the total classes of the data set.
I can set this up (I've done it several times). It makes it easy to keep track of training runs -- we can push parameters, metrics, and artifacts (e.g. trained weights, loss plots, etc...) to the logging server. It makes it much easier to keep track of experimental runs and retrieve data or assets associated with them if/when we need it later.
Our code can conveniently output debugging output with something simple like:
if verbose:
print(...)
Record noise data in random segments over the course of a day or two.
Unfortunately, I don't think we can ask Keras to just do it's own train/val split when we use model.fit_generator()
. We will have to implement our own validation generator (as the initial codebase indicates), but that involves us doing an initial split of our dataset.
I'd suggest creating a directory structure that has /data/train
, /data/val
, and /data/test
directories at the top level, with subdirectories for the various species classes.
Probably a 70/20/10 stratified split? We can use sklearn.model_selection.train_test_split()
.
But before we do this, we need to have decided which of the species will get the "Other" label.
We need to decide which species to explicitly classify -- the N species with the most samples, where I'm thinking N is ~3-4. Then we should classify everything else as "Other" (aka. random sea noises of animals we don't care about).
Right now, everything's working with 5 sec clips, but I ran into matrix dimensionality mismatch errors in the model when trying to drop to 2 sec. But this is work that's probably worth doing, as it would give us more training examples.
Fill in the stubbed out code for creating the vggish model, based on:
Apply the resampling logic from our code to see what the audio waveforms look like after resampling. (The current value of the SAMPLE_RATE
constant in mel_params.py is 16000.)
Generate a mel spectrogram of the resampled audio and plot that. That way we will be able to visualize the image in the same format that the model will train on. I think that will help us pick some species to classify which show some visual distinction.
We may need a separate Dockerfile for the TX2 (or we may get lucky and be able to share our orca_dev
image).
Low priority, but nice to have:
Before we wrap up this project and make the repo public, we should remove links to the audio file archives since we don't have the right to redistribute. (Instructions are in the setup.md doc for our own sake for now.)
We shouldn't be regularly running against our test set, but eventually will need to add support for running test with a trained model.
we set up a proxy for the live stream. tx2 connects to this to read audio input.
later we plug in audio sample overlays.
Rather than assuming that the entire audio file contains a "positive" example of a class, I think we may need to manually annotate the appropriate portions.
Right now, we have several standalone scripts. We can create a central CLI entry point that triggers each as appropriate.
crawl file system
count files per label
total, avg sample length for each label
@paloukari @mwinton @ram-iyer
Hello,
It seems that the data are no more available:
C:\Users\quentin.hamard>aws s3 cp s3://w251-orca-detector-data/data.tar.gz ./
fatal error: An error occurred (404) when calling the HeadObject operation: Key "data.tar.gz" does not exist
C:\Users\quentin.hamard>aws s3 cp s3://w251-orca-detector-data/vggish_weights.tar.gz ./
fatal error: An error occurred (404) when calling the HeadObject operation: Key "vggish_weights.tar.gz" does not exist
C:\Users\quentin.hamard>aws s3 cp s3://w251-orca-detector-data/orca_weights_616776.hdf5 ~/OrcaDetector/results/orca_weights_latest.hdf5
fatal error: An error occurred (404) when calling the HeadObject operation: Key "orca_weights_616776.hdf5" does not exist
I would like to re-use the CNN you trained to use it as a feature extractor.
Add a brief README.md file in the ./orca_detector/vggish/
directory crediting the original Google project that the code in that directory came from.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.