qlemaire22 / speech-music-detection Goto Github PK
View Code? Open in Web Editor NEWPython framework for Speech and Music Detection using Keras.
License: MIT License
Python framework for Speech and Music Detection using Keras.
License: MIT License
hi,dear
glad to see the great project,
could you please supply the pretrained model to test some audio?
I want to have a try.
thx
Hi, I successfully installed everything and added my own datasets, but when issuing python prepare_dataset/prepare_audio.py my_own_dataset
I always get:
Traceback (most recent call last): File "prepare_dataset/prepare_audio.py", line 224, in <module> resample_dataset(args.data_location, args.dataset) File "prepare_dataset/prepare_audio.py", line 18, in resample_dataset cfg = utils.load_json('../datasets.json') File "/home/developer/Documents/speech-music-detection/smd/utils.py", line 99, in load_json with open(filename) as f: FileNotFoundError: [Errno 2] No such file or directory: '../datasets.json'
Of course dataset.json is at the project's root, and contains:
{ "my_own_dataset": { "data_folder": "my_own_dataset/audio", "filelists_folder": "my_own_dataset/filelists" } }
What am I missing?
Thanks
I want to thank you for making this significant contribution to the field. The project was extremely well done, and I have learned a lot about how to structure a large python project from reading your code.
I have attempted to train a model using the 'all_quality' config in experiments.json using a subset of the datasets (musan, gtzan, muspeak). The training was aborted after 28/50 epochs due to a tf/keras bug, and I was saving the model after each epoch. The last model saved had these results:
1214/1214 [==============================] - 557s 457ms/step - loss: 0.1321 - binary_accuracy: 0.9514 - categorical_accuracy: 0.8334 - val_loss: 0.1099 - val_binary_accuracy: 0.9621 - val_categorical_accuracy: 0.9421
I thought I might be able to use that model to make a prediction on a 30 minute wav file (mostly music with 4 segments of speech of about 3 minutes each). The output of predict.py labeled the entire 30 minutes as speech, so I think I'm doing something wrong. I wasn't sure what to put for the --mean_path and --std_path. I just see mean.npy and var.npy files in the filelists_* directories of the datasets. Are these supposed to be combined in some way and the result passed to predict.py?
I would be grateful for any advice you can give about training and running predictions.
Merci
Hi
Could you supply the trained model, I want to make a test, and it will only be used for a test.
my email: [email protected]
Hi, I would like to add my own dataset, which consists in several mixed speech/music files.
In the README you mention the need for creating two different folders, but I cannot fully understand the "repartition of the data between each set for each type of label" part...
Could you please give us an example dir tree to use as reference?
Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.