vivjay30 / cone-of-silence Goto Github PK
View Code? Open in Web Editor NEWThe Cone of Silence:
License: MIT License
The Cone of Silence:
License: MIT License
Hey, I loved your work, I was trying to replicate it, to do that I was generating some synthetic dataset but got some errors and doubts.
As you mentioned the dataset used is VCTK, it has dataset in .flac format which is not recognized by the program, so did you guys did any preprocessing over the dataset?
And there is no data folder in the original dataset i,e. VCTK (mentioned in the command to generate synthetic dataset).
And can you share the dataset you guys used for training?
Hi, I'm trying to recreate your results, when I run the code I get error message saying there's no such module 'mir_eval', and I wasn't able to find it. perhaps you moved the files somewhere else or renamed the directory? the reference is from 'cos/helpers/eval_utils.py'
Thanks
Hi, vivjay
Cone-of-Silence/cos/training/train.py
Lines 115 to 124 in a3f27eb
Hi, in your README, you mention that "We even provide a sample 4 channel file for you to run". Where is this file located?
Hi,
Your paper mentions you recorded 3 hours of data into the Seeed 4 mic hat from VCTK corpus.
Would you be willing to make that available?
We'd like to duplicate the results, and we don't get the same level from training on just the synthetic data.
Thanks!!
Richard
Hi. I do realize that you have already answered the question that whether the algorithm is real-time or not in #9 . I just want to know whether there is any way that we can do to make it operational in real-time?
From what I understand, the model will work for 4-channel and 6-channel wav files.
Does this model also work on 2-channel recordings? Is there a pretrained model for that?
Hi @vivjay30 ,
Thanks for your sharing - nice work!. Would you like to share your dataset for training?
Best regards,
PeterPham
Hi @vivjay30
Thanks for your answer. Would you like to tell some device that you are recording real speaker?
Best regards,
PeterPham
Hi Vivek,
Thanks for your awesome work.
What's the meaning for FG_VOL_MIN、FG_VOL_MAX、BG_VOL_MIN、BG_VOL_MAX in generate_dataset.py and how did you calculate these four values?
Best regards,
KenHuang
Hi,
thanks for this library.
Is it possible to use this in a real-time scenario like conferences?
Best regards,
Dirk
Hi Vivek,
Thanks for opensourcing this interesting project - nice work! I just have one question about the model size and the performance. I checked your Demucs model implementation and calculated the number of parameters, and with your default hyperparameter setting there are over 260M parameters. I'm not sure if this is the actual setting you used for training, but if so, this is really a huge amount of parameters as in other separation models the model sizes are typically smaller than 10M nowadays. I'm wondering whether you have done any experiments on how the performance will be if you shrink the model size, e.g. to the level of multi-channel Conv-TasNet or TAC reported in your paper. Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.