Comments (13)
Hi,
I am confused about the same thing, maybe a little more than you are.
Can you please tell me how you know that the input_feature.py file is generating the development files?
I can't find where it is storing the files it generates.
from 3d-convolutional-speaker-recognition.
@xav12358 @Chegde8
hello both,
I get confused on the same things, do you get solution?
from 3d-convolutional-speaker-recognition.
@xav12358 @peter05010402
Hi,
I did a little reading on hdf5 file types and think I did figure it out.
I tried to follow the file structure that the hdf5 example files had. Here is the code snippet:
eval_try.hdf5 is the file which contains evaluation and enrollment data.
"lab" is an array of labels.
"feat" is a matrix of features.
Here I have the same data for evaluation and enrollment. You will just need to make separate arrays of "lab" and "feat" for both of them.
I added this at the end of the original "input_features.py" code.
Note that, for the above code snippet, the hdf5 files will be stored in the folder in which input_features.py is saved, i.e. in /code/0-input.
To use these files for development, enrollment and evaluation phases just open these files as "fileh" instead of the ones already given in the code.
Do let me know if you need clarification on anything I have just said or if there is an error and I have misunderstood something.
from 3d-convolutional-speaker-recognition.
@Chegde8 Thank you for your code.
I got an error's follow:
could you send me your input_features.py? Thank you!
[email protected]
from 3d-convolutional-speaker-recognition.
I think the error is saying that you are storing an empty array into eval_try.hdf5.
I will email my input_features.py to you :)
from 3d-convolutional-speaker-recognition.
@Chegde8 Thank you!!!
from 3d-convolutional-speaker-recognition.
@Chegde8 , Could you please send your input_features.py?
from 3d-convolutional-speaker-recognition.
Hi @Chegde8 , @peter05010402 and @xav12358 ,
I added the following to the input_features.py:
idx = 0
f = open('file_path_test1.txt','r')
for line in f:
idx = idx + 1
#print (idx)
lab = []
feat = []
for i in range(idx):
feature, label = dataset.__getitem__(i)
lab.append(label)
feat.append(feature)
print(feature.shape)
print(label)
######################
## creating hdf5 file ##
######################
h5file = tables.open_file('/root/3D_CNN/3D-convolutional-speaker-recognition/data/evaluation_test.hdf5', 'w')
label_test = h5file.create_carray(where = '/', name = 'label_enrollment', obj = lab, byteorder = 'little')
label_array = h5file.create_carray(where = '/', name = 'label_evaluation', obj = lab, byteorder = 'little')
utterance_test = h5file.create_earray(where = '/', name = 'utterance_enrollment', chunkshape = [1,20,80,40], obj = feat, byteorder = 'little')
utterance_train = h5file.create_earray(where = '/', name = 'utterance_evaluation', `chunkshape = [1,20,80,40]`, obj = feat, byteorder = 'little')
n5file.close()`
It gave me the following error:
ValueError: the shape ((0, 1, 20, 80, 40)) and chunkshape ((1, 20, 80, 40)) ranks must be equal.
I recognized that lab
and feat
are arrays and each one has 9 elements (# of wav files I want to test). Each element of the feat
array has the features of each wav file in my wav list. So what I did is I changed chunkshape
values to be chunkshape = [9,20,80,40,1]
and the evaluation_test.hdf5 file was created with no error.
When I used hdf5 file I created I got bad results and I'm trying to figure out what's the problem.
Based on the structure of the program, is it allowed to make multiple models from multiple wav files at once? Or I need to enroll each model alone?!
from 3d-convolutional-speaker-recognition.
Hi @MSAlghamdi ,
Do you get different results when chunkshape is 1 and when it is 9?
According to me the only thing that changes when you change chunkshape is that the the shape of data written to the hdf5 file in one I/O operation changes. So instead of writing one feature each time, all 9 are written to the hdf5 file at once . I don't know how this would change the results.
Also I am not sure if changing chunkshape has anything to do with whether multiple models from multiple wav files are created at once. The model creation happens in the training and enrollment phases, not in the input phase. In this script we just produce the files which are used later.
But again, I am really not an expert on how hdf5 files work. So what I have said above might not be completely correct.
Also, your question got me thinking, do the training and enrollment codes make different models for each wav file, or do they create one model for each speaker based on the labels?
Essentially, will the performance be better if I use multiple short wav files per speaker or one long wav file per speaker?
PS: @MSAlghamdi , do you still want me to send you my input_features.py file? I think you have already gotten it, but asking just in case.
from 3d-convolutional-speaker-recognition.
No, thank you. I just built main and it did work.
datasetTest = AudioDataset(files_path='file_path_test.txt', audio_dir='Audio',
transform=Compose([CMVN(), Feature_Cube(cube_shape=(20, 80, 40), augmentation=True), ToOutput()]))
datasetTrain = AudioDataset(files_path='file_path_train.txt', audio_dir='Audio',
transform=Compose([CMVN(), Feature_Cube(cube_shape=(20, 80, 40), augmentation=True), ToOutput()]))
############### TEST DATASET ####################
idx_test = 0
f1 = open('file_path_train.txt','r')
for line in f1:
idx_test = idx_test + 1
lab_test = []
feat_test = []
for i in range(idx_test):
feature, label = datasetTest.__getitem__(i)
lab_test.append(label)
# feature.shap= (1, 20, 80, 40).
# make it like: (1, 80, 40, 20)
feature = feature.swapaxes(1, 2).swapaxes(2, 3)
feat_test.append(feature[0,:,:,:])
############### TRAIN DATASET ####################
idx = 0
f = open('file_path_train.txt','r')
for line in f:
idx = idx + 1
lab_train = []
feat_train = []
for i in range(idx):
feature, label = datasetTrain.__getitem__(i)
lab_train.append(label)
feature = feature.swapaxes(1, 2).swapaxes(2, 3)
feat_train.append(feature[0,:,:,:])
h5file = tables.open_file('/root/3D_CNN/3D-convolutional-speaker-recognition/data/devel_try.hdf5', 'w')
label_test = h5file.create_carray(where = '/', name = 'label_test', obj = lab_test, byteorder = 'little')
label_array = h5file.create_carray(where = '/', name = 'label_train', obj = lab_train, byteorder = 'little')
utterance_test = h5file.create_earray(where = '/', name = 'utterance_test', chunkshape = [8,80,40,20], obj = feat_test, byteorder = 'little')
utterance_train = h5file.create_earray(where = '/', name = 'utterance_train', chunkshape = [13,80,40,20], obj = feat_train, byteorder = 'little')
h5file.close()
the .h5 file was created in a good shape. There's another issue popped up when I ran the demo that could be solved if I store the feature in a npy array.
I'll let you know when it's solved.
from 3d-convolutional-speaker-recognition.
@Chegde8 hello, I added your code and change the first line as:
with open("eval_try.hdf5", "w") as h5file:
but got this error:
AttributeError: '_io.TextIOWrapper' object has no attribute 'create_carray'
ummmm, sorry im the new of SV, could u please send ur input_features.py? Thanks!
[email protected]
from 3d-convolutional-speaker-recognition.
@xav12358 @peter05010402
Hi,
I did a little reading on hdf5 file types and think I did figure it out.
I tried to follow the file structure that the hdf5 example files had. Here is the code snippet:eval_try.hdf5 is the file which contains evaluation and enrollment data.
"lab" is an array of labels.
"feat" is a matrix of features.
Here I have the same data for evaluation and enrollment. You will just need to make separate arrays of "lab" and "feat" for both of them.
I added this at the end of the original "input_features.py" code.
Note that, for the above code snippet, the hdf5 files will be stored in the folder in which input_features.py is saved, i.e. in /code/0-input.To use these files for development, enrollment and evaluation phases just open these files as "fileh" instead of the ones already given in the code.
Do let me know if you need clarification on anything I have just said or if there is an error and I have misunderstood something.
could you send me your input_features.py? Thank you! @MSAlghamdi
from 3d-convolutional-speaker-recognition.
@Chegde8 could you send me your input_features.py? Thank you! [email protected]
from 3d-convolutional-speaker-recognition.
Related Issues (20)
- Convolution expects input with rank 4, got 5 HOT 9
- Extracting VAD for our own dataset HOT 1
- Run time error in the demo HOT 6
- No such file or directory: 'results/SCORES/score_vector.npy HOT 1
- Where does input_feature.py store it's results? HOT 1
- How to make Speaker Verification (1:1 recognition) model in keras? HOT 1
- What is the exact meaning of "utterances"? HOT 4
- ValueError: axes don't match array
- What does low-level and high-level features extraction mean? HOT 1
- .wav inputs specifics HOT 6
- where is score_vector.npy HOT 2
- Demo video recording link is broken
- Please update the vedio link of the demo.
- Pre trained model HOT 1
- Does anyone know the EER of this repo? HOT 1
- How to deal with .hdfs5 files ? HOT 2
- version problem HOT 1
- Speakers for Enrollment and Development
- γAll Dependency of This Project HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from 3d-convolutional-speaker-recognition.