- This repo contains the inference code to use pretrained human voice gender classifier.
- You could also try ๐คHuggingface online demo.
First, clone this repository
git clone https://github.com/JaesungHuh/voice-gender-classifier.git
and install the packages via pip.
cd voice-gender-classifier
pip install -r requirements.txt
import torch
from model import ECAPA_gender
# You could directly download the model from the huggingface model hub
model = ECAPA_gender.from_pretrained("JaesungHuh/ecapa-gender")
model.eval()
# If you are using gpu ....
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
# Load the audio file and use predict function to directly get the output
example_file = "data/00001.wav"
with torch.no_grad():
output = model.predict(example_file, device=device)
print("Gender : ", output)
For those who need pretrained weights, please download them in here
State-of-the-art speaker verification model already produces good representation of the speaker's gender.
I used the pretrained ECAPA-TDNN from TaoRuijie's repository, added one linear layer to make two-class classifier, and finetuned the model with the VoxCeleb2 dev set.
The model achieved 98.7% accuracy on the VoxCeleb1 identification test split.
I would like to note the training dataset I've used for this model (VoxCeleb) may not represent the global human population. Please be careful of unintended biases when using this model.
- ๐ค Huggingface Hub link
- I modified the model architecture from TaoRuijie's repository.
- For more details about ECAPA-TDNN, check the paper.