Name: Frederico S. Oliveira
Type: User
Company: UFMT
Bio: Researcher in the area of NLP, Ph.D. student at UFG, focusing on speech synthesis and recognition using deep learning and also professor at UFMT.
Twitter: fred_s0
Location: Cuiabá, Mato Grosso - Brazil
Blog: https://www.fredso.com.br
Frederico S. Oliveira's Projects
This repository contains implementation of different AI algorithms, based on the 4th edition of amazing AI Book, Artificial Intelligence A Modern Approach
Material Didático da disciplina Algoritmos e Estrutura de Dados
Classic game using PyGame
Official Code for Assem-VC @ICASSP2022
A simple GUI application that slices audio with silence detection
A paper and project list about the cutting edge Speech Synthesis, Text-to-Speech (TTS), Singing Voice Synthesis (SVS), Voice Conversion (VC), Singing Voice Conversion (SVC), and related interesting works.
BRSpeech: A Portuguese Dataset for Speech Synthesis
A model for predicting MOS that utilizes embeddings of supervised learning and self-supervised learning models, combined with embeddings of speaker verification models, to predict the MOS metric.
This is a dataset composed of images of capybaras to be used for training a model for object detection
This repository presents how to train your own Image Segmentator Using TensorFlow Object Detection API.
This repository presents how to train your own Object Detector Using TensorFlow Object Detection API. It also demonstrates how to use the trained model to annotate data (auto-annotate).
CML-TTS: A Multilingual Dataset for Speech Synthesis
CML-TTS Conversion Tools
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Examples for an AI course following the textbook Artificial Intelligence: A Modern Approach by Russell and Norvig.
A set of audio augmentation techniques to perform noise insertion in datasets used for Automatic Speech Recognition.
scripts to augment labelded images with bounding boxes
Deep Speaker: an End-to-End Neural Speaker Embedding System.
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
Tensorflow Object Detection API for fault detection at power transmission lines.
Frontend and backend separated object detection tf2 demo build with Flask, TensorFlow 2.x.
Frontend and backend separated object detection tf2 demo build with Flask, TensorFlow 2.x.
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
The official PyTorch implementation of "FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement".