Coder Social home page Coder Social logo

flipkart-grid-2.0's Introduction

Flipkart-GRID-Noise-Cancellation-Solution

Owners: Aditya Das, Nihal John George and Aryan Pandey

This repository contains team Third Degree Burn's solution for Round 3 of the Flipkart GRiD 2.0.

Correction in the video:- We meant to say that the audios are padded to a length of 10 seconds not 15 seconds. Any audio longer than that is cropped to 10 seconds and fed into the model.

Link for drive folder where we have stored all our predictions on the input files: https://drive.google.com/drive/folders/1ewBhjymSAa-8PkT5S80DMuDDK8_dUK3a?usp=sharing

Link to WER for each file: link

Video link: link

API Usage

Without a GUI

We have provided an API which uses Flask to take in the path to an input file or a directory with multiple input files and the path to an output directory where the files will be stored in WAV format. To make use of our scripts, run this wsgi script on a terminal and then run this interacting script separately to start interacting with the server.

$ cd FlaskNoGUI

$ python wsgi.py

The scripts can be found here:

With a GUI

DISCLAIMER - Move the gbl_model.h5 file from the 'FlaskNoGUI/Models/' folder to the 'FlaskGUI/Models/' folder before running scripts in this section

We have also made the scripts for having a small GUI incorporated with Flask using tkinter. The order to run the scripts is the same. You first run this wsgi script on a terminal and then run this interacting script separately to start interacting with the server. The only difference here is that On running the interacting script, a pop-up window shows in which you can select whether you want to input a single file or a whole directory. In any case, you need to select the input directory before the output directory, else the code will not run.

The scripts for this can be found here:

Model Building

Following are the scripts that we have used for building our model:

Datsets

We manually made our data where we created 30 odd files with just background noise and 30 odd files which contains clear voice. We then generated our dataset by mixing each clean audio with each background noise to create a new audio file. This way we got 1000 audio files. We also incorporated a function where we scaled our noise by some amount to get three sets of audio: One with a dimmed out noise, one with the noise at the same level as the clean audio and one with an amplified noise. Totally we got 3000 audio samples from this mixing.

The datasets as of now are uploaded to Kaggle and are Private. Anyone with the following links will be able to access it. The dataset will be made public after the competition is over, if Flipkart allows it.

flipkart-grid-2.0's People

Contributors

nihalgeorge01 avatar adityadas-iitm avatar aryanpandey avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.