Coder Social home page Coder Social logo

phoneme's Introduction

Prototype usage

  1. Download the git repo with submodules.
git clone --recurse-submodules [email protected]:CMU-MCDS/phoneme.git <TargetDir>

The <TargetDir> can be any name you want to put this git repo in. In this README, we assume it to be phoneme.

  1. Set up the python virtual environment (ideally outside the directory phoneme).
virtualenv phoneme_env
source phoneme_env/bin/activate
pip install -r phoneme/requirements.txt
  1. Set up the git repo for development.
cd phoneme/modules/persephone
git checkout -b mcds
git merge origin/mcds

cd ../..
python server.py
  1. In browser, go to
http://127.0.0.1:5000

The training files must be prepared as a <TrainData>.zip file, where <TrainData> is the name of the dataset. This zip file should include a directory with the same name <TrainData>. This directory should then include a directory called wav containing the wav files, and a directory called label containing the corresponding label files.

The untranscribed files must be prepared in the same way as the training files are, except that the label directory is not needed.

Sample datasets are provided in the directory sample_data.

  1. When done, exit virtual environment with
deactivate

Potential next steps

  1. Integrate the interface with Moses
  2. Back-end API for transcribing new wav audio
  3. Improve the design/implementation API:
    • TODO: output the model performance and progress back to the interface
    • TODO: allow user to specify batch size, num_train, num_layers, and hidden size
    • TODO: implement functions to check the format of the uploaded data.
    • TODO: Consider what kind of files are supported and what if users uploaded multiple files
    • TODO: add some functions to organize the uploaded data and manage different experiments. (if user uploaded multiple batches of data, the interface needs to allow user select a particular training batch)

Citation to the resources used

This system uses Persephone. For its source, please see: https://github.com/oadams/persephone. For its publication, please see:

Oliver Adams, Trevor Cohn, Graham Neubig, Hilaria Cruz, Steven Bird, and Alexis Michaud, Evaluating phonemic transcription of low-resource tonal languages for language documentation, Proceedings of LREC 2018.

The sample_data contains subsets of the Na dataset. For its source, please see: http://lacito.vjf.cnrs.fr/pangloss/languages/Na_en.php.

The sample_data also contains the Griko dataset. For its source, please see: http://griko.project.uoi.gr/. For its publication, please see:

Lekakou Marika, Valeria Baldissera, and Antonis Anastasopoulos (2013). Documentation and analysis of an endangered language: aspects of the grammar of Griko. University of Ioannina.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.