Coder Social home page Coder Social logo

dsfsi / masakhane-web Goto Github PK

View Code? Open in Web Editor NEW
35.0 35.0 15.0 19.9 MB

Masakhane Web is a translation web application for solely African Languages.

Home Page: http://translate.masakhane.io

License: MIT License

Makefile 1.59% Python 22.03% Dockerfile 1.09% HTML 0.95% JavaScript 16.77% CSS 0.12% Shell 0.28% Jupyter Notebook 57.18%
african african-languages africanlp africannlp docker joeynmt machine-translation masakhane masakhane-web natural-language-processing translation-models

masakhane-web's People

Contributors

banqomania avatar categitau avatar chrisemezue avatar dependabot[bot] avatar espoirmur avatar freshia avatar idzingirai avatar juliakreutzer avatar kabongosalomon avatar lastrucci01 avatar peaceaz avatar ruohoruotsi avatar tunde99 avatar vukosim avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

masakhane-web's Issues

Update endpoint

Description

I'm seeing this error when adding an additional model

image

It might be related to the time it takes to download the model.

Start the server with default languages.

The service starts with no language in memory, and we need to run curl --request GET 'http://127.0.0.1:5000/update to load the language pairs available in the database python manage.py all_languages in memory.

Edit the class AddResource to have default mode loaded when the command docker-compose up -d

Combine Add_Language & Update

Description

Combine the steps of adding and updating loaded language models.

Tasks

  • Combine add & update

Background

Masakhane Translate has a Manage CLI that let's you add language model info from the database, among other things. Language model info refers to the data stored in the database (Fig 1), the actual language models are downloaded in another step. The database seems to serve as a reference as to which database models are downloaded. The add_lang command (Fig 2) reads languages.json - which is just a file storing the language names & shorthands - and gets the name_tag composition to be stored in the database using the user-defined Language DB model. Take note of the database implementation using FlaskSQLAlchemy.

Once model info has been loaded into the database, the /update endpoint - which is linked to the AddResource Class (Fig 3) - must be hit which will go through the name_tags in the database and download the model files from Zenodo if necessary. Once the model has been downloaded it is loaded into memory with self.models which is a dictonary. It will also remove any models in memory that no longer have a reference in the DB.

Files

translate.py
model_load.py
init.py
manage.py

Screenshots

Figure 1 - Language table contents
image

Figure 2 - add_language command
Model info added to the database
image

Figure 3 - AddResource Class
Model info used to download models
image

update the README.md running with Docker part

The instructions on running the app through docker are not correct. To run the container we should use docker-compose up-d --build rather than docker-compose -f docker-compose.prod.yml up -d --build which is a command to run on production and it doesn't work on dev.

Dependency Maintenance Documentation

Description

Create documentation on guidelines to go about when updating dependencies.
It will probably involve going to the branch of the dependency upgrade and figuring if, and why, it might break and updating the code and testing before merging it into master.

Background

There is a dependancy bot in the repo that checks for out of date dependencies, once found it creates a new branch, updates the dependencies and creates a PR. Dependencies have a high chance of causing packages in the project to be invalid with one another and cause errors.

Tasks

Include specific tasks in the order they need to be done in. Include links to specific lines of code where the task should happen at.

  • Figure out the method used for updating dependencies.
  • Create documentation for contributors of the repo to know how to maintain dependencies.

Documentation on the Backend system

Description

After learning about the backend of Masakhane Translate, I'd like to document explanations on the details of the system so future developers can get started easier.

Tasks

  • Create a doc for explaining project structure and implementation
  • Update nested Readme's
  • Add more code comments to important files

BPE markup appearing in output in edge case

It looks like the postprocessing doesn't remove BPE in the invalid (but still possible) case that there's no token after one ending in @@.

I ran into this with this English -> Swahili pair: "I don't speak Swahili well." -> "Mimi huzungumza Kiswahili kwa ukunju@@". Screenshot below:

image

Adding more languages to the online system

We need to have these configurations added

  • IGBO -> English
  • English -> IGBO
  • English -> Setswana
  • Hausa -> English
  • English -> Hasua
  • Yoruba -> English
  • Swahili -> English

Prepare 2nd payload for Zenodo

Description

For the models that are missing in Zenodo but are on Masakhane Web Translate, lets create the zip payloads to be uploaded to Zenodo. Once on Zenodo, the links in available models, should now point to Zenodo
Refactor

Files

A list of relevant files for this issue. This will help people navigate the project and offer some clues of where to start.

To Reproduce

If this issue is describing a bug, include some steps to reproduce the behavior.

Tasks

Include specific tasks in the order they need to be done in. Include links to specific lines of code where the task should happen at.

  • Create the zip payloads to be uploaded to Zenodo.
  • links in available models, should now point to Zenodo
  • Refactor code to also Support Zenodo based downloads of models
  • Commit, pull request

Get to run the system

Afternoon @PeaceAz and @ntsakomtombeni

Be able to run the system on your local machine.

Run through Docker Compose

You will need to install docker on your local machine

Read the instructions in the README to get the system running locally.

I you have problems, please tag @Kabongosalomon

Documentation for the Client

FOR SOMEONE WITH KNOWLEDGE OF REACT

Description

Create Docs explaining the structure of the frontend.

Background

The frontend/client is written in React and uses Webpack.

Files

/src/client

Tasks

Include specific tasks in the order they need to be done in. Include links to specific lines of code where the task should happen at.

  • Document React project structure on Client README
  • Documentation WebPack integration on Client README

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.