Coder Social home page Coder Social logo

zishan-rahman / dsnote Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mkiol/dsnote

0.0 0.0 0.0 69.84 MB

Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.

License: Mozilla Public License 2.0

Shell 0.25% C++ 87.74% QML 9.40% CMake 2.61%

dsnote's Introduction

Speech Note

Linux desktop and Sailfish OS app for note taking, reading and translating with offline Speech to Text, Text to Speech and Machine Translation

Download on Flathub

Description

Speech Note let you take, read and translate notes in multiple languages. It uses Speech to Text, Text to Speech and Machine Translation to do so. Text and voice processing take place entirely offline, locally on your computer, without using a network connection. Your privacy is always respected. No data is sent to the Internet.

Speech Note uses many different processing engines to do its job. Currently these are used:

Languages and Models

Following languages are supported:

Lang ID Name DeepSpeech (STT) Whisper (STT) Vosk (STT) April-ASR (STT) Piper (TTS) RHVoice (TTS) espeak (TTS) MBROLA (TTS) Coqui (TTS) Mimic3 (TTS) Bergamot (MT)
af Afrikaans
am Amharic ● (e)
ar Arabic
bg Bulgarian
bn Bengali
bs Bosnian
ca Catalan
cs Czech
da Danish
de German
el Greek ● (e)
en English
eo Esperanto
es Spanish
et Estonian ● (e)
eu Basque ● (e)
fa Persian
fi Finnish
fr French
ga Irish
gu Gujarati
ha Hausa
he Hebrew
hi Hindi
hr Croatian
hu Hungarian ● (e)
id Indonesian ● (e)
is Icelandic
it Italian
ja Japanese
jv Javanese
ka Georgian
kk Kazakh
ko Korean
ky Kyrgyz
la Latin
lb Luxembourgish
lt Lithuanian
lv Latvian
mk Macedonian
mn Mongolian ● (e)
ms Malay
mt Maltese
ne Nepali
nl Dutch ● (e)
no Norwegian
pl Polish
pt Portuguese ● (e)
ro Romanian ● (e)
ru Russian
sk Slovak
sl Slovenian ● (e)
sq Albanian
sr Serbian
sv Swedish
sw Swahili
te Telugu
th Thai ● (e)
tl Tagalog
tn Tswana
tr Turkish ● (e)
tt Tatar
uk Ukrainian
uz Uzbek
vi Vietnamese
yo Yoruba ● (e)
zh Chinese

(e) experimental, most likely doesn't work well
(*) Coqui TTS models are only available on x86-64

Language models can be downloaded directly from the app.

Details of models which are currently configured for download are described in models.json (GitHub) or models.json (GitLab).

Contributions

Any contribution is very welcome!

Project is hosted both on GitHub and GitLab. Feel free to make a PR/MR, report an issue or reqest for new feature on the platform you prefer the most.

Translation

Translation files in Qt format are in translations dir (GitHub) or translations dir (GitLab).

Preferred way to contribute translation is via Transifex service, but if you would like to make a direct PR/MR, please do it.

Install

Flatpak package (published on Flathub) includes almost all the dependencies needed to run every feature of the application. This includes CUDA, ROCm, Torch and Python libraries. Due to this, the size of the package and the space required after installation are significant. If you don't need all the functionalities, you can use much smaller "Tiny" package (available on Releases page), which contains only the basic features.

Comparison between "Flathub" and "Tiny" Flatpak packages:

Feature Flathub Tiny
Coqui/DeepSpeech STT ✔️ ✔️
Vosk STT ✔️ ✔️
Whisper STT ✔️ ✔️
Whisper STT GPU ✔️
Faster Whisper STT ✔️
April-ASR STT ✔️ ✔️
eSpeak TTS ✔️ ✔️
MBROLA TTS ✔️ ✔️
Piper TTS ✔️ ✔️
RHVoice TTS ✔️ ✔️
Coqui TTS ✔️
Mimic3 TTS ✔️
Punctuation restoration ✔️

Building from sources

Arch Linux

It is also possible to build and install the latest development (git) or latest stable (release) version from the repository using the provided PKGBUILD file (please note that the same remarks about building on Linux apply):

git clone <git repository url>

cd dsnote/arch/git      # build latest git version
# or
cd dsnote/arch/release  # build latest release version

makepkg -si

Flatpak

git clone <git repository url>

cd dsnote/flatpak

flatpak-builder --user --install-deps-from=flathub --repo="/path/to/local/flatpak/repo" "/path/to/output/dir" net.mkiol.SpeechNote.yaml

Sailfish OS

git clone <git repository url>

cd dsnote
mkdir build
cd build

sfdk config --session specfile=../sfos/harbour-dsnote.spec
sfdk config --session target=SailfishOS-4.4.0.58-aarch64
sfdk cmake ../ -DCMAKE_BUILD_TYPE=Release -DWITH_SFOS=ON -DWITH_PY=OFF
sfdk package

Linux (direct build, not recommended)

Speech Note has many build-time and run-time dependencies. This includes shared and static libraries, 3rd-party executables, Python and Perl scripts. Because of these complexity, the recommended way to build is to use Flatpak tool-chain (Flatpak manifest file and flatpak-builder). If you want to make a direct build (i.e. without flatpak) it is also possible but more complicated.

git clone <git repository url>

cd dsnote
mkdir build
cd build

cmake ../ -DCMAKE_BUILD_TYPE=Release -DWITH_DESKTOP=ON
make

To make build without support for Python components, add -DWITH_PY=OFF in cmake step.

To see other build options search for option(BUILD_XXX) in CMakeList.txt file.

Libraries

Speech Note relies on following open source projects:

Reviews and demos

License

Speech Note is an open source project. Source code is released under the Mozilla Public License Version 2.0.

3rd party libraries:

The files in the directory nonbreaking_prefixes were copied from mosesdecoder project and distributed under the GNU Lesser General Public License v2.1.

dsnote's People

Contributors

mkiol avatar lfd3v avatar karry avatar zishan-rahman avatar albanobattistella avatar dashinfantry avatar popanz avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.