Coder Social home page Coder Social logo

mozilla / firefox-translations Goto Github PK

View Code? Open in Web Editor NEW
575.0 21.0 47.0 35.48 MB

Firefox Translations is a webextension that enables client side translations for web browsers.

License: Mozilla Public License 2.0

JavaScript 98.14% CSS 0.25% HTML 0.86% Python 0.65% Shell 0.08% Smarty 0.03%
firefox translation webextension javascript deep-neural-networks nmt nlp

firefox-translations's Introduction

Build CodeQL End-to-End Tests Firefox Translations - Install Nightly CODE OF CONDUCT LICENSE Mozilla Add-on

This WebExtension is NOT maintained anymore. The development of Firefox Translation moved into Firefox itself and will be available for anyone from Firefox 108.

Please open new issues in https://bugzilla.mozilla.org/enter_bug.cgi?product=Firefox&component=Translation

Firefox Translations

Firefox Translations was a WebExtension that enabled client side in-page translations for web browsers.

Firefox Translations was developed with The Bergamot Project Consortium, coordinated by the University of Edinburgh with partners Charles University in Prague, the University of Sheffield, University of Tartu, and Mozilla. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 825303. 🇪🇺

Release version

Desktop

The current release version is available for installation on Mozilla Add-ons

AMO

Android

Follow the steps below to install the extension on Firefox Nightly or Beta for Android:

  • Apply the steps described on this article, skipping the section Create a collection on AMO (we already provide a collection here) and starting from the section Enable general extension support setting in Nightly
  • On step 5, input 17436609 in the Collection owner field, and fxt in the Collection name field
  • Your browser should restart.
  • After restarting, click on the three dot menu and select Add-ons
  • The Add-ons page should be displayed and Firefox Translations appear at the top of the list. Just click on the + icon to have it installed
  • With that you should have the addon added to your browser. Please refer to this video on how to use the extension.
  • You can then remove the Custom Addon-on collection, just by clicking at it and clearing the fields, so you could have the stock addons listed again.

Supported languages

Production

  • Spanish
  • Estonian
  • English
  • German
  • Czech
  • Bulgarian
  • Portuguese
  • Italian
  • French
  • Polish

Development

  • Russian
  • Persian (Farsi)
  • Icelandic
  • Norwegian Nynorsk
  • Norwegian Bokmål
  • Ukrainian
  • Dutch

Testing

Nightly builds

Desktop

You can test nightly builds of the extension in Firefox Nightly or Developer Edition in one of the supported languages by following the steps below:

  • Type about:config in the navigation bar and set the following preferences:
    xpinstall.signatures.required to false
    extensions.experiments.enabled to true
  • Then install the extension by clicking here Firefox Translations - Install Nightly
  • You may need to restart your browser and Firefox Translations will be ready to use. Just browse to a website in one of the supported languages and the option to translate should be displayed.
Demo
Firefox.Translations.v2.mov

Android

You can test the addon on Android by following the steps below:

  1. Clone this repo and execute npm install
  2. Install Firefox Nightly for Android in your phone
  3. Connect your phone to your computer via USB
  4. Follow these steps in order to setup your phone and browser to install the addon
  5. You might need to execute adb shell pm grant org.mozilla.fenix android.permission.READ_EXTERNAL_STORAGE in your terminal so the addon could be pushed to your phone
  6. Execute adb devices in your terminal, copy the device id, and replace the string <device id from adb devices> on package.json by it
  7. Execute npm run android -- --android-device=<ANDROID_DEVICE_ID> in your terminal to install the addon in your phone and have the browser automatically started (or npm run android-win -- --android-device=<ANDROID_DEVICE_ID> if developing on a Windows system)

That should be enough to have the addon installed on Firefox in your Android. Folow the steps in the video below to learn how to use it.

Demo
screen-20230302-100129.mp4

Development

3rd party dependencies

The extension does not utilize any npm modules, and the only vendored dependencies within are:

  • Bergamot Translator

    • A WebAssembly wrapper around the actual Neural Machine Translator, Marian. The code to build the WASM module can be found on its repository
  • Fasttext

    • We bundle the WebAssembly port of fasttext along its compressed model in order to detect the page's language. Instructions to build the WebAssembly module can be found here
  • Sentry

  • serialize-error

    • code of serialize-error npm package is bundled for serialization of exceptions to report errors from content scripts to background script

How to run

  • Install Firefox Nightly
  • Clone this repo and run npm install
  • Run npm run once and wait until Nightly starts
  • Go to about:config and set extensions.experiments.enabled to true
  • Browse to a page in any of the supported languages to have the translation option to appear

Updating telemetry schema

After adding new metrics to extension/model/telemetry/metrics.yaml or pings to extension/model/telemetry/pings.yaml, run

bash scripts/update-telemetry-schema.sh

to regenerate JS telemetry schema.

Updating bergamot-translator WASM module

Replace

  • extension/controller/translation/bergamot-translation-worker.js
  • extension/model/static/translation/bergamot-translator-worker.wasm

with the new artifacts and then execute:

bash scripts/update-bergamot-translator.sh

to regenerate JS version file. This version is reported in telemetry.

Discussions

Firefox translations channel on Matrix

firefox-translations's People

Contributors

abhi-agg avatar andrenatal avatar calixteman avatar dependabot[bot] avatar emilio avatar eu9ene avatar flodolo avatar gitoffthelawn avatar jcristau avatar jelmervdl avatar kpu avatar lonnen avatar marco-c avatar midgleyc avatar migmanu avatar nerixyz avatar relud avatar rob--w avatar rpl avatar sylvestre avatar tex2002ans avatar veyndan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

firefox-translations's Issues

I can't label issues

From @andrenatal e-mail "I created a label to categorize it, so please use it when doing so". However, I lack permission to add labels to github issues here.

Form translation component should be always on top

Even though the Outbound/Form translation component cannot be implemented as a native panel in the chrome (since it is not allowed for extensions to make use of XUL/XPCOM), it should act so and resemble it as much as possible. One of the expected features is that it always stays on top.

The following webpages show that in the implementation that I tested (https://github.com/mozilla/firefox-translations/actions/runs/1458463171), it is not always the case.

https://karriere.deutschebahn.com/karriere-de/jobs/Kontaktformular-4199212
image

https://www.tajmac-zps.cz/kontakte (then click on "Schreiben Sie uns/Write to us")
image

Empty info/notification popup showing

I have Firefox Translations enabled by setting extensions.translations.disabled on Firefox Nightly 97.0a1 (2021-12-17) (64-bit) running on Pop!_OS.

Clicking the translate icon in the URL bar will bring up an empty info bar like so:
Screenshot from 2021-12-24 21-56-32

Inspecting the browser console, I see the following errors when clicking the translate button:

notif.init is not a function TranslationBrowserChromeUi.js:313
    showTranslationInfoBar moz-extension://93675596-b356-4a58-bf90-e7e021642882/experiment-apis/translateUi/TranslationBrowserChromeUi.js?cachebuster=1640342430870:313
    onClickCallback moz-extension://93675596-b356-4a58-bf90-e7e021642882/experiment-apis/translateUi/TranslationBrowserChromeUi.js?cachebuster=1640342430870:263
    PopupNotifications_fireCallback resource://gre/modules/PopupNotifications.jsm:1693
    notificationsToShow resource://gre/modules/PopupNotifications.jsm:1213
    filter self-hosted:242
    PopupNotifications_showPanel resource://gre/modules/PopupNotifications.jsm:1208
    PopupNotifications_update resource://gre/modules/PopupNotifications.jsm:1422
    PopupNotifications_reshowNotifications resource://gre/modules/PopupNotifications.jsm:1628
    PopupNotifications_onIconBoxCommand resource://gre/modules/PopupNotifications.jsm:1604
    handleEvent resource://gre/modules/PopupNotifications.jsm:802

Followed by:

TypeError: can't access property "split", n.message is null PopupNotifications.jsm:939:17
    _formatDescriptionMessage resource://gre/modules/PopupNotifications.jsm:939
    PopupNotifications_refreshPanel resource://gre/modules/PopupNotifications.jsm:987
    forEach self-hosted:208
    PopupNotifications_refreshPanel resource://gre/modules/PopupNotifications.jsm:967
    PopupNotifications_showPanel resource://gre/modules/PopupNotifications.jsm:1224
    PopupNotifications_update resource://gre/modules/PopupNotifications.jsm:1422
    PopupNotifications_reshowNotifications resource://gre/modules/PopupNotifications.jsm:1628
    PopupNotifications_onIconBoxCommand resource://gre/modules/PopupNotifications.jsm:1604
    handleEvent resource://gre/modules/PopupNotifications.jsm:802

I tried having a look around the code but don't know enough about the workings of internal browser extensions to figure out the issue!

Improved language detection

We should have a more expanded language detection heuristic in terms of content extraction and analysis to determine the page's language instead of traversing just through divs.

Stabilize and polish Outbound translation UI

  • Why takes a long time to load and inject the form widget into the page.
  • The form widget should always have the topmost z-index. Some STRs here: #16
  • Polished UI with some usage instructions
  • Translations are broken in the latest messaging batched mechanism
  • Add dynamic scrolling to the backtranslation textarea
  • Display the widget when input texts are selected besides only textareas
  • Listen to mutations that add new editable forms to the Dom after the first scan (#12). Basically unhook to the mutation listener on inpage and have its own for OT looking for what's pertinent.
  • Outbound Translation messages to the translation worker should have top priority.
  • Error when widget loses focus: #57
  • Rebase

New translation request required per page

Moving from one page to another - e.g. following a link on a translated page - requires a translation be requested again from the infobar. It's reasonable to assume that if a user translated a page, and a successive page is in the same language or is in the same domain, they'd want to see the next page translated too. Worth adding an "always translate" function, or translate pages following on from previously translated pages ?

Firerox crashing when running on a manually built gecko-dev

I'm getting the following crash when running the extension in a custom built Firefox from gecko-dev on OSX. No issues on Linux

Assertion failure: !loadingPrincipal || loadingPrincipal->GetIsNullPrincipal() || principal->GetIsNullPrincipal() || loadingPrincipal->Subsumes(principal), at /Users/anatal/projects/mozilla/gecko/dom/workers/ScriptLoader.cpp:1223
#01: mozilla::dom::(anonymous namespace)::LoaderListener::OnStreamComplete(nsIStreamLoader*, nsISupports*, nsresult, unsigned int, unsigned char const*)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x3f6671f]
#02: mozilla::net::nsStreamLoader::OnStopRequest(nsIRequest*, nsresult)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x53efb4]
#03: nsBaseChannel::OnStopRequest(nsIRequest*, nsresult)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x4a3aae]
#04: non-virtual thunk to nsBaseChannel::OnStopRequest(nsIRequest*, nsresult)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x4a3ba0]
#05: nsJARChannel::OnStopRequest(nsIRequest*, nsresult)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x114357c]
#06: non-virtual thunk to nsJARChannel::OnStopRequest(nsIRequest*, nsresult)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x114646d]
#07: nsInputStreamPump::OnStateStop()[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x4cdef1]
#08: nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x4cd44f]
#09: non-virtual thunk to nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x4ce21d]
#10: nsInputStreamReadyEvent::Run()[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x2acaa9]
#11: mozilla::RunnableTask::Run()[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x31518b]
#12: mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x2ef365]
#13: mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x2ede4a]
#14: mozilla::TaskController::ProcessPendingMTTask(bool)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x2ee11e]
#15: mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_0>::Run()[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x31a4c7]
#16: nsThread::ProcessNextEvent(bool, bool*)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x302dbd]
#17: NS_ProcessNextEvent(nsIThread*, bool)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x30a0ac]
#18: mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0xb97e61]
#19: MessageLoop::Run()[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0xb101e4]
#20: nsBaseAppShell::Run()[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x439b509]
#21: nsAppShell::Run()[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x441f7e2]
#22: XRE_RunAppShell()[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x5f013f4]
#23: mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0xb98901]
#24: MessageLoop::Run()[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0xb101e4]
#25: XRE_InitChildProcess(int, char**, XREChildData const*)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/toolkit/library/build/XUL +0x5f00e2d]
#26: main[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin21.2.0/dist/NightlyDebug.app/Contents/MacOS/plugin-container.app/Contents/MacOS/plugin-container +0x3f43]

Outbound translation dialogues be present at the bottom of screen when activated regardless of scrolling

The outbound translation (aka form form filling) feature is unusable if one has to scroll to the bottom of the page for each form element. The UI specification currently is two text areas at the bottom of the viewport (not page), one for typing and one for feedback. Ideally with a draggable and closeable pane. Something like the Firefox Web Developer tools.

Apparently WebExtensions can't access XUL so the textareas are injected to the HTML page by the extension. (Aside: is there a security problem/resource abuse issue with this if the page can write to the textareas?) Injecting notional Firefox UI elements (at least from the user perspective, but not from a XUL perspective) into the page seems error prone, but doable in most cases.

It should be possible to use position: fixed and z-order to create a bottom pane within the HTML page.

Example of roughly what I'm looking for from the Guardian's cookie warning:
guardian
Except without the veil over the top of the page and without preventing me from seeing the footer of the page.

Error when outbound translation widget lose focus

Firefox stderr: JavaScript error: moz-extension://b44b0730-9e1c-41b1-8f15-5215e602e4ff/view/js/OutboundTranslation.js, line 82: NotFoundError: Node.removeChild: The node to be removed is not a child of this node

Set versions in telemetry

  • record BERGAMOT_VERSION_FULL from bergamot-translator-workser.js
  • introduce and record extensionBuildId, like in the legacy extension:
extensionBuildId: `${process.env.VERSION}-${process.env.extensionBuildEnvironment}#${process.env.BRANCH}

Collect telemetry metrics "words in viewport"

This should be an initial number of words in the viewport, before scrolling. I guess the idea was to estimate how soon a user will see the translated text. Not sure how critical is this metric for us.

[META] Addon signing

Actionable items:

Translating non-German webpages to German doesn't work

Steps to reproduce:

  1. Download Firefox Nightly (German version)
  2. Start extension (latest main branch)
  3. Navigate to page: https://andrenatal.com/translations-playground/?lang=pt
  4. Infobar correctly displays the page language
  5. Click on Übersetzen
  6. The content is not translated

Replacing pt with es, et, en, it, ru and cs in the link https://andrenatal.com/translations-playground/?lang=pt gives same behavior with following error message in web console:

Error: We did not find an alpha in the model named: F0::Wemb_QuantMultA. bergamot-translator-worker.js:1217:12
Error: Aborted from auto marian::cpu::integer::fetchAlphaFromModelNodeOp::forwardOps()::(anonymous class)::operator()() const in /root/checkout/3rd_party/marian-dev/src/tensors/cpu/intgemm_interface.h:583 bergamot-translator-worker.js:1217:12
Callstacks not supported in WASM builds currently bergamot-translator-worker.js:1217:12
undefined bergamot-translator-worker.js:649:9
Translation error:  RuntimeError: abort(undefined). Build with -s ASSERTIONS=1 for more info. translationWorker.js:120:37
RuntimeError: abort(undefined). Build with -s ASSERTIONS=1 for more info. bergamot-translator-worker.js:653:14

`We did not find an alpha in the model named: F0::Wemb_QuantMultA.` when translating from `pt` to `de` on outbound translations.

I'm getting the crash below [2] in the model/engine when translating from pt to de using outbound translations (although this seems to be a generalized issue). It's not happening from pt to the other languages but it might be happening with different combinations. There's a screen record of the error here:

STR:
1 - Download Nightly pt-br
2 - Load the extension from master
3 - Navigate to: http://andrenatal.github.io/translations-playground
4 - Choose de
5 - Click on Traduzir
6 - Click on Enable translation of form Yes
7 - Click on the textarea
8 - Type something in the textarea withing the outbound translations widget

[1] https://www.dropbox.com/s/7vw5vhykcnlnwjv/ptde_crash.mov?dl=1

[2]

Using fallback gemm implementation bergamot-translator-worker.js:6245:17
Wasm Runtime initialized Successfully (preRun -> onRuntimeInitialized) in 0.011 secs translationWorker.js:67:29
Creating Translation Service with config: [object Object] translationWorker.js:230:25
Translation Service created successfully translationWorker.js:232:25
Constructing model 'dept' via pivoting: 'deen' and 'enpt' translationWorker.js:252:25
Total Download time for all files of 'enpt': 0.185 secs translationWorker.js:321:21
Constructing Aligned memory. Size: 17140836 bytes, Alignment: 256 translationWorker.js:481:21
Aligned memory construction done translationWorker.js:483:21
Aligned memory initialized translationWorker.js:486:21
Constructing Aligned memory. Size: 4472528 bytes, Alignment: 64 translationWorker.js:481:21
Aligned memory construction done translationWorker.js:483:21
Aligned memory initialized translationWorker.js:486:21
Constructing Aligned memory. Size: 812781 bytes, Alignment: 64 translationWorker.js:481:21
Aligned memory construction done translationWorker.js:483:21
Aligned memory initialized translationWorker.js:486:21
Aligned vocab memory1 size: 812781 translationWorker.js:343:23
Aligned model memory size: 17140836 translationWorker.js:345:21
Aligned shortlist memory size: 4472528 translationWorker.js:346:21
Translation Model config: 
            beam-size: 1
            normalize: 1.0
            word-penalty: 0
            max-length-break: 128
            mini-batch-words: 1024
            workspace: 128
            max-length-factor: 2.0
            skip-cost: true
            cpu-threads: 0
            quiet: true
            quiet-translation: true
            gemm-precision: int8shiftAlphaAll
            translationWorker.js:347:21
[2022-01-27 13:15:45] [data] Loading SentencePiece vocabulary from buffer bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:46] Missing list of protected prefixes for sentence splitting. Set with --ssplit-prefix-file. bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:46] [memory] Extending reserved space to 128 MB (device cpu0) bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:46] Loaded model config bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:46] Loading scorer of type transformer as feature F0 bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:46] Memory mapping model at 0x682a00 bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:46] [memory] Reserving 31 MB, device cpu0 bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:46] [memory] Reserving 8 MB, device cpu0 bergamot-translator-worker.js:1217:12
Total Download time for all files of 'deen': 0.803 secs translationWorker.js:321:21
Constructing Aligned memory. Size: 17140837 bytes, Alignment: 256 translationWorker.js:481:21
Aligned memory construction done translationWorker.js:483:21
Aligned memory initialized translationWorker.js:486:21
Constructing Aligned memory. Size: 5047568 bytes, Alignment: 64 translationWorker.js:481:21
Aligned memory construction done translationWorker.js:483:21
Aligned memory initialized translationWorker.js:486:21
Constructing Aligned memory. Size: 784269 bytes, Alignment: 64 translationWorker.js:481:21
Aligned memory construction done translationWorker.js:483:21
Aligned memory initialized translationWorker.js:486:21
Aligned vocab memory1 size: 784269 translationWorker.js:343:23
Aligned model memory size: 17140837 translationWorker.js:345:21
Aligned shortlist memory size: 5047568 translationWorker.js:346:21
Translation Model config: 
            beam-size: 1
            normalize: 1.0
            word-penalty: 0
            max-length-break: 128
            mini-batch-words: 1024
            workspace: 128
            max-length-factor: 2.0
            skip-cost: true
            cpu-threads: 0
            quiet: true
            quiet-translation: true
            gemm-precision: int8shiftAlphaAll
            translationWorker.js:347:21
[2022-01-27 13:15:46] [data] Loading SentencePiece vocabulary from buffer bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:46] Missing list of protected prefixes for sentence splitting. Set with --ssplit-prefix-file. bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:46] [memory] Extending reserved space to 128 MB (device cpu0) bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:46] Loaded model config bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:46] Loading scorer of type transformer as feature F0 bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:46] Memory mapping model at 0xa89bf00 bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:47] [memory] Reserving 31 MB, device cpu0 bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:47] [memory] Reserving 8 MB, device cpu0 bergamot-translator-worker.js:1217:12
Model 'dept' successfully constructed. Time taken: 1.273 secs translationWorker.js:201:23
loadLanguageModel function complete translationWorker.js:223:21
Constructing model 'ptde' via pivoting: 'pten' and 'ende' translationWorker.js:252:25
Total Download time for all files of 'ende': 0.144 secs translationWorker.js:321:21
Constructing Aligned memory. Size: 17140498 bytes, Alignment: 256 translationWorker.js:481:21
Aligned memory construction done translationWorker.js:483:21
Aligned memory initialized translationWorker.js:486:21
Constructing Aligned memory. Size: 3062492 bytes, Alignment: 64 translationWorker.js:481:21
Aligned memory construction done translationWorker.js:483:21
Aligned memory initialized translationWorker.js:486:21
Constructing Aligned memory. Size: 797501 bytes, Alignment: 64 translationWorker.js:481:21
Aligned memory construction done translationWorker.js:483:21
Aligned memory initialized translationWorker.js:486:21
Aligned vocab memory1 size: 797501 translationWorker.js:343:23
Aligned model memory size: 17140498 translationWorker.js:345:21
Aligned shortlist memory size: 3062492 translationWorker.js:346:21
Translation Model config: 
            beam-size: 1
            normalize: 1.0
            word-penalty: 0
            max-length-break: 128
            mini-batch-words: 1024
            workspace: 128
            max-length-factor: 2.0
            skip-cost: true
            cpu-threads: 0
            quiet: true
            quiet-translation: true
            gemm-precision: int8shiftAlphaAll
            translationWorker.js:347:21
[2022-01-27 13:15:47] [data] Loading SentencePiece vocabulary from buffer bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:47] Missing list of protected prefixes for sentence splitting. Set with --ssplit-prefix-file. bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:47] [memory] Extending reserved space to 128 MB (device cpu0) bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:48] Loaded model config bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:48] Loading scorer of type transformer as feature F0 bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:48] Memory mapping model at 0xd8f0f00 bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:48] [memory] Reserving 31 MB, device cpu0 bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:48] [memory] Reserving 8 MB, device cpu0 bergamot-translator-worker.js:1217:12
Total Download time for all files of 'pten': 0.454 secs translationWorker.js:321:21
Constructing Aligned memory. Size: 17140836 bytes, Alignment: 256 translationWorker.js:481:21
Aligned memory construction done translationWorker.js:483:21
Aligned memory initialized translationWorker.js:486:21
Constructing Aligned memory. Size: 5001420 bytes, Alignment: 64 translationWorker.js:481:21
Aligned memory construction done translationWorker.js:483:21
Aligned memory initialized translationWorker.js:486:21
Constructing Aligned memory. Size: 812889 bytes, Alignment: 64 translationWorker.js:481:21
Aligned memory construction done translationWorker.js:483:21
Aligned memory initialized translationWorker.js:486:21
Aligned vocab memory1 size: 812889 translationWorker.js:343:23
Aligned model memory size: 17140836 translationWorker.js:345:21
Aligned shortlist memory size: 5001420 translationWorker.js:346:21
Translation Model config: 
            beam-size: 1
            normalize: 1.0
            word-penalty: 0
            max-length-break: 128
            mini-batch-words: 1024
            workspace: 128
            max-length-factor: 2.0
            skip-cost: true
            cpu-threads: 0
            quiet: true
            quiet-translation: true
            gemm-precision: int8shiftAlphaAll
            translationWorker.js:347:21
[2022-01-27 13:15:48] [data] Loading SentencePiece vocabulary from buffer bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:48] Missing list of protected prefixes for sentence splitting. Set with --ssplit-prefix-file. bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:48] [memory] Extending reserved space to 128 MB (device cpu0) bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:48] Loaded model config bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:48] Loading scorer of type transformer as feature F0 bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:48] Memory mapping model at 0x1c560600 bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:48] [memory] Reserving 31 MB, device cpu0 bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:48] [memory] Reserving 8 MB, device cpu0 bergamot-translator-worker.js:1217:12
Outbound Model 'ptde' successfully constructed. Time taken: 0.791 secs translationWorker.js:175:25
[2022-01-27 13:15:54] Error: We did not find an alpha in the model named: F0::Wemb_QuantMultA. bergamot-translator-worker.js:1217:12
[2022-01-27 13:15:54] Error: Aborted from auto marian::cpu::integer::fetchAlphaFromModelNodeOp::forwardOps()::(anonymous class)::operator()() const in /root/checkout/3rd_party/marian-dev/src/tensors/cpu/intgemm_interface.h:583 bergamot-translator-worker.js:1217:12
Callstacks not supported in WASM builds currently bergamot-translator-worker.js:1217:12
undefined bergamot-translator-worker.js:649:9
Translation error:  RuntimeError: abort(undefined). Build with -s ASSERTIONS=1 for more info. translationWorker.js:117:37
RuntimeError: abort(undefined). Build with -s ASSERTIONS=1 for more info.

Translation bar not appearing on pages when URL is typed / pasted

Peek 2022-02-03 19-59

  1. Use 25503dc with directions in the README.md and an en-US speaking nightly
  2. Paste a Russian URL into the bar (I used Linux middle click) say https://ru.wikipedia.org/wiki/%D0%A1%D0%B0%D0%BC%D1%87%D0%B5%D0%BD%D0%BA%D0%BE,_%D0%93%D0%B5%D0%BE%D1%80%D0%B3%D0%B8%D0%B9_%D0%94%D0%BC%D0%B8%D1%82%D1%80%D0%B8%D0%B5%D0%B2%D0%B8%D1%87
  3. Hit enter.
  4. Page renders but no bar appears.
  5. Go to another tab, type es.wikipedia.org and again no bar appears
  6. Click a link and observe that the extension is active.
  7. Return to that original Russian URL and hit enter in the URL field. The bar appears.

Use HTML translation feature including inline tag movement

Currently the code sends text snippets for translation:

submitTranslation(node, key) {
if (this.messagesSent.has(key)) {
// if we already sent this message, we just skip it
return;
}
const text = node.textContent;
if (text.trim().length) {
/*
* send the content back to mediator in order to have the translation
* requested by it
*/
const payload = {
text,
type: "inpage",
attrId: [
this.processingNodeMap,
key
],
};
this.notifyMediator("translate", payload);
this.messagesSent.add(key);
}
}

This leads to very poor translation quality because the system does not have sentence context. Even if it were to have sentence context, keeping text spans as is prevents reordering and is an impossible translation problem. For example, chien translates to dog. In this HTML, what is the translation of h?
<span id="0">c</span><span id="1">h</span><span id="2">i</span><span id="3">e</span><span id="4">n</span>

Since block elements are sentence-breaking, individual block elements can be sent for translation using their innerHTML. The HTML parser also knows to break sentences at block boundaries so larger elements can also be sent in. It does assume well-formed HTML though; Firefox is better at fixing HTML and this ensures consistency between rendering and how the engine perceives tags. Well-formed implies tags that open also close inside the same block of text; #23 is a blocker.

#51 is a partial blocker. Specifically this part needs to be fixed first:

Even if HTML was being submitted, it would not be properly used (and cause an abort()) because the model doesn't produce alignment information. In the model configuration yaml, the line alignment: soft is missing.

Once that is fixed, HTML processing coming out of the engine should be consistent with https://translate.ikhoefgeen.nl/ .

Quality issues with HTML processing should be raised on https://github.com/browsermt/bergamot-translator

non-HTML input is treated as HTML

It looks like all input is marked as HTML for the translator, even though nodes' text content is submitted. If a node contains something like <p>Hello &lt; world</p>, it would submit Hello < world. Which when parsed as HTML is invalid input and would cause an exception.

Even if HTML was being submitted, it would not be properly used (and cause an abort()) because the model doesn't produce alignment information. In the model configuration yaml, the line alignment: soft is missing.

Form translation dialogues hidden behind page's modal veil

On booking.com when trying to use "Form translation", two text fields appear behind the grey overlay that booking.com uses to give the impression of a modal dialogue.
bottom

  1. Install current extension
  2. https://www.booking.com/hotel/gb/park-plaza-westminster-bridge.es.html
  3. Click translate
  4. Click Yes to enable translations of forms (and wait an unusually long time for "Loading form translation")
  5. Click "Ask a question." lower down in the page to open a modal dialogue.
  6. Click on the textarea.

[meta] Implement Basic Quality Estimation

We need to be able to enable the quality estimation functionality by switching a pref. This same pref should also disable the functionality that causes performance penalty in the engine, and not only the UI feature.

List of tasks:

  • #132
  • Modify extension to encode plain text and send it as HTML to the engine whenever QE feature is on
    • This is required as engine always expects html text to return QE scores (as per this)
    • The translation doesn't need to be decoded back as the engine always returns translation as html whenever QE is on and this output can be used directly to show colors
  • #133
  • #179

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.