[Schema] L10n proposal about browser-compat-data HOT 13 CLOSED

mdn commented on April 28, 2024 3

[Schema] L10n proposal

from browser-compat-data.

Comments (13)

SebastianZ commented on April 28, 2024 1

[{"en-US": "text1"}, {"en-US":"text2"}] sounds reasonable and we're already using this schema in other places like l10n/css.json, but you're right that it's hard for translators to get to know when they have to update their translation.

One idea to get rid of this issue is to add some kind of number versioning to the strings, e.g.

[
  {
    "en-US: {
      "version": 3,
      "text": "In Internet Explorer 8 and 9, there is a bug where a computed <code>background-color</code> of <code>transparent</code> causes <code>click</code> events to not get fired on overlaid elements."
    }
  }
]

Though this may be overkill. A simpler solution would be to add a note to the commit messages which language(s) were changed, e.g. "en-US: Clarified compatibity note of 'background-color' CSS property for IE". Then translators could filter the commit messages by "en-US" to see what has changes since the last edit in their language.

Sebastian

from browser-compat-data.

jwhitlock commented on April 28, 2024 1

If I had time to work on this, I would:

Define all source strings as English in the spec, as well as which data elements are plain text, HTML, etc. Avoiding HTML is a good idea, but being clear about it is necessary if you can't avoid it.
Create a script to extract strings into the standard gettext format
Manage translation using gettext conventions, like Kuma, perhaps even using Pontoon to translate the strings. With gettext, you get fuzzy translations, notifications of changed strings, etc. etc. for free.
Create a second script to export gettext-formatted files to a JSON data structure.
Implement a gettext-like translation in KumaScript (I'm pretty sure this is already done, and multiple times).

I think using existing gettext standards will be less painful then building dashboards, versioning strings, etc.

from browser-compat-data.

queengooborg commented on April 28, 2024 1

This issue has been sitting around for a long time and is one of the oldest issues we have open. Unfortunately, localizing the notes in BCD has not been discussed or even mentioned for quite a while, and I don't think it will be a priority for us any time soon. As such, I'm going to close this issue, but I am happy to revisit it in the future!

from browser-compat-data.

meisamkhengul commented on April 28, 2024 1

Hi!

Our current proposal for the schema has notes: these are textual comments.

E.g.
...
  "__compat": {
    ...,
      "Internet Explorer": {
        "support": "4.0",
        "notes": ["In Internet Explorer 8 and 9, there is a bug where a computed <code>background-color</code> of <code>transparent</code> causes <code>click</code> events to not get fired on overlaid elements."]
There may be several notes (hence the []).

How do we want to translate them? We would like something that is simple, that is something that doesn't force us to build something outside github.

One way could be to have an object. Instead of: ["text1", "text2"] we would have: [{"en-US": "text1"}, {"en-US":"text2"}]

This would allow to store translated strings from the start and allows macros to use them easily. This would not make maintenance easy: if the en-US text changes, there is no easy way for a translator to know it (beside watching the file), also there is no way of knowing if a translation is up-to-date or not.

This is a basic proposal. Does anybody have a better idea?

Best writer

from browser-compat-data.

meisamkhengul commented on April 28, 2024 1

special Thanks to github

from browser-compat-data.

Elchi3 commented on April 28, 2024

I also think that [{"en-US": "text1", "de": "Text eins"}, {"en-US":"text2", "de": "Text zwei" }] makes sense.

To help localizers, I would build an external dashboard that does the checks (it might be like doc status pages or completely on its own). I think there are two things:
a) The language key is not present. So if you are a French localizer and an object is {"en-US": "text1", "de": "Text eins"} this will show up as untranslated for you.

b) There is an update to the English string. In this case I think the person who updates the English string should invalidate the localizations. So, for example {"en-US": "text42", "de": "#NEEDSUPDATE# Text eins"}. This would then show up in the German dashboard as "update needed". And in the rendering it of the data it could fall back to English, as the translation is invalid.

from browser-compat-data.

teoli2003 commented on April 28, 2024

I'm concerned about the verbosity of adding a version number for each string. Also, having flags inside a string seems to make consumption of these complex, as they need to be parsed.

What about:

[{"en-US":"text1",
  "de": "Text eins"},
 {"en-US": "text2",
  "de": {"up-to-date":false,"string":"Text two"}

from browser-compat-data.

SebastianZ commented on April 28, 2024

[{"en-US":"text1",
  "de": "Text eins"},
 {"en-US": "text2",
  "de": {"up-to-date":false,"string":"Text two"}

I like that approach and the idea of a dashboard.

Though I wonder whether you both missed my second solution about adding 'en-US' to the commit messages instead of putting the info into the files, because there was no feedback to it.

Sebastian

from browser-compat-data.

Elchi3 commented on April 28, 2024

Though I wonder whether you both missed my second solution about adding 'en-US' to the commit messages instead of putting the info into the files, because there was no feedback to it.

I saw it, but it doesn't sound compelling to me. It would require reviewing or validating commit messages. Forgetting to add "up-to-date":false is easy as well, but I assume it's a bit better, because it is in the code and could be caught easier when reviewing.

from browser-compat-data.

SebastianZ commented on April 28, 2024

Define all source strings as English in the spec, as well as which data elements are plain text, HTML, etc. Avoiding HTML is a good idea, but being clear about it is necessary if you can't avoid it.

If HTML is avoided, it need to be clarified if and how the entries should be formatted. Allowing formatted strings has some advantages as well as disadvantages.

I think using existing gettext standards will be less painful then building dashboards, versioning strings, etc.

I'm not familiar with gettext. I assume you mean the GNU related project gettext, right?

Sebastian

from browser-compat-data.

jwhitlock commented on April 28, 2024

I think using existing gettext standards will be less painful then building dashboards, versioning strings, etc.

I'm not familiar with gettext. I assume you mean the GNU related project gettext, right?

Yes. I think Wikipedia's gettext page is a better introduction than the GNU docs. Once you have text in the .po format, you can use existing tools to translate the strings, or the format is easy enough to translate by hand (see Kuma's German javascript.po).

We would have to write a tool to extract strings from JSON to the template .pot file, like javascript.pot. We could then script the gettext tools to update each locale's .po files and check them in. After translation, our second custom tool would extract translated strings from the .po files and put them back in JSON or KumaScript. For example, in Kuma, we use Django's tools to generate JavaScript javascript.js that implements the gettext functions with the translated strings, for use in client-side UI.

from browser-compat-data.

Elchi3 commented on April 28, 2024

Sounds promising to me. Found these relevant resources:
https://www.npmjs.com/package/jsxgettext
http://stackoverflow.com/questions/39586651/json-and-translation

I think one general aspect to decide is whether l10n is offered by the data provider (this repo) and or that data consumers (in our case Kuma/KumaScript) have to deal with l10n themselves. It seems like we are aiming for the former and thus translations would somehow live in this repository.

from browser-compat-data.

teoli2003 commented on April 28, 2024

I think that we should avoid multiple translations: translations should live in this repository.

Basically, if we go for @jwhitlock idea, we will have the .json file containing only English, and a script to create a .po from these.

In other words, from the .json file point of view, it means that we don't translate in the file, but we consider all translated strings as English (we need to define the format of these strings, though).

from browser-compat-data.

[Schema] L10n proposal about browser-compat-data HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent