Comments (69)
On the other hand there are several l10n plugins for Jekyll like https://github.com/Anthony-Gaudino/jekyll-multiple-languages-plugin . That one handles the translations in .yml files which would allow us to have at least string based translations. On the other hand I'm not sure if our localizers are used to .yml files though. Haven't found one that would use properties-files, but there might be ones as well.
from activate.mozilla.community.
Can we get #126 in to not destroy all the efforts the Brazil community did? :)
from activate.mozilla.community.
Adding @a-kilroy
from activate.mozilla.community.
Would love more details on this since I think there could be some overlap in what the foundation is working on to better integrate localizers. We might be able to piggy back off some of that work.
from activate.mozilla.community.
@a-kilroy thanks for jumping in, let me try to answer your question.
For the activate.mozilla.community website, we're using this repository here, which also includes all texts that are displayed on the page. The English, original source files, are located in the _pages folder directly: https://github.com/mozilla/activate.mozilla.community/tree/gh-pages/_pages
For every localization there are the following steps to do:
- Create a subfolder in the _pages which indicates the locale, for example "es" like it already exists: https://github.com/mozilla/activate.mozilla.community/tree/gh-pages/_pages/es
- Then you need to copy all the English folders and Markdown documents to this and start translating them, so in the end we have one complete file in EN, one complete file in ES
- Create a file for the general GUI specific translations: https://github.com/mozilla/activate.mozilla.community/blob/gh-pages/_data/l10n/es.yml, which also specifies the URL for the langues, like /es/ for Spanish
- Add your locale to the Language switch
Guillermo created a Pull Request with a template folder which can be copied to make it easier to add a new language. In hindsight, I'm not sure if that is helping us with this problem here. But let me first describe the translation problem.
As you see above, we have 2 different documents, EN and ES. Now there are two possibilities:
- Somebody changes a link or something generic in the English document -> this person could change this in the other languages as well, as long as there are no specific language skills needed
- There is new text, or any change on existing text -> The person changing it can only do it in EN (assuming the source always gets changed first)
In both cases, if there is any language skill involved, it will require a localizer to change the text in the language specific file as well. As it is, the change in the source file will be done as a normal Git commit which does per default not notify anybody watching the repository. So currently there is no way to automatically notify all localizers that something changed and a re-translation should be done.
This is basically the same as if you have a Google Doc with the English text on it and send it out to somebody to translate, they most probably will copy the English one and write the text for the other language. If the English document gets changed, it won't notify the localizer about it.
One possibility would be to have a list of people to notify and if somebody directly changes the text or merges a Pull Request to make this person responsible to notify each one of these persons. This seems cumbersome to me though, but right now I don't really have any other suggestion myself. Another possibility would be Pontoon (l10n tool Mozilla localizers already use) to make this possible, but I don't know how well this would work with full Markdown files instead of small strings. Maybe @mathjazz could enlighten us here?
In any case, we need to make sure that the locales are getting updated as well, not only the English source.
I hope that is a clear (even though very long) problem statement. Feel free to ask if something is not phrased clearly or if you spot any mistakes.
from activate.mozilla.community.
There are numerous reasons why embedding text directly to the code rarely works for localization purposes.
- Notifying localizers about source string updates is one and @MichaelKohler has explained it brilliantly. Even if you come up with a solution, please note that some localizers are not comfortable dealing with diffs, because they aren't technical.
- It slows down the localization process, because localizers can't use localization tools they are familiar with.
- It makes consistency of the translation difficult if not impossible to maintain, because text needs to be broken down into logical segments (strings) for that to work.
- The process is very brittle, because localizers can break files. Even if they are as simple as Markdown files with yaml header.
So I suggest you internationalize the site using one of the i18n libraries, which will create resource files. It's a one-time task that will take significantly less time on long term than the current solution.
from activate.mozilla.community.
At the risk of oversimplifing it, an ideal scenario would be to integrate Pontoon right? I believe this is something MoFo is already working on and since they have several sites that use Github pages I expect there could be some overlap/coordination opportunities. I was trying to understand want we'd like to do and the specific problem so that I can figure out the right people to talk to/connect. I think I understand the problem but not the ideal solution. And honestly if it's not helpful I can drop it.
from activate.mozilla.community.
Yes, if the site is internationalized, we can plug it into Pontoon easily, which is used to localize most if not all Mozilla (MoFo & MoCo) websites. We have best practices and docs for this. The contact person for website localization is @peiying2.
from activate.mozilla.community.
Given this discussion, let's put a HOLD on integrating any more locales right now. Happy to see the conversation happening though!
from activate.mozilla.community.
@mathjazz what can you provide us to have Jekyll adapted to what you are suggesting? We don't have the resources to build something here on top of vanilla Jekyll.
Right now we have a couple of P1 languages we need to deliver where people just need to localize a markdown file, we can improve in the future ;-)
from activate.mozilla.community.
Let's do what we can to not fragment the l10n process/tool chain. This will only make it harder for the community to engage on these types of projects.
from activate.mozilla.community.
I agree, that's why we are asking for your help here :)
In the mean time we know our current process is not perfect but we wanted to deploy something fast and scrappy, we can improve as we go ;-)
from activate.mozilla.community.
One possible short-term plan may be extracting strings into your md files and converting those to xliff for use in Pontoon. It seems that there's already a utility out there that can perform that conversion -- https://github.com/tadatuta/md2xliff
The long-term strategy would be to convert everything over to HTML and use the l20n framework.
from activate.mozilla.community.
I see, we want to use markdown to allow non-technical people to add/update content to the site directly from github UI, that's the whole purpose of using Jekyll (also in-build github pages support)
I don't know if we can have markdown for English and then extract strings for other locales?
from activate.mozilla.community.
This short term plan allows you to continue using markdown, while using a
standard localization format that Pontoon supports and that preserves the
document structure.
El 22 ago. 2016 3:19 PM, "Nukeador" [email protected] escribió:
I see, we want to use markdown to allow non-technical people to add/update
content to the site directly from github UI, that's the whole purpose of
using Jekyll.I don't know if we can have markdown for English and then extract strings
for other locales?—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#34 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AB1yZJJchfbPhLKLQeA48KnsbVg1TsG2ks5qihJigaJpZM4JlXKj
.
from activate.mozilla.community.
Cool, any guides on how we should provide the xliff files so people can use pontoon and how to integrate them back, thanks! :-)
from activate.mozilla.community.
I would experiment with that script I linked to in a previous comment to
convert between the two formats. Once your comfortable that there's no data
loss, if you set up a strings repo with a directory per locale containing
the xliff files, Pontoon only needs the URL to the en-US repo directory and
can pull them in.
El 22 ago. 2016 3:30 PM, "Nukeador" [email protected] escribió:
Cool, any guides on how we should provide the xliff files so people can
use pontoon and how to integrate them back, thanks! :-)—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#34 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AB1yZCJhMGK2xgXCDdh1nSY8-u-alUPgks5qihTZgaJpZM4JlXKj
.
from activate.mozilla.community.
From the foundation:
https://github.com/MozillaFoundation/Advocacy/wiki/Localization:-How-it-happens-during-Copyright
This is not exactly the same set up but maybe is helpful. I also believe
they have something for their github sites though I can't find it on their
wiki so might be worth reaching out to them.
On Mon, Aug 22, 2016 at 11:40 PM, gueroJeff [email protected]
wrote:
I would experiment with that script I linked to in a previous comment to
convert between the two formats. Once your comfortable that there's no data
loss, if you set up a strings repo with a directory per locale containing
the xliff files, Pontoon only needs the URL to the en-US repo directory and
can pull them in.El 22 ago. 2016 3:30 PM, "Nukeador" [email protected] escribió:
Cool, any guides on how we should provide the xliff files so people can
use pontoon and how to integrate them back, thanks! :-)—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<https://github.com/mozilla/activate.mozilla.community/
issues/34#issuecomment-241556944>,
or mute the thread
<https://github.com/notifications/unsubscribe-
auth/AB1yZCJhMGK2xgXCDdh1nSY8-u-alUPgks5qihTZgaJpZM4JlXKj>
.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#34 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ATI2-HozngMJJHBQL7FIIN-pGkKo-8szks5qihdhgaJpZM4JlXKj
.
from activate.mozilla.community.
@nukeador Converting MD to XLIFF and back to MD as @gueroJeff suggested sounds like your best bet.
You will have to convert MD to XLIFF every time there's a change in the source languge and also convert back from XLIFF to MD regularly so translations are deployed to production.
You can use this existing repository or a separate one for storing XLIFF files. See Section A of Pontoon docs for more details:
https://developer.mozilla.org/en-US/docs/Mozilla/Implementing_Pontoon_in_a_Mozilla_website
from activate.mozilla.community.
@mathjazz I've done a quick test with the md2xliff tool and I have found two issues:
- Header variables are extracted with no linebreaks and that breaks the file when you reconstruct it.
- I can't find a way to add a modification to the original English md file and extract just the changes to the xlf file.
Ideas? :-)
What I managed to do was:
- Extract a md file to xlf
- Make some translations
- Reconstruct that into a md file in the destination language
from activate.mozilla.community.
Discussed this in France with nikos - what are your thoughts @comzeradd ? We should aim to have the same system for Clubs site.
from activate.mozilla.community.
Yes, we also had a brief conversation with @nukeador about this. Since we have the requirement of keeping markdown this limits our options . For instance using something like webL10n or the solution @a-kilroy posted above from the advocacy page.
I'm not very familiar with pontoon, but why md2xliff is not good enough? Is it much of a problem that it doesn't produce diffs and re-created the whole file?
from activate.mozilla.community.
We could have a script to provide diffs and recreate but I was wondering if this is something that Pontoon is able to handle.
from activate.mozilla.community.
To sum up, I see a few requirements here:
- Keep the content in markdown, to make it easy for non-technical people to edit it.
- Avoid using Pull Requests for localizing content.
- Use standard mozilla l10n tools (pontoon), to increase project and content changes discoverability from localizers.
All jekyll's localization plugins and methods involve opening Pull Requests for localized content, which is not desired in our case.
I did some tests with md2xliff
and besides the issues with the metadata headers it works nice. So my suggested course of actions would be:
- Add this project to pontoon.
- Create a
locales
folder in this repository to put the xliff files, in the structure documentation suggests, and give write access to pontoon. - Manually create the xliff files from engiish markdown pages. I'd suggest we remove headers on extract (and re-add them on reconstruct) to avoid problems.
- Periodically recostruct localized xliff files to markdown.
from activate.mozilla.community.
@comzeradd I agree. What would you need?
@mathjazz @gueroJeff Is this something you can support us?
Thanks!
from activate.mozilla.community.
Sounds like a plan!
What matters for Pontoon is that files in a supported file format are available at the right place in the repository it can write to. And that's covered by the plan proposed by @comzeradd already!
I'm no expert in XLIFF files, but since it's a bilingual file format, I suspect every time a new en-US XLIFF file is generated, we'd also need to merge those changes into localized XLIFF files. There must be scripts that do this. I'll add @gueroJeff and @flodolo to comment on that (both of them are currently on conferences). Please note that Pontoon can work without this step, but your application might not.
from activate.mozilla.community.
I'm no expert in XLIFF files, but since it's a bilingual file format, I suspect every time a new en-US XLIFF file is generated, we'd also need to merge those changes into localized XLIFF files. There must be scripts that do this. I'll add @gueroJeff and @flodolo to comment on that (both of them are currently on conferences). Please note that Pontoon can work without this step, but your application might not.
Wait. Your app doesn't use XLIFF files directly, it uses MD files. So as long as the xliff2md script can create valid localized MD files, this step is not needed.
from activate.mozilla.community.
First step would be to enable Pontoon, so I opened a bug.
from activate.mozilla.community.
That's the last step I believe. Requirements under Section A need to be met first:
https://developer.mozilla.org/en-US/docs/Mozilla/Implementing_Pontoon_in_a_Mozilla_website
That's basically steps 2 and 3 from your list.
from activate.mozilla.community.
Yes, good point. I started creating the locales files. One thing I'm not sure about is whether I should include the original (en-US) files too, since xlf files have a source and target locale anyway.
from activate.mozilla.community.
Yup, we need the en-US folder with original files.
To give you an idea, here's the (only) xliff-based project we currently localize:
https://github.com/mozilla-l10n/firefoxios-l10n/
from activate.mozilla.community.
Thanks
I added the locales files, gave write access to the mozilla-pontoon bot and update the bug :)
from activate.mozilla.community.
Thanks @comzeradd! Could you use the .xliff
file extension?
from activate.mozilla.community.
🎉
from activate.mozilla.community.
from activate.mozilla.community.
Thanks!
XML parser is throwing an error:
Traceback (most recent call last):
File "/app/.heroku/python/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task
R = retval = fun(*args, **kwargs)
File "/app/.heroku/python/lib/python2.7/site-packages/newrelic-2.50.0.39/newrelic/hooks/application_celery.py", line 66, in wrapper
return wrapped(*args, **kwargs)
File "/app/.heroku/python/lib/python2.7/site-packages/celery/app/trace.py", line 438, in __protected_call__
return self.run(*args, **kwargs)
File "/app/pontoon/sync/core.py", line 59, in wrapped_func
return func(self, *args, **kwargs)
File "/app/pontoon/sync/tasks.py", line 226, in sync_translations
vcs_project.resources
File "/app/.heroku/python/lib/python2.7/site-packages/django/utils/functional.py", line 33, in __get__
res = instance.__dict__[self.name] = self.func(instance)
File "/app/pontoon/sync/vcs/models.py", line 291, in resources
resources[path] = VCSResource(self, path, locales=locales)
File "/app/pontoon/sync/vcs/models.py", line 413, in __init__
resource_file = formats.parse(resource_path, source_resource_path, locale)
File "/app/pontoon/sync/formats/__init__.py", line 44, in parse
return SUPPORTED_FORMAT_PARSERS[extension](path, source_path=source_path, locale=locale)
File "/app/pontoon/sync/formats/xliff.py", line 127, in parse
xliff_file = xliff.xlifffile(f)
File "/app/.heroku/python/lib/python2.7/site-packages/translate/storage/xliff.py", line 549, in __init__
lisa.LISAfile.__init__(self, *args, **kwargs)
File "/app/.heroku/python/lib/python2.7/site-packages/translate/storage/lisa.py", line 282, in __init__
self.parse(inputfile)
File "/app/.heroku/python/lib/python2.7/site-packages/translate/storage/lisa.py", line 358, in parse
self.document = etree.fromstring(xml, parser).getroottree()
File "lxml.etree.pyx", line 3103, in lxml.etree.fromstring (src/lxml/lxml.etree.c:70569)
File "parser.pxi", line 1828, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:106403)
File "parser.pxi", line 1716, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:105194)
File "parser.pxi", line 1086, in lxml.etree._BaseParser._parseDoc (src/lxml/lxml.etree.c:99876)
File "parser.pxi", line 580, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:94350)
File "parser.pxi", line 690, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:95786)
File "parser.pxi", line 620, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:94853)
XMLSyntaxError: xmlParseEntityRef: no name, line 145, column 165
Seems like &
's need to be escaped, e.g.:
https://github.com/mozilla/activate.mozilla.community/blob/gh-pages/locales/es-ES/webvr-camp.xliff#L145
from activate.mozilla.community.
Has anyone tried to go back to .md from these files, localizing a few random strings?
In only checked a couple of file: one looked fine, but others look full of unnecessary fragments. Example:
- https://github.com/mozilla/activate.mozilla.community/blob/gh-pages/locales/fr/activities.xliff#L57
- https://github.com/mozilla/activate.mozilla.community/blob/gh-pages/locales/fr/activities.xliff#L57 (what's this strange markup?)
- https://github.com/mozilla/activate.mozilla.community/blob/gh-pages/locales/fr/activities.xliff#L193
I was honestly expecting something else: no markup, just the text and some form of template to inject translations into. This seems really brittle, and given the size of the content, it would be great to do a proper testing before asking people to work on it, and potentially lose work.
from activate.mozilla.community.
Seems like &'s need to be escaped, e.g.:
@mathjazz I substituted &
with &
. Could you check that this work?
Thanks
from activate.mozilla.community.
Has anyone tried to go back to .md from these files, localizing a few random strings?
Yeap. But you need the skeleton files that md2xliff
created to reverse the process properly. I can add them to the repo if this doesn't create any problem to the pontoon bot (because they live inside the same folders as xliff files).
what's this strange markup?
This is the way we add specific css classes and markup to the content. There is no way to avoid this if we want the reverse process of reconstructing the markdown files to work without someone having to spend a lot of time to manually adding markup code again. On pontoon we just have to copy this to the localized target.
from activate.mozilla.community.
@mathjazz I substituted & with &. Could you check that this work?
It seems like some &
s in locale files are not escaped yet:
https://github.com/mozilla/activate.mozilla.community/blob/gh-pages/locales/pt-PT/test-pilot.xliff#L129
BTW, for URLs you should probably use %26
instead of &
:
https://github.com/mozilla/activate.mozilla.community/blob/gh-pages/locales/en-US/test-pilot.xliff#L133
from activate.mozilla.community.
Thanks. I made the substitutions on url on all locales.
from activate.mozilla.community.
Thanks @comzeradd!
I've successfully set the test project up on Pontoon stage server (the link will be broken in a few weeks from now):
https://mozilla-pontoon-staging.herokuapp.com/fr/activate-test/all-resources/?string=159738
I was also able to make a test commit to the repository: f4aa015. It would be great if you could use the same whitespace as Pontoon, so the diff would be easier to read, but that's the lowest possible priority.
The next step could be for someone to review the original strings and see if we can simplify them as flod suggested. There's lot's of markup and strings that don't need to be translated.
from activate.mozilla.community.
Everything looks ok. We indeed have some markup in there. One option would be to copy them to the localized side once it hits production, to make it easier for people to ignore them. If we remove them, then the reconstructing process would need much more manual work from someone from this team and would probably lead to slow updates on the localized content.
from activate.mozilla.community.
What are the next steps here?
from activate.mozilla.community.
I think we are good to move this to production Pontoon.
from activate.mozilla.community.
Thanks, @comzeradd!
Leaving it to the project management team. /cc @peiying2
from activate.mozilla.community.
Thanks everyone for brainstorming and finalizing a process so we can proceed.
I went through some of the strings, and saw a need for an explicit list of instruction on the kind of strings that are for localization while others should be ignored. I need to compile this list and include in my email communication to the localizers.
from activate.mozilla.community.
I would go even further: all strings that are supposed to remain identical should be pre-translated to avoid a mess, and reduce the amount of copy and paste for localizers.
This might give you some ideas
https://github.com/mozilla-mobile/firefox-ios-build-tools/blob/master/scripts/update-xliff.py
from activate.mozilla.community.
I just pushed a commit to pre-fill all the strings that contain only mark-up. That will hopefully reduce the complexity for localizers.
from activate.mozilla.community.
@peiying2 @mathjazz Can the strings go live now? We can also document the process for localisers here in github and elsewhere as needed.
from activate.mozilla.community.
LGTM.
from activate.mozilla.community.
We can make it go live. To get localization started, we need to draft a list of DO NOTs now and update it as we monitor the kind of mistakes people make.
from activate.mozilla.community.
Is there a link we can access for it?
from activate.mozilla.community.
@mathjazz , after some discussions with @brianking and @comzeradd, this is a go. I believe the files need to be pre-processed before you expose all the locales in Pontoon. I will let @comzeradd chime in on what's next.
from activate.mozilla.community.
Cool. Just let me know which locales to enable when the files are ready.
from activate.mozilla.community.
@mathjazz @peiying2 We are good to go. I regenerated all locales files to reflect recent content changes and added all the initial locales @brianking asked for.
from activate.mozilla.community.
Thanks! Is this the list of locales that need to be enabled initially?
https://github.com/mozilla/activate.mozilla.community/tree/gh-pages/locales
Of course all other locales will be able to request to be enabled through Pontoon.
from activate.mozilla.community.
Yes, they are the ones.
from activate.mozilla.community.
First draft of documentation of process, and initial comms to localisers about the Pontoon project:
https://docs.google.com/document/d/17q6JwjFH1zDq8fVQGHJhD7rhSh4H3poCZXIoToCoWDQ/edit?usp=sharing
from activate.mozilla.community.
@brianking some comments on the process doc
from activate.mozilla.community.
Any update on this?
from activate.mozilla.community.
@peiying2 we are ready to go our end. Once the strings are in Pontoon, we will send the email to localisers.
from activate.mozilla.community.
Great, let's go!
from activate.mozilla.community.
Thanks @comzeradd! Could you use the .xliff file extension?
This never happened, files are still called .xlf
I see files showing up on Pontoon though, so I assume we're OK.
from activate.mozilla.community.
Actually it did happen: 0c97135.
And then we added support for .xlf in Pontoon: mozilla/pontoon@12bf17f.
from activate.mozilla.community.
And then we added support for .xlf in Pontoon:
Thanks, that explains things ;-)
from activate.mozilla.community.
Thanks for the info. I don't see them show up in Pontoon yet, am I looking at the wrong place?
I will document our "merge-back" process soon in the README file.
from activate.mozilla.community.
It's not exposed yet, it's only on the stage server. I think @peiying2 still needs to coordinate messaging too
from activate.mozilla.community.
We have gone back to our old way of doing things for now. Obviously that is not the way we want, as we won't get a lot of traction with that approach. However we could not figure out yet how to deal with markdown files and xliff (or anything else for that matter). You can find my exploration at https://github.com/MichaelKohler/md2xliff-exploration/blob/master/README.md . I'm closing this issue for now, we need to discuss if we want to keep markdown files or if we want to go for something more suitable (and then using l20n). Thanks for all your help, we will get back to you.
from activate.mozilla.community.
Related Issues (20)
- Move locale bn-BD to bn HOT 1
- Move locale bn-BD to bn HOT 2
- Close Firefox for Good contest HOT 1
- Translation from Englisch to Romanian and German HOT 1
- Set Respond campaign as an upcoming campaign
- Put Join Firefox Campaign on activate
- Add as an active campaign SOTI for December 16th HOT 2
- Sunset the webextension activity HOT 2
- Take out impact and metrics from the activities
- Add community portal menu HOT 20
- Evaluate common voice
- Evaluate Webcompat
- Evaluate Rust
- Evaluate VR
- Take down Firefox Account campaign
- Take down State of the Internet Campaign
- Add Firefox Voice Campaign on Activate
- Activate -> Portal Transition Notice HOT 5
- Tech Speakers campaign to be taken down
- Redirect Activities menu to the new activities in the community portal HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from activate.mozilla.community.