Coder Social home page Coder Social logo

jason-fox / fox.jason.translate.xliff Goto Github PK

View Code? Open in Web Editor NEW
10.0 4.0 4.0 1.1 MB

DITA-OT plug-in to create, auto-translate and re-merge XLIFF files, generating translated documentation in a targeted foreign language.

Home Page: https://jason-fox.github.io/dita-ot-plugins/translate.xliff

License: Apache License 2.0

XSLT 70.27% Java 29.73%
dita-ot-plugin dita xliff dita-ot language-translation watson-translator yandex-translate bing-translator deepl

fox.jason.translate.xliff's Introduction

DITA-OT Translate Plug-in

license DITA-OT 4.2 CI Coverage Status Quality Gate Status

DITA-OT Translate Plug-in is a DITA-OT Plug-in to create, auto-translate and re-merge XLIFF files, generating translated documentation in a targeted foreign language. It can create and consume files using either XLIFF 1.2 or XLIFF 2.1 format.

This plug-in consists of three DITA-OT transforms

  • The xliff-create transform creates XLIFF and skeleton files from the *.dita files.
  • The xliff-translate transform populates the <target> texts using an automatic translation service.
  • The xliff-dita transform recreates the DITA project using the translated texts.

▶️ Video from DITA-OT Day 2019

Table of Contents

Install

The DITA-OT Translate Plug-in has been tested against DITA-OT 4.x. It is recommended that you upgrade to the latest version.

Installing DITA-OT

The DITA-OT Translate Plug-in is a plug-in for the DITA Open Toolkit.

  • Full installation instructions for downloading DITA-OT can be found here.

    1. Download the dita-ot-4.2.zip package from the project website at dita-ot.org/download
    2. Extract the contents of the package to the directory where you want to install DITA-OT.
    3. Optional: Add the absolute path for the bin directory to the PATH system variable.

    This defines the necessary environment variable to run the dita command from the command line.

curl -LO https://github.com/dita-ot/dita-ot/releases/download/4.2/dita-ot-4.2.zip
unzip -q dita-ot-4.2.zip
rm dita-ot-4.2.zip

Installing the Plug-in

  • Run the plug-in installation commands:
dita install https://github.com/doctales/org.doctales.xmltask/archive/master.zip

The dita command line tool requires no additional configuration.


Signing up for an Automatic Translation Service

Several publically available automatic translation cloud services are available for use, they typically offer a try-before-you-buy option and generally offer sample access to the service for without cost. Upgrading to a paid version will be necessary when transforming larger documents.

IBM Cloud Services

The IBM Language Translator allows you to translate text programmatically from one language into another language

Introduction: Getting Started

Create an instance of the service:

  1. Go to the Language Translator External link icon page in the IBM Cloud Catalog.
  2. Sign up for a free IBM Cloud account or log in.
  3. Click Create.

Copy the credentials to authenticate to your service instance:

  1. From the IBM Cloud dashboard External link icon, click on your Language Translator service instance to go to the Language Translator service dashboard page.
  2. On the Manage page, click Show to view your credentials.
  3. Copy the API Key and URL values.
  4. Within the plug-in alter the file cfg/configuration.properties to hold your API Key and URL.

By default the Frankfurt translation service URL used is: https://gateway-fra.watsonplatform.net/language-translator/api/v3/translate, amend this when using a regional instance.


Microsoft Azure

Microsoft Translator provides multi-language support for translation, transliteration, language detection, and dictionaries.

Introduction: Overview

Create an instance of the service:

  1. Go to Try Cognitive Services
  2. Select the Translator Text APIs tab.
  3. Under Translator Text, select the Get API Key button.
  4. Agree to the terms and select your locale from the drop-down menu.
  5. Sign in by using your Microsoft, Facebook, LinkedIn, or GitHub account.

You can sign up for a free Microsoft account at the Microsoft account portal. To get started, click Sign in with Microsoft and then, when asked to sign in, click Create one. Follow the steps to create and verify your new Microsoft account.

After you sign in to Try Cognitive Services, your free trial begins. The displayed webpage lists all the Azure Cognitive Services services for which you currently have trial subscriptions. Two subscription keys are listed beside Speech Services. You can use either key in your applications.

Copy the credentials to authenticate to your service instance:

  1. Copy each of the API Key and Endpoint values.
  2. Within the plug-in alter the file cfg/configuration.properties to hold your API Key and URL.

By default the global translation service URL used is: https://api.cognitive.microsofttranslator.com/translate, amend this when using a regional instance.


Yandex Translate

The API provides access to the Yandex online machine translation service. It supports more than 90 languages and can translate separate words or complete texts.

Introduction: Overview

To sign-up to the service:

  1. Review the user agreement and rules for formatting translation results.
  2. Get a free API key.
  3. Read the documentation, where you will find instructions on enabling the API and detailed descriptions of its features.

After you sign in to your account select API Keys and create a new key as necessary. The latest endpoint can be found in the documentation

https://translate.yandex.net/api/v1.5/tr/translate

Copy the credentials to authenticate to your service instance:

  1. Copy each of the API Key and Endpoint values.
  2. Within the plug-in alter the file cfg/configuration.properties to hold your API Key and URL.

DeepL API

The DeepL API is accessible with a DeepL Pro subscription (DeepL API plan) only. The API is an interface that allows other computer programs to send texts to the DeepL servers and receive high-quality translations.

Introduction: Overview

To sign-up to the service:

  1. Open a DeepL API developers account. Note that not all accounts offer access to the DeepL API. It is essential that the account type includes REST API access.
  2. Fill out the application details and add a credit card. No payments are required for the first 30 days. You can cancel the card and still maintain free access for the trial period.
  3. Read the documentation, where you will find instructions on enabling the API and detailed descriptions of its features.

After you sign in to your account select API Keys and create a new key as necessary. The latest endpoint can be found in the documentation

https://api.deepl.com/v2/translate

Copy the credentials to authenticate to your service instance:

  1. Copy each of the API Key and Endpoint values.
  2. Within the plug-in alter the file cfg/configuration.properties to hold your API Key and URL.

Usage

XLIFF 1.2 Invocation from the command line

  1. to create an XLIFF 1.2 File and associated skeletons with run:
PATH-TO-DITA-OT/bin/dita -f xliff-create -i document.ditamap  -o out  --xliff.version=1

Result

A translate.xlf file will appear in the out directory along with a series of skeleton files.

<?xml version="1.0" encoding="UTF-8"?>
<xliff xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <file datatype="xml" original="/document.ditamap" source-language="en" target-language="es">
    <header xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:dita="http://www.dita-ot.org">
      <skl>
        <external-file href="./skl/document.ditamap.skl" />
      </skl>
    </header>
    <body xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:dita="http://www.dita-ot.org">
        <trans-unit xmlns="" xmlns:dita="dita-ot.org" approved="no" id="42094" xml:space="preserve">
          <source xml:lang="en">
            Loves or pursues or desires to obtain pain of itself, because it
            is pain, but occasionally circumstances occur in which toil and
            pain can procure him some great pleasure. To take a trivial
            example,  <x ctype="x-dita-b" id="d3e14">which of us ever undertakes
            laborious physical exercise,</x> except to obtain some advantage from it?
            But who has any right to find fault with a man who chooses to enjoy a pleasure
            that has no annoying consequences, or one who avoids a pain that produces no
            resultant pleasure?
          </source>
          <target xml:lang="la"/>
        </trans-unit>
        ... etc
      </body>
   </file>
...etc

Note: if the translate.cachefile parameter is used, unchanged text with previously approved translations will be copied over to the <target> elements.

  1. to populate an exisiting XLIFF 1.2 File with auto-translated text
PATH-TO-DITA-OT/bin/dita -f xliff-translate \
    -i translate.xlf --translate.service=[bing|deepl|watson|yandex] \
    --translate.apikey=<api-key>
    --xliff.version=1

Result

The XLIFF 1.2 File is auto-translated in place, with translated text as shown:

Note: only <trans-unit> elements which are approved="no" will be auto-translated.

<?xml version="1.0" encoding="UTF-8"?>
<xliff xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <file datatype="xml" original="/document.ditamap" source-language="en" target-language="es">
    <header xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:dita="http://www.dita-ot.org">
      <skl>
        <external-file href="./skl/document.ditamap.skl" />
      </skl>
    </header>
    <body xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:dita="http://www.dita-ot.org">
        <trans-unit xmlns="" xmlns:dita="dita-ot.org" approved="no" id="42094" xml:space="preserve">
          <source xml:lang="en">
            Loves or pursues or desires to obtain pain of itself, because it
            is pain, but occasionally circumstances occur in which toil and
            pain can procure him some great pleasure. To take a trivial
            example, <x ctype="x-dita-b" id="d3e14">which of us ever undertakes
            laborious physical exercise,</x> except to obtain some advantage from it?
            But who has any right to find fault with a man who chooses to enjoy a pleasure
            that has no annoying consequences, or one who avoids a pain that produces no
            resultant pleasure?
          </source>
          <target xml:lang="la">
            Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do
            eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
            enim ad minim veniam, <x ctype="x-dita-b" id="d3e14">quis nostrud exercitation
            ullamco laboris,</x> nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor
            in reprehenderit in voluptate velit esse cillum dolore eu fugiat
            nulla pariatur. Excepteur sint occaecat cupidatat non proident,
            sunt in culpa qui officia deserunt mollit anim id est laborum.
          </target>
        </trans-unit>
        ...etc
      </body>
   </file>
...etc

XLIFF 2.1 Invocation from the command line

  1. to create an XLIFF 2.1 File and associated skeletons with run:
PATH-TO-DITA-OT/bin/dita -f xliff-create -i document.ditamap  -o out  --xliff.version=2

Result

A translate.xlf file will appear in the out directory along with a series of skeleton files.

<?xml version="1.0" encoding="UTF-8"?>
<xliff srcLang="en" trgLang="la" version="2.0" xmlns="urn:oasis:names:tc:xliff:document:2.0">
  <file id="2" original="/topic.dita">
    <skeleton href="./skl/topic.dita.skl"></skeleton>
    <unit fs:fs="p" id="9962" xmlns:fs="urn:oasis:names:tc:xliff:fs:2.0">
      <originalData>
        <data id="sd4e14">&lt;b&gt;</data>
        <data id="ed4e14">&lt;/b&gt;</data>
      </originalData>
      <segment state="initial">
        <source xml:lang="en" xml:space="preserve">Loves or pursues or desires to obtain pain of
            itself, because it is pain, but occasionally circumstances occur in which toil and pain
            can procure him some  great pleasure. To take a trivial example, <pc dataRefEnd="ed4e14"
            dataRefStart="sd4e14" fs:fs="b" id="d4e14">which of us ever undertakes laborious physical
            exercise,</pc>except to obtain some advantage from it? But who has any right to find fault
            with a man who chooses to enjoy a pleasure that has no annoying consequences, or one who avoids
            a pain that produces no resultant pleasure?
          </source>
          <target xml:lang="la"></target>
      </segment>
    </unit>
    ...etc
  </file>
  ...etc
  1. to populate an exisiting XLIFF 2.1 File with auto-translated text
PATH-TO-DITA-OT/bin/dita -f xliff-translate \
    -i translate.xlf --translate.service=[bing|deepl|watson|yandex] \
    --translate.apikey=<api-key>
    --xliff.version=2

Result

The XLIFF 2.1 File is auto-translated in place, with translated text as shown:

Note: any <segement> elements which are state="final" will not be re-translated.

<?xml version="1.0" encoding="UTF-8"?>
<xliff srcLang="en" trgLang="la" version="2.0" xmlns="urn:oasis:names:tc:xliff:document:2.0">
  <file id="2" original="/topic.dita">
    <skeleton href="./skl/topic.dita.skl"></skeleton>
    <unit fs:fs="p" id="9962" xmlns:fs="urn:oasis:names:tc:xliff:fs:2.0">
      <originalData>
        <data id="sd4e14">&lt;b&gt;</data>
        <data id="ed4e14">&lt;/b&gt;</data>
      </originalData>
      <segment state="translated">
        <source xml:lang="en" xml:space="preserve">Loves or pursues or desires to obtain pain of
            itself, because it is pain, but occasionally circumstances occur in which toil and pain
            can procure him some  great pleasure. To take a trivial example, <pc dataRefEnd="ed4e14"
            dataRefStart="sd4e14" fs:fs="b" id="d4e14">which of us ever undertakes laborious physical
            exercise</pc>except to obtain some advantage from it? But who has any right to find fault with
            a man who chooses to enjoy a pleasure that has no annoying consequences, or one who avoids a pain
            that produces no resultant pleasure?
        </source>
        <target xml:lang="la">
            Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do
            eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
            enim ad minim veniam, <pc dataRefEnd="ed4e14" dataRefStart="sd4e14" fs:fs="b" id="d4e14">
            quis nostrud exercitation ullamco laboris,</pc> nisi ut aliquip ex ea commodo consequat.
            Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat
            nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia
            deserunt mollit anim id est laborum.
        </target>
      </segment>
    </unit>
    ...etc
  </file>
  ...etc

Populating Skeletons from the command line

  1. recreate *.dita files using an XLIFF File and its associated skeletons with run:
PATH-TO-DITA-OT/bin/dita -f xliff-dita -i translate.xlf -o out --xliff.version=1|2

Result

The translated *.dita files are generated into the out directory.

Note

Any machine translation is by definition imperfect. A typical translation workflow would send the generated XLIFF files to the translation agency (known also as "localisation service provider"), and receive back verified translated content from the translation agency integrated into to the XLIFF. For XLIFF 1.2, each <trans-unit> should be marked approved="yes" when the <target> element has been verified. Similarly for XLIFF 2.1 each <segement> should be marked as state="final".

Parameter Reference

  • translate.from - Source language to use. Defaults to the value in configuration.properties
  • translate.to - Target language. Defaults to the value in configuration.properties
  • translate.cachefile - Specifies the (absolute) location of a previously translated XLIFF file to be used. If the id matches to a previously translated text snippet in the cache file, the text will be copied over and the snippet marked as approved.
  • translate.service - Decides which translation service to use:
    • bing - Connects to the Microsoft Azure Translation service
    • custom - Sends the translate to an arbitrary URL using POST - use this to connect to proxies for Google Cloud Translate
    • deepl - Connects to the DeepL API Translation service
    • dummy - Avoids accessing a translation service, copies sources to target langauge directly without amendment.
    • watson - Connects to the IBM Cloud Translation service
    • yandex - Connects to the Yandex Translation service
  • translate.authentication.url - URL for creating an OAuth token if needed for a service. Defaults to the value in `configuration.properties.
  • translate.apikey - API Key for the Translation service. Defaults to the value in configuration.properties
  • translate.region - Subscription region for a Microsoft multi-service text API subscription
  • translate.url - URL for a Translation service. Defaults to the value in configuration.properties
  • xliff.version - Decides which XLIFF format to use. Defaults to the value in configuration.properties:
    • 1 - XLIFF 1.2 format
    • 2 - XLIFF 2.1 format

License

Apache 2.0 © 2019 - 2024 Jason Fox

The Program includes the following additional software components which were obtained under license:

fox.jason.translate.xliff's People

Contributors

actions-user avatar electronix avatar jason-fox avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

fox.jason.translate.xliff's Issues

Questions around @translate

Three questions received from @xephon2

Does it take the @translate attribute into account?
Are there any limitation on complex file structures?
What about profiling attributes? I could not find any info about this in the docs.

xliff-translate error

I have installed dita-ot3.6.1 and translate.xliff and face following warnings and errors:
xliff-create is running with some warnings which seem to cause errors during xliff-translate.

xliff-create:

[copy] Copying 68 files to C:\Users\Michael\Documents\CE\Documentation\eLABin1 xml (eng)\eLABin1\SCPI\out2\skl
 [copy] Copied 10 empty directories to 2 empty directories under C:\Users\Michael\Documents\CE\Documentation\eLABin1 xml (eng)\eLABin1\SCPI\out2\skl
[xliff-gen] Transforming into C:\Users\Michael\AppData\Local\Temp\temp20210427162002286\xliff
[xliff-gen] Processing C:\Users\Michael\AppData\Local\Temp\temp20210427162002286\md5\SCPI.ditamap.xml to C:\Users\Michael\AppData\Local\Temp\temp20210427162002286\xliff\SCPI.ditamap.xlf.xml
[xliff-gen] Loading stylesheet C:\Users\Michael\Documents\CE\Documentation\dita-ot-3.6.1\plugins\fox.jason.translate.xliff\xsl\2\dita-xliff.xsl
[xliff-gen] C:\Users\Michael\Documents\CE\Documentation\dita-ot-3.6.1\plugins\fox.jason.translate.xliff\xsl\2\dita-xliff.xsl:28:9: Warning! Exception thrown by URIResolver Cause: java.lang.reflect.InvocationTargetException`
later (many times):
`[xliff-gen] Processing C:\Users\Michael\AppData\Local\Temp\temp20210427162002286\md5\topics\c_scpi_basics.dita.xml to C:\Users\Michael\AppData\Local\Temp\temp20210427162002286\xliff\topics\c_scpi_basics.dita.xlf.xml
[xliff-gen] C:\Users\Michael\Documents\CE\Documentation\dita-ot-3.6.1\plugins\fox.jason.translate.xliff\xsl\2\dita-xliff.xsl:28:9: Warning! Document has been marked not available: file:/C:/Users/Michael/Documents/CE/Documentation/dita-ot-3.6.1/plugins/fox.jason.translate.xliff/xsl/2/dita-xliff.xsl/../C:\Users\Michael\AppData\Local\Temp\temp20210427162002286\null464186473.xml
[xliff-gen] Processing C:\Users\Michael\AppData\Local\Temp\temp20210427162002286\md5\topics\c_scpi_prog.dita.xml to C:\Users\Michael\AppData\Local\Temp\temp20210427162002286\xliff\topics\c_scpi_prog.dita.xlf.xml
[xliff-gen] C:\Users\Michael\Documents\CE\Documentation\dita-ot-3.6.1\plugins\fox.jason.translate.xliff\xsl\2\dita-xliff.xsl:28:9: Warning! Document has been marked not available: file:/C:/Users/Michael/Documents/CE/Documentation/dita-ot-3.6.1/plugins/fox.jason.translate.xliff/xsl/2/dita-xliff.xsl/../C:\Users\Michael\AppData\Local\Temp\temp20210427162002286\null464186473.xml

xliff-translate:

Error: The following error occurred while executing this line:
jar:file:/C:/Users/Michael/Documents/CE/Documentation/dita-ot-3.6.1/bin/../plugins/fox.jason.translate.xliff/lib/translate-1.1.jar!/fox/jason/translate/antlib.xml:445: C:\Users\Michael\AppData\Local\Temp\temp20210427163100231\null1168852439.txt doesn't exist

The whole dita-document seems ok, as I don't have any issues creating PDFs.

Question: will there be a new release soon?

The current release (3.5.0) does not work with the current release of org.doctales.xmltask (1.16.1).
Explicitly the command dita -f xliff-dita %restofarguments% throws an error the following one:

jar:file:/opt/dita/plugins/fox.jason.translate.xliff/lib/translate-1.2.jar!/fox/jason/translate/antlib.xml:574: java.lang.IllegalAccessError: class com.oopsconsultancy.xmltask.jdk15.XPathAnalyser15 (in unnamed module @0x45283ce2) cannot access class com.sun.org.apache.xpath.internal.XPathAPI (in module java.xml) because module java.xml does not export com.sun.org.apache.xpath.internal to unnamed module @0x45283ce2

Edit: I use openjdk-17 since DITA-OT requires version 17 or newer

These are the configurations I tested:
fox.jason.translate.xliff 3.5.0 & org.doctales.xmltask 1.16.1 -> is the current configuration in the DITA-OT registry, does not work due to the stated error
fox.jason.translate.xliff 3.5.0 & org.jung.xmltask master HEAD -> does not work because the name changed from org.doctales.xmltask to org.jung.xmltask, can be fixed easily
fox.jason.translate.xliff master HEAD (with dependency correction in plugin.xml) & org.jung.xmltask master HEAD -> WORKS!

Since org.jung.xmltask is changed frequently at the moment it would be nice to coordinate a release for both of them.
Feel free to reach out to me if my description isn't precise enough :)

net.sf.saxon.trans.XPathException error

Hello - when I try to execute the xliff-create transformation, the console prints this error message:
Error: net.sf.saxon.trans.XPathException: I/O error reported by XML parser processing file:/Users/atrujillo/Core/prototype/en/main-product-maps/topics/common-topics/parasoft-settings/dtp-settings/dtp.autoconfig.dita: /Users/atrujillo/Core/prototype/en/main-product-maps/topics/common-topics/parasoft-settings/dtp-settings/topic.dtd (No such file or directory)

I've successfully validated all of the dita and ditamap files in my project and do not understand what might be throwing the error. I am able to execute the other transformations (e.g., html5) with problem.

I am using DITA-OT 3.5.1 and have installed the org.doctales.xmltask plugin per the instructions.

Please let me know if any other information is necessary to help determine if this is a bug.

Thanks,
Adam

'glossrefs' are missing from the translation

After translating a longer document, I realize that:
'glossrefs' are missing from the translation.

I saw the file 'no-translate-elements.xsl' which seems to specify elements which are not translated.
After the translation the <codeph>-tag is missing

<p> inside <ul> <li> are missing: everything is 1 long line

translate.xlf does not contain contents since DITA-OT 4.1

It looks like the plugin isn't working anymore since DITA-OT 4.1. With DITA-OT 4.0.2 the result looks as expected.
The skl files are created correctly but the translate.xlf file contains nothing more than

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<xliff xmlns="urn:oasis:names:tc:xliff:document:2.0"
        xmlns:fs="urn:oasis:names:tc:xliff:fs:2.0"
        srcLang="en"
        trgLang="de"
        version="2.0"/>

For now thats all I found out. I'll update on this when I know more about it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.