simonech / xlifflib Goto Github PK
View Code? Open in Web Editor NEWLibrary to generate xliff file to export content to the standard xml format for CAT tools
License: MIT License
Library to generate xliff file to export content to the standard xml format for CAT tools
License: MIT License
Find a way to remove the ugly "$type" property from the JSON output
Either in the Bundle or as additional output parameter of the merge operation, the caller must know the target language
Currently inline codes only support formatting elements, but should also support links and images at least.
Wrap the extraction process behind a RESTful service:
Input: generic document structure with metadata
Output: Xliff 2
Across the library, there are many points where content is moved from a unit to a group and other way around.
This moves its metadata and segments from one to the other.
There are small differences between all the instances of the code, but better to find a common way of dealing with those, and refactor to just on method instead of 4 or more
Reason is that when extracting content from 3rd party systems extractors might return both a property or a propertygroup depending on how the content is structured.
This also requires changing Document to have a list of PropertyContainers instead of a separate list of properties and propertygroups.
Now they are inside the merger and extractor but they should live outside of them
Now all test are a bit mixed. Would be better to split the various files in merging and extraction
If you import a XLIFF document with an empty target e.g.
<target></target>
Then the .FromXliff
function fails as Target.Text is null or empty (so Target.Text[0]
as PlainText).
PR on it's way - just wanted issue number for branch
At the moment target language is passed as parameter to the extraction, but the extraction is not doing anything with the target language.
Maybe something else should add the the target language to the Xliff document.
In some cases custom XML elements are needed for other tools to work on.
The library should allow adding extension points
Currently only one level formatting is supported, but formatting inside formatting, like <b>this bold and <u>underlined</u></b>
is not.
During an export, it must be compulsory to specify a target language
Make conversion from html tags to inline codes to avoid two classes and two list of mappings.
Allow configuration of inline code processing to choose between putting the enclosing tags as originaldata, type/subtype or both.
For example an id for the source system that provided the extract
Reason is that the CData splitter fails if it's trying to split a unit which doesn't have a CData section
Add CLI tool for exporting/merging text, XML, JSON, MD
Check that metadata are added to nested properties
Check that CDataSplitter copies metadata to parent group when splitting units
And Xliff file must be converted back to a Bundle so that the original CMS can import it back
Add some custom extensions for when import/export is not possible due to validation errors or pre-conditions (like when the code expects just one unit or one segment and for some reasons there are more)
It would have been cool to have it work with random sources, but for the moment better to simplify and keep it focused to work with just the Bundle.
Also, remove the logic from within the Model
After the implementation of issue #42 introduced support of block elements, now it is possible to send tables and other html elements to translation.
Anyway, if a block element contains an attribute (a style or class), it is not included in the XLIFF and thus not imported back.
Now only HTML text is split in paragraphs, but also plain text should be
In issue #42 I created an html parser that extracts the whole html into one tree in one pass, but it's still used by the paragraph splitter in a recursive approach.
A new html specific "splitter" should be implemented that does the XLIFF extraction in one go.
When an empty p tag or a self closing element are parsed, it gives null reference exception
A linked list is the best approach to create list of items in order, and allowing to specify in which position to add them.
Specifically it must be possible to configure
xlf:b -> b or strong
xlf:i -> i or em
could be some configuration like semantic tags or format tags
Instead of adding all import and export tests in two classes, put the tests for extracting and merging from Xliff to Bundle and back into the individual elements (Document, PropertyGroup, Property).
The CData splitter class splits multi-paragraph properties into multiple units.
But in order to import it back, it must merge them back together.
It should take the enclosing tag based on the type and subtype or the originaldata if available
Additional data must be added to groups or units, for example to store the field where the content came from in the original system
In the FromXliff methods, explain in the error message which element is causing the problem
Now all metadata are added under a metadata group called XliffLib.
Make it possible for users to define their own groups
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.