The xlifflib from simonech

Make JSON serialization and deserialization a bit nicer

Find a way to remove the ugly "$type" property from the JSON output

TargetLanguage should be returned to the caller after a merge

Either in the Bundle or as additional output parameter of the merge operation, the caller must know the target language

Support inline elements that are not just formatting

Currently inline codes only support formatting elements, but should also support links and images at least.

Implement RESTFul extraction service

Wrap the extraction process behind a RESTful service:

Input: generic document structure with metadata
Output: Xliff 2

Reduce duplication in the code that moves content from unit and groups (and back)

Across the library, there are many points where content is moved from a unit to a group and other way around.
This moves its metadata and segments from one to the other.
There are small differences between all the instances of the code, but better to find a common way of dealing with those, and refactor to just on method instead of 4 or more

Make Property and PropertyGroup inheriting from same base class

Reason is that when extracting content from 3rd party systems extractors might return both a property or a propertygroup depending on how the content is structured.
This also requires changing Document to have a list of PropertyContainers instead of a separate list of properties and propertygroups.

Implement "connector" for JSON

Implement "connector" for generic XML

Refactor Xliff reader and writer to external classes

Now they are inside the merger and extractor but they should live outside of them

Restructure testing hierarchy

Now all test are a bit mixed. Would be better to split the various files in merging and extraction

Uncaught error when target element is blank .

If you import a XLIFF document with an empty target e.g.

<target></target>

Then the .FromXliff function fails as Target.Text is null or empty (so Target.Text[0] as PlainText).

PR on it's way - just wanted issue number for branch

Find a better way to pass target language

At the moment target language is passed as parameter to the extraction, but the extraction is not doing anything with the target language.
Maybe something else should add the the target language to the Xliff document.

I want to be able to add custom elements to the root of the file

In some cases custom XML elements are needed for other tools to work on.
The library should allow adding extension points

Handle nested formatted elements in inlinecodes

Currently only one level formatting is supported, but formatting inside formatting, like <b>this bold and <u>underlined</u></b> is not.

Add possibility to specify target language

During an export, it must be compulsory to specify a target language

Refactor conversion from html tags to inline code

Make conversion from html tags to inline codes to avoid two classes and two list of mappings.

Give option to include originadata or type/subtype when converting to inline codes

Allow configuration of inline code processing to choose between putting the enclosing tags as originaldata, type/subtype or both.

Must be able to specify additional metadata to the file

For example an id for the source system that provided the extract

Export with default exporter fails if a unit is plain text

Reason is that the CData splitter fails if it's trying to split a unit which doesn't have a CData section

Add command line tool

Add CLI tool for exporting/merging text, XML, JSON, MD

Add tests for metadata support

Check that metadata are added to nested properties
Check that CDataSplitter copies metadata to parent group when splitting units

Must be able to import xliff to bundle

And Xliff file must be converted back to a Bundle so that the original CMS can import it back

Add custom Exception classes

Add some custom extensions for when import/export is not possible due to validation errors or pre-conditions (like when the code expects just one unit or one segment and for some reasons there are more)

Simplify Export/Merger to just work with Bundle

It would have been cool to have it work with random sources, but for the moment better to simplify and keep it focused to work with just the Bundle.

Also, remove the logic from within the Model

Implement "connector" for HTML fragments

Attributes of block elements is not retained when imported back

After the implementation of issue #42 introduced support of block elements, now it is possible to send tables and other html elements to translation.
Anyway, if a block element contains an attribute (a style or class), it is not included in the XLIFF and thus not imported back.

Plain text paragraph should be split as well

Now only HTML text is split in paragraphs, but also plain text should be

Refactor ParagraphSplitter to do conversion to XLIFF in one go

In issue #42 I created an html parser that extracts the whole html into one tree in one pass, but it's still used by the paragraph splitter in a recursive approach.
A new html specific "splitter" should be implemented that does the XLIFF extraction in one go.

simonech / xlifflib Goto Github PK

xlifflib's Introduction

Hi there 👋

xlifflib's People

Contributors

Stargazers

Watchers

Forkers

xlifflib's Issues

Recommend Projects

Recommend Topics

Recommend Org