Coder Social home page Coder Social logo

Translation support about devilutionx HOT 55 CLOSED

diasurgical avatar diasurgical commented on April 29, 2024
Translation support

from devilutionx.

Comments (55)

john-tornblom avatar john-tornblom commented on April 29, 2024 3

thanks @AJenbo, that did the trick! I've now gotten it to work with DevilutionX (v1.0.1).

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024 3

Diablo was build with Windows-1252, but in-game it is limited to ASCII. Moving to UTF-8 shouldn't really be an issue and is what we plan on doing. Moving to language specific code pages like Windows 1258 is problematic as it locks the binery to a specific language, and breaks chat between various clients.

The biggest blocker atm is getting gettext building on Mac and Windows. Next step would be to create a TTF version of the diablo font and figure out how to do font substitution in order to support languages that needs glyphs not found in ASCII.

The current progress can be found here: #533
I did just discover that https://github.com/SuperTux/supertux might be the perfect fit for us to (gettxt, sdl_ttf, cmake, font substitution). So if anyone is willing to work on this it would be a good place to look for examples of how to implement things.

from devilutionx.

sheepo99 avatar sheepo99 commented on April 29, 2024 2

It does, thank you for the info.

In regards to language targeting: I would at least say include only fully translated languages in the package milestone builds (incomplete ones could still be in the nightlies or when downloading source for compilation). This is so translators are motivated to finish their work, as incomplete/unpolished translations are a familiar sight in many open source projects.

from devilutionx.

muziling avatar muziling commented on April 29, 2024 2

zh_cn.po.zip
chinese translation, Character reference is 4E00-9FA5, thanks

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024 1

Hi @liberodark
We currently do not have the translation infrastructure in place, but if you have any suggestions we could talk about what solution would be best. I usually have used Gettext on previous software solutions. It generates a .po file for the translator to work on. There is plenty of software for editing them like PoEdit.

The game will upscale to your current resolution, so yes it will run at 1080p with black bars in the side and bilinear filtering (same as GOG.com's version). That said we are currently working on the renderer to improve the visual output, here amounts having it change the output resolution meaning you will be able to see more of the world at one time (no black bars etc).

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024 1

Gettext support adding comments to the message in code and then including them in the exported .pot file for the translators.

from devilutionx.

kraileth avatar kraileth commented on April 29, 2024 1

If Transifex is chosen in the end, count me in for the German translation. I've helped with translations of a couple of Open Source projects that I'm far less familiar with than with Diablo. Also I kind of always wanted to do this; it's almost a quarter of a century since I first loaded up Diablo.exe in an hex editor and begun translating the strings found there (which was actually really difficult due to length limitations and the fact that English usually is quite a bit shorter). I'd love to eventually do this properly!

from devilutionx.

john-tornblom avatar john-tornblom commented on April 29, 2024 1

I posted PO files here: https://gist.github.com/john-tornblom/50b9bd4ea3a3ffc36b94f387696a1f0d

from devilutionx.

john-tornblom avatar john-tornblom commented on April 29, 2024 1

Supertux seems to be using tinygettext.

From a programmer point of view, its a simple API choice that has to be made. Using string keys with simple macro, e.g., #define _(x) gettext(x), seems very popular. I've also seen enum/int keys which seems to reduce readability of source code. However, the latter would fit well with the audio lookup techniques already being leveraged in devilutionX. I guess one could just use a different macro where enum keys are preferable, e.g, #define _(x) gettext("x"), while using the same underlying translation table.

From a translator point of view, we would like to have a fileformat with good tool support, and possibly the ability to convert translations to other file formats.

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024 1

@runlevel5 I would really appreciate some sample color fonts in the various OTF formats: SVG | COLR | SBIX | CBDT. If you can help with that then it will be much clear for us to figure out where to go from here.

What I was initially looking at was to edit the AA rendering of the font for a specific size since that would let me paint a pixel-based grayscale version that could then be recolored at render time. It looks like FontForge is capable of this, but I have been unable to figure out how to achieve this since the option appears grayed out when I try the program.

If your friend knows of any other ways to implement textured fonts that would also be helpful.

from devilutionx.

john-tornblom avatar john-tornblom commented on April 29, 2024 1

OK, I'll have a look at alternatives to gettext next weekend and see what I can come up with, starting with tinygettext.

from devilutionx.

john-tornblom avatar john-tornblom commented on April 29, 2024 1

Did some experiments with parsing MO files, see the function mo_parse at https://gist.github.com/john-tornblom/0ba2f1d46a48a288145b9b5ed0fdb501#file-mo2po-c-L57

Do you think this would be sufficient (assuming a std::map instead of an array of string_map_t)? We still need gettext to compile MO files, but I guess that is not a problem?

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024 1

Basic translation support is now enabled in the application. If anyone wants to work on a language please download Poedit and start a new translation based on the deviutionx.pot file. Then update from the source and you should see a list of the latest texts that is available for translation. Once done submit a PR with the .po that you have created.

Not that there is currently a limit with how DevbiutionX renders text which limits it to the following symbols:
image
This will be addressed separately, it may cause your translations to look a bit funny until addressed.

Thanks to @john-tornblom for working on the text parser.

from devilutionx.

liberodark avatar liberodark commented on April 29, 2024

Little other question have possibility to play in 1080 Full Screen ?

from devilutionx.

liberodark avatar liberodark commented on April 29, 2024

Great thank you for your work !

from devilutionx.

grepwood avatar grepwood commented on April 29, 2024

@AJenbo We could acquire gamedata from around the world and do some basic text analysis. The purpose of this exercise is to find out what encoding tables were used for the gamedata.

Analogously, OpenMW and VCMI allow the user to switch between Windows encoding tables 1250 (Eastern European), 1251 (Cyrillic) and 1252 (Western European). I pitched them an idea way back ago that we could do away with manual selection. Morrowind's main data file has a header with a 256 byte comment. In the commercial releases, this comment is in the same language as the rest of the gamedata, so we can compare that 256 byte chunk against known chunks to determine the language and encoding table. This is a little resource hungry since each acknowledged language would take up 256 bytes in resource segment and the language check is O(n). But it takes away hassle from the end user.

If Diablo doesn't use UTF-8, then this is an available solution to compare found gamedata against known gamedata - either by part or checksum and size.

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

Diablo always uses ISO_8859-1, changing this would break multiplayer and save game compatability. The only exception is the PS1 release, but we are unable to use the data from this release as it is in a mostly unknown format (and has a lower resolution and frame rate).

There are some errors in Diablo's fonts:

005C is / instead of \
2018 includes ‘ from Windows-1252
2019 includes ’ from Windows-1252
00B7 is ? instead of ·
00B8 is ? instead of ¸
00BC is ½ instead of ¼
00BD is ¼ instead of ½

(this is based on the UI font)
Extending the fonts to being Windows-1252 should be easy enough if some one is willing to render out the addition letters in a matching style. We could also fix the errors this way.
The in-game fonts should be the same except that they are missing some of the letters, this can be solved the same way.

This has been done at least partially previously:
diasurgical/devilution#32

Expanding to Windows-1252 adds the following symbols:
€ ‚ ƒ „ … † ‡ ˆ ‰ Š ‹ Œ Ž “ ” • – — ˜ ™ š › œ ž Ÿ

from devilutionx.

sheepo99 avatar sheepo99 commented on April 29, 2024

We currently do not have the translation infrastructure in place, but if you have any suggestions we could talk about what solution would be best. I usually have used Gettext on previous software solutions. It generates a .po file for the translator to work on. There is plenty of software for editing them like PoEdit.

This sounds just lovely save for one thing: we would have to find a way to give the lines context. I would suggest some sort of annex companion document included in the package a priori. Assuming gettext would always generate the lines under the same order, having a context file (like an excel sheet) would be very easy and it would help prospective translators a lot.

from devilutionx.

sheepo99 avatar sheepo99 commented on April 29, 2024

Could you describe in detail how would this work? Would it eventually be possible to export the .po files into an online translation environment such as Transifex?

As a translator myself, there are two ways this could be done. Keeping translations in a free online translation platform, or hosting a translation kit including up-to-date translation memories, which would have to be maintained manually.

My suggestion is the following: whether translations are to be included in the package or hosted online, do set targets for priority languages first (giving priority to French, Italian, German, Spanish is an industry standard) and ensure their translations are finished and polished before making a package out of them. For greek, hebrew, arabic, slavic and asian languages, first handle font support before allowing linguists to begin working on them. More importantly do not let incomplete translations make their way into the package.

A few other things that would be nice tot have:
1 - context field - short description about the specific line and where does it appear in game
2 - character limit field - a comment indicating the max amount of characters (including spaces) every specific line can have in game without it bleeding out of its bounding box.
3 - Male/female/plural tags - romance languages have special cases with male and female words. A good example would be Deckard Cain's "Hello my friend. Stay a while and listen." line. "Friend" can be translated by either "ami" and "amie" respectively a person of either male or female gender. A translation for a genderless word would be possible too (such as "camarade", for example), but would flexibility would vary from language to language and good results are not always guaranteed. OpenXCom made a wonderful implementation of this, and I would advise to look into their code for this matter.
4 - Tags for special formatting - For example [newline] for a linebreak, and so on.

Finally, if possible ensure every language has at least a qualified reviewer who is a native speaker and has at least some translation experience.

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

These should answer most of your questions:
https://docs.transifex.com/formats/gettext
https://www.gnu.org/software/gettext/manual/gettext.html#Names

Regarding targeting specific languages and having native speaking translators doing reviews; we are a hobby project and as such not part of the industry so there isn't much we can do beyond piquing others interest the same way as is the case for the code that has been written.

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

Appreciate the input

from devilutionx.

john-tornblom avatar john-tornblom commented on April 29, 2024

I addressed parts of this issue by patching the MPQ with alternative audio files from the playstation release of diablo, see https://github.com/john-tornblom/psx-tools/tree/audio-lang

I got the audio going with diabloweb. However, these audio files are encoded in a different PCM format, which devilutionX (SDL) does not recognize. I assume its a simple fix though. Re-encoding the WAVs with sox or something might be quick and crude way forward...

I am not 100% sure if I did the WAV mappings correctly, I used a pretty simple spectrum analysis approach to correlate audio files between the PSX and PC version, so there could be errors here.

Also, there seems to be some kind of locale files on the PSX release, so at least extracting some text is feasible. I'm not sure how complete it is with respect to the PC version though...

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

@john-tornblom make sure you didn't enable audio compression when generating the MPQ, it's a feature only avalible in later version (starcraft era) of storm (was included in some updates) and we do not support it in DevilutionX

from devilutionx.

john-tornblom avatar john-tornblom commented on April 29, 2024

Extracting text from a PS1 CDROM is pretty easy if you just neglect the binary preamble of the locale files. I posted gettext msgids from MAINTXT.ENG on pastebin: https://pastebin.com/61GmL3BL

If these msgids are similar to the strings used in devilutionX, translating to French, German, Swedish and Japanese should be quick and strait-forward.

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

Looks like the dialog texts are missing from this. But yeah it should be easy to merge thease with the devilutionx PO files. If only some one knew how to link things on Windows :(

from devilutionx.

runlevel5 avatar runlevel5 commented on April 29, 2024

@AJenbo I would love to help out with Vietnamese translation. Ideally UTF-8 font is recommended however I am aware of the breaking changes with multi-player and save game data. Alternatively I think the fonts could be extended to Windows 1258. Thoughts?

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

For assets (images and audio) we would use mpq or folders with localized language files. So no changes would be needed there except loading an extra mpq file.

from devilutionx.

john-tornblom avatar john-tornblom commented on April 29, 2024

For assets (images and audio) we would use mpq or folders with localized language files. So no changes would be needed there except loading an extra mpq file.

Right, ofc. I'm not that well acquainted with the src, I just assumed assets in MPQs are accessed using enums, like the ones in https://github.com/diasurgical/devilutionX/blob/master/enums.h

from devilutionx.

qndel avatar qndel commented on April 29, 2024

to hell with the mpqs! :P

from devilutionx.

runlevel5 avatar runlevel5 commented on April 29, 2024

@AJenbo

Next step would be to create a TTF version of the diablo font and figure out how to do font substitution in order to support languages that needs glyphs not found in ASCII.

I know a designer friend who could help with the TTF font. Feel free to let me know once you have sorted out all technical details with the implementation.

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

Ok, change of plans we will continue to use bitmap (image) font. The way that we will support Unicode in DeivlutionX is by using an image per Unicode Block. This allows us to break up the font in a way that doesn't create massive images and swallow all your ram. You can get an overview of what each block contains
https://en.wikibooks.org/wiki/Unicode/Character_reference/0000-0FFF

The first release will probably contain fonts for Basic Latin, Latin-1 Supplement, and Cyrillic. But if people contribute more translations and fonts we will ofcause be happy to add them as well.

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

@runlevel5 would the Tai Viet block cover what you need for a Vietnamese translation?
https://en.wikipedia.org/wiki/Tai_Viet_(Unicode_block)

The text would be UTF8 encoded, this won't actually break savegame support, but any hero that is using a non-US-ASCII name won't appear correctly if loaded in the original Diablo.

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

@uwodb noticed you looking in to Korean I actually learned some basics about how to phonetically read the alphabet last week :D

Any way the font block that you would need to render for Korean to be supported should be ordered like this:
https://en.wikipedia.org/wiki/Hangul_Jamo_(Unicode_block)

from devilutionx.

uwodb avatar uwodb commented on April 29, 2024

Are you working on create a new font? It is necessary to check whether Diablo II uses bitmap font.

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

Since we are sticking with bitmap fonts we will keep the one from D1 and and simply add the languages specific fonts as we add translations.
I don't get your comment about D2, it's not really relevant to this project. We can't distribute D2 fonts if that is what you have in mind. Afaik D2 uses a bitmap font but isn't as flexible as what we have planned here.

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

devilutionx-es.po.zip
spanish translation by @hiperiondev

from devilutionx.

hiperiondev avatar hiperiondev commented on April 29, 2024

Thanks @AJenbo for do that!

There is a problem with the translation that I don't know how it could be solved, and it may happen in other languages as well.
Since there is no Saxon Genitive, some sentences that are composed with this (there are several in the translation) will be out of order because they should be at the end of the sentence and not at the beginning as in English.

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

Ok, we will look into it once translations have been integrated.

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

Thanks

from devilutionx.

john-tornblom avatar john-tornblom commented on April 29, 2024

I played around with clang python bindings and made a simple script that replace string literals that appear in the locale files on the PSX PAL version with macro calls to gettext. Perhaps this can help with initial translation effort? See https://github.com/john-tornblom/devilutionX/blob/883032eaf8679dc502e178a9d9a176a2168e6f12/patch_gettext.py

I also tried to replace all strings directly to Swedish using a similar script. I commited the end result to john-tornblom@37346a0

It compiles, and menu system seems to work. Unfortunately, game crashes while loading the first level. SourceX/storm/storm.cpp:258: void* dvl::SMemAlloc(unsigned int, const char*, int, int): Assertion amount != -1u' failed.`

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

If you can help figure out how to link with gettext or something equivalent for all platforms that would be HUGLY appreciated :D
Even leaving some as English-only is acceptable if the majority of systems have translation support.

from devilutionx.

john-tornblom avatar john-tornblom commented on April 29, 2024

Alright, I'm not a windows user so I'm afraid I cannot help with that. What about the c++ alternatives like tinygettext? I could also contribute with a custom .po parser that instantiates a hashmap we can do lookups in, would that be sufficient?

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

I did look in to tinygettext and indeed it would work if we can get it to work, I was struggling with it but that was before we went with C+17 so might be better now. A custom po/mo paser would also totally work.

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

For text rendering we also need something that can efficiently look up what unicode block a char belong to so that it can find the correct font texture to use.

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

Case you didn't alreayd see my attempt: #902

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

Did some experiments with parsing MO files, see the function mo_parse at https://gist.github.com/john-tornblom/0ba2f1d46a48a288145b9b5ed0fdb501#file-mo2po-c-L57

Looks good.

Do you think this would be sufficient (assuming a std::map instead of an array of string_map_t)? We still need gettext to compile MO files, but I guess that is not a problem?

Yeah that is fine, the po-file is normally compiled in to a mo-file by the translation tool, like Poedit or similar.

from devilutionx.

john-tornblom avatar john-tornblom commented on April 29, 2024

I pushed an initial version to john-tornblom@8bf129f I apologize for the mix of C and C++, and the formatting of language.cpp. At the time of writing, no strings are actually translated, they need to be passed through the translation macro, e.g., "Exit Diablo" --> _("Exit Diablo"). This works for some strings, but not all, e.g., those defined as global variables cannot be translated at this time since they are initialized before the language subsystem.

Would be nice if this could be tested on more platforms, I only use Ubuntu. By default, en.mo will be used. This can be changed in diablo.ini, and must match the basename of the .mo file, e.g., "sv.mo" with the following ini settings:
[Language]
Code=sv

When the implementation is validated on more platforms, I can automate most of the "string" --> _("string") changes.

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

This works for some strings, but not all, e.g., those defined as global variables cannot be translated at this time since they are initialized before the language subsystem.

The cheat is to still apply the macro to them, and also to the point where they are being presented.

const char *test = _("Static text");
printf(_(test));

But refactoring would be better.

from devilutionx.

john-tornblom avatar john-tornblom commented on April 29, 2024

That cheat only works if the static text is translated after LanguageInitialize() has been invoked. Global variables are initialized before main() is called, in which case the language subsystem has not loaded the preferred translation (given the current implementation). I know that in C, it is possible to define a "module constructor" that gets invoked before main(), so in principle it is possible to get your suggested trick working. But I'm not sure how that works with C++, since many of its language constructs rely on a runtime lib that must be initialized before being used. If this approach is taken, we need to be very careful with the order in which different "modules" are initialized.

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

Ok, ill give you a more verbose example:

// Fake get text function that we tell xgettext to look for.
#define _F(x) x

const char *test = _F("Static text");
int main(int argc, char **argv)
    LanguageInitialize();
    printf(_(test));
}

It's done for a lot of applications and works fine there, despite not being ideal.

from devilutionx.

john-tornblom avatar john-tornblom commented on April 29, 2024

Right, global variables can only store msgid, and must be passed through the _() macro each time it is being used, e.g., with strcpy, strlen etc. The _F() macro is just used to generate po files?

from devilutionx.

john-tornblom avatar john-tornblom commented on April 29, 2024

I integrated with CMake to optionally compile the po files (run cmake with -DUSE_GETTEXT=1), and added a Swedish translation of the main menu. Seems to work fine on Ubuntu, should I submit a PR?

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

Right, global variables can only store msgid, and must be passed through the _() macro each time it is being used, e.g., with strcpy, strlen etc. The _F() macro is just used to generate po files?

Correct, it's non ideal, but works

I integrated with CMake to optionally compile the po files (run cmake with -DUSE_GETTEXT=1), and added a Swedish translation of the main menu. Seems to work fine on Ubuntu, should I submit a PR?

Sounds good :) Yes please

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

Just in case someone in here didn't notice yet, but we now have support for Latin, Cyrillic and Greek (no Greek translation yet) texts in game. We also expect to have support for Korean (55% translated), Japanese (still no translation), and Chinese (fully translated for both CN and TW) later this month.

Bulgarian:
image

from devilutionx.

AJenbo avatar AJenbo commented on April 29, 2024

@uwodb We just finished the first rendering of the Korean font :)

image

If you have time to help with finishing the translation (55% compleate) there is still 1-2 weeks before we do the release.

from devilutionx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.