Coder Social home page Coder Social logo

Comments (36)

krperry avatar krperry commented on May 20, 2024 1

from liblouis.

krperry avatar krperry commented on May 20, 2024

This link takes you to the conversation on this issue on liblouis-liblouisxml free list
https://www.freelists.org/post/liblouis-liblouisxml/Emphasis-phrase-question-and-forking-till-fixed,7

from liblouis.

bertfrees avatar bertfrees commented on May 20, 2024

See also https://www.freelists.org/post/liblouis-liblouisxml/Emphasis-phrase

from liblouis.

krperry avatar krperry commented on May 20, 2024

from liblouis.

bertfrees avatar bertfrees commented on May 20, 2024

I'm going to need more tests to see if there is more to it, but it seems there is already one thing we can do: we can change how the counting of words to determine the length of phrases happens. Currently Liblouis only counts whole words (thereby treating unemphasisable characters at the beginning and end as spaces). But since the goal of marking phrases is to reduce the number of indicators, we could perfectly well justify counting half words too, as this won't increase the number of indicators. (Note that this true only on the condition that endemphphrase after is used, like is the case in UEB. When endemphphrase before, or begemphword, is used and the emphasized part of the last word does not span the whole word, one additional indicator is needed to cancel emphasis.)

from liblouis.

bertfrees avatar bertfrees commented on May 20, 2024

Regarding the all caps issue: there is some code in Liblouis that was added exactly to achieve the opposite of what you say should happen, namely that if the last word of the phrase ends with non-letters (punctuation), the endemphphrase after indicator is inserted after them. The code was included specifically when the noempclass feature was added, in order to preserve the old behavior.

When I disable this code, your "'ABC ABC DEF' defg" example is translated the way you say it should, but several UEB tests start failing because they claim the opposite should happen. @krperry You may want to check them, they are in en-ueb-08-capitalization.yaml.

Apart from the UEB tests, a number of other tests are affected. Some, like Norwegian, are actually improvements, but other, such as Swedish, are regressions. In other words, different braille codes do it differently.

from liblouis.

krperry avatar krperry commented on May 20, 2024

from liblouis.

krperry avatar krperry commented on May 20, 2024

from liblouis.

krperry avatar krperry commented on May 20, 2024

from liblouis.

bertfrees avatar bertfrees commented on May 20, 2024

I think the three tests pretty much show everything

They may explain the requirement, but for really good coverage we need more tests IMO:

  • tests with other kinds of punctuation
  • tests with punctuation that is not enclosing (only before, only after, or different punctuation before and after)
  • tests with enclosing punctuation combined with other punctuation
  • tests with punctuation at the beginning and/or end included in the emphasis
  • etc. etc.

I am going by what transcribers tell me at APH

The existing tests are coming from the UEB rule book AFAIK. So they are probably correct. But nevertheless I think it's worth checking them out. These are the tests:

  • CAUTION: WET PAINT!
  • IT'S A HOAX! (APRIL FOOL!)
  • V-NECK SWEATERS FOR SALE!

It would be good if you could add some clarifying comments to the tests, and also to your new tests, to explain the different expectations in different cases. The more clarifying comments are included in YAML files, the better.

from liblouis.

jrbowden avatar jrbowden commented on May 20, 2024

Regarding the placement of the caps terminator, there are two cases:

  1. When there is punctuation which does not "balance", then the caps terminator goes at the end. This is the case with the examples from RUEB 8. The examples are correct. e.g.
    CAUTION: WET PAINT!
    ⠠⠠⠠⠉⠁⠥⠰⠝⠒⠀⠺⠑⠞⠀⠏⠁⠔⠞⠖⠠⠄
    Here, the caps terminator comes after the exclamation.

  2. But, when there are balancing punctuation marks (like the brackets or quotes), the capitals terminator should go before the matching punctuation, the principle of "nesting". So @krperry's examples, e.g.:
    "ABC ABC DEFG" defg
    ⠦⠠⠠⠠⠁⠃⠉⠀⠁⠃⠉⠀⠙⠑⠋⠛⠠⠄⠴⠀⠙⠑⠋⠛
    Note I have deliberately changed to double quotes to make the punctuation and caps terminator more obvious.

What I have said for capitals is equally true for typeforms (like bold):

  1. Caution: wet paint!
    ⠘⠶⠠⠉⠁⠥⠰⠝⠒⠀⠺⠑⠞⠀⠏⠁⠔⠞⠖⠘⠄

  2. "abc abc defg" defg
    ⠦⠘⠶⠁⠃⠉⠀⠁⠃⠉⠀⠙⠑⠋⠛⠘⠄⠴⠀⠙⠑⠋⠛

See Rules of Unified English Braille section 9.7 re typeforms and punctuation.
It would be difficult to get all the examples passing - some require "an understanding of the text".

I agree with @bertfrees more examples are needed. I will attempt to write some and add them to this ticket.

from liblouis.

krperry avatar krperry commented on May 20, 2024

from liblouis.

bertfrees avatar bertfrees commented on May 20, 2024

Thank you @jrbowden, that makes things more clear for me. And I agree with you that it might be difficult to get all the tests to pass.

The "IT'S A HOAX! (APRIL FOOL!)" example comes from RUEB section 8.6.2, which seems to be the relevant section for this issue:

The capitals terminator may precede or follow punctuation and other terminators but it is best that indicators and paired characters such as parentheses, square brackets and quotes be nested. That is, close punctuation and indicators in reverse order of opening.

Strange that this hasn't been referenced by anyone before.

I notice that it says "it is best that", not "it is required that". So that means the behavior of Liblouis should ideally be improved, but is acceptable as it is.

So I guess the "serious bug" that Ken referred to on the mailing list is the emphasis part. Regarding that part: my commit 30df4d9 fixes Ken's first and second test. I'm not sure that means it resolves the issue. It's only two tests.

from liblouis.

krperry avatar krperry commented on May 20, 2024

from liblouis.

bertfrees avatar bertfrees commented on May 20, 2024

@krperry Great, keep me posted.

from liblouis.

krperry avatar krperry commented on May 20, 2024

from liblouis.

bertfrees avatar bertfrees commented on May 20, 2024

The only commit you need is 30df4d9. It's on the phrase-emphasis branch, but don't use the last commit of that branch.

from liblouis.

krperry avatar krperry commented on May 20, 2024

from liblouis.

krperry avatar krperry commented on May 20, 2024

from liblouis.

krperry avatar krperry commented on May 20, 2024

from liblouis.

bertfrees avatar bertfrees commented on May 20, 2024

git fetch https://github.com/liblouis/liblouis.git && git checkout 30df4d99f6

from liblouis.

krperry avatar krperry commented on May 20, 2024

This seems to work for the double quotes and the single quoest with your fix. It has mostly fixed the other punctuation like (), {}, [], and <> the only problem is the enclosing ending punctuation is part of the emphasis even though it is not marked. It is hard to post it here but if you Put a paran at the start of a phrase and a paran at the end of a phrase bold the phrase but do not bold the parans. The end paran should be outside the phrase end mark. I am pretty sure that was in the tests I sent in. I will download them and check. This is the same for all enclosing punctuation. This is much better than it was now if we can only get that last bit fixed.

from liblouis.

krperry avatar krperry commented on May 20, 2024

Do I need to put more tests in to get the enclosing punctuation problems fixed? I know one of the tests I put in this original ticket showed the problem but maybe more is needed? The part I am talking about is when you have text enclosed with parenthesis, brakets, braces, angle brackets, pretty much any punctuation that a person might use for enclosures other than quotes. Then if you bold the inside but not the punctuation. Liblouis still gets that wrong. If we can get that fixed then this ticket can be closed. AS it is we put the current liblouis in the current stable brailleblaster 2.1 and so far it is good other than this problem.

from liblouis.

bertfrees avatar bertfrees commented on May 20, 2024

Hi Ken. I'm a bit confused. As far as I can tell the issue you describe is fixed. The test that I am running is the following:

table: |
  include tables/unicode.dis
  include tables/spaces.uti
  include tables/en-chardefs.cti
  include tables/en-ueb-g1.ctb
tests:
- - "(abc abc defg) defg"
  - ⠷⠘⠶⠁⠃⠉⠀⠁⠃⠉⠀⠙⠑⠋⠛⠘⠄⠾⠀⠙⠑⠋⠛
  - typeform:
      bold: ' ++++++++++++      '
- - "(abc abc abc defg) defg"
  - ⠷⠘⠶⠁⠃⠉⠀⠁⠃⠉⠀⠁⠃⠉⠀⠙⠑⠋⠛⠘⠄⠾⠀⠙⠑⠋⠛
  - typeform:
      bold: ' ++++++++++++++++      '

It corresponds with your "example 2" in your initial comment:

  1. A bold phrase with non-bolded parentheses:
    (abc abc defg) defg

Can you spot a mistake in the test?

from liblouis.

krperry avatar krperry commented on May 20, 2024

from liblouis.

bertfrees avatar bertfrees commented on May 20, 2024

I'm sorry, your last comment makes no sense to me...

In the test the parens are not bolded, and the opening and closing bold tags are inside the parens.

The test I'm using is simply taken from the file that you sent us earlier. And as far as I can tell it matches the requirement.

@jrbowden what is your take on this?

from liblouis.

krperry avatar krperry commented on May 20, 2024

from liblouis.

bertfrees avatar bertfrees commented on May 20, 2024

OK (relief).

I'm not entirely sure but I think it was passing before. In any case, it does with the current release.

from liblouis.

krperry avatar krperry commented on May 20, 2024

from liblouis.

krperry avatar krperry commented on May 20, 2024

from liblouis.

jrbowden avatar jrbowden commented on May 20, 2024

Hi @krperry and @bertfrees ,
As promised, I attached a set of more extensive tests. I hope this helps.
When I run it here, all but 1 test passes. The one that fails is possibly debatable, but I think it is true.
I hope this helps.
Question is where should these tests go?

from liblouis.

bertfrees avatar bertfrees commented on May 20, 2024

Somewhere in tests/braille-specs/en-ueb-rueb.yaml?

from liblouis.

bertfrees avatar bertfrees commented on May 20, 2024

@krperry Do you already have more clarity?

from liblouis.

krperry avatar krperry commented on May 20, 2024

from liblouis.

egli avatar egli commented on May 20, 2024

Yes, if it all works then let's close this :-)

from liblouis.

krperry avatar krperry commented on May 20, 2024

working tested in Brailleblaster.

from liblouis.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.