Coder Social home page Coder Social logo

acli / trn Goto Github PK

View Code? Open in Web Editor NEW
5.0 4.0 1.0 1.16 MB

trn + UTF-8 = ♥ // Unicode might look like trash on trn 4.0 test77, but if it’s open-source it can be fixed

Home Page: https://github.com/acli/trn/wiki

License: Other

Shell 14.37% C 74.12% Roff 6.09% C++ 0.22% Yacc 1.17% Perl 0.36% Tcl 3.56% Makefile 0.11%
usenet-client terminal-based old-stuff unofficial unicode-support old-school work-in-progress

trn's Introduction

Unicode-patched Threaded Read News (trn) 4.0 test77

This is a version of trn-4.0-test77 that has been patched to display UTF-8 reasonably correctly. However, the original “character set” conversions are currently disabled. Bugs that have nothing to do with UTF-8 support are also being worked on, as this is the newsreader I’m actually using (yes, I’ve tried tin and nn, and no, tin is not easier and nn is not better).

Posting is half-working, with Content-Transfer-Encoding declared as 8bit. A more proper fix is the next step.

Further along, the original conversions need to be put back in, but in a way that wouldn’t corrupt UTF-8.

The original README from 4.0-test77 is in README.

trn's People

Contributors

acli avatar eli-the-bearded avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

eli-the-bearded

trn's Issues

Articles sometimes displayed with wrong encoding

Test case: “MrChrisV is the only one marching in step” in news.software.readers

Article detected as Latin1 and displayed as mojibake, but actually tagged as utf-8. “v” causes it to be detected as utf-8, but ^R causes it to be re-misdetected as Latin1

(The post itself is mojibake, but it should still honour content-type; it’s currently not.)

Intermittent missed tags

Sometimes trn misses lots of tags, to the point the entire post can come out blank. The cause might be buffer size or tags being cut in half by the buffer boundary, as trn seems to never see the missed tags at all.

Names can get “shortened” to mojibake

If a name started with CJK followed by ASCII, the name shortener can corrupt the CJK part

This is happening obviously because it has no notion of UTF-8 or even DBCS.

Are all unit tests passing?

I merged your changes to my fork and they applied cleanly to my master (which is trn4.0test77 from sourceforge). Then I rebased my current work in my cmake branch on top of your changes. In the process of integrating your unit tests into the ones that I had added, I am getting the following failures in gtest:

[ RUN      ] TerminateStringAtVisualIndexTest.iso8859_1
C:\Code\trn\trn\tests\trn\test_utf.cpp(405): error: Expected equality of these values:
  m_after
    Which is: "\xC3\xA1\xC3\xAD\xC3\xBA\xC3\xA9\xC3\xB3"
    As Text: "áíúéó"
  m_buffer
    Which is: "\xC3\xA1\xC3\xAD\xC3\xBA"
    As Text: "áíú"
terminate_string_at_visual_index(5)
[  FAILED  ] TerminateStringAtVisualIndexTest.iso8859_1 (11 ms)
[ RUN      ] TerminateStringAtVisualIndexTest.cjk_basic
C:\Code\trn\trn\tests\trn\test_utf.cpp(416): error: Expected equality of these values:
  m_after
    Which is: "\xAF\xA7\xE5\x8C\x96\xE9\xA3\x9B\xE7\x81\xB0"
  m_buffer
    Which is: "\xAF\xA7\xE5\x8C\x96\xE9\xA3\x9B\xE7\x81\xB0\xE4"
terminate_string_at_visual_index(8)
[  FAILED  ] TerminateStringAtVisualIndexTest.cjk_basic (2977 ms)
[ RUN      ] TerminateStringAtVisualIndexTest.cjk_basic_at_wrong_boundary
C:\Code\trn\trn\tests\trn\test_utf.cpp(427): error: Expected equality of these values:
  m_after
    Which is: "\xE5\xAF\xA7\xE5\x8C\x96\xE9\xA3\x9B\xE7\x81\xB0 "
    As Text: "寧化飛灰 "
  m_buffer
    Which is: "\xE5\xAF\xA7\xE5\x8C\x96\xE9\xA3\x9B\xE7\x81\xB0\xE4"
terminate_string_at_visual_index(9)
[  FAILED  ] TerminateStringAtVisualIndexTest.cjk_basic_at_wrong_boundary (3 ms)

My assumption is that I missed something when I was rebasing my changes on top of yours, but I wanted to check to see if you get a clean test run on your tree. I am developing on Windows, so I don't have ready access to a unix machine right now. I'm going to see what I can do under WSL, but that's a whole other journey....

util.c has external dependencies

util.c contains the very important safemalloc() function but has links to nntp stuff etc. so you can’t use safemalloc() if you want to create a unit-testable module.

KILL files do not seem to work

Commands that should work (according to articles found by Google) and that work when typed at the prompt get “processed” when entering a newsgroup but do not seem to have any effect.

This might be an NNTP-only issue. Need to investigate.

Configure uses the strange term "termlib"

The Configure script uses the strange term "termlib". It should be updated to mention termcap/terminfo since the latter are terms people actually use; "termlib" isn't.

ctrl-g broken

In digest format articles (as far as I know, only used still for comp.risks), jumps to the next message in the digest. Or it should. in acli-trn it just hangs. I have not yet diagnosed it further.

Alternate text not honoured

Partly because double quotes are removed and partly because tags seem to be truncated, trn does not check for alt text at all. This must be fixed to comply with Ontario laws.

UTF-8 that is not 1ch wide treated as 1ch wide

Some UTF-8 characters are double-width (e.g., East Asian) but trn assumes all characters are 1ch wide so this throws the alignment off.

Other UTF-8 characters are zero-width (e.g., combining diacritics) but also treated as 1ch wide; this also throws the alignment off.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.