Coder Social home page Coder Social logo

Comments (14)

joncampbell123 avatar joncampbell123 commented on July 17, 2024 1

Good news!

I finally got around to obtaining a NEC PC-9821 laptop to develop this part of DOSLIB on. It's a 486DX 50MHz with 10MB of RAM. Despite the seller shipping with no OS, I was able to reinstall MS-DOS 6.2 NEC-98 version. Many PC-98 games, including Touhou Project 1-5, run perfectly fine on it without any sound.

I will be able to begin developing more code in the hw/necpc98 part of the project.

The RS-232 port on the laptop appears to be something proprietary rather than the familiar RS232C 9-pin port. Do any adapters exist to bring it out to RS-232C? I would love to port the remote control program to work on PC-98 to aid development, if I can figure out programming the serial port.

from doslib.

joncampbell123 avatar joncampbell123 commented on July 17, 2024 1

@gingerbeardman That would be helpful, yes!

I've managed to gather a few PDF scans already that could use OCR. Some I found on the Internet Archive.

http://hackipedia.org/browse/Computer/Platform/PC,%20NEC%20PC-98/Collections

from doslib.

joncampbell123 avatar joncampbell123 commented on July 17, 2024

I have also updated Hackipedia.org with what PC-98 I've found so far:

http://hackipedia.org/Platform/x86/NEC%20PC-98/

from doslib.

joncampbell123 avatar joncampbell123 commented on July 17, 2024

Next task for PC-98 development: Some quick one-off programs to play with keyboard input via INT 18h. Then, begin the 8251 library to demonstrate talking directly to the 8251 chips in the PC-98 platform that drive a) the keyboard and b) the RS-232C port.

I may have to finangle as bit as the available documentation is in Japanese and not in an OCR'd format I can just copy-paste into Google Translate.

I'm reading from what docs I have that later PC-9821 systems have a proper 16550 UART but emulate the 8251 for backwards compatibility. Is that right?

from doslib.

gingerbeardman avatar gingerbeardman commented on July 17, 2024

If you need any Japanese documents running through OCR, let me know! I have software set up to do just that

from doslib.

joncampbell123 avatar joncampbell123 commented on July 17, 2024

I'm also interested in any documentation concerning NEC's ANSI driver. It seems to have a direct interface via INT DCh but I can only find some documentation on the "extensions" to the interface. Many games and utilities seem to call on it. Once call I traced into appears to set/retrieve the function key row text.

from doslib.

gingerbeardman avatar gingerbeardman commented on July 17, 2024

I'll OCR them soon.

Also, have you contacted the author of np2kai? I'm sure he'd share documentation

from doslib.

gingerbeardman avatar gingerbeardman commented on July 17, 2024

OK, here we go! This was some heavy work for my little old MBP.

Pre-process

  • remove any existing OCR using PDFpenPro
  • de-skew using "Enhance Scans" in Acrobat
  • split large files in half by duplicating, then deleting unwanted half from each

Post-process

  • re-combine them afterwards, if required

Anyway, here are the OCR'd files. I'd keep them alongside the originals.

I also tried unsuccessfully with:

  • PDFpen Pro (got so very close)
  • FineReader (ABBY Pro)

Also, I'd like to point you to the Neo Kobe collection and also the Tokugawa Corporate Forums.

Translation Aggregator is a great little app to get multiple translations of whatever you copy into the clipboard. Windows only, so I run it using Wine.

Let me know how you get on with these. Happy to redo/tweak.

from doslib.

joncampbell123 avatar joncampbell123 commented on July 17, 2024

I will place these OCRd PDFs on the private copy of my hackipedia site to work from. I assume you'd rather I not publish them on the site publicly.

I checked over the PDFs and I can confirm the text is selectable, and copying the text to Notepad (Windows) or Leafpad (Linux) shows text that resembles what is on the page. Considering that some of the kanji are fairly blurry, I'm impressed.

from doslib.

gingerbeardman avatar gingerbeardman commented on July 17, 2024

I don't mind what you do with them. Feel free to share them publicly. I claim no ownership.

The new files may contain slightly lower quality image data due to the way the OCR apps modify them, so it's still worth keeping the originals around. If I redo them I always work from the originals.

There's some very impressive OCR software available these days. Though not every OCR app supports Japanese, and each has their own strengths and weaknesses.

As you work with them I'd appreciate feedback on which set give more consistent accuracy. Then in future I'll just use that one OCR app to save time!

from doslib.

gingerbeardman avatar gingerbeardman commented on July 17, 2024

Updated Translation Aggregator download link

from doslib.

gingerbeardman avatar gingerbeardman commented on July 17, 2024

I reinstalled PDFpenPro and managed to get some mediocre results:

http://www.mediafire.com/file/z30acwfyrc5y55a/PC98-OCR-PDFpenPro.7z

My thoughts on comparative quality, first is best:

  1. ScanSnap
  2. Acrobat
  3. PDFpenPro

Interestingly that is also the order of ease of processing, so I'll stick with ScanSnap for now.

from doslib.

joncampbell123 avatar joncampbell123 commented on July 17, 2024

So far so good. The only OCR errors I see are cases where it can't tell between 1 and I (capital i) and l (lowercase L).

from doslib.

gingerbeardman avatar gingerbeardman commented on July 17, 2024

Great. I'll see if it's possible to tweak or spell check the text. Maybe use a custom dictionary. We'll see.

from doslib.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.