Coder Social home page Coder Social logo

rtf's Introduction

Main Usage

$note = new JoshRibakoff_Note;
$note->setRTF($yourRTFDocumentAsText);
$note->formatHTML(); // returns your RTF converted to HTML as best as possible.

Convert between RTF documents & HTML or plaintext.

I was tasked with the wild job of displaying data from a closed sourced microsoft product in our PHP web app, allowing it to be manipulated in a WYSIWYG editor, and written back. It turned out this product used an MSSQL database which PHP can work with. The problem then was the manipulation and conversion of RTF to HTML and back. So I created this monstrosity using TDD. It was really interesting to compare how elegant HTML is compared to RTF, given that they both do essentially the same thing.

I don't recommend using RTF for anything. Its a bad Microsoft standard. If you do, you can use this library which translates RTF to HTML & back (and of course plaintext) it supports font colors, sizes, bold, italics & underline.

Other Uses & Design Forces

Part of RTF is nested "{" and "}", nested data doesn't work well with regex which is why I wrote a full lexer. There is a class that can match any "{" to its matching "}", called recursively it can create a multi dimensional array. You could re-use this class to create a language parsed by PHP (if you were so inclined)

Short Comings

Given the absurdity of working with a Microsoft product, I did the minimal coding to complete the job. My task only required to support bold, underline, italics, and color. Some other basic stuff is supported but it will have undefined results with more advanced RTF like images & tables. Also my task was to simply take HTML entered in a WYSIWYG editor and prepend it to the top of the RTF document. If you want to actually convert an HTML document to RTF from scratch, you can't with this. Nor should you. The main goal is mostly to go from RTF to HTML to display an RTF document in your web app, while supporting basic manipulation of the RTF.

Tag Order

In RTF, tags can be closed in a different order than they are opened. It is a lexer which means it looks for tokens at the pointer position in the RTF text, when it finds a token it does some work and then advances the pointer, until there is no more RTF data to lex. If the current token closes a tag that wasn't the last tag opened, close as few tags as necessary to allow us to close the current token's tag. As we close each tag, we remove it from the stack of open tags, but push it onto a second stack so we can reopen them again after closing the tag for the current token. THen it Puts an HTML closing tag for the current token and removes that token from the stack of open tags

In order words:

\b bold \i bold+italics \b0 italics \i0

This would translate directly to:

<b>bold<i>bold+italics</b>italics</i>

Which is invalid. So this library "guesses" what the HTML version would look like & would do this:

<b>bold<i>bold+italics</i></b><i>italics</i>

Also RTF has a "font table" where colors are indexed to an ID in the header. This library manipulates that font table header.

Further Examples

Convert simple RTF fragment to HTML

$converter = new JoshRibakoff_Note_RTFToHTML();
$converter->convert('\\ulfoo\\ulnone');

Creates this:

<u>foo</u>

Modifying an existing RTF document to add some text (converted from HTML to RTF)

$startNote = '{\\rtf1\\ansi\\ansicpg1251\\deff0\\deflang1049{\\fonttbl{\\f0\\fnil\\fcharset0 Microsoft Sans Serif;}{\\f1\\fnil\\fcharset204 Microsoft Sans Serif;}}
\\uc1\\pard\\f0\\fs17 4/4/2013 2:36:12 PM - jribakoff: simple note\\par
\\f1\\par
}
';

$note = new JoshRibakoff_Note();
$note->setRTF($startNote);
$note->prependNote(array(
      'text' => '<b style="foo">bold note</b>'
));

Creates this:

\\rtf1\\ansi\\ansicpg1251\\deff0\\deflang1049{\\fonttbl{\\f0\\fnil\\fcharset0 Microsoft Sans Serif;}{\\f1\\fnil\\fcharset204 Microsoft Sans Serif;}}
{\\colortbl }
\\f0\\fs17 \b bold note \b0\\par
\\uc1\\pard\\f0\\fs17 4/4/2013 2:36:12 PM - jribakoff: simple note\\par
\\f1\\par

`

rtf's People

Contributors

joshribakoff avatar alipek avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.