Overview
Right now the i18n API lays the groundwork for some interesting language processing applications. It can detect the language of content on a page (or guess at it and provide a confidence rating), get the UI language of the browser, and allows developers to publish extensions in multiple supported languages by providing a locales/lang/messages.json
file (among other things).
Users can also download individual language packs which are then added to that user's list of available languages to spell check in.
Problematically, it is impossible to remove or manage custom dictionary words (which are tied to the logged in browser profile and are not, as far as I know, specific to any one dictionary the user has installed). Users must manually edit the persdict.dat file (hidden away in the roaming data folder of the host machine) to be able to clear or edit custom dictionary words.
What's more is that despite some browser extensions being able to literally read/consume the content of any webpage (given the right permissions), an extension cannot see what custom words a user has saved nor is the native Hunspell spellchecker exposed.
The reason I point this out is that I do not believe being able to access a users custom dictionary list or the browser bundled spellchecker is somehow more invasive than an extension given permission to read the content of a web page the user is visiting, therefore this proposal would not introduce new or otherwise complex permissions that are foreign to the web extensions landscape or go against Mozilla's privacy first principles.
Summary of Problems
- extensions cannot read or modify a users custom word list
- extensions cannot call native Hunspell methods (i.e. to let an extension spell check specific text in a different language than the one currently selected for spell checking input fields)
- extensions cannot customize how a browser shows or handles spell checking, so:
- extensions cannot easily customize the red squiggly underline that indicates a misspelled word (i.e. change color of underline, or instead highlight the whole word instead of just underline it)
- extensions cannot limit the amount of word suggestions shown based on a user's preference
- extensions cannot call specific spell check methods and must implement their own spellchecker which is incredibly wasteful
- extensions cannot automatically switch the native spell checker's language to one of the language packs in use despite the
i18n module allowing extensions to detect the language of content on the page
Proposal
To rectify this I would propose adding a new "spellcheck" API permission scheme that exposes the native Hunspell/spellchecker instance.
Expose parts of the Hunspell/native spell checker
The below updated permissions are based on the following spellcheck API extension proposals:
Types
Something like:
spellcheck.MisspelledWord (Object)
currentSuggestion
: string
- the currently suggested word (default to word with closest proximity)
suggestions
: array
- strings from 0 to N, 0 having the closest proximity, N having the least proximity
misspelledWord
: string
- the misspelled word
node
: object
- contains the location of the misspelled word inside the input/textfield (so that we can draw over it)
spellcheck.Options (Object)
limit
: the maximum number of words to suggest for a misspelled word
Events
Something like:
spellcheck.onShowWordSuggestions
- fired when a user asks to see misspelled word suggestions
spellcheck.onCycleWordSuggestions
- fired when a user moves up or down
spellcheck.onSelectWordSuggestion
- fired when a user selects a word
- ?
Methods
Something like:
spellcheck.getCurrentSuggestion(MisspelledWord)
spellcheck.setDictionary(string)
- a 4 character dictionary locale (i.e. en_GB, de_DE), set the currently in use dictionary
spellcheck.replaceWord(MispelledWord, string)
- replace a misspelled word with a string
spellcheck.cycleSuggestedWordUp(MispelledWord)
- cycle the suggested word up (updates MisspelledWord.currentSuggestion
)
spellcheck.cycleSuggestedWordDown(MispelledWord)
- cycle the suggested word down (updates MisspelledWord.currentSuggestion
)
Add a "spellchecker" permissions scheme that has read and write sections
Similar to how the clipboard permission scheme works, the spellchecker could be updated to have a "write" permission too, which would allow an extension with the write permission to:
- enable or disable spell checking
- switch the current language pack in use to any language pack the user has installed
- manipulate how misspelled words appear on any given page (would bypass the need to have a content script reading all text of all input and text fields, etc)
- limit or otherwise change the amount of word suggestions shown for misspelled words
- edit the users custom word list
- replace misspelled words inside an input or text area
The read permission would be able to:
- read a user's custom words
- read misspelled words and the word suggestions of the misspelled words
- read whether spell checking is enabled or not for the active profile
A nice side effect of setting up the permissions this way is that an extension without a contentScript permission, but with a read/write i18n permission, would not necessarily need or be able to read the textual content of a page (nor would any script need to be injected). By plugging into the native Hunspell instance an extension could, for example, listen for a misspelled word event/message (and do something) without having any idea what the surrounding content is, and without being able to read content of non-spellcheck-able fields, which allows for more fine tuned privacy controls.
Example Project
For an example project (currently on hold) that demonstrates some of the above functionality, see here: https://grayedfox.github.io/multidict/
The above extension bundles nspell (a JavaScript wrapper of Hunspell) and it's own dictionaries (mostly clones of those available in the extension store) in order to achieve a much slower spell check. What's more each enabled dictionary eats up a lot of resources, and users must go into their settings and disable spell checking (to prevent the red squiggly underline appearing), and all of this just to be able to achieve the simple task of automatically changing what dictionary is used when spell checking and providing some customization of how to show misspelled words.
Tidbits
One thing I'm not too clear on is how to handle customization of a misspelled word's appearance. Being able to change the color of the squiggly line would be a great start (just use a HTML color picker and pass in a valid HEX color scheme).
A more advanced feature, which would vastly open up customization options, is allow extensions to provide a CSS object that could style misspelled words. As long as the CSS is valid, the browser could handle styling the words (and could even whitelist certain style options if needed).