A company-backend for human language texts based on word frequency dictionaries.
company-wordfreq
is a company backup intended for writing texts in a human
language. The completions it proposes are words already used in the current
(or another open) buffer and matching words from a word list file. This word
list file is supposed to be a simple list of words ordered by the frequency the
words are used in the language. So the first completions are words already
used in the buffer followed by matching words of the language ordered by
frequency.
company-wordfreq
does not come with the word list files directly, but it can
download the files for you for many languages from FrequencyWords. I made a
fork of this repo to make sure, that the original does not change all over
sudden without my noticing.
The directory where the word list files reside is determined by the variable
company-word-freq-path
, default wordfreq-dicts
in your emacs home
directory, usually like ~/.emacs.d/wordfreq-dicts
. Their names must follow
the pattern <language>.txt
where language is the ispell-local-dictionary
value of the current language.
You need rg
alias ripgrep in your $PATH
as company-wordfreq
uses it to
grep into the word list files.
Hopefully from MELPA one day. As long as it is not, it is probably easiest to use straight.el. Put the following lines into your init file.
(straight-use-package '(company-wordfreq :type git :host github :repo "johannes-mueller/company-wordfreq.el"))
company-wordfreq
is supposed to be the one and only company backend and
company-mode
should not transform or sort its candidates. This can be
achieved by setting the variables company-backends
and company-transformers
buffer locally in text-mode
buffers by
(add-hook 'text-mode-hook (lambda () (setq-local company-backends '(company-wordfreq)) (setq-local company-transformers nil)))
Usually you don’t need to configure the language picked to get the word
completions. company-wordfreq
uses the variable ispell-local-dictionary
.
It should work dynamically even if you use auto-dictionary-mode
.
To download a word list use
M-x company-wordfreq-download-list
You are presented a list of languages to choose. For some languages the word lists are huge, which can lead to noticeable latency when the completions are build. Therefore you are asked if you want to use a word list with only the 50k most frequent words.
The file will then be downloaded, processed and put in place.
This is basically the result of a week-end hack. So probably not everything will work under any circumstances. Bug reports and feedback welcome in the issue tracker. Pull requests also, of course.