slusarz / dovecot-fts-flatcurve Goto Github PK
View Code? Open in Web Editor NEWDovecot FTS Flatcurve plugin (Xapian)
Home Page: https://slusarz.github.io/dovecot-fts-flatcurve/
License: GNU Lesser General Public License v2.1
Dovecot FTS Flatcurve plugin (Xapian)
Home Page: https://slusarz.github.io/dovecot-fts-flatcurve/
License: GNU Lesser General Public License v2.1
Having installed Dovecot from the repo.dovecot.org for Focal (version 2.3.17) and adding the dovecot-dev package I was able to successfully compile flatcurve. However, after adding flatcurve to mail_plugins, and then triggering a reindex, I receive the following:
Fatal: Couldn't load required plugin /usr/lib/dovecot/modules/lib21_fts_flatcurve_plugin.so: dlopen() failed: /usr/lib/dovecot/modules/lib21_fts_flatcurve_plugin.so: undefined symbol: _ZTIN6icu_668ByteSinkE
a search runs with no issue, and gives correct results if I run it againest any mailbox:
doveadm -D search -u user ( BODY objection ) MAILBOX INBOX
the search crash if run with
doveadm -D search -u user ( BODY objection ) MAILBOX virtual/All
All is definded as a virtual mailbox:
*
-Trash
-Trash/*
-Junk
-Junk/*
-Spam
-Spam/*
all
Once I run the search, it will search all mailboxes correctly, gives a list of matching uids correctly, then:
Feb 17 18:36:13 doveadm(user): Panic: file fts-search.c: line 87 (level_scores_add_vuids): assertion failed: (array_count(&vuids_arr) == array_count(&br->scores))
Feb 17 18:36:13 doveadm(user): Error: Raw backtrace: /usr/lib/dovecot/libdovecot.so.0(backtrace_append+0x41) [0x7efdb8ee1e51] -> /usr/lib/dovecot/libdovecot.so.0(backtrace_get+0x22) [0x7efdb8ee1f72] -> /usr/lib/dovecot/libdovecot.so.0(+0x10c24b) [0x7efdb8eef24b] -> /usr/lib/dovecot/libdovecot.so.0(+0x10c287) [0x7efdb8eef287] -> /usr/lib/dovecot/libdovecot.so.0(+0x5e375) [0x7efdb8e41375] -> /usr/lib/dovecot/lib20_fts_plugin.so(+0xc3ad) [0x7efdb88c63ad] -> /usr/lib/dovecot/lib20_fts_plugin.so(+0x12a44) [0x7efdb88cca44] -> /usr/lib/dovecot/lib20_fts_plugin.so(fts_search_lookup+0xfc) [0x7efdb88cd0ec] -> /usr/lib/dovecot/lib20_fts_plugin.so(+0x16368) [0x7efdb88d0368] -> doveadm(doveadm_mail_iter_init+0x100) [0x558da4ed85e0] -> doveadm(+0x3ffc3) [0x558da4edbfc3] -> doveadm(+0x36c3c) [0x558da4ed2c3c] -> doveadm(doveadm_cmd_ver2_to_mail_cmd_wrapper+0x2de) [0x558da4ed3d3e] -> doveadm(doveadm_cmd_run_ver2+0x50f) [0x558da4ee433f] -> doveadm(doveadm_cmd_try_run_ver2+0x3e) [0x558da4ee43ce] -> doveadm(main+0x1dc) [0x558da4ec36cc] -> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7efdb8a6a083] -> doveadm(_start+0x2e) [0x558da4ec393e]
Aborted (core dumped)
The same search will run with no issues on other plugins
I've discovered, that email addresses like [email protected] or [email protected] cannot be found. Additional problem, that I can't force dovecot to use its index files for header search, if fts index has been built.
plugin configuration :
plugin {
fts = flatcurve
fts_enforced = no
fts_autoindex_exclude = \Junk
fts_autoindex_exclude2 = \Trash
fts_filters = normalizer-icu snowball stopwords
fts_filters_en = lowercase snowball english-possessive stopwords
fts_languages = ru en
fts_tokenizers = generic email-address
fts_tokenizer_generic = algorithm=simple
}
Sorry for all the issues, but after compilation and configuration errors the next problem arises.
/var/log/error.log now mentions
Error: fts: Failed to initialize backend 'flatcurve': Unknown backend
in /usr/lib/dovecot/modules i have
-rw-r--r-- 1 root root 1241300 Nov 8 18:42 lib21_fts_flatcurve_plugin.a
-rwxr-xr-x 1 root root 1104 Nov 8 18:42 lib21_fts_flatcurve_plugin.la
-rwxr-xr-x 1 root root 623824 Nov 8 18:42 lib21_fts_flatcurve_plugin.so
I use fts_autoindex = no
. When I deliver a small text mail and immediately search for it, the result is as expected, the mail is reported as a match. I'm using doveadm here for demonstration purposes, it's the same with IMAP.
doveadm save -u ewald testmail && doveadm -D search -u ewald mailbox inbox text test
Sep 30 13:30:15 doveadm(ewald): Debug: fts-flatcurve: Query (allhdrs:test* OR test*) mailbox=INBOX matches=1 uids=9
28795a3baf42f160202400006a82f8f2 9
Now I deliver a bigger mail. It has a plain text body and 2 images and 1 PDF as attachments. But I'm not indexing the attachments and the size of the plain text body is actually quite small.
doveadm save -u ewald testmail2 && doveadm -D search -u ewald mailbox inbox text anhang
Sep 30 13:38:08 doveadm(ewald): Debug: fts-flatcurve: Query (allhdrs:anhang* OR anhang*) mailbox=INBOX matches=1 uids=10
The debug log shows that flatcurve found a match, but it is not reported, there's no line with the mailbox-guid and the UID in the output. When I repeat the search, the match is reported:
doveadm -D search -u ewald mailbox inbox text anhang
Sep 30 13:40:11 doveadm(ewald): Debug: fts-flatcurve: Query (allhdrs:anhang* OR anhang*) mailbox=INBOX matches=1 uids=10
28795a3baf42f160202400006a82f8f2 10
Again, it's the same with IMAP. First search doesn't show any match, second one does:
. SEARCH TEXT anhang
* SEARCH
. OK Search completed (0.033 + 0.001 secs).
. SEARCH TEXT anhang
* SEARCH 1
. OK Search completed (0.001 + 0.000 secs).
The log shows a match for both searches:
Sep 30 13:43:03 imap(ewald)<34782><3nF5xDTNVMsAAAAAAAAAAAAAAAAAAAAB>: Debug: fts-flatcurve: Query (allhdrs:anhang* OR anhang*) mailbox=INBOX matches=1 uids=13
...
Sep 30 13:43:05 imap(ewald)<34782><3nF5xDTNVMsAAAAAAAAAAAAAAAAAAAAB>: Debug: fts-flatcurve: Query (allhdrs:anhang* OR anhang*) mailbox=INBOX matches=1 uids=13
Any idea what's going on here? Again, thanks for your help!
When I search for a phrase, I get the same results with IMAP SEARCH with v0.2.0 and v0.3.0:
2 uid search body "search phrase"
* SEARCH 1003
2 OK Search completed (0.002 + 0.000 + 0.001 secs).
When I do this search with doveadm search
, I get the same result with v0.2.0:
# v0.2.0
doveadm search -u ewald body "search phrase" mailbox inbox
1e52731e4b0660614b7201006a82f8f2 1003
WIth v0.3.0, I don't get any matches with doveadm search
:
# v0.3.0
doveadm search -u ewald body "search phrase" mailbox inbox | wc -l
0
Can you confirm that this is a bug in v0.3.0? I would expect this to return the same result as with v0.2.0 and IMAP SEARCH.
I don't know whether this is in scope of your project but I would love to have fts-flatcurve running on macOS. I've got it to compile, but when I run Dovecot with it and run a search I get:
Dec 21 14:56:46 indexer-worker(pid 20949 user kour1er): Fatal: master: service(indexer-worker): child 20949 killed with signal 11 (core dumps disabled - https://dovecot.org/bugreport.html#coredumps)
This is the latest git version running against Dovecot 2.3.17.1 on Intel. I've attached the indexer-worker crash log in case that of any use.
indexer-worker.txt
Using dovecot 2.3.18 and xapian 1.4.18 with the fts_header_excludes and fts_header_includes options, the fts-flatcurve folder inside the Maildir become huge.
Taking for example a Maildir with cur folder of 1.1G the related fts-flatcurve folder passed from 140M to 2.2G:
[root@myserver Maildir]# du -sh cur/
1,1G cur/
[root@myserver Maildir]# du -sh fts-flatcurve/
2,2G fts-flatcurve/
[root@myserver Maildir]# du -sh backup-fts-flatcurve-without-filters/
140M backup-fts-flatcurve-without-filters/
I did the following procedure before to do every test:
find . -type f -name 'dovecot.*' -delete
find . -type d -name 'fts-flatcurve' | xargs rm -rf
doveadm fts rescan -u ${account}
doveadm index -u ${account} '*'
This is my configuration of fts-flatcurve plugin:
### flatcurve
fts = flatcurve
fts_flatcurve_substring_search = no
fts_autoindex = no
fts_enforced = body
fts_index_timeout = 5s
fts_decoder = decode2text
fts_languages = it en es de fr nl pt
fts_filters = lowercase snowball stopwords
fts_filters_en = lowercase snowball english-possessive stopwords
fts_tokenizers = generic email-address
fts_tokenizer_generic = algorithm=simple
### These are the two lines which cause the issue
fts_header_excludes = *
fts_header_includes = From To Cc Bcc Subject Message-ID
I would expected with the last two lines above a less space used for fts indexes, not more.
Why flatcurve does not have an option to store its data in a separate location to mail content?
It's a huge problem for me, as I keep maildirs on slow NFS drive but I could use local fast SSD for indexing.
Xapian extension was perfectly OK with such a scenario.
dovecot --version
2.3.8
fts-backend-flatcurve.c:21:13: error: ‘event_category_fts’ undeclared here (not in a function);
grep event_category_fts /usr/include/dovecot/*.h
(empty)
I'm not sure when event_category_fts was first included, but dovecot-devel on Oracle Linux 8 gives me no joy. Maybe update the readme or documentation.
Installed Packages
Name : dovecot-devel
Epoch : 1
Version : 2.3.8
Release : 9.el8
Architecture : x86_64
Size : 1.7 M
Source : dovecot-2.3.8-9.el8.src.rpm
Repository : @System
From repo : ol8_codeready_builder
I have a mail with the word "body" in the body and the word "attachment" in an attached PDF (see attached mail.txt). I enabled fts_decode
with decode2text.sh
to index the PDF.
Searching for "body" and "attachment" returns the mail as expected:
sudo doveadm -D search -u ewald BODY body MAILBOX INBOX
... fts-flatcurve(INBOX): Query (body:body* OR body:bodi*) matches=1 uids=8
d1c791351b174863987e00006a82f8f2 8
sudo doveadm -D search -u ewald BODY attachment MAILBOX INBOX
... fts-flatcurve(INBOX): Query (body:attachment* OR body:attach*) matches=1 uids=8
d1c791351b174863987e00006a82f8f2 8
A more complicated search also works as expected when searching for "body":
sudo doveadm -D search -u ewald \( BODY body OR HEADER Reply-To body \) MAILBOX INBOX
... fts-flatcurve(INBOX): Query (body:body* OR allhdrs:body* OR body:bodi*) maybe_matches=1 uids=8
d1c791351b174863987e00006a82f8f2 8
Because of the "Reply-To" header it's just a "maybe match", but it works.
But searching for "attachment" doesn't return the mail:
sudo doveadm -D search -u ewald \( BODY attachment OR HEADER Reply-To attachment \) MAILBOX INBOX | wc -l
0
... fts-flatcurve(INBOX): Query (body:attachment* OR allhdrs:attachment* OR body:attach*) maybe_matches=1 uids=8
My understanding is that "maybe match" means that Dovecot searches the message again, but it doesn't find "attachment" because it scans the mail without decoding the PDF.
I checked with Squat and it returns the mail:
sudo doveadm search -u ewald \( BODY attachment OR HEADER Reply-To attachment \) MAILBOX INBOX
d1c791351b174863987e00006a82f8f2 8
I would expect Flatcurve to return the mail as well. I think the problem is that Flatcurve returns either "definite matches" or "maybe matches", but in this case it should probably return both, the BODY match as a "definite match" and the HEADER match as a "maybe match". Dovecot then wouldn't need to scan the mail again, because it knows it has a "definite match" for the mail. This would lead to more consistent (and probably faster) results.
Hi,
First of all, thanks for this awesome plugin.
I set it up locally (using Dovecot 2.3.18 from Debian testing) and it seems to cover most of my needs, but unfortunately I can't make text search using BODY
to work as I expected. Let me give you an example.
I have several emails that contain the following phrase inside them: Hello Jim
. If I issue the following command:
# doveadm -D search -u myuser mailbox INBOX BODY "Hello John"
I get only two results. They contain the strings I'm looking for, but in weird places (like in the middle of a word), and I don't get any of the messages that actually contain the full string Hello Jim
.
Here's the debug output for the search:
# doveadm -D search -u myuser mailbox INBOX BODY "Hello Jim"
Debug: Loading modules from directory: /usr/lib/dovecot/modules/doveadm
Debug: Skipping module doveadm_acl_plugin, because dlopen() failed: /usr/lib/dovecot/modules/doveadm/lib10_doveadm_acl_plugin.so: undefined symbol: acl_user_module (this is usually intentio
nal, so just ignore this message)
Debug: Skipping module doveadm_quota_plugin, because dlopen() failed: /usr/lib/dovecot/modules/doveadm/lib10_doveadm_quota_plugin.so: undefined symbol: quota_user_module (this is usually in
tentional, so just ignore this message)
Debug: Skipping module doveadm_fts_lucene_plugin, because dlopen() failed: /usr/lib/dovecot/modules/doveadm/lib20_doveadm_fts_lucene_plugin.so: undefined symbol: lucene_index_iter_deinit (t
his is usually intentional, so just ignore this message)
Debug: Skipping module doveadm_fts_plugin, because dlopen() failed: /usr/lib/dovecot/modules/doveadm/lib20_doveadm_fts_plugin.so: undefined symbol: fts_user_get_language_list (this is usual
ly intentional, so just ignore this message)
Debug: Skipping module doveadm_fts_flatcurve_plugin, because dlopen() failed: /usr/lib/dovecot/modules/doveadm/lib21_doveadm_fts_flatcurve_plugin.so: undefined symbol: fts_flatcurve_user_mo
dule (this is usually intentional, so just ignore this message) Debug: Skipping module doveadm_mail_crypt_plugin, because dlopen() failed: /usr/lib/dovecot/modules/doveadm/libdoveadm_mail_crypt_plugin.so: undefined symbol: mail_crypt_box_get_pvt_digests (this is usually intentional, so just ignore this message) May 06 14:59:52 Debug: Loading modules from directory: /usr/lib/dovecot/modules
May 06 14:59:52 Debug: Module loaded: /usr/lib/dovecot/modules/lib20_fts_plugin.so
May 06 14:59:52 Debug: Module loaded: /usr/lib/dovecot/modules/lib21_fts_flatcurve_plugin.so
May 06 14:59:52 Debug: Loading modules from directory: /usr/lib/dovecot/modules/doveadm
May 06 14:59:52 Debug: Skipping module doveadm_acl_plugin, because dlopen() failed: /usr/lib/dovecot/modules/doveadm/lib10_doveadm_acl_plugin.so: undefined symbol: acl_user_module (this is
usually intentional, so just ignore this message)
May 06 14:59:52 Debug: Skipping module doveadm_quota_plugin, because dlopen() failed: /usr/lib/dovecot/modules/doveadm/lib10_doveadm_quota_plugin.so: undefined symbol: quota_user_module (th
is is usually intentional, so just ignore this message)
May 06 14:59:52 Debug: Skipping module doveadm_fts_lucene_plugin, because dlopen() failed: /usr/lib/dovecot/modules/doveadm/lib20_doveadm_fts_lucene_plugin.so: undefined symbol: lucene_inde
x_iter_deinit (this is usually intentional, so just ignore this message)
May 06 14:59:52 Debug: Module loaded: /usr/lib/dovecot/modules/doveadm/lib20_doveadm_fts_plugin.so
May 06 14:59:52 Debug: Module loaded: /usr/lib/dovecot/modules/doveadm/lib21_doveadm_fts_flatcurve_plugin.so
May 06 14:59:52 Debug: Skipping module doveadm_mail_crypt_plugin, because dlopen() failed: /usr/lib/dovecot/modules/doveadm/libdoveadm_mail_crypt_plugin.so: undefined symbol: mail_crypt_box
_get_pvt_digests (this is usually intentional, so just ignore this message)
May 06 14:59:52 doveadm(myuser)<3323071><OowzGihwdWK/tDIAquyQZQ>: Debug: Namespace : type=private, prefix=myuserdj/, sep=/, inbox=yes, hidden=no, list=yes, subscriptions=yes location=maildi
r:~/Mail/myuser:INBOX=~/Mail/myuser/INBOX:LAYOUT=fs
May 06 14:59:52 doveadm(myuser)<3323071><OowzGihwdWK/tDIAquyQZQ>: Debug: fs: root=/home/myuser/Mail/myuser, index=, indexpvt=, control=, inbox=/home/myuser/Mail/myuser/INBOX, alt=
May 06 14:59:52 doveadm(myuser)<3323071><OowzGihwdWK/tDIAquyQZQ>: Debug: Namespace : type=private, prefix=, sep=, inbox=no, hidden=yes, list=no, subscriptions=no location=fail::LAYOUT=none
May 06 14:59:52 doveadm(myuser)<3323071><OowzGihwdWK/tDIAquyQZQ>: Debug: none: root=, index=, indexpvt=, control=, inbox=, alt=
May 06 14:59:52 doveadm(myuser)<3323071><OowzGihwdWK/tDIAquyQZQ>: Debug: fts: Indexes disabled for namespace ''
May 06 14:59:52 doveadm(myuser): Debug: Mailbox INBOX: Mailbox opened
May 06 14:59:52 doveadm(myuser): Debug: fts-flatcurve: Xapian library version: 1.4.18
May 06 14:59:52 doveadm(myuser): Debug: fts-flatcurve(INBOX): Opened DB (RO) messages=27019 version=1 shards=6
May 06 14:59:52 doveadm(myuser): Debug: fts-flatcurve(INBOX): Last UID uid=39701
May 06 14:59:52 doveadm(myuser): Debug: fts-flatcurve(INBOX): Last UID uid=39701
May 06 14:59:52 doveadm(myuser): Debug: fts-flatcurve(INBOX): Query (body:Hello AND body:Jim*) maybe_matches=1 uids=6095:6096
May 06 14:59:52 doveadm(myuser): Debug: fts-flatcurve(INBOX): Query (body:Jim* AND body:Hello*) matches=1 uids=6095:6096
May 06 14:59:52 doveadm(myuser): Debug: fts-flatcurve(INBOX): Query (body:Jim* OR body:jim*) matches=15 uids=539,800,1215,1379,2231,2233,2240,2244,3589,4021,6095:6096,6481,12042,24732,29512
May 06 14:59:52 doveadm(myuser): Debug: fts-flatcurve(INBOX): Query (body:Hello* OR body:hello*) matches=80 uids=1078,1661,2542,2678,3534,3543,3595,3673,4034,4038,4040,4043,4121,5345,5413,5
443:5444,6095:6096,7729,9006,9427:9428,10784,11159,11170,12217,12429,13423,21527,22435,22719,22954,22965,23340,23764,23846,24031,24304,25362,26870,26987,27074,27384,27505,27582,27684,27696,
27880,28143,28559,28592,28748,29571,29948,30012,30254,31103,31567,32257,33025,34597,34801,34818,34820,35089,35101,35216,35225,35230,35233,35467,35537,35873,36334,37345,38560,38577,38945,389
82,39215,39263,39266
d86a0b16162afb5afe6b0000aaec9065 6095
d86a0b16162afb5afe6b0000aaec9065 6096
May 06 14:59:52 doveadm(3323071): Debug: auth-master: conn unix:/run/dovecot/auth-userdb (pid=2842173,uid=0): Disconnected: Connection closed (fd=9)
If I grep my mailbox directly for the same string, I get:
$ grep -l 'Hello Jim' * | sort -u | wc -l
7
Here's my configuration for fts-flatcurve:
# cat /etc/dovecot/conf.d/90-fts.conf
mail_plugins = $mail_plugins fts fts_flatcurve
plugin {
fts = flatcurve
fts_enforced = yes
fts_autoindex = yes
fts_languages = en pt
fts_tokenizers = generic email-address
fts_flatcurve_max_term_size = 200
fts_flatcurve_substring_search = yes
}
Am I missing something here?
Thanks in advance.
Hi @slusarz
I am the developer of the Xapian plugin for Dovecot ( https://github.com/grosjo/fts-xapian )
We are doing duplicate work
My plugins is already pushed to Archlinux, Debian, and very soon Fedora
Shall we combine our efforts ?
Hello everyone,
we from the mailcow: dockerized Team are currently implementing the flatcurve FTS into our project.
But during that we´ve come across a weird issue which is similar to the issue: #14 but sadly the tips did not work.
To be clear we´re using a own Dockerfile which have the own compiled core (as already said as a solution).
The Dockerfile:
FROM debian:bullseye-slim
ARG DEBIAN_FRONTEND=noninteractive
ARG DOVECOT=2.3.18
ARG XAPIAN=1.4.19
ENV LC_ALL C
ENV GOSU_VERSION 1.14
# Add groups and users before installing Dovecot to not break compatibility
RUN groupadd -g 5000 vmail \
&& groupadd -g 401 dovecot \
&& groupadd -g 402 dovenull \
&& groupadd -g 999 sogo \
&& usermod -a -G sogo nobody \
&& useradd -g vmail -u 5000 vmail -d /var/vmail \
&& useradd -c "Dovecot unprivileged user" -d /dev/null -u 401 -g dovecot -s /bin/false dovecot \
&& useradd -c "Dovecot login user" -d /dev/null -u 402 -g dovenull -s /bin/false dovenull \
&& touch /etc/default/locale \
&& apt-get update \
&& apt-get -y --no-install-recommends install \
apt-transport-https \
ca-certificates \
cpanminus \
curl \
dnsutils \
dirmngr \
gettext \
gnupg2 \
jq \
libauthen-ntlm-perl \
libcgi-pm-perl \
libcrypt-openssl-rsa-perl \
libcrypt-ssleay-perl \
libdata-uniqid-perl \
libdbd-mysql-perl \
libdbi-perl \
libdigest-hmac-perl \
libdist-checkconflicts-perl \
libencode-imaputf7-perl \
libfile-copy-recursive-perl \
libfile-tail-perl \
libhtml-parser-perl \
libio-compress-perl \
libio-socket-inet6-perl \
libio-socket-ssl-perl \
libio-tee-perl \
libipc-run-perl \
libjson-webtoken-perl \
liblockfile-simple-perl \
libmail-imapclient-perl \
libmodule-implementation-perl \
libmodule-scandeps-perl \
libnet-ssleay-perl \
libpackage-stash-perl \
libpackage-stash-xs-perl \
libpar-packer-perl \
libparse-recdescent-perl \
libproc-processtable-perl \
libreadonly-perl \
libregexp-common-perl \
libsys-meminfo-perl \
libterm-readkey-perl \
libtest-deep-perl \
libtest-fatal-perl \
libtest-mock-guard-perl \
libtest-mockobject-perl \
libtest-nowarnings-perl \
libtest-pod-perl \
libtest-requires-perl \
libtest-simple-perl \
libtest-warn-perl \
libtry-tiny-perl \
libunicode-string-perl \
liburi-perl \
libwww-perl \
lua-sql-mysql \
lua-socket \
mariadb-client \
procps \
python3-pip \
redis-server \
supervisor \
syslog-ng \
syslog-ng-core \
syslog-ng-mod-redis \
wget \
git \
build-essential \
autoconf \
automake \
libtool \
make \
libicu-dev \
libxapian-dev \
libstemmer-dev \
libexttextcat-dev \
zlib1g-dev \
pkg-config \
libsqlite3-dev \
bison \
flex \
valgrind
RUN dpkgArch="$(dpkg --print-architecture | awk -F- '{ print $NF }')" \
&& wget -O /usr/local/bin/gosu "https://github.com/tianon/gosu/releases/download/$GOSU_VERSION/gosu-$dpkgArch" \
&& chmod +x /usr/local/bin/gosu \
&& gosu nobody true \
&& apt-key adv --fetch-keys https://repo.dovecot.org/DOVECOT-REPO-GPG \
&& echo "deb https://repo.dovecot.org/ce-${DOVECOT}/debian/bullseye bullseye main" > /etc/apt/sources.list.d/dovecot.list \
&& cd /tmp && git clone --depth 1 --branch release-2.3 https://github.com/dovecot/core.git \
&& cd core \
&& ./autogen.sh && PANDOC=false ./configure --with-stemmer --with-textcat --with-icu \
&& make install \
&& apt update \
&& apt-get -y --no-install-recommends install \
dovecot-lua \
dovecot-managesieved \
dovecot-sieve \
dovecot-lmtpd \
dovecot-ldap \
dovecot-mysql \
# dovecot-core \
dovecot-pop3d \
dovecot-imapd \
dovecot-dev
RUN cd /tmp && git clone https://github.com/slusarz/dovecot-fts-flatcurve.git && cd dovecot-fts-flatcurve \
&& ./autogen.sh \
&& ./configure --with-dovecot=/usr/lib/dovecot \
&& make \
&& make install
RUN pip3 install mysql-connector-python html2text jinja2 redis \
&& apt-get autoremove --purge -y \
&& apt-get purge -y \
make \
libxapian-dev \
libicu-dev \
zlib1g-dev \
pkg-config \
build-essential \
libsqlite3-dev \
&& apt-get autoclean \
&& rm -rf /var/lib/apt/lists/*
RUN rm -rf /tmp/* /var/tmp/* /root/.cache/
COPY trim_logs.sh /usr/local/bin/trim_logs.sh
COPY clean_q_aged.sh /usr/local/bin/clean_q_aged.sh
COPY syslog-ng.conf /etc/syslog-ng/syslog-ng.conf
COPY syslog-ng-redis_slave.conf /etc/syslog-ng/syslog-ng-redis_slave.conf
COPY imapsync /usr/local/bin/imapsync
COPY imapsync_runner.pl /usr/local/bin/imapsync_runner.pl
COPY report-spam.sieve /usr/lib/dovecot/sieve/report-spam.sieve
COPY report-ham.sieve /usr/lib/dovecot/sieve/report-ham.sieve
COPY rspamd-pipe-ham /usr/lib/dovecot/sieve/rspamd-pipe-ham
COPY rspamd-pipe-spam /usr/lib/dovecot/sieve/rspamd-pipe-spam
COPY sa-rules.sh /usr/local/bin/sa-rules.sh
COPY maildir_gc.sh /usr/local/bin/maildir_gc.sh
COPY docker-entrypoint.sh /
COPY supervisord.conf /etc/supervisor/supervisord.conf
COPY stop-supervisor.sh /usr/local/sbin/stop-supervisor.sh
COPY quarantine_notify.py /usr/local/bin/quarantine_notify.py
COPY quota_notify.py /usr/local/bin/quota_notify.py
COPY repl_health.sh /usr/local/bin/repl_health.sh
ENTRYPOINT ["/docker-entrypoint.sh"]
CMD exec /usr/bin/supervisord -c /etc/supervisor/supervisord.conf
May be a bit messy but this is for testing purposes only at the moment :)
Here´s the error we got:
dovecot-mailcow_1 | Uptime: 3235 Threads: 12 Questions: 4956 Slow queries: 0 Opens: 52 Open tables: 43 Queries per second avg: 1.531
dovecot-mailcow_1 | sievec: Fatal: Couldn't load required plugin /usr/lib/dovecot/modules/lib21_fts_flatcurve_plugin.so: dlopen() failed: /usr/lib/dovecot/modules/lib21_fts_flatcurve_plugin.so: undefined symbol: _ZTIN6icu_678ByteSinkE
Please let me know if you need anything else to allocate or help us with this problem.
Hello,
I recently switched to this FTS plugin. I am running a nightly cronjob doing doveadm fts optimize -A
.
For one user, I get "filename too long" errors during this indexing, where the error message shows a path that appears to be a concatenation of all paths of the user's mailboxes.
The message looks like this:
doveadm(mailuser): Error: fts-flatcurve: stat(/srv/dovecot/mails/mailuser/mailboxes/BOX1/dbox-Mails/fts-flatcurve//srv/dovecot/mails/mailuser/mailboxes/BOX2/dbox-Mails/fts-flatcurve//srv/dovecot/mails/mailuser/mailboxes/BOX3/dbox-Mails/fts-flatcurve/) failed: File name too long
I have shortened this example message to 3 mailboxes, the actual message contains 57 mailboxes and the resulting path in the error message has a length of 4136 characters. However, since this concatenation is not a valid path and a stat is attempted on it, I suspect this to be an issue in the plugin?
I get 28 of these messages during the doveadm execution, but all appear identical (same user, same path). The user has a total of 76 mailboxes.
If there is any helpful information I can provide for this, please let me know.
I am using dovecot 2.3.19.1 with fts-flatcurve 0.3.3. Just moved to 2.3.20 but issue remains.
I cant find a complete config example for fts-flatcurve. When I just use the config lines from the documentation, it complains about language and tokenizers missing, is there a complete working example somewhere?
I've been running the tagged 0.1.0 version trouble-free for a while and thought I would try HEAD. With that I see sigabrt from indexer-worker, also from imap processes if I do an expunge, relating to dotlock.
2021-12-08T11:52:27.417Z symphytum dovecot: indexer-worker(sthen)<39204><5ifUEHucsGE8cgEA89wJGQ:zXJzGHucsGEkmQAA89wJGQ>: Panic: file file-dotlock.c: line 570 (dotlock_create): assertion failed: (lock_info.fd != -1)
2021-12-08T11:52:27.475Z symphytum dovecot: indexer-worker(sthen)<39204><5ifUEHucsGE8cgEA89wJGQ:zXJzGHucsGEkmQAA89wJGQ>: Fatal: master: service(indexer-worker): child 39204 killed with signal 6 (core dumped)
2021-12-08T11:52:27.533Z symphytum dovecot: indexer-worker(sthen)<8691><5ifUEHucsGE8cgEA89wJGQ:vGaUHXucsGHzIQAA89wJGQ>: Fatal: master: service(indexer-worker): child 8691 killed with signal 11 (core dumped)
indexer-worker<39204>
Core was generated by `indexer-worker'.
Program terminated with signal SIGABRT, Aborted.
#0 thrkill () at /tmp/-:3
3 /tmp/-: No such file or directory.
(gdb) bt
#0 thrkill () at /tmp/-:3
#1 0x0000035aab9143de in _libc_abort () at /usr/src/lib/libc/stdlib/abort.c:51
#2 0x0000035b75800c86 in default_fatal_finish (type=LOG_TYPE_PANIC, status=0) at failures.c:459
#3 0x0000035b757ff0a4 in fatal_handler_real (ctx=0x7f7ffffd5fb0, format=<optimized out>, args=<optimized out>)
at failures.c:471
#4 0x0000035b75800041 in i_internal_fatal_handler (ctx=0x0,
format=0x6 <error: Cannot access memory at address 0x6>, args=0x0) at failures.c:872
#5 0x0000035b757ff2e2 in i_panic (format=0x6 <error: Cannot access memory at address 0x6>) at failures.c:524
#6 0x0000035b758041f0 in dotlock_create (dotlock=<optimized out>,
flags=(DOTLOCK_CREATE_FLAG_CHECKONLY | unknown: 4), write_pid=<optimized out>, lock_path_r=<optimized out>)
at file-dotlock.c:570
#7 0x0000035b75803011 in file_dotlock_create_real (dotlock=0x35b587dcc00,
flags=(DOTLOCK_CREATE_FLAG_CHECKONLY | unknown: 4)) at file-dotlock.c:635
#8 file_dotlock_create (set=0x35b587b14d8,
path=0x35b587c3868 "/home/sthen/mdbox/mailboxes/zBackup/dbox-Mails/fts-flatcurve/flatcurve-dotlock",
flags=(DOTLOCK_CREATE_FLAG_CHECKONLY | unknown: 4), dotlock_r=0x35b587de288) at file-dotlock.c:679
#9 0x0000035a9b892148 in fts_flatcurve_xapian_lock (backend=0x35b587b1400)
at dovecot-fts-flatcurve-a69652485c3c4371ae78086acaec0a5a39f02efe/src/fts-backend-flatcurve-xapian.cpp:516
#10 0x0000035a9b893503 in fts_flatcurve_xapian_db_populate (backend=0x35b587b1400,
opts=(FLATCURVE_XAPIAN_DB_NOCREATE_CURRENT | FLATCURVE_XAPIAN_DB_IGNORE_EMPTY))
at dovecot-fts-flatcurve-a69652485c3c4371ae78086acaec0a5a39f02efe/src/fts-backend-flatcurve-xapian.cpp:610
#11 0x0000035a9b89048c in fts_flatcurve_xapian_read_db (backend=0x35b587b1400,
opts=(FLATCURVE_XAPIAN_DB_NOCREATE_CURRENT | FLATCURVE_XAPIAN_DB_IGNORE_EMPTY))
at dovecot-fts-flatcurve-a69652485c3c4371ae78086acaec0a5a39f02efe/src/fts-backend-flatcurve-xapian.cpp:678
#12 0x0000035a9b8910f5 in fts_flatcurve_xapian_get_last_uid (backend=0x0, last_uid_r=0x7f7ffffd64ec)
at dovecot-fts-flatcurve-a69652485c3c4371ae78086acaec0a5a39f02efe/src/fts-backend-flatcurve-xapian.cpp:1119
#13 0x0000035a9b88e060 in fts_backend_flatcurve_get_last_uid (_backend=0x35b587b1400, box=<optimized out>,
last_uid_r=0x7f7ffffd64ec)
at dovecot-fts-flatcurve-a69652485c3c4371ae78086acaec0a5a39f02efe/src/fts-backend-flatcurve.c:158
#14 0x0000035b4d8cadc4 in fts_backend_get_last_uid (backend=0x0, box=0x6, last_uid_r=0x7f7ffffd64ec) at fts-api.c:106
#15 0x0000035b4d8d5322 in fts_mailbox_get_last_cached_seq (box=0x35b587c8448, seq_r=0x7f7ffffd6524)
at fts-storage.c:84
#16 0x0000035b4d8d30f6 in fts_mailbox_get_status (box=0x35b587c8448,
items=(STATUS_MESSAGES | STATUS_LAST_CACHED_SEQ), status_r=0x7f7ffffd66e0) at fts-storage.c:109
#17 0x0000035b6f75a6c7 in mailbox_get_status (box=0x35b587c8448, items=(STATUS_MESSAGES | STATUS_LAST_CACHED_SEQ),
status_r=0x7f7ffffd66e0) at mail-storage.c:2169
#18 0x00000358889dde1e in index_mailbox_precache (conn=0x35b587be800, box=0x35b587c8448) at master-connection.c:92
#19 index_mailbox (conn=0x35b587be800, user=<optimized out>, mailbox=<optimized out>,
max_recent_msgs=<optimized out>, what=<optimized out>) at master-connection.c:239
#20 master_connection_input_args (_conn=0x35b587be800, args=<optimized out>) at master-connection.c:283
#21 0x0000035b757f3ff2 in connection_input_default (conn=0x35b587be800) at connection.c:95
#22 0x0000035b7581c6c1 in io_loop_call_io (io=0x35b587e4e80) at ioloop.c:737
#23 0x0000035b7581f7fe in io_loop_handler_run_internal (ioloop=0x35b587b2c00) at ioloop-kqueue.c:164
#24 0x0000035b7581cd31 in io_loop_handler_run (ioloop=0x35b587b2c00) at ioloop.c:789
#25 0x0000035b7581cb38 in io_loop_run (ioloop=0x35b587b2c00) at ioloop.c:762
#26 0x0000035b75773a95 in master_service_run (service=0x35b587ba000, callback=0x6) at master-service.c:863
#27 0x00000358889dd828 in main (argc=1, argv=0x7f7ffffd6ad8) at indexer-worker.c:76
indexer-worker<8691>
Core was generated by `indexer-worker'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 fts_flatcurve_xapian_write_db_get (backend=0x54e7c43000, xdb=0x0, wopts=FLATCURVE_XAPIAN_WDB_CREATE)
at /usr/obj/ports/dovecot-fts-flatcurve-0.1.0pl20211202/dovecot-fts-flatcurve-a69652485c3c4371ae78086acaec0a5a39f02efe/src/fts-backend-flatcurve-xapian.cpp:366
366 if (xdb->dbw != NULL)
(gdb) p xdb
$1 = (flatcurve_xapian_db *) 0x0
(gdb) bt
#0 fts_flatcurve_xapian_write_db_get (backend=0x54e7c43000, xdb=0x0, wopts=FLATCURVE_XAPIAN_WDB_CREATE)
at dovecot-fts-flatcurve-a69652485c3c4371ae78086acaec0a5a39f02efe/src/fts-backend-flatcurve-xapian.cpp:366
#1 0x0000005563fe07c5 in fts_flatcurve_xapian_write_db_current (backend=0x54e7c43000,
opts=<error reading variable: Cannot access memory at address 0x7>)
at dovecot-fts-flatcurve-a69652485c3c4371ae78086acaec0a5a39f02efe/src/fts-backend-flatcurve-xapian.cpp:650
#2 fts_flatcurve_xapian_init_msg (ctx=0x54e7c582c8)
at dovecot-fts-flatcurve-a69652485c3c4371ae78086acaec0a5a39f02efe/src/fts-backend-flatcurve-xapian.cpp:1187
#3 0x0000005563fdd4cd in fts_backend_flatcurve_update_set_build_key (_ctx=0x54e7c582c8, key=0x7f7ffffd9a40)
at dovecot-fts-flatcurve-a69652485c3c4371ae78086acaec0a5a39f02efe/src/fts-backend-flatcurve.c:261
#4 0x00000055893e7171 in fts_backend_update_set_build_key (ctx=0x54e7c582c8, key=0x7f7ffffd9a40) at fts-api.c:178
#5 0x00000055893e842a in fts_build_mail_header (ctx=<optimized out>, block=<optimized out>) at fts-build-mail.c:160
#6 fts_build_mail_real (update_ctx=0x54e7c582c8, mail=<optimized out>, retriable_err_msg_r=<optimized out>,
may_need_retry_r=<optimized out>) at fts-build-mail.c:576
#7 fts_build_mail (update_ctx=0x54e7c582c8, mail=<optimized out>) at fts-build-mail.c:625
#8 0x00000055893eed23 in fts_mail_index (_mail=0x54e7c73848) at fts-storage.c:540
#9 fts_mail_precache (_mail=0x54e7c73848) at fts-storage.c:561
#10 0x000000559d79dcb5 in mail_precache (mail=0x54e7c73848) at mail.c:463
#11 0x00000052c560b205 in index_mailbox_precache (conn=0x54e7c69800, box=<optimized out>) at master-connection.c:119
#12 index_mailbox (conn=0x54e7c69800, user=<optimized out>, mailbox=<optimized out>,
max_recent_msgs=<optimized out>, what=<optimized out>) at master-connection.c:239
#13 master_connection_input_args (_conn=0x54e7c69800, args=<optimized out>) at master-connection.c:283
#14 0x00000054ff8d2ff2 in connection_input_default (conn=0x54e7c69800) at connection.c:95
#15 0x00000054ff8fb6c1 in io_loop_call_io (io=0x54e7c56f80) at ioloop.c:737
#16 0x00000054ff8fe7fe in io_loop_handler_run_internal (ioloop=0x54e7c4bd00) at ioloop-kqueue.c:164
#17 0x00000054ff8fbd31 in io_loop_handler_run (ioloop=0x54e7c4bd00) at ioloop.c:789
#18 0x00000054ff8fbb38 in io_loop_run (ioloop=0x54e7c4bd00) at ioloop.c:762
#19 0x00000054ff852a95 in master_service_run (service=0x54e7c40e00, callback=0x0) at master-service.c:863
#20 0x00000052c560a828 in main (argc=1, argv=0x7f7ffffda0c8) at indexer-worker.c:76
2021-12-08T11:56:15.146Z symphytum dovecot: imap(sthen)<78914><1ThBJ6HSdCQKDwV4>: Panic: file file-dotlock.c: line 570 (dotlock_create): assertion failed: (lock_info.fd != -1)
2021-12-08T11:56:15.219Z symphytum dovecot: imap(sthen)<78914><1ThBJ6HSdCQKDwV4>: Fatal: master: service(imap): child 78914 killed with signal 6 (core dumped)
imap<78914>
Core was generated by `imap'.
Program terminated with signal SIGABRT, Aborted.
#0 thrkill () at /tmp/-:3
3 /tmp/-: No such file or directory.
(gdb) bt
#0 thrkill () at /tmp/-:3
#1 0x00000f06e66e63de in _libc_abort () at /usr/src/lib/libc/stdlib/abort.c:51
#2 0x00000f069b2bac86 in default_fatal_finish (type=LOG_TYPE_PANIC, status=0) at failures.c:459
#3 0x00000f069b2b90a4 in fatal_handler_real (ctx=0x7f7fffff66b0, format=<optimized out>, args=<optimized out>)
at failures.c:471
#4 0x00000f069b2ba041 in i_internal_fatal_handler (ctx=0x0,
format=0x6 <error: Cannot access memory at address 0x6>, args=0x0) at failures.c:872
#5 0x00000f069b2b92e2 in i_panic (format=0x6 <error: Cannot access memory at address 0x6>) at failures.c:524
#6 0x00000f069b2be1f0 in dotlock_create (dotlock=<optimized out>,
flags=(DOTLOCK_CREATE_FLAG_CHECKONLY | unknown: 4), write_pid=<optimized out>, lock_path_r=<optimized out>)
at file-dotlock.c:570
#7 0x00000f069b2bd011 in file_dotlock_create_real (dotlock=0xf060b874b80,
flags=(DOTLOCK_CREATE_FLAG_CHECKONLY | unknown: 4)) at file-dotlock.c:635
#8 file_dotlock_create (set=0xf06e64d06d8,
path=0xf06e64cd068 "/home/sthen/mdbox/mailboxes/hackers/dbox-Mails/fts-flatcurve/flatcurve-dotlock",
flags=(DOTLOCK_CREATE_FLAG_CHECKONLY | unknown: 4), dotlock_r=0xf06e64df288) at file-dotlock.c:679
#9 0x00000f0680d98148 in fts_flatcurve_xapian_lock (backend=0xf06e64d0600)
at dovecot-fts-flatcurve-a69652485c3c4371ae78086acaec0a5a39f02efe/src/fts-backend-flatcurve-xapian.cpp:516
#10 0x00000f0680d99503 in fts_flatcurve_xapian_db_populate (backend=0xf06e64d0600,
opts=(FLATCURVE_XAPIAN_DB_NOCREATE_CURRENT | FLATCURVE_XAPIAN_DB_IGNORE_EMPTY))
at dovecot-fts-flatcurve-a69652485c3c4371ae78086acaec0a5a39f02efe/src/fts-backend-flatcurve-xapian.cpp:610
#11 0x00000f0680d9648c in fts_flatcurve_xapian_read_db (backend=0xf06e64d0600,
opts=(FLATCURVE_XAPIAN_DB_NOCREATE_CURRENT | FLATCURVE_XAPIAN_DB_IGNORE_EMPTY))
at dovecot-fts-flatcurve-a69652485c3c4371ae78086acaec0a5a39f02efe/src/fts-backend-flatcurve-xapian.cpp:678
#12 0x00000f0680d970f5 in fts_flatcurve_xapian_get_last_uid (backend=0x0, last_uid_r=0x7f7fffff6c44)
at dovecot-fts-flatcurve-a69652485c3c4371ae78086acaec0a5a39f02efe/src/fts-backend-flatcurve-xapian.cpp:1119
#13 0x00000f0680d94060 in fts_backend_flatcurve_get_last_uid (_backend=0xf06e64d0600, box=<optimized out>,
last_uid_r=0x7f7fffff6c44)
at dovecot-fts-flatcurve-a69652485c3c4371ae78086acaec0a5a39f02efe/src/fts-backend-flatcurve.c:158
#14 0x00000f06f2770dc4 in fts_backend_get_last_uid (backend=0x0, box=0x6, last_uid_r=0x7f7fffff6c44) at fts-api.c:106
#15 0x00000f06f277496b in fts_indexer_init (backend=0xf06e64d0600, box=0xf060b84a448, ctx_r=0xf060b84c9e0)
at fts-indexer.c:239
#16 0x00000f06f27795bc in fts_try_build_init (ctx=0xf060b84de00, fctx=0xf060b84c980) at fts-storage.c:139
#17 fts_mailbox_search_init (t=<optimized out>, args=0xf060b868048, sort_program=<optimized out>,
wanted_fields=<optimized out>, wanted_headers=<optimized out>) at fts-storage.c:257
#18 0x00000f03f3dfe5ef in imap_search_start (ctx=0xf06e64eb220, sargs=0xf060b868048, sort_program=0x0)
at imap-search.c:537
#19 0x00000f03f3ded9d0 in cmd_search (cmd=0xf06e64eb048) at cmd-search.c:48
#20 0x00000f03f3df6872 in command_exec (cmd=0xf06e64eb048) at imap-commands.c:201
#21 0x00000f03f3df50c6 in client_command_input (cmd=0xf06e64eb048) at imap-client.c:1230
#22 0x00000f03f3df51d2 in client_command_input (cmd=<optimized out>) at imap-client.c:1295
#23 0x00000f03f3df51d2 in client_command_input (cmd=<optimized out>) at imap-client.c:1295
#24 0x00000f03f3df3a90 in client_handle_next_command (client=0xf06e64be848, remove_io_r=<optimized out>)
at imap-client.c:1339
#25 client_handle_input (client=0xf06e64be848) at imap-client.c:1353
#26 0x00000f03f3df3666 in client_continue_pending_input (client=0xf06e64be848) at imap-client.c:1139
#27 0x00000f069b2d66c1 in io_loop_call_io (io=0xf06e64c5380) at ioloop.c:737
#28 0x00000f069b2d97fe in io_loop_handler_run_internal (ioloop=0xf06e64e9500) at ioloop-kqueue.c:164
#29 0x00000f069b2d6d31 in io_loop_handler_run (ioloop=0xf06e64e9500) at ioloop.c:789
#30 0x00000f069b2d6b38 in io_loop_run (ioloop=0xf06e64e9500) at ioloop.c:762
#31 0x00000f069b22da95 in master_service_run (service=0xf06e64b7000, callback=0x6) at master-service.c:863
#32 0x00000f03f3e05035 in main (argc=1, argv=<optimized out>) at main.c:564
(gdb) frame 6
#6 0x00000f069b2be1f0 in dotlock_create (dotlock=<optimized out>,
flags=(DOTLOCK_CREATE_FLAG_CHECKONLY | unknown: 4), write_pid=<optimized out>, lock_path_r=<optimized out>)
at file-dotlock.c:570
570 i_assert(lock_info.fd != -1);
(gdb) list
565 now = time(NULL);
566 } while (now < max_wait_time);
567 file_lock_wait_end(dotlock->path);
568
569 if (ret > 0) {
570 i_assert(lock_info.fd != -1);
571 if (fstat(lock_info.fd, &st) < 0) {
572 i_error("fstat(%s) failed: %m", lock_path);
573 ret = -1;
574 } else {
(gdb) p lock_info
$1 = {set = 0xf060b874b80,
path = 0xf060b84c180 "/home/sthen/mdbox/mailboxes/hackers/dbox-Mails/fts-flatcurve/flatcurve-dotlock",
lock_path = 0xf06a70d3530 "/home/sthen/mdbox/mailboxes/hackers/dbox-Mails/fts-flatcurve/flatcurve-dotlock",
temp_path = 0x0, fd = -1, lock_info = {dev = 0, ino = 0, size = 0, ctime = 0, mtime = 0}, file_info = {dev = 0,
ino = 0, size = 0, ctime = 0, mtime = 0}, last_pid_check = 0, last_change = 0, wait_usecs = 0, have_pid = false,
pid_read = false, use_io_notify = true, lock_stated = true}
Any ideas please, or is there more information you'd like me to provide? Currently running Dovecot 2.3.17.1, also seen with 2.3.16, on OpenBSD (amd64).
fts config:
fts = flatcurve
fts_autoindex = yes
fts_decoder = decode2text
fts_filters = lowercase snowball stopwords
fts_language_config = /usr/local/share/libexttextcat/fpdb.conf.dovecot
fts_languages = en fr es pt ru de it nl
fts_tokenizers = generic email-address
protocol lmtp {
mail_plugins = " zlib notify replication fts fts_flatcurve sieve"
}
protocol lda {
mail_plugins = " zlib notify replication fts fts_flatcurve sieve"
postmaster_address = [...]
}
protocol imap {
mail_max_userip_connections = 21
mail_plugins = " zlib notify replication fts fts_flatcurve imap_sieve imap_zlib"
}
Thank you.
We cannot get rid of this message using compiled dovecot-fts-flatcurve:
Fatal: Couldn't load required plugin /usr/lib/dovecot/modules/lib21_fts_flatcurve_plugin.so: dlopen() failed: /usr/lib/dovecot/modules/lib21_fts_flatcurve_plugin.so: undefined symbol: _ZTIN6icu_668ByteSinkE
We looked at this and this issue.
We compile the module and copy it to Dovecot (/usr/lib/dovecot/modules) which is a dovecot bullseye repo install in a bullseye container.
What is wrong with us? ;-)
Our environment is this:
Dockerfile: .github/actions/dovecot-fts-flatcurve-test/Dockerfile
FROM debian:bullseye-slim
echo "deb https://repo.dovecot.org/ce-2.3-latest/debian/bullseye bullseye main" >> /etc/apt/sources.list.d/dovecot.list
FROM debian:bullseye-slim
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update && apt-get install -y \
curl \
gpg
RUN curl https://repo.dovecot.org/DOVECOT-REPO-GPG | gpg --import && \
gpg --export ED409DA1 > /etc/apt/trusted.gpg.d/dovecot.gpg && \
echo "deb https://repo.dovecot.org/ce-2.3-latest/debian/bullseye bullseye main" >> /etc/apt/sources.list.d/dovecot.list
RUN apt-get update && apt-get install -y \
git \
automake \
libtool \
wget \
make \
gettext \
build-essential \
bison \
flex \
valgrind \
libssl-dev \
pkg-config \
libstemmer-dev \
libexttextcat-dev \
libicu-dev \
libxapian-dev \
dovecot-imaptest
# We need to build Dovecot ourselves, since "standard" Dovecot does not
# come with necessary ICU libraries built-in
RUN mkdir /dovecot
RUN git clone --depth 1 --branch release-2.3 \
https://github.com/dovecot/core.git /dovecot/core
RUN cd /dovecot/core && \
./autogen.sh && \
PANDOC=false ./configure --with-stemmer --with-textcat --with-icu && \
make install
RUN git clone --depth 1 https://github.com/slusarz/dovecot-fts-flatcurve.git \
/dovecot/fts-flatcurve
RUN cd /dovecot/fts-flatcurve && ./autogen.sh && ./configure --with-dovecot=/usr/local/lib/dovecot && make install
# Users dovecot and dovenull are created by dovecot-imaptest package
RUN useradd vmail && \
mkdir -p /dovecot/sdbox && \
chown -R vmail:vmail /dovecot/sdbox
ADD configs/ /dovecot/configs
RUN chown -R vmail:vmail /dovecot/configs/virtual
ADD imaptest/ /dovecot/imaptest
ADD fts-flatcurve-test.sh /fts-flatcurve-test.sh
RUN chmod +x /fts-flatcurve-test.sh
ENTRYPOINT ["/fts-flatcurve-test.sh"]
Hi!
I just tried to update my NixOS mail server, but noticed that the last release 0.1.0 doesn't work with dovecot 2.3.17.
Can you please tag 0.1.1 or similar?
Hello,
from a user perspective, I wonder what the differences of this fts plugin vs fts-xapian are, so I could decide which one to use. From the available docs, it seems both do the same. I‘m currently using fts-xapian simply because this plugin requires a newer dovecot than available in Ubuntu. Still I wonder what motivated you to write a new plugin although fts xapian has been available for a while. I think it would be worthwhile to highlight the user relevant differences in the readme to aid new users in deciding which plugin to use.
Some searches in virtual mailboxes run into an assertion. The same searches don't crash in backend mailboxes.
Panic: file fts-search.c: line 87 (level_scores_add_vuids): assertion failed: (array_count(&vuids_arr) == array_count(&br->scores))
I created a test case in my fork to reproduce the issue: edieterich@545fe17
Here's a test run with the crash: https://github.com/edieterich/dovecot-fts-flatcurve/actions/runs/8203662289
This search works:
1 search or from user body body
But this search crashes:
2 search or header reply-to user body body
Something with headers and maybe matches? I don't see crashes without maybe matches.
Query (allhdrs:user* OR body:body* OR body:bodi*) matches=1 uids=1 maybe_matches=1 maybe_uids=1
Report of this error:
Could not write message data: uid=####; InvalidArgumentError: Term too long (> 245): //foo.example.com/?xtl=REDACTED&[email protected]
I am currently testing dovecot-fts-flatcurve with dovecot 2.3.19.
I encountered the following error indexing my INBOX:
doveadm -D index -u [email protected] INBOX
<snip>
Nov 17 14:02:07 doveadm([email protected]): Debug: Mailbox INBOX: Mailbox opened
Nov 17 14:02:07 doveadm([email protected]): Debug: fts-flatcurve: Xapian library version: 1.4.18
Nov 17 14:02:07 doveadm([email protected]): Debug: fts-flatcurve(INBOX): Opened DB (RO) messages=0 version=1 shards=1
Nov 17 14:02:07 doveadm([email protected]): Debug: fts-flatcurve(INBOX): Last UID uid=0
Nov 17 14:02:07 doveadm([email protected]): Info: INBOX: Caching mails seq=1..8020
Nov 17 14:02:07 doveadm([email protected]): Debug: fts-flatcurve(INBOX): Last UID uid=0
Nov 17 14:02:07 doveadm([email protected]): Debug: Mailbox INBOX: UID 912: Opened mail because: fts indexing
Nov 17 14:02:07 doveadm([email protected]): Debug: fts-flatcurve(INBOX): Opened DB (RW; current.1668686821134633) messages=0 version=1
Nov 17 14:02:07 doveadm([email protected]): Debug: fts-flatcurve(INBOX): Indexing uid=912
Nov 17 14:02:07 doveadm([email protected]): Debug: Mailbox INBOX: UID 913: Opened mail because: fts indexing
Nov 17 14:02:07 doveadm([email protected]): Debug: fts-flatcurve(INBOX): Indexing uid=913
Nov 17 14:02:07 doveadm([email protected]): Debug: Mailbox INBOX: UID 914: Opened mail because: fts indexing
Nov 17 14:02:07 doveadm([email protected]): Debug: fts-flatcurve(INBOX): Indexing uid=914
Nov 17 14:02:07 doveadm([email protected]): Debug: Mailbox INBOX: UID 915: Opened mail because: fts indexing
Nov 17 14:02:07 doveadm([email protected]): Debug: fts-flatcurve(INBOX): Indexing uid=915
Nov 17 14:02:07 doveadm([email protected]): Debug: Mailbox INBOX: UID 916: Opened mail because: fts indexing
Nov 17 14:02:07 doveadm([email protected]): Debug: fts-flatcurve(INBOX): Indexing uid=916
<snip>
Nov 17 14:02:09 doveadm([email protected]): Debug: fts-flatcurve(INBOX): Indexing uid=1515
Nov 17 14:02:09 doveadm([email protected]): Debug: Mailbox INBOX: UID 1516: Opened mail because: fts indexing
Nov 17 14:02:09 doveadm([email protected]): Debug: fts-flatcurve(INBOX): Indexing uid=1516
Nov 17 14:02:09 doveadm([email protected]): Debug: Mailbox INBOX: UID 1517: Opened mail because: fts indexing
Nov 17 14:02:09 doveadm([email protected]): Debug: fts-flatcurve(INBOX): Indexing uid=1517
Nov 17 13:48:18 doveadm([email protected]): Panic: file fts-filter.c: line 137 (fts_filter_filter): assertion failed: ((*token)[0] != '\0')
Nov 17 13:48:18 doveadm([email protected]): Error: Raw backtrace: /usr/lib/dovecot/libdovecot.so.0(backtrace_append+0x42) [0x7f0ab16ba582] -> /usr/lib/dovecot/libdovecot.so.0(backtrace_get+0x1e) [0x7f0ab16ba69e] -> /usr/lib/dovecot/libdovecot.so.0(+0x1022fb) [0x7f0ab16c72fb] -> /usr/lib/dovecot/libdovecot.so.0(+0x102331) [0x7f0ab16c7331] -> /usr/lib/dovecot/libdovecot.so.0(+0x55589) [0x7f0ab161a589] -> /usr/lib/dovecot/modules/lib20_fts_plugin.so(+0xa40a) [0x7f0ab0f1440a] -> /usr/lib/dovecot/modules/lib20_fts_plugin.so(fts_filter_filter+0x27) [0x7f0ab0f1f447] -> /usr/lib/dovecot/modules/lib20_fts_plugin.so(+0xbc5b) [0x7f0ab0f15c5b] -> /usr/lib/dovecot/modules/lib20_fts_plugin.so(+0xbef5) [0x7f0ab0f15ef5] -> /usr/lib/dovecot/modules/lib20_fts_plugin.so(fts_build_mail+0x5df) [0x7f0ab0f167af] -> /usr/lib/dovecot/modules/lib20_fts_plugin.so(+0x126c8) [0x7f0ab0f1c6c8] -> /usr/lib/dovecot/libdovecot-storage.so.0(mail_precache+0x30) [0x7f0ab17e9fb0] -> doveadm(+0x39df7) [0x557302f5cdf7] -> doveadm(+0x34a45) [0x557302f57a45] -> doveadm(doveadm_cmd_ver2_to_mail_cmd_wrapper+0x2ca) [0x557302f58afa] -> doveadm(doveadm_cmd_run_ver2+0x501) [0x557302f691b1] -> doveadm(doveadm_cmd_try_run_ver2+0x3a) [0x557302f6922a] -> doveadm(main+0x1d4) [0x557302f47e24] -> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea) [0x7f0ab1275d0a] -> doveadm(_start+0x2a) [0x557302f4808a]
Dovecot configuration:
ii dovecot-core 2:2.3.19.1-2+debian11 amd64 secure POP3/IMAP server - core files
ii dovecot-fts-flatcurve 0.3.3-2 amd64 FTS plugin for dovecot based on Xapian
base_dir = /var/run/dovecot/
deliver_log_format = msgid=%m: from_envelope=%e from=%f subject=%s size=%w %$ delivery_time=%{delivery_time} session_time=%{session_time} storage_id=%{storage_id}
imap_client_workarounds = delay-newmail tb-extra-mailbox-sep tb-lsub-flags
imap_id_log = *
imap_id_retain = yes
imap_id_send = *
lda_mailbox_autocreate = yes
lda_mailbox_autosubscribe = yes
listen = x.x.x.x
lmtp_rcpt_check_quota = yes
lmtp_save_to_detail_mailbox = yes
login_log_format_elements = user=<%u> method=%m rip=%r lip=%l mpid=%e %c %k session=<%{session}>
mail_gid = vmail
mail_location = mbox:~/mail:INBOX=/var/mail/%u
mail_plugins = lazy_expunge acl quota fts fts_flatcurve
mail_uid = vmail
managesieve_notify_capability = mailto
managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext
namespace inbox {
inbox = yes
location =
mailbox .EXPUNGED {
autoexpunge = 1 weeks
autoexpunge_max_mails = 100000
}
mailbox Drafts {
special_use = \Drafts
}
mailbox Junk {
special_use = \Junk
}
mailbox Sent {
special_use = \Sent
}
mailbox "Sent Messages" {
special_use = \Sent
}
mailbox Trash {
special_use = \Trash
}
prefix =
}
passdb {
args = /etc/dovecot/dovecot-sql.conf.ext
driver = sql
}
plugin {
fts = flatcurve
fts_autoindex = no
fts_enforced = yes
fts_filters = snowball stopwords
fts_filters_en = lowercase snowball english-possessive stopwords
fts_flatcurve_commit_limit = 500
fts_flatcurve_max_term_size = 30
fts_flatcurve_min_term_size = 2
fts_flatcurve_optimize_limit = 10
fts_flatcurve_rotate_size = 5000
fts_flatcurve_rotate_time = 5000
fts_flatcurve_substring_search = no
fts_index_timeout = 60s
fts_languages = en es de fr
fts_tokenizer_generic = algorithm=simple
fts_tokenizers = generic email-address
mail_log_events = delete undelete expunge copy mailbox_delete mailbox_rename flag_change append
mail_log_fields = uid box msgid from subject size vsize flags
sieve = file:~/sieve;active=~/.dovecot.sieve
}
protocols = pop3 imap sieve lmtp sieve
service auth {
unix_listener /var/spool/postfix/private/auth {
group = postfix
mode = 0666
user = postfix
}
}
service imap-login {
inet_listener imap {
port = 143
}
inet_listener imaps {
port = 993
ssl = yes
}
}
service lmtp {
unix_listener /var/spool/postfix/private/dovecot-lmtp {
group = postfix
mode = 0600
user = postfix
}
}
service managesieve-login {
inet_listener sieve {
port = 4190
}
}
service pop3-login {
inet_listener pop3 {
port = 110
}
inet_listener pop3s {
port = 995
ssl = yes
}
}
service stats {
unix_listener stats-reader {
group = vmail
mode = 0660
user = vmail
}
unix_listener stats-writer {
group = vmail
mode = 0660
user = vmail
}
}
ssl_cert = </etc/letsencrypt/live/xxxx/fullchain.pem
ssl_client_ca_dir = /etc/ssl/certs
ssl_dh = # hidden, use -P to show it
ssl_key = # hidden, use -P to show it
submission_host = x.x.x.x
submission_relay_host = 127.0.0.1
submission_relay_trusted = yes
userdb {
args = /etc/dovecot/dovecot-sql.conf.ext
driver = sql
}
protocol lmtp {
mail_plugins = lazy_expunge acl sieve
}
protocol lda {
mail_plugins = lazy_expunge acl sieve
}
protocol imap {
mail_max_userip_connections = 30
mail_plugins = lazy_expunge acl imap_sieve imap_filter_sieve
}
protocol sieve {
managesieve_logout_format = bytes=%i/%o
managesieve_max_compile_errors = 5
managesieve_max_line_length = 64 k
}
}
User configuration in sql:
+----------+------------+-----------------------------------------------------------------------------------------------------------+------------+-------+-------+--------+------------------+---------------------+
| username | domain | password | home | uid | gid | active | mail | email |
+----------+------------+-----------------------------------------------------------------------------------------------------------+------------+-------+-------+--------+------------------+---------------------+
| user | domain.tld | xxxxx| /var/vmail | vmail | vmail | Y | mdbox:/var/vmail | [email protected] |
+----------+------------+-----------------------------------------------------------------------------------------------------------+------------+-------+-------+--------+------------------+---------------------+
I am using mdbox
mailbox format.
Thanks
Hi, I have compiled fts-flatcurve v1.0.1 for my dovecot 2.3.21. At the ./configure stage, libicu-dev and icu-devtools are installed. I use Russian by default. In order for the search to work case-insensitive, I have to use the fts_filter = normalizer-icu
option (otherwise, the search is case-sensitive only). This option works because the search really becomes case-insensitive. However, an error is written in the dovecot logs:
Mar 01 11:57:45 imap([email protected])<242162><AlnFkZUSlLJ/AAAB>: Error: fts-flatcurve: fts_filter_normalizer_icu: libicu support not built in
Mar 01 11:57:45 imap([email protected])<242162><AlnFkZUSlLJ/AAAB>: Error: fts: Failed to initialize backend 'flatcurve': fts-flatcurve: Invalid settings
I don't understand why fts-flatcurve asked for the libicu library at the build stage, but during use it writes that support is not built in
?
I want to note that the search does not work correctly if I remove the normalizer-icu
option (the search becomes case-sensitive).
I am attaching part of the dovecot configuration file:
fts_autoindex=yes
fts_autoindex_max_recent_msgs=80
fts_index_timeout=90s
fts = flatcurve
fts_enforced = yes
#fts_decoder = decode2text
fts_autoindex_exclude = \Trash
fts_autoindex_exclude2 = \Junk
fts_languages = ru en
fts_filters_en = lowercase english-possessive stopwords
fts_filters = normalizer-icu snowball stopwords
fts_tokenizer_generic = algorithm=simple
fts_tokenizers = generic email-address
I'm still dealing with dovecot. I'm asking for leniency.
Hi,
It's been a few days now since I haven't been able to use flatcurve due to the error below:
dovecot[3384677]: indexer-worker(user)<3384699><UehlaS0T9MQAAAAAAAAAAAAAAAAAAAAB:wMLDAQeM62V7pTMAquyQZQ>: Warning: fts-flatcurve(INBOX): Could not write message data: uid=51735; InvalidArgumentError: Term too long (> 245): _znst10_hashtablei7qstringst4pairiks0_iesais3_enst8__detail10_select1stest8equal_tois0_est4hashis0_ens5_18_mod_range_hashingens5_20_default_ranged_hashens5_20_prime_rehash_policyens5_17_hashtable_traitsilb0elb0elb1eeee9_m_rehashe{size_t}rk{size_t}@ba
dovecot[3384677]: indexer-worker(user)<3384699><UehlaS0T9MQAAAAAAAAAAAAAAAAAAAAB:wMLDAQeM62V7pTMAquyQZQ>: Warning: fts-flatcurve(INBOX): Could not write message data: uid=51737; InvalidArgumentError: Term too long (> 245): 9qtprivate18qfunctorslotobjectist5_bindifmn5qcoro6detail17waitoperationbasei10qtcpservereefvnst7__n486116coroutine_handleiveeepns3_14qcorotcpserver29waitfornewconnectionoperationes9_eeli0ens_4listijeeeve4impleipns_15qslotobjectbaseep7qobjectppvpb@
dovecot[3384677]: indexer-worker: Error: terminate called after throwing an instance of 'std::bad_alloc'
dovecot[3384677]: indexer-worker: Error: what(): std::bad_alloc
Mar 08 17:07:06 paluero dovecot[3384677]: imap(user)<3384697><UehlaS0T9MQAAAAAAAAAAAAAAAAAAAAB>: Error: Mailbox INBOX: indexer failed to index mailbox
Mar 08 17:07:06 paluero dovecot[3384677]: indexer-worker(user)<3384699><UehlaS0T9MQAAAAAAAAAAAAAAAAAAAAB:wMLDAQeM62V7pTMAquyQZQ>: Fatal: master: service(indexer-worker): child 3384699 killed with signal 6 (core dumped)
I haven't had the chance to investigate what's going on yet, but maybe this is a known issue?
Hi,
I stumbled upon this project and am curious if beneficial to adopt for my own server and to include with Debian officially (where I am a developer).
After reading the nice and detailed README file, the one important question is still unanswered: How does it compare to other FTS plugins.
I was sorta expecting to get the answer in section Why flatcurve
but understand that it is about "Why the name flatcurve" not "why make yet another plugin called flatcurve".
Then I noticed in Acknowledgements
you mention conversations leading you to do the project - perhaps if those conversations were public and you could link to them my curiosity would be satisfied...
If I search in a virtual folder with unindexed messages, the index of the underlying real mailbox isn't updated and the search doesn't return all expected matches.
Here's a change to your test suite to reproduce the issue. I simply remove the index of INBOX and repeat the virtual search from before: edieterich@2bd8a1b
The test run fails:
Testing virtual search with unindexed messages
[9867](https://github.com/edieterich/dovecot-fts-flatcurve/actions/runs/3240988370/jobs/5312322483#step:3:9869)
*** Test virtual command 3/10 (line 6)
[9868](https://github.com/edieterich/dovecot-fts-flatcurve/actions/runs/3240988370/jobs/5312322483#step:3:9870)
- failed: Missing 1 untagged replies (1 mismatches)
[9869](https://github.com/edieterich/dovecot-fts-flatcurve/actions/runs/3240988370/jobs/5312322483#step:3:9871)
- first unexpanded: search 1 2 3
[9870](https://github.com/edieterich/dovecot-fts-flatcurve/actions/runs/3240988370/jobs/5312322483#step:3:9872)
- first expanded: search 1 2 3
[9871](https://github.com/edieterich/dovecot-fts-flatcurve/actions/runs/3240988370/jobs/5312322483#step:3:9873)
- best match: SEARCH 1 3
[9872](https://github.com/edieterich/dovecot-fts-flatcurve/actions/runs/3240988370/jobs/5312322483#step:3:9874)
- Command: search body body1
A virtual search should update the underlying indexes if necessary, is this correct?
Thank you for your help.
Hi
Thank you for your work on this plugin!
I've tried compiling it on FreeBSD but the only way I could get to compile it is by manually changing the config.status after configure to include the headers path:
-S["XAPIAN_LIBS"]="-L/usr/local/lib -lxapian"
+S["XAPIAN_LIBS"]="-L/usr/local/lib -lxapian -I/usr/local/include"
Once running doveadm index on a never indexed mailbox, FTS gets locked, eventually timing out.
Debug: fts-flatcurve(INBOX): Waiting for DB (RW; current.1643974776251308) lock
Debug: fts-flatcurve(INBOX): Waiting for DB (RW; current.1643974776251308) lock
...
Fatal: fts-flatcurve: Could not obtain DB lock (RW; current.1643974776251308)
Tried with default settings:
fts = flatcurve
fts_flatcurve_commit_limit = 500
fts_flatcurve_max_term_size = 30
fts_flatcurve_min_term_size = 2
fts_flatcurve_optimize_limit = 10
fts_flatcurve_rotate_size = 5000
fts_flatcurve_rotate_time = 5000
fts_flatcurve_substring_search = no
fts_languages = en es de
fts_tokenizer_generic = algorithm=simple
fts_tokenizers = generic email-address
Any hint is appreciated!
After switching from an older Dovecot version with fts-lucene to Dovecot 2.3.16 with fts-flatcurve, I noticed that searches from Thunderbird were a lot slower than before. It turns out Thunderbird does its searches with UNDELETED
, and the index simply isn't used for those searches.
Without UNDELETED
, the index is being used:
. UID SEARCH BODY onmiddellijk
* SEARCH 499 1160 1175 1187 1196 1198 1265 1341 1631 1838 1840 2323 2327 2331 2344 2346 2347 2349 2416 2488 5126 5135 5844
. OK Search completed (0.033 + 0.000 + 0.032 secs).
With UNDELETED
, the index isn't used and the search is a lot slower:
. UID SEARCH UNDELETED BODY onmiddellijk
* OK Searched 55% of the mailbox, ETA 0:07
* SEARCH 499 1160 1175 1187 1196 1198 1265 1341 1631 1838 1840 2323 2327 2331 2344 2346 2347 2349 2416 2488 5126 5135 5844
. OK Search completed (17.956 + 0.001 + 0.072 secs).
I am really looking forward to Dovecot 2.4 with fts-flatcurve, but in the meantime on Debian Bookworm, i tried to compile it manually.
Per the readme file i downloaded the source code for fts-flatcurve 0.2.0 compiled and installed it, leading dovcecot 2.3.13 (the default version on debian bookworm) to report this:
`doveadm fts-flatcurve check
Fatal: Couldn't load required plugin /usr/lib/dovecot/modules/lib21_fts_flatcurve_plugin.so: Module is for different ABI version 2.3.ABIv19(2.3.19.1) (we have 2.3.ABIv13(2.3.13))`
Since there is no Dovecot CE Version for Debian Bookworm yet, i am kind of stuck on dovecot 2.3.13 - but the readme suggests fts-flatcurve 0.2.0 should work with that version, only it doesnt.
am i missing something?
Is it possible to remove the usage of the .la files?
The .la files are removed from the dovecot packages, even the -dev packages built by debian/ubuntu, so if anyone wishes to compile this package against it, they first have to custom compile dovecot, then build this package against that custom version.
Hi!
I saw that fts-flatcurve is recommended in the official Dovecot manual:
Is this project ready to use yet?
Seems to be very new, and I can't find anyone using it..
I would like to package it for NixOS so a version number would be nice.
We started to use fts flatcurve on Debian bookworm with the following
components:
Everything seems to work fine, except imap crashes from time to time
with different users:
Feb 16 21:52:06 mail2 kernel: [459605.167041] imap[55238]: segfault at 8 ip 0000620dfcc4aee9 sp 0000713494feb4e0 error 4 in libxapian.so.30.12.3[620dfcc14000+173000]
Enabling coredumps the backtrace shows:
#0 0x00006ffff4529ee9 in Xapian::Database::get_document(unsigned int) const ()
from /lib/x86_64-linux-gnu/libxapian.so.30
#1 0x00006ffff6e67cb0 in fts_flatcurve_xapian_uid_exists_db (uid=uid@entry=8040,
backend=0x1a4f0a594a70) at fts-backend-flatcurve-xapian.cpp:916
#2 0x00006ffff6e6bb34 in fts_flatcurve_xapian_write_db_by_uid (uid=8040,
backend=0x1a4f0a594a70) at fts-backend-flatcurve-xapian.cpp:935
#3 fts_flatcurve_xapian_expunge (backend=0x1a4f0a594a70, uid=8040)
at fts-backend-flatcurve-xapian.cpp:1179
#4 0x00006ffff75050ba in mailbox_sync_notify ()
from /usr/lib/dovecot/libdovecot-storage.so.0
#5 0x00006ffff75292e1 in sdbox_sync_finish ()
It happens consistently at IMAP EXPUNGE, in all cases. The mailboxes are on NFS but a single machine accesses
them. Any hint is really appreciated.
Hello, I'm, running dovecot 2.3.18 unter a gentoo machine and installed dovecot-fts-flatcurve 0.2.0.
alpha.test rhering # doveadm fts-flatcurve check -A INBOX
doveadm(1188705): Panic: file doveadm-print.c: line 146: unreached
doveadm(1188705): Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0(backtrace_append+0x3d) [0x7f4237c4260d] -> /usr/lib64/dovecot/libdovecot.so.0(backtrace_get+0x1e) [0x7f4237c4271e] -> /usr/lib64/dovecot/libdovecot.so.0(+0xff5a9) [0x7f4237c4f5a9] -> /usr/lib64/dovecot/libdovecot.so.0(+0xff5e1) [0x7f4237c4f5e1] -> /usr/lib64/dovecot/libdovecot.so.0(+0x5446b) [0x7f4237ba446b] -> /usr/lib64/dovecot/libdovecot.so.0(+0x54807) [0x7f4237ba4807] -> doveadm(+0x21262) [0x56378fc29262] -> doveadm(doveadm_cmd_ver2_to_mail_cmd_wrapper+0x6a4) [0x56378fc3ced4] -> doveadm(doveadm_cmd_run_ver2+0x4df) [0x56378fc4cfaf] -> doveadm(doveadm_cmd_try_run_ver2+0x3a) [0x56378fc4d02a] -> doveadm(main+0x1d4) [0x56378fc2c8b4] -> /lib64/libc.so.6(__libc_start_main+0xcd) [0x7f42378867fd] -> doveadm(_start+0x2a) [0x56378fc2cb2a]
Abgebrochen (Speicherabzug geschrieben)
alpha.test rhering # coredumpctl gdb
PID: 1188705 (doveadm)
UID: 0 (root)
GID: 65502 (vmail)
Signal: 6 (ABRT)
Timestamp: Wed 2022-02-16 14:05:43 CET (2s ago)
Command Line: doveadm fts-flatcurve check -A INBOX
Executable: /usr/bin/doveadm
Control Group: /user.slice/user-1000.slice/session-29.scope
Unit: session-29.scope
Slice: user-1000.slice
Session: 29
Owner UID: 1000 (rhering)
Boot ID: 29fb1366af184227a36ef9949418b94d
Machine ID: 6c8725f9e2f751645ac1422658e66800
Hostname: alpha.test
Storage: /var/lib/systemd/coredump/core.doveadm.0.29fb1366af184227a36ef9949418b94d.1188705.1645016743000000.zst (present)
Disk Size: 247.0K
Message: Process 1188705 (doveadm) of user 0 dumped core.
Found module linux-vdso.so.1 with build-id: fe5c504f30a69e6b50c2193d4271e0f461e16f18
Found module libnss_myhostname.so.2 without build-id.
Found module libnss_files.so.2 without build-id.
Found module libcap.so.2 without build-id.
Found module libnss_mymachines.so.2 without build-id.
Found module lib21_doveadm_fts_flatcurve_plugin.so without build-id.
Found module lib20_doveadm_fts_plugin.so without build-id.
Found module lib10_doveadm_quota_plugin.so without build-id.
Found module lib10_doveadm_acl_plugin.so without build-id.
Found module libuuid.so.1 without build-id.
Found module librt.so.1 without build-id.
Found module libxapian.so.30 without build-id.
Found module lib21_fts_flatcurve_plugin.so without build-id.
Found module lib20_zlib_plugin.so without build-id.
Found module lib20_virtual_plugin.so without build-id.
Found module lib20_replication_plugin.so without build-id.
Found module libicudata.so.70 without build-id.
Found module libgcc_s.so.1 without build-id.
Found module libstdc++.so.6 without build-id.
Found module libicuuc.so.70 without build-id.
Found module libicui18n.so.70 without build-id.
Found module libexttextcat-2.0.so.0 without build-id.
Found module libstemmer.so.2 without build-id.
Found module lib20_fts_plugin.so without build-id.
Found module lib15_notify_plugin.so without build-id.
Found module lib11_trash_plugin.so without build-id.
Found module lib10_quota_plugin.so without build-id.
Found module lib01_acl_plugin.so without build-id.
Found module libdovecot-sieve.so.0 without build-id.
Found module lib10_doveadm_sieve_plugin.so without build-id.
Found module libdl.so.2 without build-id.
Found module libpthread.so.0 without build-id.
Found module libc.so.6 without build-id.
Found module libm.so.6 without build-id.
Found module libdovecot.so.0 without build-id.
Found module libdovecot-storage.so.0 without build-id.
Found module libcrypt.so.2 without build-id.
Found module liblz4.so.1 without build-id.
Found module liblzma.so.5 without build-id.
Found module libbz2.so.1 without build-id.
Found module libz.so.1 without build-id.
Found module doveadm without build-id.
Stack trace of thread 1188705:
#0 0x00007f423789b5f1 __GI_raise (libc.so.6 + 0x385f1)
#1 0x00007f4237885536 __GI_abort (libc.so.6 + 0x22536)
#2 0x00007f4237ba47ba default_fatal_finish (libdovecot.so.0 + 0x547ba)
#3 0x00007f4237c4f5e1 default_fatal_handler (libdovecot.so.0 + 0xff5e1)
#4 0x00007f4237ba446b i_panic (libdovecot.so.0 + 0x5446b)
#5 0x00007f4237ba4807 i_unreached (libdovecot.so.0 + 0x54807)
#6 0x000056378fc29262 doveadm_print_sticky (doveadm + 0x21262)
#7 0x000056378fc3ced4 doveadm_mail_all_users (doveadm + 0x34ed4)
#8 0x000056378fc4cfaf doveadm_cmd_run_ver2 (doveadm + 0x44faf)
#9 0x000056378fc4d02a doveadm_cmd_try_run_ver2 (doveadm + 0x4502a)
#10 0x000056378fc2c8b4 main (doveadm + 0x248b4)
#11 0x00007f42378867fd __libc_start_main (libc.so.6 + 0x237fd)
#12 0x000056378fc2cb2a _start (doveadm + 0x24b2a)
GNU gdb (Gentoo 11.2 vanilla) 11.2
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/doveadm...
Reading symbols from /usr/lib/debug//usr/bin/doveadm.debug...
[New LWP 1188705]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `doveadm fts-flatcurve check -A INBOX'.
Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
49 return ret;
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
#1 0x00007f4237885536 in __GI_abort () at abort.c:79
#2 0x00007f4237ba47ba in default_fatal_finish (status=0, type=LOG_TYPE_PANIC) at failures.c:459
#3 fatal_handler_real (ctx=<optimized out>, format=<optimized out>, args=<optimized out>) at failures.c:471
#4 0x00007f4237c4f5e1 in default_fatal_handler (ctx=<optimized out>, format=<optimized out>, args=<optimized out>) at failures.c:479
#5 0x00007f4237ba446b in i_panic (format=format@entry=0x7f4237cb023d "file %s: line %d: unreached") at failures.c:524
#6 0x00007f4237ba4807 in i_unreached (source_filename=source_filename@entry=0x56378fc7c647 "doveadm-print.c", source_linenum=source_linenum@entry=146) at failures.c:990
#7 0x000056378fc29262 in doveadm_print_sticky (key=key@entry=0x56378fc7c8a3 "username", value=0x56379034eac0 "[email protected]") at doveadm-print.c:146
#8 0x000056378fc3ced4 in doveadm_mail_all_users (wildcard_user=0x0, ctx=0x563790342368) at doveadm-mail.c:533
#9 doveadm_mail_cmd_exec (wildcard_user=0x0, ctx=0x563790342368) at doveadm-mail.c:670
#10 doveadm_cmd_ver2_to_mail_cmd_wrapper (cctx=<optimized out>) at doveadm-mail.c:988
#11 0x000056378fc4cfaf in doveadm_cmd_run_ver2 (argc=3, argv=0x563790310a20, cctx=<optimized out>) at doveadm-cmd.c:465
#12 0x000056378fc4d02a in doveadm_cmd_try_run_ver2 (cmd_name=cmd_name@entry=0x563790310a48 "fts-flatcurve", argc=<optimized out>, argv=<optimized out>, cctx=cctx@entry=0x7fffa4e89430) at doveadm-cmd.c:363
#13 0x000056378fc2c8b4 in main (argc=<optimized out>, argv=<optimized out>) at doveadm.c:361
another segfault:
alpha.test rhering # doveadm fts-flatcurve check -u [email protected] "?"
doveadm([email protected]): Panic: file fts-backend-flatcurve.c: line 120: unreached
doveadm([email protected]): Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0(backtrace_append+0x3d) [0x7fd7767c760d] -> /usr/lib64/dovecot/libdovecot.so.0(backtrace_get+0x1e) [0x7fd7767c771e] -> /usr/lib64/dovecot/libdovecot.so.0(+0xff5a9) [0x7fd7767d45a9] -> /usr/lib64/dovecot/libdovecot.so.0(+0xff5e1) [0x7fd7767d45e1] -> /usr/lib64/dovecot/libdovecot.so.0(+0x5446b) [0x7fd77672946b] -> /usr/lib64/dovecot/libdovecot.so.0(+0x54807) [0x7fd776729807] -> /usr/lib64/dovecot/lib21_fts_flatcurve_plugin.so(+0x7de2) [0x7fd773dfede2] -> /usr/lib64/dovecot/doveadm/lib21_doveadm_fts_flatcurve_plugin.so(+0x2817) [0x7fd773bab817] -> doveadm(+0x33a5d) [0x560e90ec1a5d] -> doveadm(doveadm_cmd_ver2_to_mail_cmd_wrapper+0x2bc) [0x560e90ec2aec] -> doveadm(doveadm_cmd_run_ver2+0x4df) [0x560e90ed2faf] -> doveadm(doveadm_cmd_try_run_ver2+0x3a) [0x560e90ed302a] -> doveadm(main+0x1d4) [0x560e90eb28b4] -> /lib64/libc.so.6(__libc_start_main+0xcd) [0x7fd77640b7fd] -> doveadm(_start+0x2a) [0x560e90eb2b2a]
Abgebrochen (Speicherabzug geschrieben)
alpha.test rhering # coredumpctl gdb
PID: 1188745 (doveadm)
UID: 65502 (vmail)
GID: 65502 (vmail)
Signal: 6 (ABRT)
Timestamp: Wed 2022-02-16 14:06:31 CET (2min 36s ago)
Command Line: doveadm fts-flatcurve check -u [email protected] $'?'
Executable: /usr/bin/doveadm
Control Group: /user.slice/user-1000.slice/session-29.scope
Unit: session-29.scope
Slice: user-1000.slice
Session: 29
Owner UID: 1000 (rhering)
Boot ID: 29fb1366af184227a36ef9949418b94d
Machine ID: 6c8725f9e2f751645ac1422658e66800
Hostname: alpha.test
Storage: /var/lib/systemd/coredump/core.doveadm.65502.29fb1366af184227a36ef9949418b94d.1188745.1645016791000000.zst (present)
Disk Size: 245.3K
Message: Process 1188745 (doveadm) of user 65502 dumped core.
Found module linux-vdso.so.1 with build-id: fe5c504f30a69e6b50c2193d4271e0f461e16f18
Found module libnss_myhostname.so.2 without build-id.
Found module libnss_files.so.2 without build-id.
Found module libcap.so.2 without build-id.
Found module libnss_mymachines.so.2 without build-id.
Found module lib21_doveadm_fts_flatcurve_plugin.so without build-id.
Found module lib20_doveadm_fts_plugin.so without build-id.
Found module lib10_doveadm_quota_plugin.so without build-id.
Found module lib10_doveadm_acl_plugin.so without build-id.
Found module libuuid.so.1 without build-id.
Found module librt.so.1 without build-id.
Found module libxapian.so.30 without build-id.
Found module lib21_fts_flatcurve_plugin.so without build-id.
Found module lib20_zlib_plugin.so without build-id.
Found module lib20_virtual_plugin.so without build-id.
Found module lib20_replication_plugin.so without build-id.
Found module libicudata.so.70 without build-id.
Found module libgcc_s.so.1 without build-id.
Found module libstdc++.so.6 without build-id.
Found module libicuuc.so.70 without build-id.
Found module libicui18n.so.70 without build-id.
Found module libexttextcat-2.0.so.0 without build-id.
Found module libstemmer.so.2 without build-id.
Found module lib20_fts_plugin.so without build-id.
Found module lib15_notify_plugin.so without build-id.
Found module lib11_trash_plugin.so without build-id.
Found module lib10_quota_plugin.so without build-id.
Found module lib01_acl_plugin.so without build-id.
Found module libdovecot-sieve.so.0 without build-id.
Found module lib10_doveadm_sieve_plugin.so without build-id.
Found module libdl.so.2 without build-id.
Found module libpthread.so.0 without build-id.
Found module libc.so.6 without build-id.
Found module libm.so.6 without build-id.
Found module libdovecot.so.0 without build-id.
Found module libdovecot-storage.so.0 without build-id.
Found module libcrypt.so.2 without build-id.
Found module liblz4.so.1 without build-id.
Found module liblzma.so.5 without build-id.
Found module libbz2.so.1 without build-id.
Found module libz.so.1 without build-id.
Found module doveadm without build-id.
Stack trace of thread 1188745:
#0 0x00007fd7764205f1 __GI_raise (libc.so.6 + 0x385f1)
#1 0x00007fd77640a536 __GI_abort (libc.so.6 + 0x22536)
#2 0x00007fd7767297ba default_fatal_finish (libdovecot.so.0 + 0x547ba)
#3 0x00007fd7767d45e1 default_fatal_handler (libdovecot.so.0 + 0xff5e1)
#4 0x00007fd77672946b i_panic (libdovecot.so.0 + 0x5446b)
#5 0x00007fd776729807 i_unreached (libdovecot.so.0 + 0x54807)
#6 0x00007fd773dfede2 fts_backend_flatcurve_set_mailbox (lib21_fts_flatcurve_plugin.so + 0x7de2)
#7 0x00007fd773bab817 cmd_fts_flatcurve_mailbox_run_do (lib21_doveadm_fts_flatcurve_plugin.so + 0x2817)
#8 0x0000560e90ec1a5d doveadm_mail_next_user (doveadm + 0x33a5d)
#9 0x0000560e90ec2aec doveadm_mail_cmd_exec (doveadm + 0x34aec)
#10 0x0000560e90ed2faf doveadm_cmd_run_ver2 (doveadm + 0x44faf)
#11 0x0000560e90ed302a doveadm_cmd_try_run_ver2 (doveadm + 0x4502a)
#12 0x0000560e90eb28b4 main (doveadm + 0x248b4)
#13 0x00007fd77640b7fd __libc_start_main (libc.so.6 + 0x237fd)
#14 0x0000560e90eb2b2a _start (doveadm + 0x24b2a)
GNU gdb (Gentoo 11.2 vanilla) 11.2
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/doveadm...
Reading symbols from /usr/lib/debug//usr/bin/doveadm.debug...
[New LWP 1188745]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `doveadm fts-flatcurve check -u [email protected] ?'.
Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
49 return ret;
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
#1 0x00007fd77640a536 in __GI_abort () at abort.c:79
#2 0x00007fd7767297ba in default_fatal_finish (status=0, type=LOG_TYPE_PANIC) at failures.c:459
#3 fatal_handler_real (ctx=<optimized out>, format=<optimized out>, args=<optimized out>) at failures.c:471
#4 0x00007fd7767d45e1 in default_fatal_handler (ctx=<optimized out>, format=<optimized out>, args=<optimized out>) at failures.c:479
#5 0x00007fd77672946b in i_panic (format=format@entry=0x7fd77683523d "file %s: line %d: unreached") at failures.c:524
#6 0x00007fd776729807 in i_unreached (source_filename=source_filename@entry=0x7fd773e081ad "fts-backend-flatcurve.c", source_linenum=source_linenum@entry=120) at failures.c:990
#7 0x00007fd773dfede2 in fts_backend_flatcurve_set_mailbox (backend=backend@entry=0x560e91a4b6d0, box=0x560e91a5b328) at fts-backend-flatcurve.c:120
#8 0x00007fd773bab817 in cmd_fts_flatcurve_mailbox_run_do (ctx=0x560e91a11388, user=<optimized out>, backend=0x560e91a4b6d0) at doveadm-fts-flatcurve.c:145
#9 cmd_fts_flatcurve_mailbox_run (_ctx=0x560e91a11388, user=<optimized out>) at doveadm-fts-flatcurve.c:231
#10 0x0000560e90ec1a5d in doveadm_mail_next_user (ctx=0x560e91a11388, error_r=0x7ffff796cc98) at doveadm-mail.c:464
#11 0x0000560e90ec2aec in doveadm_mail_cmd_exec (wildcard_user=0x0, ctx=0x560e91a11388) at doveadm-mail.c:659
#12 doveadm_cmd_ver2_to_mail_cmd_wrapper (cctx=<optimized out>) at doveadm-mail.c:988
#13 0x0000560e90ed2faf in doveadm_cmd_run_ver2 (argc=4, argv=0x560e919dfa20, cctx=<optimized out>) at doveadm-cmd.c:465
#14 0x0000560e90ed302a in doveadm_cmd_try_run_ver2 (cmd_name=cmd_name@entry=0x560e919dfa50 "fts-flatcurve", argc=<optimized out>, argv=<optimized out>, cctx=cctx@entry=0x7ffff796cdf0) at doveadm-cmd.c:363
#15 0x0000560e90eb28b4 in main (argc=<optimized out>, argv=<optimized out>) at doveadm.c:361
Hi everyone,
very nice Plugin/FTS you wrote with flatcurve here so thanks for that. It works like a charm so far :)
But i have a question or a feature request if this is possible through flatcurve:
The ability to change the flatcurve (not the dovecot index) folder which is located (per default) at the dovecot index folder which is fine so far.
It would be nice if you could swap out the flat curve indexes. E.g. to a faster disk without being bound to the Dovecot own index.
Is this already possible and I have overlooked it? Or would that be a change to the Flatcurve plugin? (Is that even possible?)
Thanks for your answer :)
I've setup
dovecot-2.3.18-1.fc36.x86_64
and build/installed
dovecot23_fts_flatcurve-git.HEAD-0.20220520_110538.fc36.x86_64
After (re)scan + (re)index, if in Thunderbird I exec a search:
Search for messages in: pgnd@example
[X] Search subfolders
[X] Run search on server
[X] Match all of the following
[Body] [contains] [teststring]
with dovecot logging enabled as
log_debug = category=fts-flatcurve
I see in my logs, e.g.,
...
2022-05-22 20:11:36 imap([email protected])<lNL/bKLf1t2sHgsH>: Debug: fts-flatcurve(some/path/subfolder): Query (body:teststr* OR body:teststring*) matches=0 uids=
...
That query,
body:teststr* OR body:teststring*
is an unneeded, basically redundant, boolean operation.
Bug, or config, problem?
Hi,
I'm trying to build this on el9, but the build is failing on:
fts-backend-flatcurve-xapian.cpp: In function 'int fts_flatcurve_xapian_lock(flatcurve_fts_backend*)':
fts-backend-flatcurve-xapian.cpp:517:13: error: 'struct file_create_settings' has no member named 'lock_settings'
517 | set.lock_settings.close_on_free = TRUE;
| ^~~~~~~~~~~~~
fts-backend-flatcurve-xapian.cpp:518:13: error: 'struct file_create_settings' has no member named 'lock_settings'
518 | set.lock_settings.unlink_on_free = TRUE;
| ^~~~~~~~~~~~~
fts-backend-flatcurve-xapian.cpp:519:13: error: 'struct file_create_settings' has no member named 'lock_settings'
519 | set.lock_settings.lock_method = backend->parsed_lock_method;
| ^~~~~~~~~~~~~
The distro is shipping dovecot 2.3.16 and xapian 1.4.18, so the requirements are met according to the front page.
Any idea what the problem may be?
Hi,
I had been using the lucene FTS plugin since a decade now and it has done me well. Thought of upgrading to the new & current stuff and came across this flatcurve plugin of yours which seems very promising (xapian on the other hand was creating indexes larger than my mailboxes themselves). I am using following configuration in dovecot.conf:
fts = flatcurve
fts_filters_en = lowercase english-possessive stopwords
fts_languages = en
fts_tokenizers = generic email-address
fts_autoindex = no
fts_enforced = yes
A search command like this:
doveadm -D search -u [email protected] mailbox INBOX SUBJECT "/home/johndoe/render.php"
should show the messages with subject: "CRON: /home/johndoe/render.php OK" but produces a lot of extra undesired results and I think the second line in this debug output indicates the reason:
May 23 07:44:13 doveadm([email protected]): Debug: fts-flatcurve(INBOX): Query (hdr_subject:/home/johndoe/render.php*) matches=0 uids=
May 23 07:44:13 doveadm([email protected]): Debug: fts-flatcurve(INBOX): Query (hdr_subject:php* AND hdr_subject:render* AND hdr_subject:johndoe* AND hdr_subject:home*) matches=272 uids=67041,67085,67188,67223,67257,67290,67323,67355,67395,67564,67770,67817,67863,67985,68819,69512,69572,69635,69737,70017,70058,70086,70125,70147,70191,70296,70304,70331,70340,70350,70354,70375,70407,70417,70427,70449,70499,70521:70522,70535:70550,70555,70561:70563,70591,70597:70599,70662,70685,70702,70708,70718:70719,70724,70727:70728,70730:70733,70735,70746:70747,70754,70775,70777,70794,70811:70812,70822,70866,70942,70948,70971,71017,71021,71040,71042,71075,71079,71084,71113,71128:71129,71131,71152,71160,71184,71188,71208,71214,71225,71255,71269,71297,71300,71331,71375,71422,71449,71457,71467,71469,71495,71515,71605,71626,71632,71649,71672,71681:71682,71689,71692,71699,71716,71757,71770,71777,71782:71785,71790,71795,71797,71814,71818:71819,71828,71838:71842,71845,71859:71860,71937,71947,71954,71960,71963:71964,71977,71990,72014,72021:72022,72030,72034:72042,72045:72046,72049,72056,72061,72063,72073:72074,72083,72088,72090,72092,72101,72108,72129,72131:72132,72134,72136:72140,72159,72163,72172:72173,72186,72212,72218:72223,72237,72239,72246,72267,72288,72387,72410,72446,72469,72476:72477,72514,72541,72543,72568:72569,72572:72574,72598,72604,72606,72609,72644,72674,72687,72691,72694,72734,72772,72791,72797,72799,72803,72832:72833,72835:72841,72856:72857,72866:72867,72873:72874,72901,72930,72938,72948,72960,72965,72976,73018,73037,73071,73081,73116,73158,73249,73307,73352,73392,73466,73533,73601,73670,73733,73775,73784:73786,73804,73807,73811,73815,73819,73823,73825,73831,73842,73846,74005,74199,74390,74540,74684,74854,75017,75192,75354,75525,75710,75839:75843,75845,75903,75984:75985,76091,76263,76447,76624,76816,76989,77091:77092,77097,77119,77155,77293,77460,77608,77761,77908,78066,78218,78393,78400:78401,78522:78523,78560,78728,78921,79104,79298,79504,79555,79898,80027,80031:80032,80034:80035,80037,80056,80071,80073,80077:80079,80082:80084,80086,80089
I tried rebuilding the indexes with fts_flatcurve_substring_search = yes
too but that didn't change anything. It works as expected with lucene plugin because in that case header search is performed via dovecot indexes instead of FTS. May be I am not doing something right in configuring this new FTS? Will really appreciate some pointer here.
Thanks,
Sam
Hello,
I'm testing FTS flatcurve plugin in order to understand if I can switch from FTS Solr to flatcurve.
In my configuration I have enabled Virtual mailboxes and to search in all folders I just SEARCH on Virtual/All folder.
With Solr, if this (virtual) folder is not indexed, Dovecot starts to index it (on all real folders).
However with FTS flatcurve, when I SEARCH on Virtual/All for the first time, the indexer process does not start and the search returns no data.
Only if I manually run doveadm index -q -u [email protected] '*'
flatcurve find messages.
Can flatcurve have the same feature as Solr for Virtual mailboxes?
Here a sample of my configuration:
namespace Virtual {
hidden = yes
list = no
location = virtual:/etc/dovecot/virtual:INDEX=~/Maildir/virtual
prefix = Virtual/
separator = /
subscriptions = no
}
namespace inbox {
[...]
mailbox virtual/All {
comment = All my messages
special_use = \All
}
}
# cat /etc/dovecot/virtual/All/dovecot-virtual
*
all
# in 90-plugin.conf I have:
fts = flatcurve
fts_flatcurve = commit_limit=500 max_term_size=30 min_term_size=2 \
optimize_limit=10 rotate_size=5000 rotate_time=5000 \
substring_search=no
fts_autoindex = yes
fts_enforced = yes
fts_index_timeout = 5s
fts_filters = normalizer-icu snowball stopwords
fts_filters_en = lowercase snowball english-possessive stopwords
fts_languages = en it es de
fts_tokenizer_generic = algorithm=simple
fts_tokenizers = generic email-address
Thanks
Just testing this software, and found a likely bug in an if statement.
fts-backend-flatcurve-xapian.cpp, line 607:
else if (errno = ENOENT)
should likely be else if (errno == ENOENT)
hey, is it normal, that the fts-storage need ~3 times more space?
du -sh /srv/vmail/<domain>/<local>/mdbox/storage
12G /srv/vmail/<domain>/<local>/mdbox/storage
du -sh /srv/vmail/<domain>/<local>/mdbox/mailboxes/INBOX/dbox-Mails/fts-flatcurve
40G /srv/vmail/<domain>/<local>/mdbox/mailboxes/INBOX/dbox-Mails/fts-flatcurve
here is my fts config:
plugin {
fts = flatcurve
fts_flatcurve_substring_search = no
fts_autoindex = yes
fts_enforced = yes
fts_filters = normalizer-icu snowball stopwords
fts_filters_en = lowercase snowball english-possessive stopwords
fts_index_timeout = 60s
fts_languages = en de
fts_tokenizer_generic = algorithm=simple
fts_tokenizers = generic email-address
fts_decoder = decode2text
fts_autoindex_exclude = \Junk
fts_autoindex_exclude2 = \Trash
}
As comparison, the old lucene indizes of this mailbox:
# du -sh /srv/vmail/<domain>/<local>/mdbox/lucene-indexes
796M /srv/vmail/<domain>/<local>/mdbox/lucene-indexes
I get segmentation faults when I search after expunging mails. Here's a sequence of IMAP commands that cause the segmentation fault. Initially there are 2 mails in the inbox (the number of mails shouldn't matter).
1 SELECT INBOX
* 2 EXISTS
2 SEARCH TEXT test
* SEARCH 1 2
3 STORE 2 +FLAGS (\deleted)
4 EXPUNGE
5 SEARCH TEXT test
-> Segmentation fault
Here's the backtrace from the segmentation fault:
#0 0x00007f9e8f98078f in Xapian::Database::get_doccount() const () from /lib/x86_64-linux-gnu/libxapian.so.30
#1 0x00007f9e8f98d872 in Xapian::Enquire::Internal::get_mset(unsigned int, unsigned int, unsigned int, Xapian::RSet const*, Xapian::MatchDecider const*) const () from /lib/x86_64-linux-gnu/libxapian.so.30
#2 0x00007f9e8f98dcb4 in Xapian::Enquire::get_mset(unsigned int, unsigned int, unsigned int, Xapian::RSet const*, Xapian::MatchDecider const*) const () from /lib/x86_64-linux-gnu/libxapian.so.30
#3 0x00007f9e8fb70c44 in fts_flatcurve_xapian_get_last_uid_query (backend=0x55a95b1c0420, db=0x55a95b243800) at fts-backend-flatcurve-xapian.cpp:766
#4 0x00007f9e8fb70ea1 in fts_flatcurve_xapian_get_last_uid (backend=0x55a95b1c0420, last_uid_r=0x7ffda2ae2148) at fts-backend-flatcurve-xapian.cpp:792
#5 0x00007f9e8fb6d1b0 in fts_backend_flatcurve_get_last_uid (_backend=0x55a95b1c0420, box=0x55a95b1cb308, last_uid_r=0x7ffda2ae2148) at fts-backend-flatcurve.c:137
#6 0x00007f9e921b6c25 in fts_backend_get_last_uid (backend=0x55a95b1c0420, box=0x55a95b1cb308, last_uid_r=0x7ffda2ae2148) at fts-api.c:106
#7 0x00007f9e921bb4a4 in fts_indexer_init (backend=0x55a95b1c0420, box=0x55a95b1cb308, ctx_r=0x55a95b181430) at fts-indexer.c:101
#8 0x00007f9e921c131b in fts_try_build_init (ctx=0x55a95b1d04e0, fctx=0x55a95b1813d0) at fts-storage.c:140
#9 0x00007f9e921c1909 in fts_mailbox_search_init (t=0x55a95b178040, args=0x55a95b1f3518, sort_program=0x0, wanted_fields=(unknown: 0), wanted_headers=0x0) at fts-storage.c:258
#10 0x00007f9e92688d04 in mailbox_search_init (t=0x55a95b178040, args=0x55a95b1f3518, sort_program=0x0, wanted_fields=(unknown: 0), wanted_headers=0x0) at mail-storage.c:2254
#11 0x000055a95991709b in imap_search_start (ctx=0x55a95b1b78a0, sargs=0x55a95b1f3518, sort_program=0x0) at imap-search.c:537
#12 0x000055a9598feb9e in cmd_search (cmd=0x55a95b1b76f8) at cmd-search.c:48
#13 0x000055a95990b650 in command_exec (cmd=0x55a95b1b76f8) at imap-commands.c:201
#14 0x000055a959908a6d in client_command_input (cmd=0x55a95b1b76f8) at imap-client.c:1209
#15 0x000055a959908dae in client_command_input (cmd=0x55a95b1b76f8) at imap-client.c:1276
#16 0x000055a959908f13 in client_handle_next_command (client=0x55a95b1b4878, remove_io_r=0x7ffda2ae24b1) at imap-client.c:1318
#17 0x000055a95990901c in client_handle_input (client=0x55a95b1b4878) at imap-client.c:1332
#18 0x000055a959909206 in client_input (client=0x55a95b1b4878) at imap-client.c:1376
#19 0x00007f9e92555c5b in io_loop_call_io (io=0x55a95b1ca500) at ioloop.c:715
#20 0x00007f9e92558eb3 in io_loop_handler_run_internal (ioloop=0x55a95b176260) at ioloop-epoll.c:222
#21 0x00007f9e92555ee1 in io_loop_handler_run (ioloop=0x55a95b176260) at ioloop.c:767
#22 0x00007f9e92555dbd in io_loop_run (ioloop=0x55a95b176260) at ioloop.c:740
#23 0x00007f9e924722a3 in master_service_run (service=0x55a95b1760c0, callback=0x55a95991fd63 <client_connected>) at master-service.c:862
#24 0x000055a959920194 in main (argc=1, argv=0x55a95b175de0) at main.c:546
This is the latest master branch version of the flatcurve plugin. My configuration looks like this:
plugin {
fts_autoindex = no
fts_languages = de en
fts_tokenizers = generic email-address
fts_tokenizer_generic = algorithm=simple
fts_filters = lowercase stopwords snowball
fts_language_config = /usr/share/libexttextcat/fpdb.conf
}
Can you reproduce this issue and figure out what's happening or do you need more information from me? Thank you for your help!
Hi!
Updated the version to 1.0.3, decided to rebuild the indexes with the command:
doveadm -D force-resync -u '*' '*'
In the process, there are records of the form:
Native optimize failed, fallback to manual optimization; Invalid Operation Error: when merging databases, --no-number is only currently supported if the databases have disjoint ranges of used document ids: /var/mail/mydomain.com/[email protected]/.Archives.2019.INBOX/fts-flatcurve/index.6426 has range 1-5672, /var/mail/mydomain.com/[email protected]/.Archives.2019.INBOX/fts-flatcurve/index.1003 has range 1-5672
Do I need to take any action to set up flatcurve? Or will flatcurve deal with this error itself?
I added a test for database sharding to my fork: master...edieterich:test-database-sharding
(I only reduced rotate_size
to make sharding happen faster.)
The test run fails with Xapian::DatabaseNotFoundError
: https://github.com/edieterich/dovecot-fts-flatcurve/runs/3788875522?check_suite_focus=true#step:3:14793
If I run the test locally and check the mailbox, I don't see a index.current
directory:
ls -l /dovecot/sdbox/user/sdbox/mailboxes/imaptest/dbox-Mails/fts-flatcurve/
total 4
drwx------ 2 vmail vmail 4096 Oct 4 10:13 index.1017
When I test manually, sharding sometimes works. But then I have the problem that not all shards are considered when searching.
Here's an example for what I mean. I search while delivering mails in parallel. I repeatedly deliver a few 1000 mails again and again. Here's a search result:
. search text backend
* SEARCH 3 40 42 47 69 71 106 109 117 118 173 281 480 555 577 578 985 1003 1040 1042 1047 1069 1071 1106 1109 1117 1118 1173 1281 1480 1555 1577 1578 1985 2066 2222 2328 2331 2353 2354 2436 2456 2669 2864 2904 2907 2953 3016 3027 3090 3180 3210 3224 3491 3492 3557 3660 3738 3740 3742 3747 3748 3756 4047 4097 4130 4281 4630 4635 4647 4650 4672 4673 4674 4675 4676 4677 4681 4682 4699 4700 4701 4704 4756 4806 4909 4912 4914 4975 5003 5040 5042 5047 5069 5071 5106 5109 5117 5118 5173 5281 5480 5555 5577 5578 5985 6003 6040 6042 6047 6069 6071 6106 6109 6117 6118 6173 6281 6480 6555 6577 6578 6985 7066 7222 7328 7331 7353 7354 7436 7456 7669 7864 7904 7907 7953 8016 8027 8090 8180 8210 8224 8491 8492 8557 8660 8738 8740 8742 8747 8748 8756 9003 9040 9042 9047 9069 9071 9106 9109 9117 9118 9173 9281 9480 9555 9577 9578
Some time during all of this, the database is rotated:
ls -l /var/spool/dovecot/user/mailboxes/INBOX/dbox-Mails/fts-flatcurve/
total 12
drwx------ 2 mail mail 4096 Oct 4 13:15 index.4372
drwx------ 2 mail mail 4096 Oct 4 11:25 index.5480
drwx------ 2 mail mail 4096 Oct 4 13:15 index.current
When I repeat the search, my expectation is that I get the same mails as in the first search plus some additional matches from the mails delivered in the meantime. Instead the mails "in the middle" are missing. I assume that I only get matches from the first and the current shard. The matches from the second shard (presumably) are missing:
. search text backend
* SEARCH 3 40 42 47 69 71 106 109 117 118 173 281 480 555 577 578 985 1003 1040 1042 1047 1069 1071 1106 1109 1117 1118 1173 1281 1480 1555 1577 1578 1985 2066 2222 2328 2331 2353 2354 2436 2456 2669 2864 2904 2907 2953 3016 3027 3090 3180 3210 3224 3491 3492 3557 3660 3738 3740 3742 3747 3748 3756 4047 4097 4130 4281 4630 4635 4647 4650 4672 4673 4674 4675 4676 4677 4681 4682 4699 4700 4701 4704 4756 4806 4909 4912 4914 4975 10066 10222 10328 10331 10353 10354 10436 10456
When I open a new IMAP session, my expectation is met:
. search text backend
* SEARCH 3 40 42 47 69 71 106 109 117 118 173 281 480 555 577 578 985 1003 1040 1042 1047 1069 1071 1106 1109 1117 1118 1173 1281 1480 1555 1577 1578 1985 2066 2222 2328 2331 2353 2354 2436 2456 2669 2864 2904 2907 2953 3016 3027 3090 3180 3210 3224 3491 3492 3557 3660 3738 3740 3742 3747 3748 3756 4047 4097 4130 4281 4630 4635 4647 4650 4672 4673 4674 4675 4676 4677 4681 4682 4699 4700 4701 4704 4756 4806 4909 4912 4914 4975 5003 5040 5042 5047 5069 5071 5106 5109 5117 5118 5173 5281 5480 5555 5577 5578 5985 6003 6040 6042 6047 6069 6071 6106 6109 6117 6118 6173 6281 6480 6555 6577 6578 6985 7066 7222 7328 7331 7353 7354 7436 7456 7669 7864 7904 7907 7953 8016 8027 8090 8180 8210 8224 8491 8492 8557 8660 8738 8740 8742 8747 8748 8756 9003 9040 9042 9047 9069 9071 9106 9109 9117 9118 9173 9281 9480 9555 9577 9578 9985 10066 10222 10328 10331 10353 10354 10436 10456
These are all the mails from the first search plus additional matches for newer mails.
I tried to write a test script for this, but I never got past the Xapian::DatabaseNotFoundError
. Maybe you can reproduce this issue?
It feels like both problems are related. Opening/closing databases in an IMAP session doesn't seem to work correctly.
xapian's capable of configurable attachment indexing/search, e.g.
https://xapian.org/docs/omega/overview.html
https://github.com/xelkano/redmine_xapian
https://wiki.bcs.rochester.edu/StatsWiki/HelpOnXapian
is dovecot-fts-flatcurve attachment capable, and configurable?
one alternative that (still?) seems to work is to enable fts_decoder,
https://doc.dovecot.org/settings/plugin/fts-plugin/#plugin_setting-fts-fts_decoder
as,
plugin {
...
fts_decoder = decode2text
}
service decode2text {
executable = script /usr/libexec/dovecot/decode2text.sh
user = vmail
unix_listener decode2text {
mode = 0666
}
}
iiuc, the ->text converted attachment is scanned/indexed by fts.
but not using Xapian/flatcurve native capabilities.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.