Coder Social home page Coder Social logo

talp-upc / freeling Goto Github PK

View Code? Open in Web Editor NEW
250.0 27.0 96.0 2.25 GB

FreeLing project source code

License: Other

Java 0.02% Makefile 0.01% Perl 0.03% PHP 0.01% Python 0.01% Ruby 0.02% Shell 0.03% Lex 91.23% TeX 0.01% Batchfile 0.01% C 0.89% C++ 7.66% Awk 0.01% Slash 0.01% CMake 0.03% Cuda 0.01% Dockerfile 0.01% SWIG 0.06% Erlang 0.01%

freeling's Introduction

FreeLing

FreeLing project source code

You'll find more information about this project at FreeLing web page

freeling's People

Contributors

ambs avatar andreaslillebo avatar arademaker avatar azh avatar bryant1410 avatar diegodlh avatar guillemcordoba avatar jgsogo avatar jslauthor avatar linguista avatar lluisp avatar merqlove avatar owentrigueros avatar pauarge avatar peleitor avatar sergillamas avatar setzer22 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

freeling's Issues

locutions or locucions

Some languages (like Catalan, Spanish or English) have a "locucions.dat" file, and the examples use that name. While in German (not sure if other languages too) has a "locutions" file instead, which makes difficult a general solution having the language as parameter.

Freeling 4.0 in ubuntu 14.04 from deb package. Libboost problems.

Hello,

I'm trying to install freeling from "freeling-4.0-trusty-amd64.deb" package in Ubuntu 14.04. The problem is that this package requires libboost 1.55. But to execute analyze I need libboost 1.54. I solved this problem installing the deb package with libboost 1.55 and after freeling is installed I replaced libboost1.55 with libboost1.54.

Regards,
Jairo.

Error Importing FreeLing to Python

Hi. Not sure if this is the correct place to post this issue... please let me know otherwise.

I’ve installed FreeLing from source as specified in the Installation Manual on a Lenovo ThinkPad running on 64-bit Windows 10. My goal is to use the FreeLing module on Python, but when I type import pyfreeling I am getting the following: ImportError: DLL load failed: The specified module could not be found. Below are the details that should help figure out what is going on.

I have Python 3.6.6, Anaconda custom. In terms of the files and dependencies mentioned in the Installation Manual, I downloaded [FreeLing-4.1], [Visual Studio 2015], [zlib-1.2.11], [boost-v1.61.0], [icu-v57.1], and [swigwin-3.0.9]. The directories I use are: %FLINSTALL%=“C:\Users\loren\Dropbox\Software\FreeLing-4.1”; %SWIGDIR%=“C:\Users\loren\Dropbox\Software\swigwin-3.0.9”; and %PYTHONPATH%= “C:\Users\loren\Anaconda3\envs\mlbook”. Note that the FreeLing and SWIG directories are in Dropbox rather than Anaconda3 (not sure if this may explain the error).

Everything in the download went smoothly except for two things. First, I had to add the following option to the cmake command in order to avoid compiling errors: -DCMAKE_CXX_FLAGS="/FS /EHsc". Second, near the end of installation a fatal error caused by not finding swig.swg and python.swg (both of which are under %SWIGDIR%\Lib) would occur. This was fixed by copying swig.swg and python.swg (as well as all their dependencies) and pasting them to %SWIGDIR%. For some weird reason, the installation would not look into the Lib subfolder and since I do not know how to address that directly, I simply moved the files. After these two fixes, everything ran smoothly and all the files and dependencies mentioned in the Installation Manual were in the appropriate directories.

The problem comes when I try to import the module. Using a Jupyter notebook, I first set as the current directory the folder where pyfreeling.py, _pyfreeling.lib, and _pyfreeling.pyd are found. I then add all the relevant folders mentioned in the Installation Manual to syspath. I then check that pyfreeling is an importable module from syspath using pkgutil. I then execute import pyfreeling which is followed by the ImportError mentioned at the beginning. Below is an image of the error details.

image

Based on what I read on some forums, I used Dependency Walker to find the dependencies of _pyfreeling.pyd. This told me that _pyfreeling.pyd has several specified files that the system cannot find. Most are false positives (e.g., files starting with API-MS-WIN or EXT-MS-WIN) but two may be the source of the problem: FREELING.DLL and PYTHON36.DLL are not found. When I look for these files under the %FLINSTALL% directory, I don’t find PYTHON36.DLL. However, FREELING.DLL is found under %FLINSTALL%\build\src\libfreeling and under %FLINSTALL%\freeling\bin. If this is the source of the problem, I do not know how to address it. Is there something wrong in installation or do I need add more paths to %PYTHONPATH%?

Thanks in advance! (And thank you for developing such a wonderful tool... I'm very excited to start using it)

Zlib

Hi, again.

From other issue thread:

i found it at http://zlib.net/fossils/
It seems the change happened between 1.2.5.1 and 1.2.5.2
So the code should be

if ZLIB_VERNUM < 0x1252

define gzFile gzFile*

endif

it's already fixed in master

In fact, my Mac has 1.2.5.0 and it doesn't work with this define. Changing it to 0x1250, for example, it starts working. Also, I can't see anybody changing such a thing in a patch version (1.2.5.1 to 1.2.5.2).

So, probably better to consider 0x1250 to be OK, and just do the define for versions below it. Meanwhile, if somebody with lower versions complain, we can always adjust.

Thanks

Problem generating Java APIs

Hi!
I've been trying to get the Java APIs to work, and have been following the instructions, but I was unable to generate the files on OS/X as well as on Ubuntu, with identical error messages. I am most likely doing something wrong, but I can't seem to figure out what.

Output when running make:

rm -rf edu/upc/freeling
mkdir -p edu/upc/freeling
mkdir -p /usr/local/
swig -java -c++ -package edu.upc.freeling -outdir edu/upc/freeling -o freeling_javaAPI.cxx -I/usr/local/share/swig/3.0.12/java -I/usr/local/share/swig/3.0.12/std -I/usr/local/share/swig/3.0.12 freeling_javaAPI.i
/usr/local/share/swig/3.0.12/std/std_set.i:98: Error: Syntax error in input(3).
make: *** [freeling_javaAPI.cxx] Error 1

I have tried using both swig2.0 and the newest version without avail.

std_set.i:98 (if relevant): %fragment(SWIG_Traits_frag(std::set< _Key, _Compare, _Alloc >), "header",fragment=SWIG_Traits_frag(_Key), fragment="StdSetTraits")

Any help is appreciated.

Not the same codebase for windows and unix?

Hi!

I'm working around with CMake to make it more robust and compatible for windows and linux builds, but I have found that there is not the same codebase for Windows and Unix... example:

libfreeling in msvc provided projects doesn't compile corrector.cc but unix does... and in fact that file doesn't compile in msvc, and I think there are more like this. Is there any mechanical way to select which files should be in MSVC solution and which shouldn't?

Thanks!

Node.js bindings

Hello!
I've come across your project, and it seems like a great start for us here to go deep down NLP!
First of all thank you for your awesome support on NLP in Spanish, greatly appreciated.

I wanted to ask about the possibility of adding an API binding for node.js (similar to the one in Python)
I've come across many libraries that use external C++ libraries, and it seems given the fact of all the awesome support for other languages, that node.js would fit right in.

Here's some documentation on the matter: https://nodejs.org/api/addons.html

Love from Argentina

Lemmatization and WSD

Given https://github.com/TALP-UPC/FreeLing/blob/master/data/pt/afixos.dat#L781-L782 a Portuguese word such as "regularmente" will be lemmatized to "regular" but it would keep the POS RG (adverb) and the WSD module will not find this sense in the Wordnet. This is an interesting problem. I can only think in 3 solutions, both not perfect:

  1. lemmatize "regularmente" to "regularmente" (adv = http://wnpt.brlcloud.com/wn/synset?id=00195024-r) and "regular" as "regular" (adj = http://wnpt.brlcloud.com/wn/synset?id=01959294-a). But it will probably almost duplicate the lexicon of adverbs.

  2. lemmatize "regularmente" to "regular" and change its POS to ADJ. But it can confuse the parser.

  3. lemmatize "regularmente" to "regular" and keep the POS ADV. The current status. It can also confuse the parser and it will require a extra work from the WSD module since it would not make sense to add "regular" in the 00195024-r synset.

JAVA: error: use of undeclared identifier 'result'

Hello,
OSX 10.x, Swig 3.0.10/2.0.12,
Latest Freeling via HomeBrew.

rm -rf edu/upc/freeling
mkdir -p edu/upc/freeling
mkdir -p ../../../common/lib
swig -java -c++ -package edu.upc.freeling -outdir edu/upc/freeling -o freeling_javaAPI.cxx -I/usr/local/opt/swig/share/swig/3.0.10/java -I/usr/local/opt/swig/share/swig/3.0.10/std -I/usr/local/opt/swig/share/swig/3.0.10 freeling_javaAPI.i
g++ -dynamiclib -o ../../../common/lib/libfreeling_javaAPI.dylib freeling_javaAPI.cxx -lfreeling -I/usr/local/opt/icu4c/include -L/usr/local/opt/freeling/lib -I/usr/local/opt/freeling/include -I/usr/local/opt/freeling/include/treeler -I/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/include -I/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/include/darwin -I/usr/local/include -L/usr/local/lib -fPIC -std=c++0x -lboost_system-mt
In file included from freeling_javaAPI.cxx:250:
In file included from /usr/local/opt/freeling/include/freeling.h:35:
In file included from /usr/local/opt/freeling/include/freeling/morfo/lang_ident.h:44:
In file included from /usr/local/opt/freeling/include/freeling/morfo/idioma.h:45:
In file included from /usr/local/opt/freeling/include/freeling/morfo/smoothingLD.h:33:
/usr/local/opt/freeling/include/freeling/morfo/util.h:159:20: warning: 'tmpnam' is
      deprecated: This function is provided for compatibility reasons only. Due to
      security concerns inherent in the design of tmpnam(3), it is highly
      recommended that you use mkstemp(3) instead. [-Wdeprecated-declarations]
    err_type err = NEW_TMPNAME(tempfile,L_tmpnam+1);
                   ^
/usr/local/opt/freeling/include/freeling/morfo/util.h:58:29: note: expanded from
      macro 'NEW_TMPNAME'
#define NEW_TMPNAME(buf,sz) tmpnam(buf)
                            ^
/usr/include/stdio.h:276:7: note: 'tmpnam' has been explicitly marked deprecated
      here
char    *tmpnam(char *);
         ^
In file included from freeling_javaAPI.cxx:250:
In file included from /usr/local/opt/freeling/include/freeling.h:50:
In file included from /usr/local/opt/freeling/include/freeling/morfo/dep_treeler.h:49:
In file included from /usr/local/opt/freeling/include/treeler/dep/dependency_parser.h:42:
In file included from /usr/local/opt/freeling/include/treeler/control/models.h:50:
In file included from /usr/local/opt/freeling/include/treeler/tag/tag.h:43:
In file included from /usr/local/opt/freeling/include/treeler/tag/fgen-tag.h:9:
In file included from /usr/local/opt/freeling/include/treeler/base/feature-vector.h:42:
In file included from /usr/local/opt/freeling/include/treeler/base/fidx.h:41:
/usr/local/opt/freeling/include/treeler/base/feature-idx-v0.h:71:7: warning:
      'register' storage class specifier is deprecated and incompatible with C++1z
      [-Wdeprecated-register]
      register uint32_t a = (uint32_t)(t & 0xffffffff);
      ^~~~~~~~~
/usr/local/opt/freeling/include/treeler/base/feature-idx-v0.h:73:7: warning:
      'register' storage class specifier is deprecated and incompatible with C++1z
      [-Wdeprecated-register]
      register uint32_t b = (uint32_t)((t >> 32) & 0xffffffff);
      ^~~~~~~~~
/usr/local/opt/freeling/include/treeler/base/feature-idx-v0.h:75:7: warning:
      'register' storage class specifier is deprecated and incompatible with C++1z
      [-Wdeprecated-register]
      register uint32_t c = 0;
      ^~~~~~~~~
freeling_javaAPI.cxx:12056:27: error: use of undeclared identifier 'result'; did you
      mean 'jresult'?
  freeling::word::Modules result;
                          ^~~~~~
                          jresult
freeling_javaAPI.cxx:12055:8: note: 'jresult' declared here
  jint jresult = 0 ;
       ^
freeling_javaAPI.cxx:12056:26: error: expected ';' after expression
  freeling::word::Modules result;
                         ^
                         ;
freeling_javaAPI.cxx:12056:19: error: no member named 'Modules' in 'freeling::word'
  freeling::word::Modules result;
  ~~~~~~~~~~~~~~~~^
freeling_javaAPI.cxx:12060:3: error: use of undeclared identifier 'result'; did you
      mean 'jresult'?
  result = (freeling::word::Modules)freeling::word::USERMAP;
  ^~~~~~
  jresult
freeling_javaAPI.cxx:12055:8: note: 'jresult' declared here
  jint jresult = 0 ;
       ^
freeling_javaAPI.cxx:12060:37: error: expected ';' after expression
  result = (freeling::word::Modules)freeling::word::USERMAP;
                                    ^
                                    ;
freeling_javaAPI.cxx:12060:29: error: no member named 'Modules' in 'freeling::word'
  result = (freeling::word::Modules)freeling::word::USERMAP;
            ~~~~~~~~~~~~~~~~^
freeling_javaAPI.cxx:12060:53: error: no member named 'USERMAP' in 'freeling::word'
  result = (freeling::word::Modules)freeling::word::USERMAP;
                                    ~~~~~~~~~~~~~~~~^
freeling_javaAPI.cxx:12061:19: error: use of undeclared identifier 'result'
  jresult = (jint)result;
                  ^
freeling_javaAPI.cxx:12056:27: warning: expression result unused [-Wunused-value]
  freeling::word::Modules result;
                          ^~~~~~
freeling_javaAPI.cxx:12068:27: error: use of undeclared identifier 'result'; did you
      mean 'jresult'?
  freeling::word::Modules result;
                          ^~~~~~
                          jresult
freeling_javaAPI.cxx:12067:8: note: 'jresult' declared here
  jint jresult = 0 ;
       ^
freeling_javaAPI.cxx:12068:26: error: expected ';' after expression
  freeling::word::Modules result;
                         ^
                         ;
freeling_javaAPI.cxx:12068:19: error: no member named 'Modules' in 'freeling::word'
  freeling::word::Modules result;
  ~~~~~~~~~~~~~~~~^
freeling_javaAPI.cxx:12072:3: error: use of undeclared identifier 'result'; did you
      mean 'jresult'?
  result = (freeling::word::Modules)freeling::word::NUMBERS;
  ^~~~~~
  jresult
freeling_javaAPI.cxx:12067:8: note: 'jresult' declared here
  jint jresult = 0 ;
       ^
freeling_javaAPI.cxx:12072:37: error: expected ';' after expression
  result = (freeling::word::Modules)freeling::word::NUMBERS;
                                    ^
                                    ;
freeling_javaAPI.cxx:12072:29: error: no member named 'Modules' in 'freeling::word'
  result = (freeling::word::Modules)freeling::word::NUMBERS;
            ~~~~~~~~~~~~~~~~^
freeling_javaAPI.cxx:12072:53: error: no member named 'NUMBERS' in 'freeling::word'
  result = (freeling::word::Modules)freeling::word::NUMBERS;
                                    ~~~~~~~~~~~~~~~~^
freeling_javaAPI.cxx:12073:19: error: use of undeclared identifier 'result'
  jresult = (jint)result;
                  ^
freeling_javaAPI.cxx:12068:27: warning: expression result unused [-Wunused-value]
  freeling::word::Modules result;
                          ^~~~~~
freeling_javaAPI.cxx:12080:27: error: use of undeclared identifier 'result'; did you
      mean 'jresult'?
  freeling::word::Modules result;
                          ^~~~~~
                          jresult
freeling_javaAPI.cxx:12079:8: note: 'jresult' declared here
  jint jresult = 0 ;
       ^
freeling_javaAPI.cxx:12080:26: error: expected ';' after expression
  freeling::word::Modules result;
                         ^
                         ;
freeling_javaAPI.cxx:12080:19: error: no member named 'Modules' in 'freeling::word'
  freeling::word::Modules result;
  ~~~~~~~~~~~~~~~~^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
6 warnings and 20 errors generated.
make: *** [libfreeling_javaAPI.dylib] Error 1

Java API issue on Windows

FreeLing/APIs/java/readme:

HOW TO BUILD THE API IN WINDOWS, USING MSVC

  1. Install java
  2. Download and install swig (http://www.swig.org/)
  3. Open msvc project named msvc/10.0/swig/java/freeling_javaAPI

Unfortunately it is impossible to find mentioned msvc project

Descarga por torrent

Hay alguna forma de descargar el .deb por torrent? Estoy en Venezuela y no conexión de internet es pésima pero por torrent la cosa fluye un poco más.

French analyzer crashes on "du" with MultiwordsDetection=no

If I go like this:

$ echo du | analyze -f freeling-config/fr.cfg

Where fr.config contains the default fr.cfg, but with "MultiwordsDetection=no", then it crashes with a segmentation fault. This seems to happen on any string that contains the token "du".

Tried this with the latest version from master, on Ubuntu 16.04 64 bit.

Different behaviour than freeling 3.1 in 4.0beta1 (less accurate?)

With 3.1

┌─(~/workspace/freeling/installed-3.1/bin)
└─(19:18:12)──> ./analyze -f ../share/freeling/config/es.cfg --out tagged
El gato come pescado y bebe agua.
El el DA0MS0 1
gato gato NCMS000 1
come comer VMIP3S0 0.994868
pescado pescado NCMS000 0.608233
y y CC 0.999962
bebe beber VMIP3S0 0.994868
agua agua NCCS000 0.99177
. . Fp 1

With 4.0beta1

┌─(~/workspace/freeling/installed/bin)
└─(19:18:19)──> ./analyze -f ../share/freeling/config/es.cfg --outlv tagged
El gato come pescado y bebe agua.
El el DA0MS0 1
gato gato NCMS000 1
come comer VMIP3S0 0.978902
pescado pescar VMP00SM 0.323747    <------------------ THIS
y y CC 0.999989
bebe beber VMIP3S0 0.989241
agua agua NCCS000 0.997446
. . Fp 1

I've been playing with different parameters, but pescado is always recognized as a verb in 4.0beta1.
Another example:

With 3.1

┌─(~/workspace/freeling/installed-3.1/bin)
└─(19:22:24)──> ./analyze -f ../share/freeling/config/es.cfg --out tagged
Yo bajo con el hombre bajo a tocar el bajo bajo la escalera.
Yo yo PP1CSN00 1
bajo bajar VMIP1S0 0.00364964
con con SPS00 1
el el DA0MS0 1
hombre hombre NCMS000 0.961347
bajo bajo AQ0MS0 0.0766423
a a SPS00 0.996023
tocar tocar VMN0000 1
el el DA0MS0 1
bajo bajo NCMS000 0.040146
bajo bajo SPS00 0.879562
la el DA0FS0 0.972269
escalera escalera NCFS000 1
. . Fp 1

With 4.0beta1

┌─(~/workspace/freeling/installed/bin)
└─(19:22:10)──> ./analyze -f ../share/freeling/config/es.cfg --outlv tagged
Yo bajo con el hombre bajo a tocar el bajo bajo la escalera.
Yo yo PP1CSN0 1
bajo bajo AQ0MS00 0.174026             <----------- THIS
con con SP 1
el el DA0MS0 1
hombre hombre NCMS000 0.990108
bajo bajo AQ0MS00 0.174026
a a SP 0.998775
tocar tocar VMN0000 1
el el DA0MS0 1
bajo bajo AQ0MS00 0.174026            <----------- THIS
bajo bajo SP 0.814719
la el DA0FS0 0.98926
escalera escalera NCFS000 1
. . Fp 1

Here bajo is recognized first as an adjective instead of as a verb, and then again as an adjective instead of as a noun.
¿Is thera a config parameter i am missing?

Also, sorry if this is not the appropiate place for such questions.

failed to install macOS branch master

Hi Padro,

Following the instructions for MacOS in https://talp-upc.gitbooks.io/freeling-4-1-user-manual/content/installation/installation-mac.html. the compilation failled:

[ 24%] Linking CXX shared library libdynet.dylib
Undefined symbols for architecture x86_64:
  "boost::iostreams::zlib_error::check(int)", referenced from:
      long boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::write<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, char const*, long) in io.cc.o
      void boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::close<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, unsigned int) in io.cc.o
      long boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::read<boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::peekable_source<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > > >(boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::peekable_source<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >&, char*, long) in io.cc.o
      void boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::close<boost::iostreams::non_blocking_adapter<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > > >(boost::iostreams::non_blocking_adapter<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >&, unsigned int) in io.cc.o
  "boost::iostreams::zlib::stream_end", referenced from:
      long boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::write<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, char const*, long) in io.cc.o
      void boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::close<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, unsigned int) in io.cc.o
      long boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::read<boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::peekable_source<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > > >(boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::peekable_source<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >&, char*, long) in io.cc.o
      void boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::close<boost::iostreams::non_blocking_adapter<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > > >(boost::iostreams::non_blocking_adapter<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >&, unsigned int) in io.cc.o
  "boost::iostreams::zlib::sync_flush", referenced from:
      long boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::write<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, char const*, long) in io.cc.o
      void boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::close<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, unsigned int) in io.cc.o
      long boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::read<boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::peekable_source<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > > >(boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::peekable_source<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >&, char*, long) in io.cc.o
      void boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::close<boost::iostreams::non_blocking_adapter<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > > >(boost::iostreams::non_blocking_adapter<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >&, unsigned int) in io.cc.o
  "boost::iostreams::zlib::default_strategy", referenced from:
      boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::make_params(int) in io.cc.o
  "boost::iostreams::zlib::default_compression", referenced from:
      boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::make_params(int) in io.cc.o
  "boost::iostreams::zlib::okay", referenced from:
      long boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::write<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, char const*, long) in io.cc.o
      long boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::read<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, char*, long) in io.cc.o
      void boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::close<boost::iostreams::non_blocking_adapter<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > > >(boost::iostreams::non_blocking_adapter<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >&, unsigned int) in io.cc.o
  "boost::iostreams::zlib::deflated", referenced from:
      boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::make_params(int) in io.cc.o
  "boost::iostreams::detail::gzip_footer::reset()", referenced from:
      boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::basic_gzip_decompressor(int, long) in io.cc.o
      long boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::write<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, char const*, long) in io.cc.o
      long boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::read<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, char*, long) in io.cc.o
  "boost::iostreams::detail::gzip_footer::process(char)", referenced from:
      long boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::write<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, char const*, long) in io.cc.o
      long boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::read<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, char*, long) in io.cc.o
  "boost::iostreams::detail::gzip_header::reset()", referenced from:
      boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::basic_gzip_decompressor(int, long) in io.cc.o
      long boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::write<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, char const*, long) in io.cc.o
      long boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::read<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, char*, long) in io.cc.o
  "boost::iostreams::detail::gzip_header::process(char)", referenced from:
      long boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::write<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, char const*, long) in io.cc.o
      long boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::read<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, char*, long) in io.cc.o
  "boost::iostreams::detail::zlib_base::after(char const*&, char*&, bool)", referenced from:
      long boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::write<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, char const*, long) in io.cc.o
      void boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::close<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, unsigned int) in io.cc.o
      long boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::read<boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::peekable_source<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > > >(boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::peekable_source<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >&, char*, long) in io.cc.o
      void boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::close<boost::iostreams::non_blocking_adapter<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > > >(boost::iostreams::non_blocking_adapter<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >&, unsigned int) in io.cc.o
  "boost::iostreams::detail::zlib_base::reset(bool, bool)", referenced from:
      boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::impl::impl<boost::iostreams::zlib_params>(long, boost::iostreams::zlib_params const&) in io.cc.o
      boost::detail::shared_count::shared_count<boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::impl>(boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::impl*) in io.cc.o
      boost::detail::sp_counted_impl_p<boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::impl>::dispose() in io.cc.o
      void boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::close<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, unsigned int) in io.cc.o
      long boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::read<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, char*, long) in io.cc.o
      void boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::close<boost::iostreams::non_blocking_adapter<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > > >(boost::iostreams::non_blocking_adapter<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >&, unsigned int) in io.cc.o
  "boost::iostreams::detail::zlib_base::before(char const*&, char const*, char*&, char*)", referenced from:
      long boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::write<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, char const*, long) in io.cc.o
      void boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::close<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, unsigned int) in io.cc.o
      long boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::read<boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::peekable_source<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > > >(boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::peekable_source<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >&, char*, long) in io.cc.o
      void boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::close<boost::iostreams::non_blocking_adapter<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > > >(boost::iostreams::non_blocking_adapter<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >&, unsigned int) in io.cc.o
  "boost::iostreams::detail::zlib_base::do_init(boost::iostreams::zlib_params const&, bool, void* (*)(void*, unsigned int, unsigned int), void (*)(void*, void*), void*)", referenced from:
      boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::impl::impl<boost::iostreams::zlib_params>(long, boost::iostreams::zlib_params const&) in io.cc.o
  "boost::iostreams::detail::zlib_base::xinflate(int)", referenced from:
      long boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::write<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, char const*, long) in io.cc.o
      void boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::close<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >(boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> >&, unsigned int) in io.cc.o
      long boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::read<boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::peekable_source<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > > >(boost::iostreams::basic_gzip_decompressor<std::__1::allocator<char> >::peekable_source<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >&, char*, long) in io.cc.o
      void boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::close<boost::iostreams::non_blocking_adapter<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > > >(boost::iostreams::non_blocking_adapter<boost::iostreams::detail::linked_streambuf<char, std::__1::char_traits<char> > >&, unsigned int) in io.cc.o
  "boost::iostreams::detail::zlib_base::zlib_base()", referenced from:
      boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::impl::impl<boost::iostreams::zlib_params>(long, boost::iostreams::zlib_params const&) in io.cc.o
  "boost::iostreams::detail::zlib_base::~zlib_base()", referenced from:
      boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::impl::impl<boost::iostreams::zlib_params>(long, boost::iostreams::zlib_params const&) in io.cc.o
      boost::detail::shared_count::shared_count<boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::impl>(boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::impl*) in io.cc.o
      boost::detail::sp_counted_impl_p<boost::iostreams::symmetric_filter<boost::iostreams::detail::zlib_decompressor_impl<std::__1::allocator<char> >, std::__1::allocator<char> >::impl>::dispose() in io.cc.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [src/libdynet/dynet/libdynet.dylib] Error 1
make[1]: *** [src/libdynet/dynet/CMakeFiles/dynet.dir/all] Error 2
make: *** [all] Error 2
leme:build ar$```

Problem with proper names when using dashes


​Hi, we are using Freeling for annotating Spanish novels and we have found a bug. The POS analyser does analyse correctly a sentence like:
"-Estamos desorientados -murmuró el hombre tranquilamente-; nos hemos debido de perder."

In this case FreeLing says that "Estamos" is a verb. But if instead of hyphen you have any kind of dashes, it says that "Estamos" is a proper name (when using the NEC, it says that it is a person):

—Estamos desorientados —murmuró el hombre tranquilamente—; nos hemos debido de perder.

You find both hyphens and dashes at the beginning of direct speech in novels (although actually the dashes are more correct). It would be great if FreeLing could treat the most frequent dashes (– and —) in the same way than hyphens.

Is there a file in my installed Freeling version where I can add the dashes as punctuation easily? Thanks!

Tag not found for contraction component

Hello, thank you for your work!
I have some trouble:

$ analyze -f en.cfg < test.en 
DICTIONARY: Tag not found for contraction component. Check dictionary entries for 'landain't' and 'land_ai'
$ cat test.en 
landain't everything to me

Latest version from master.

info about the English models

Hi Padro,

I didn't find info about how the English POS tagger and dependency parser was trainned. What corpus was used?

Best,

4.1 segfault

https://gist.github.com/ilovezfs/389539ede39e49df122089af9d0a883d

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libfreeling.dylib             	0x000000010e601788 freeling::database::dump_database(std::__1::basic_ostream<wchar_t, std::__1::char_traits<wchar_t> >&, bool) const + 26
1   libfreeling.dylib             	0x000000010e6678bc freeling::compounds::compounds(std::__1::basic_string<wchar_t, std::__1::char_traits<wchar_t>, std::__1::allocator<wchar_t> > const&, freeling::dictionary const&) + 3068
2   libfreeling.dylib             	0x000000010e5f37bc freeling::dictionary::dictionary(std::__1::basic_string<wchar_t, std::__1::char_traits<wchar_t>, std::__1::allocator<wchar_t> > const&, std::__1::basic_string<wchar_t, std::__1::char_traits<wchar_t>, std::__1::allocator<wchar_t> > const&, std::__1::basic_string<wchar_t, std::__1::char_traits<wchar_t>, std::__1::allocator<wchar_t> > const&, std::__1::basic_string<wchar_t, std::__1::char_traits<wchar_t>, std::__1::allocator<wchar_t> > const&, bool, bool) + 1692
3   libfreeling.dylib             	0x000000010e5f47a4 freeling::dictionary::dictionary(std::__1::basic_string<wchar_t, std::__1::char_traits<wchar_t>, std::__1::allocator<wchar_t> > const&, std::__1::basic_string<wchar_t, std::__1::char_traits<wchar_t>, std::__1::allocator<wchar_t> > const&, std::__1::basic_string<wchar_t, std::__1::char_traits<wchar_t>, std::__1::allocator<wchar_t> > const&, std::__1::basic_string<wchar_t, std::__1::char_traits<wchar_t>, std::__1::allocator<wchar_t> > const&, bool, bool) + 24
4   libfreeling.dylib             	0x000000010e664c10 freeling::maco::maco(freeling::maco_options const&) + 404
5   libfreeling.dylib             	0x000000010e5ea452 freeling::analyzer::analyzer(freeling::analyzer::analyzer_config_options const&) + 792
6   analyzer                      	0x000000010e5912d0 main + 169
7   libdyld.dylib                 	0x00007fff9acda5ad start + 1

running

  test do
    expected = <<~EOS
      Hello hello NN 1
      world world NN 1
    EOS
    assert_equal expected, pipe_output("#{bin}/analyze -f #{pkgshare}/config/en.cfg", "Hello world").chomp
  end

Candidate Corrections for Mispelled Words not working with Python API

Hi,

I'm trying to use the spell corrector through the Python API, but it seems that the method get_alternatives() doesn't retrieve an appropriate object (the SwigPyObject it returns is not iterable, and does not have any method to get the alternative words either). I attach the error I got when executing a piece of code from the example posted on the tutorial webpage:

---------------------------------------------------------------------------'
TypeError                                 Traceback (most recent call last)
<ipython-input-23-5909beee38a3> in <module>()
     12     # print alternative forms proposed by the alternative suggestors
     13     print("   ALTERNATIVE FORMS:")
---> 14     for a in w.get_alternatives() :
     15         print(" ["+a.get_form()+","+str(a.get_distance())+"]")
     16     print("")

TypeError: 'SwigPyObject' object is not iterable

When calling the method 'has_alternatives()', it properly detects whether there are altrnatives or not; the problem is how to access these data. I'm using Python 2.7 and Freeling 4.0.

Any idea about what's going on? Thanks in advance.

error path for include:

Thre file: src\libtreeler\treeler\algo\perceptron-v0.1.h has the following include:
#include "treeler/learn/perceptron-v0.1.tcc"
which doesn't exist anymore, i think it should be #include "treeler/algo/perceptron-v0.1.tcc"

installation on Linux

# autoreconf --install
libtoolize: Consider adding `-I m4' to ACLOCAL_AMFLAGS in Makefile.am.
Makefile.am: error: required file './NEWS' not found
Makefile.am: error: required file './ChangeLog' not found
autoreconf: automake failed with exit status: 1

New error ?

Freeling outputs wrong dependency trees from command line

I'm using windows7 64bits and the Freeling-4.0-win64 binary version.

I realized when I compared the output with the result in the demo online

I think this occurs in many cases and maybe occurs in other languages (I just tried english and spanish) but for example, consider this sentence

Maintenance of Th1 responses and dendritic cell (DC ) functions are compromised in HIV-1 infected individuals.

My expected result is that given by the demo_online
image

but despite I tried different output formats (freeling, xml, json, naf) I got the same dependency tree, which I consider wrong, and it is actually different to that given by the demo. Maybe I'm doing some wrong, this is the command I used >analyzer.bat -f en.cfg --output xml --outlv dep <ana.txt

So, the xml shown in the demo_online is actually different from the xml output via command line. I attach this file and some pictures I did in order to facilitate comparisons between both results (green annotation means "sub-tree when both outputs are agree").

from demo
from command line
freeling_output_via_commandline.txt

Finally, if you consider necessary, I could get some statistics to show how many times this occurs.

I appreciate your help with this issue.

locales issue on MacOS

The analyze program is having trouble with encoding.

urca:~ arademaker$ echo 'testando situação.' | analyze -f pt.cfg
testando testar VMG0000 1
situa� situa� RG 0.365785
� � Fz 1
� � NP00000 1
� � Fz 1
o o DA0MS0 0.950254
. . Fp 1

The locale command returns:

urca:~ arademaker$ locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL="en_US.UTF-8"

The config file used says:

urca:~ arademaker$ head /usr/local/share/freeling/config/pt.cfg
##
#### default configuration file for Portuguese analyzer
##

#### General options
Lang=pt
Locale=en_US.UTF-8

Compilation problems with GNU Make 4.1

Hi,

I just do a change in the code because I had a problem compiling Freeling product with GNU Make 4.1 in an Alpine distribution, I have find this problem when I try to dockerizing the application. The problem was with some bools that were in Uppercase, and the compiler only accepts in lowercase, for this reason I just changed the code and I send you a pull request with the changes.

The changes are in this files:

  • src/libfreeling/summarizer/relation.cc
  • src/libfreeling/summarizer/summarizer.cc

Thanks and regards.

new erros in MacOS

I got from the HEAD on master branch:

$ autoreconf --install
data/Makefile.am:4: error: bad characters in variable name ''
data/Makefile.am:18: error: bad characters in variable name ''
data/Makefile.am:5: warning: variants multiply defined in condition TRUE ...
data/Makefile.am:3: ... 'variants' previously defined here
data/Makefile.am:18: warning:  multiply defined in condition TRUE ...
data/Makefile.am:4: ... '' previously defined here
src/libfreeling/Makefile.am:21: error: bad characters in variable name ''
src/libfreeling/Makefile.am:22: warning: libfreeling_la_SOURCES multiply defined in condition TRUE ...
src/libfreeling/Makefile.am:20: ... 'libfreeling_la_SOURCES' previously defined here
autoreconf: /opt/local/bin/automake failed with exit status: 1

Tokenizer fails if sentence ends in a space

I have a database with sentences that come from a variety of sources so the structure of the sentences vary and I can't control what each sentence begins and ends with.

In tokenizer.cc at line ~162:

  while (c!=p.end()) {
      // find first non-white space and erase leading whitespaces
      while (iswspace(*c)) {
        ++c;
        ++offset;
      }

I'm getting an assertion that the iterator is not dereferencable. The cause seems to be that it's reaching the end of the sentence but the very end of the sentence has a space. The interim solution now is to reach into the database in my testing environment and manually erase the white space at the end of each sentence that it fails on but in production I won't be able to do this.

error1
screen shot 2016-06-07 at 11 06 20 am

tokenizer.cc

  ///////////////////////////////////////////////////////////////
  /// Split the string into tokens using RegExps from
  /// configuration file, returning a word object list.
  ///////////////////////////////////////////////////////////////

  void tokenizer::tokenize(const std::wstring &p, unsigned long &offset, list<word> &v) const 
  {
    wstring t[10];
    list<pair<wstring, freeling::regexp> >::const_iterator i;
    bool match;
    int j, substr, len=0;
    vector<wstring> results;  // to store match results

    v.clear(); 
    // Loop until line is completely processed. 
    wstring::const_iterator c=p.begin();
    while (c!=p.end()) {
      // find first non-white space and erase leading whitespaces
      while (iswspace(*c)) {
        ++c;
        ++offset;
      }

      TRACE(4,L"Tokenizing ["+wstring(c,p.end())+L"]");
      // find first matching rule
      match=false;    
      for (i=rules.begin(); i!=rules.end() && !match; i++) {
        try {
          TRACE(4,L"  Checking rule "+i->first);
          if (i->second.search(c, p.end(), results, true)) {
            // regexp matches, extract substrings
            match=true; len=0;
            substr = matches.find(i->first)->second;
            for (j=(substr==0? 0 : 1); j<=substr && match; j++) {
              // get each requested  substring
              t[j] = results.at(j);
              len += t[j].length();
              TRACE(2,L"Found match "+util::int2wstring(j)+L" ["+t[j]+L"] for rule "+i->first);
              // if special rule, match must be in abbrev file
              if ((i->first)[0]==L'*') {
                wstring lower = util::lowercase(t[j]);
                if (abrevs.find(lower)==abrevs.end()) {
                  match = false;
                  TRACE(2,L"Special rule and found match not in abbrev list. Rule not satisfied");
                }
              }
            }
          }
        }
        catch (...) {
          // boost::regexp rejects to match an expression if the matched string is too long
          WARNING(L"Match too long for boost buffer: Rule "+i->first+L" skipped.");
          WARNING(L"Provided input doesn't look like text.");
        }
      }

.............

installation errors with .deb

Hi,
I was trying the new .deb binary distribution and found some problems. I'm referring specifically to the trusty-amd64.deb under Debian 8.4 jessie 64.

After installation the /usr/bin/analyzer binary tries to load libboost-*.so version 1.54 libraries whereas the current versions under Debian for libboost are the 1.55.0 ones.

I tried to install the 1.54 ones but apparently the aren't available (apt-get couldn't find them).

Invalid JSON output

Hello!!

First of all thanks for this amazing job!

I have an issue when I try to run as a JSON output format, seems to be invalid. Could you check it?

How to reproduce:

echo "Martin McGuinness, ex viceprimer ministro de Irlanda del Norte y y antiguo comandante del Ejército Republicano Irlandés (IRA), ha fallecido a los 66 años. Fue una figura clave en el proceso de paz en el país y hace sólo dos meses que se apartó de la vida política. Es con profundo pesar y tristeza que hemos sabido de la muerte de nuestro amigo y camarada Martin McGuinness, que falleció en la madrugada en Derry (Irlanda del Norte). Se le echará mucho de menos por todos los que le conocieron”, indicó la nota del partido republicano Sinn Fein en un comunicado. El republicano irlandés ha muerto a consecuencia de una rara enfermedad cardiaca. Martin McGuinness ha fallecido en el hospital de Derry Altnagelvin rodeado de su familia, según informa The Guardian." | analyze -f /usr/local/Cellar/freeling/4.0_4/share/freeling/config/es.cfg --nec --output json

Copy the ouput and check with any validator, for instance: http://json.parser.online.fr/

Thanks!

Error compiling Python API

Hi @lluisp , I have found an error when I tried to compile the Python API implementation, I saw that exist changes four or five days ago, and I think probably this is the problem. I attach the log of the error when this instruction was executed:

g++ -shared -o _freeling.so freeling_pythonAPI.cxx -lfreeling -I/usr/local/include -L/usr/local/lib -I/usr/include/python3.4m -fPIC -std=gnu++0x

This is the error log:

freeling_pythonAPI.cxx: In function ‘PyObject* _wrap_alternatives_get_similar_words(PyObject*, PyObject*)’:
freeling_pythonAPI.cxx:88879:94: error: no matching function for call to ‘freeling::alternatives::get_similar_words(const wstring&, std::list<std::pair<std::basic_string<wchar_t>, int> >&) const’
   ((freeling::alternatives const *)arg1)->get_similar_words((std::wstring const &)*arg2,*arg3);
                                                                                              ^
freeling_pythonAPI.cxx:88879:94: note: candidate is:
In file included from /usr/local/include/freeling.h:42:0,
                 from freeling_pythonAPI.cxx:3222:
/usr/local/include/freeling/morfo/alternatives.h:103:10: note: void freeling::alternatives::get_similar_words(const wstring&, std::list<freeling::alternative>&) const
     void get_similar_words(const std::wstring &, std::list<freeling::alternative> &) const;
          ^
/usr/local/include/freeling/morfo/alternatives.h:103:10: note:   no known conversion for argument 2 from ‘std::list<std::pair<std::basic_string<wchar_t>, int> >’ to ‘std::list<freeling::alternative>&’
freeling_pythonAPI.cxx: In function ‘PyObject* _wrap_foma_FSM_get_similar_words(PyObject*, PyObject*)’:
freeling_pythonAPI.cxx:92648:90: error: no matching function for call to ‘freeling::foma_FSM::get_similar_words(const wstring&, std::list<std::pair<std::basic_string<wchar_t>, int> >&) const’
   ((freeling::foma_FSM const *)arg1)->get_similar_words((std::wstring const &)*arg2,*arg3);
                                                                                          ^
freeling_pythonAPI.cxx:92648:90: note: candidate is:
In file included from /usr/local/include/freeling/morfo/compounds.h:36:0,
                 from /usr/local/include/freeling/morfo/dictionary.h:40,
                 from /usr/local/include/freeling/morfo/maco.h:38,
                 from /usr/local/include/freeling.h:39,
                 from freeling_pythonAPI.cxx:3222:
/usr/local/include/freeling/morfo/foma_FSM.h:85:10: note: void freeling::foma_FSM::get_similar_words(const wstring&, std::list<freeling::alternative>&) const
     void get_similar_words(const std::wstring &, std::list<freeling::alternative> &) const;
          ^
/usr/local/include/freeling/morfo/foma_FSM.h:85:10: note:   no known conversion for argument 2 from ‘std::list<std::pair<std::basic_string<wchar_t>, int> >’ to ‘std::list<freeling::alternative>&’
make: *** [_freeling.so] Error 1

I have this into a CentOS 7 distribution into a Docker container.

Regards.

installation MacOS

Running Mac OS 10.11.4

Dependencies installed via MacPorts:

boost @1.59.0_2+no_single+no_static+python27 (active)
zlib @1.2.8_0 (active)
autoconf @2.69_5 (active)
automake @1.15_1 (active)
icu @55.1_0 (active)

$ env LDFLAGS="-L/opt/local/lib -L/opt/local/include" CPPFLAGS="-I/opt/local/include -I/opt/local/include/boost" ./configure --enable-boost-locale
$ make

The error below happens in the make command:

libtool: compile:  g++ -DPACKAGE_NAME=\"FreeLing\" -DPACKAGE_TARNAME=\"freeling\" -DPACKAGE_VERSION=\"4.0-beta1\" "-DPACKAGE_STRING=\"FreeLing 4.0-beta1\"" -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"freeling\" -DVERSION=\"4.0-beta1\" -DUSE_BOOST_LOCALE=1 -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" -DHAVE_BOOST_REGEX_HPP=1 -DHAVE_BOOST_REGEX_ICU_HPP=1 -DHAVE_BOOST_LOCALE_HPP=1 -DHAVE_BOOST_PROGRAM_OPTIONS_HPP=1 -DHAVE_BOOST_THREAD_HPP=1 -DHAVE_BOOST_THREAD_MUTEX_HPP=1 -DHAVE_ZLIB_H=1 -DHAVE_STDBOOL_H=1 -DSTDC_HEADERS=1 -I. -I../../src/libtreeler -I/opt/local/include -I/opt/local/include/boost -I../../src/libtreeler -I/opt/local/include -I/opt/local/include/boost -Wall -fPIC -std=gnu++0x -Wall -fPIC -MT treeler/dep/dependency_parser.lo -MD -MP -MF treeler/dep/.deps/dependency_parser.Tpo -c treeler/dep/dependency_parser.cc  -fno-common -DPIC -o treeler/dep/.libs/dependency_parser.o
In file included from treeler/dep/dependency_parser.cc:2:
In file included from ./treeler/dep/dependency_parser.h:42:
In file included from ./treeler/control/models.h:50:
In file included from ./treeler/tag/tag.h:43:
In file included from ./treeler/tag/fgen-tag.h:9:
In file included from ./treeler/base/feature-vector.h:42:
In file included from ./treeler/base/fidx.h:41:
./treeler/base/feature-idx-v0.h:71:7: warning: 'register' storage class specifier is deprecated [-Wdeprecated-register]
      register uint32_t a = (uint32_t)(t & 0xffffffff);
      ^~~~~~~~~
./treeler/base/feature-idx-v0.h:73:7: warning: 'register' storage class specifier is deprecated [-Wdeprecated-register]
      register uint32_t b = (uint32_t)((t >> 32) & 0xffffffff);
      ^~~~~~~~~
./treeler/base/feature-idx-v0.h:75:7: warning: 'register' storage class specifier is deprecated [-Wdeprecated-register]
      register uint32_t c = 0;
      ^~~~~~~~~
In file included from treeler/dep/dependency_parser.cc:2:
In file included from ./treeler/dep/dependency_parser.h:42:
In file included from ./treeler/control/models.h:50:
In file included from ./treeler/tag/tag.h:44:
In file included from ./treeler/base/scores.h:39:
In file included from ./treeler/base/parameters.h:43:
./treeler/base/base-parameters.h:123:53: error: arithmetic on a pointer to an incomplete type 'const struct Fvec'
      for (int r = 0; r < R; ++r) { S[r] = D::dot(F + r); }
                                                  ~ ^
./treeler/base/base-parameters.h:121:20: note: forward declaration of 'treeler::Fvec'
             const struct Fvec* const F,
                          ^
In file included from treeler/dep/dependency_parser.cc:2:
In file included from ./treeler/dep/dependency_parser.h:42:
In file included from ./treeler/control/models.h:50:
In file included from ./treeler/tag/tag.h:44:
In file included from ./treeler/base/scores.h:151:
./treeler/base/wf-scores.h:138:14: warning: moving a local object in a return statement prevents copy elision
      [-Wpessimizing-move]
      return std::move(s);
             ^
./treeler/base/wf-scores.h:138:14: note: remove std::move call here
      return std::move(s);
             ^~~~~~~~~~ ~
4 warnings and 1 error generated.
make[3]: *** [treeler/dep/dependency_parser.lo] Error 1
make[2]: *** [all-recursive] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all-recursive] Error 1

Installation Ubuntu

When I run autoreconf --install, I've got the following error:
aclocal: warning: couldn't open directory 'm4': No such file or directory
libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, '.'.
libtoolize: copying file './ltmain.sh'
libtoolize: putting macros in AC_CONFIG_MACRO_DIRS, 'm4'.
libtoolize: copying file 'm4/libtool.m4'
libtoolize: copying file 'm4/ltoptions.m4'
libtoolize: copying file 'm4/ltsugar.m4'
libtoolize: copying file 'm4/ltversion.m4'
libtoolize: copying file 'm4/lt~obsolete.m4'
configure.ac:122: installing './compile'
configure.ac:11: installing './config.guess'
configure.ac:11: installing './config.sub'
configure.ac:7: installing './install-sh'
configure.ac:7: installing './missing'
src/libfoma/Makefile.am: installing './depcomp'

Many thanks.

How to obtain the absolute position of a token relative to the start of the file

When processing a file line by line using the sentence splitter with flush = False, we will end up with words that have get_span_start/finish with 0 in the middle of a logical sentence. It looks like the span/start values are relative to the physical line.

Example:

She came from a close-knit family in Alabama, many of whom also moved
to Texas when she married the man who was an accomplished politician
in both Tennessee and Texas, and who had won the Battle of San Jacinto
during the Texas Revolution. The couple had eight children, and she
gave birth to most of them while he was away attending to politics. 

the to word in the second sentence will have a span of 0,2, where I would like for it to be 71,72.

Is there any suggestion on how to obtain the absolute value of the word span with relation to the beginning of the text? Keeping track of the physical lines doesn't seem to be effective because they do not correspond to the logical sentences.

I guess one brute force way would be to simply remove all newlines from the original text, but I would like to avoid this option.

Installation Ubuntu

Running "make", I've got the following error:
g++: internal compiler error: Processus arrêté (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See file:///usr/share/doc/gcc-5/README.Bugs for instructions.
Makefile:985 : la recette pour la cible « util.lo » a échouée
make[2]: *** [util.lo] Erreur 1
make[2] : on quitte le répertoire « /home/deturck/Documents/FreeLing-master/src/libfreeling »
Makefile:356 : la recette pour la cible « all-recursive » a échouée
make[1]: *** [all-recursive] Erreur 1
make[1] : on quitte le répertoire « /home/deturck/Documents/FreeLing-master/src »
Makefile:438 : la recette pour la cible « all-recursive » a échouée
make: *** [all-recursive] Erreur 1

Many thanks.

DockerFile for the Java API not working

I try to install the Docker in APIs/java/Dockerfile but I get some installation errors.
I install it in a Linux Mint 18.1 but the dockerFile is the original file using ubuntu trusty
The error I get is

Step 8/15 : RUN cd /tmp &&     wget -q --progress=dot:giga https://github.com/TALP-UPC/FreeLing/releases/download/4.0/freeling-4.0-trusty-amd64.deb &&     dpkg -i freeling-4.0-trusty-amd64.deb
 ---> Running in 0a760e5f0bf3
Selecting previously unselected package freeling.
(Reading database ... 32547 files and directories currently installed.)
Preparing to unpack freeling-4.0-trusty-amd64.deb ...
Unpacking freeling (4.0) ...
�[91mdpkg: dependency problems prevent configuration of freeling:
 freeling depends on libboost-filesystem1.54.0 (>= 1.54); however:
  Package libboost-filesystem1.54.0 is not installed.

dpkg: error processing package freeling (--install):
 dependency problems - leaving unconfigured
�[0m�[91mErrors were encountered while processing:
 freeling
�[0m

So it seems that does not detect correctly libboost's version (which has version 1.55 installed) as indicated in the apt-get command
apt-get install -y automake autoconf libtool wget swig libicu52 libboost-regex1.55.0 libboost-system1.55.0 libboost-program-options1.55.0

If I change the version in apt-get to 1.54 then it is able to install freeling...

RUN locale-gen en_US.UTF-8 && \
    apt-get install -y automake autoconf libtool wget swig \
                       libicu52 libboost-regex1.54.0 \
                       libboost-system1.54.0 libboost-program-options1.54.0 \
                       libboost-thread1.54.0 libboost-filesystem1.54.0 && \

Then the dpkg -i freeling-4.0-trusty-amd64.deb works with no issues, BUT then
the make command fails
here are the first errors...

g++ -shared -o /usr/local/lib/libfreeling_javaAPI.so freeling_javaAPI.cxx -lfreeling -L/usr/lib -lboost_system -I/usr/include -I/usr/include/treeler -I/usr/lib/jvm/java-8-oracle/include -I/usr/lib/jvm/java-8-oracle/include/linux -fPIC -std=c++0x
freeling_javaAPI.cxx:508:12: error: 'analyzer' in namespace 'freeling' does not name a type
    typedef freeling::analyzer::config_options config_options;
            ^
freeling_javaAPI.cxx:509:12: error: 'analyzer' in namespace 'freeling' does not name a type
    typedef freeling::analyzer::invoke_options invoke_options;
            ^
freeling_javaAPI.cxx: In function 'jlong Java_edu_upc_freeling_freelingJNI_new_1ListAlternative_1_1SWIG_10(JNIEnv*, jclass)':
freeling_javaAPI.cxx:1045:14: error: 'alternative' is not a member of 'freeling'
   std::list< freeling::alternative > *result = 0 ;

I have tried to change the ubuntu version and move it to xenial adapting the version of the parser and the libbboost, but make fails also.

Any help is welcome!

online demo fails: "Error connecting to language identification service"

URL: http://nlp.lsi.upc.edu/freeling/demo/demo.php

Options: default

Detailed error message:

[TS-100] Internal server error
Warning: SimpleXMLElement::__construct(): Entity: line 1: parser error : Start tag expected, '<' not found in /home/operador/public_html/freeling/demo/demo.php on line 265
Warning: SimpleXMLElement::__construct(): [TS-100] Internal server error in /home/operador/public_html/freeling/demo/demo.php on line 265 
Warning: SimpleXMLElement::__construct(): ^ in /home/operador/public_html/freeling/demo/demo.php on line 265 
Fatal error: Uncaught exception 'Exception' with message 'String could not be parsed as XML' in /home/operador/public_html/freeling/demo/demo.php:265 
Stack trace: #0 /home/operador/public_html/freeling/demo/demo.php(265): SimpleXMLElement->__construct('[TS-100] Intern...') 
#1 {main} thrown in /home/operador/public_html/freeling/demo/demo.php on line 265

Are tab separation valid in data files?

I'm working with the go port of freeling which works fine for english, but then I tried to use PT and it gives me an error while separating the following line of tokenizer.dat:

NAMES_CODES 0 ({ALPHA}|{SYMNUM})*[0-9]({ALPHA}|[0-9]|{SYMNUM}+{ALPHANUM})*

which turns out to be the only line separated by tab and not space. I could fix that and the code will work but should the code support TAB separation too?

Python API not working

Environment - Mac OS High Sierra
Freeling installation - Homebrew v 4.1
Issue - Python API not working
Issue Detail - I have successfully executed make on python 2.7 with no error. But when I tried to execute python sample.py, I am getting following error

Traceback (most recent call last):
File "sample.py", line 10, in
import pyfreeling
File "/Users/lab/code/FreeLing-4.1/APIs/python2/pyfreeling.py", line 17, in
_pyfreeling = swig_import_helper()
File "/Users/lab/code/FreeLing-4.1/APIs/python2/pyfreeling.py", line 16, in swig_import_helper
return importlib.import_module('_pyfreeling')
File "/usr/local/Cellar/python@2/2.7.15/Frameworks/Python.framework/Versions/2.7/lib/python2.7/importlib/init.py", line 37, in import_module
import(name)
ImportError: No module named _pyfreeling

Below is the Makefile

You may need to change these paths to match your installation.

Alternatively, you can call 'make' overriding variable values, E.g.:

make FREELINGDIR=/my/freeling/dir PYTHONDIR=/my/python/dir

FREELINGDIR = /usr/local/Cellar/freeling/4.1
PYTHONVER = python2.7
PYTHONDIR = /usr/include/python2.7

_freeling.so: freeling_pythonAPI.cxx

g++ -shared -o _freeling.so freeling_pythonAPI.cxx -lfreeling -I$(FREELINGDIR)/include -L$(FREELINGDIR)/lib -I$(PYTHONDIR) -fPIC -std=gnu++0x

Mac OSX users: Comment the line above, and uncomment the line below:

    g++ -dynamiclib -o _freeling.so freeling_pythonAPI.cxx -lboost_system -l$(PYTHONVER) -lfreeling -I$(FREELINGDIR)/include -L$(FREELINGDIR)/lib -I$(PYTHONDIR) $(CPPFLAGS) $(LDFLAGS) -fPIC  -std=c++0x

freeling_pythonAPI.cxx: freeling_pythonAPI.i ../common/freeling.i ../common/templates.i
swig -python -c++ -o freeling_pythonAPI.cxx freeling_pythonAPI.i

Python2 users: Remove option "-py3" from the above line

clean:
rm -rf pycache freeling_pythonAPI.cxx _freeling.so freeling.py*

Outdated dependencies.

Package deps like libicu libboost-regex is outdated in some .deb releases (jessie, stretch).

Problem with the path in "Additional Library Directories"

The path to the dependencies directory is wrong in the projects "libtreeler"/"libfoma"/"libfreeling" under the solution found in msvc/13.0.
instead of "........\dependencies\zlib\lib\x64" it should be "......\dependencies\zlib\lib\x64"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.