Coder Social home page Coder Social logo

eflomal's People

Contributors

miau1 avatar robertostling avatar svirpioj avatar titsuki avatar zouharvi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

eflomal's Issues

Failed to read token/ failed to read length

Hi,
I train a huge alignment model using +15M sentences and we get "failed to read token: Success". Also, on another occasion, I get the error "failed to read length: Success"
It does not occur with smaller data. Nither with small data with very long sentences.
Could it be the memory problem? Maybe eflomal saves some temporal files which I have not taken into account?
Any idea?
regards and thanks in advance

-i option doesn't work

Would it be hard to add the fastalign format -i option back ? It would be useful to avoid having to change scripts :)

Allowed maximum sentence length

Hi, what if I want to increase MAX_SENT_LEN from 0x400 to, let's say 0x1000. (changes max sentence's length in eflomal.c and eflomal.pyx). Would it work, what's your recommendation about increasing allowed maximum sentence length more than current limitation 1024 characters ?

calculated optimal null-prior value

I wonder is it possible to calculate balanced prior probability of NULL alignment to fit optimal score between recall for precision? I bet GIZA or MGIZA aligners calculates it, any chance eflomal could do something like that?

Re-opening open NamedTemporaryFile files won't work on Windows

I was hoping there was an aligner that actually worked on Windows, but unfortunately it doesn't. The first issue I bump into is the file handling.

On Windows you can't re-open NamedTemporaryFile files that are open - it will give a Permission Denied error.

EDIT: I spoke before my turn. The issue seems more intricate than what I had assumed. Apologies. The issue remains, though.

Trace:

PS C:\tools\eflomal> python .\align.py -s .\source.txt -t .\target.txt
Traceback (most recent call last):
  File ".\align.py", line 142, in <module>
    if __name__ == '__main__': main()
  File ".\align.py", line 136, in main
    use_gdb=args.debug)
  File "python\eflomal\eflomal.pyx", line 123, in eflomal.align
    with open(source_filename, 'rb') as f:
PermissionError: [Errno 13] Permission denied: 'C:\\Users\\user\\AppData\\Local\\Temp\\tmphqbksy6m'

Make error on Ubuntu 20.04

Hi @robertostling , I'm somewhat new with this, but I just tried compiling eflomal into my Ubuntu machine in WSL 2 and got an error while running the make command and I cannot find any solution on Google.

Here's the output:

cc -Ofast -march=native -Wall --std=gnu99 -Wno-unused-function -g -fopenmp -c eflomal.c
cc1: error: bad value (‘tigerlake’) for ‘-march=’ switch
cc1: note: valid arguments to ‘-march=’ switch are: nocona core2 nehalem corei7 westmere sandybridge corei7-avx ivybridge core-avx-i haswell core-avx2 broadwell skylake skylake-avx512 cannonlake icelake-client icelake-server cascadelake bonnell atom silvermont slm goldmont goldmont-plus tremont knl knm x86-64 eden-x2 nano nano-1000 nano-2000 nano-3000 nano-x2 eden-x4 nano-x4 k8 k8-sse3 opteron opteron-sse3 athlon64 athlon64-sse3 athlon-fx amdfam10 barcelona bdver1 bdver2 bdver3 bdver4 znver1 znver2 btver1 btver2 native
cc1: error: bad value (‘tigerlake’) for ‘-mtune=’ switch
cc1: note: valid arguments to ‘-mtune=’ switch are: nocona core2 nehalem corei7 westmere sandybridge corei7-avx ivybridge core-avx-i haswell core-avx2 broadwell skylake skylake-avx512 cannonlake icelake-client icelake-server cascadelake bonnell atom silvermont slm goldmont goldmont-plus tremont knl knm intel x86-64 eden-x2 nano nano-1000 nano-2000 nano-3000 nano-x2 eden-x4 nano-x4 k8 k8-sse3 opteron opteron-sse3 athlon64 athlon64-sse3 athlon-fx amdfam10 barcelona bdver1 bdver2 bdver3 bdver4 znver1 znver2 btver1 btver2 generic native
make: *** [Makefile:10: eflomal.o] Error 1

Any idea why?

Any help would greatly be appreciated. I just can't get around this issue.

FileNotFoundError

I'm trying to run eflomal, but keep getting the following error:

Traceback (most recent call last):
  File "./align.py", line 142, in <module>
    if __name__ == '__main__': main()
  File "./align.py", line 136, in main
    use_gdb=args.debug)
  File "python/eflomal/eflomal.pyx", line 152, in eflomal.align
    subprocess.call(args)
  File "/home/bene/anaconda3/lib/python3.6/subprocess.py", line 267, in call
    with Popen(*popenargs, **kwargs) as p:
  File "/home/bene/anaconda3/lib/python3.6/subprocess.py", line 707, in __init__
    restore_signals, start_new_session)
  File "/home/bene/anaconda3/lib/python3.6/subprocess.py", line 1333, in _execute_child
    raise child_exception_type(errno_num, err_msg)
FileNotFoundError: [Errno 2] No such file or directory: 'eflomal'

I tried running both eflomal and align.py, from both the eflomal directory and outside it. Am I doing something wrong?

eflomal vs Static and Contextualized Embeddings

I found publication where is written:

We find that alignments created from embeddings are superior for four and comparable for two language pairs compared to those produced by traditional statistical aligners – even with abundant parallel data; e.g., contextualized embeddings achieve a word alignment F1 for English-German that is 5 percentage points higher than eflomal, a high-quality statistical aligner, trained on 100k parallel sentences

https://arxiv.org/pdf/2004.08728.pdf

what do you think about it?


I started research few hours ago and do not have opinion and do not know if should I learn more about eflomal or about "Static and Contextualized Embeddings"

Odd path issue when installing into a prefix

I install eflomal into a prefix, as follows:

cloud-user@opus-rr:~/guarani/source/eflomal$ python3 setup.py install --prefix=$HOME/guarani/local
running install
running bdist_egg
running egg_info
writing requirements to eflomal.egg-info/requires.txt
writing eflomal.egg-info/PKG-INFO
writing dependency_links to eflomal.egg-info/dependency_links.txt
writing top-level names to eflomal.egg-info/top_level.txt
reading manifest file 'eflomal.egg-info/SOURCES.txt'
writing manifest file 'eflomal.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
creating build/bdist.linux-x86_64/egg
copying build/lib.linux-x86_64-3.4/eflomal.cpython-34m.so -> build/bdist.linux-x86_64/egg
creating stub loader for eflomal.cpython-34m.so
byte-compiling build/bdist.linux-x86_64/egg/eflomal.py to eflomal.cpython-34.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying eflomal.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying eflomal.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying eflomal.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying eflomal.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying eflomal.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
zip_safe flag not set; analyzing archive contents...
__pycache__.eflomal.cpython-34: module references __file__
creating 'dist/eflomal-0.1-py3.4-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Creating /home/cloud-user/guarani/local/lib/python3.4/site-packages/site.py
Processing eflomal-0.1-py3.4-linux-x86_64.egg
removing '/home/cloud-user/guarani/local/lib/python3.4/site-packages/eflomal-0.1-py3.4-linux-x86_64.egg' (and everything under it)
creating /home/cloud-user/guarani/local/lib/python3.4/site-packages/eflomal-0.1-py3.4-linux-x86_64.egg
Extracting eflomal-0.1-py3.4-linux-x86_64.egg to /home/cloud-user/guarani/local/lib/python3.4/site-packages
eflomal 0.1 is already the active version in easy-install.pth

Installed /home/cloud-user/guarani/local/lib/python3.4/site-packages/eflomal-0.1-py3.4-linux-x86_64.egg
Processing dependencies for eflomal==0.1
Searching for numpy==1.8.2
Best match: numpy 1.8.2
numpy 1.8.2 is already the active version in easy-install.pth

Using /usr/lib/python3/dist-packages
Finished processing dependencies for eflomal==0.1

Everything seems to work, and the prefix is in my PYTHONPATH:

$ echo $PYTHONPATH
:/home/cloud-user/guarani/local/lib/python3.4/site-packages/

But when I try to run it I get the following error:

$ cat ../../iterations/grn-spa.0.eflomal
Traceback (most recent call last):
  File "align.py", line 142, in <module>
    if __name__ == '__main__': main()
  File "align.py", line 136, in main
    use_gdb=args.debug)
  File "eflomal.pyx", line 156, in eflomal.align (python/eflomal/eflomal.c:3657)
  File "/usr/lib/python3.4/subprocess.py", line 537, in call
    with Popen(*popenargs, **kwargs) as p:
  File "/usr/lib/python3.4/subprocess.py", line 859, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.4/subprocess.py", line 1457, in _execute_child
    raise child_exception_type(errno_num, err_msg)
FileNotFoundError: [Errno 2] No such file or directory: '/home/cloud-user/guarani/source/eflomal/eflomal'

Any ideas?

Build error on Ubuntu 14.04

Dear Robert,
after installed the required dependencies, then build Elomal, I have run the Python install script:

$ sudo python3 setup.py install
running install
Checking .pth file support in /usr/local/lib/python3.4/dist-packages/
/usr/bin/python3 -E -c pass
TEST PASSED: /usr/local/lib/python3.4/dist-packages/ appears to support .pth files
running bdist_egg
running egg_info
creating eflomal.egg-info
writing requirements to eflomal.egg-info/requires.txt
writing eflomal.egg-info/PKG-INFO
writing dependency_links to eflomal.egg-info/dependency_links.txt
writing top-level names to eflomal.egg-info/top_level.txt
writing manifest file 'eflomal.egg-info/SOURCES.txt'
reading manifest file 'eflomal.egg-info/SOURCES.txt'
writing manifest file 'eflomal.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
building 'eflomal' extension
creating build
creating build/temp.linux-x86_64-3.4
creating build/temp.linux-x86_64-3.4/python
creating build/temp.linux-x86_64-3.4/python/eflomal
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -fPIC -I/usr/lib/python3/dist-packages/numpy/core/include -I/usr/include/python3.4m -c python/eflomal/eflomal.c -o build/temp.linux-x86_64-3.4/python/eflomal/eflomal.o
In file included from /usr/lib/python3/dist-packages/numpy/core/include/numpy/ndarraytypes.h:1761:0,
from /usr/lib/python3/dist-packages/numpy/core/include/numpy/ndarrayobject.h:17,
from /usr/lib/python3/dist-packages/numpy/core/include/numpy/arrayobject.h:4,
from python/eflomal/eflomal.c:353:
/usr/lib/python3/dist-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
#warning "Using deprecated NumPy API, disable it by "
^
python/eflomal/eflomal.c: In function ‘__pyx_f_7eflomal_write_text’:
python/eflomal/eflomal.c:2328:7: error: format not a string literal and no format arguments [-Werror=format-security]
fprintf(__pyx_v_f, __pyx_k_0);
^
...
At the end, the script failed with exit status 1

gcc version 4.8.4
Thank you for help, best, Philippe

Compile error on OSX Mojave

Trying to execute the make command I get the following error:
clang: error: unsupported option '-fopenmp'

I tried to install gcc via brew using:
brew install cmake gcc
but it doesn't change anything.
How to fix it?

Input data format

Hi Robert,
thank you for your help and the nice development you made with your team, I want to give Eflomal a try.
But can you add some indications about the input data format supported by Eflomal (or tell me where it is written)?
I have formatted my data like in the "3rdparty/data/" folder and just learned (see #2 (comment)) it wasn't necessary :-) .
Best, Philippe

library not found for -lrt

When I try to run the make command, I get the following error:
ld: library not found for -lrt collect2: error: ld returned 1 exit status make: *** [eflomal] Error 1

I am using gcc 8.3
Any ideas?

Compile error

Hi, I can not compile the eflomal with gcc 4.9.3 using the make command. The error is as follows. I can not find the solution through Google. Do you encounter the error ever?

cc -lm -lrt -lgomp eflomal.o -o eflomal
eflomal.o: In function align._omp_fn.2': /home/user/work/tool/align/eflomal/eflomal.c:867: undefined reference to omp_get_num_threads'
/home/user/work/tool/align/eflomal/eflomal.c:867: undefined reference to omp_get_thread_num' /home/user/work/tool/align/eflomal/eflomal.c:870: undefined reference to GOMP_critical_start'
/home/user/work/tool/align/eflomal/eflomal.c:870: undefined reference to GOMP_critical_end' eflomal.o: In function align._omp_fn.1':
/home/user/work/tool/align/eflomal/eflomal.c:848: undefined reference to omp_get_num_threads' /home/user/work/tool/align/eflomal/eflomal.c:848: undefined reference to omp_get_thread_num'
/home/user/work/tool/align/eflomal/eflomal.c:851: undefined reference to GOMP_critical_start' /home/user/work/tool/align/eflomal/eflomal.c:851: undefined reference to GOMP_critical_end'
eflomal.o: In function main._omp_fn.0': /home/user/work/tool/align/eflomal/eflomal.c:1007: undefined reference to omp_get_num_threads'
/home/user/work/tool/align/eflomal/eflomal.c:1007: undefined reference to omp_get_thread_num' eflomal.o: In function align':
/home/user/work/tool/align/eflomal/eflomal.c:848: undefined reference to GOMP_parallel' /home/user/work/tool/align/eflomal/eflomal.c:867: undefined reference to GOMP_parallel'
/home/user/work/tool/align/eflomal/eflomal.c:867: undefined reference to GOMP_parallel' /home/user/work/tool/align/eflomal/eflomal.c:848: undefined reference to GOMP_parallel'
eflomal.o: In function main': /home/user/work/tool/align/eflomal/eflomal.c:955: undefined reference to omp_set_nested'
/home/user/work/tool/align/eflomal/eflomal.c:1007: undefined reference to `GOMP_parallel'
collect2: error: ld returned 1 exit status
: recipe for target 'eflomal' failed
make: *** [eflomal] Error 1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.