Coder Social home page Coder Social logo

pyhyphen's Introduction

=================================
PyHyphen - hyphenation for Python
=================================

(c) 2008-2014 Dr. Leo

Contact: [email protected]

Project home: http://pyhyphen.googlecode.com

Mailing list: http://groups.google.com/group/pyhyphen


.. contents::

Change log
======================

New in Version 2.0.5:

* remove pre-compiled win32 C extension for Python 2.6, add one for Python 3.4
* avoid unicode error in config.py while installing on some Windows systems

New in Version 2.0.4:

* Update C library to v2.8.6
 
New in Version 2.0.2:

* minor bugfixes and refactorings


New in Version 2.0.1:

* updated URL for LibreOffice's dictionaries
* no longer attempt to hyphenate uppercased words such as 'LONDON'. This
  feature had to be dropped to work around a likely bug in the C extension which,
  under Python 3.3, caused
  the hyphenator to return words starting with a capital letter as lowercase.




New in Version 2.0

The hyphen.dictools module has been completely rewritten. This was required
by the switch from OpenOffice to LibreOffice which does no longer support the
old formats for dictionaries and meta data. these changes made it impossible to release a stable v1.0.
The new dictionary management is more
flexible and powerful. There is now a registry for locally installed hyphenation dictionaries. Each dictionary
can have its own file path. It is thus possible to add persistent metadata on pre-existing hyphenation
dictionaries, e.g. from a LibreOffice installation.
Each dictionary and hence Hyphenator can now be
associated with multiple locales such as for 'en_US' and 'en_NZ'. These changes cause some backwards-incompatible API changes.
Further changes are:

* Hyphenator.info is of a container type for 'url', 'locales' and 'filepath' of the dictionary.
* the Hyphenator.language attribute deprecated in v1.0 has been removed
* download and install dictionaries from LibreOffice's git repository by default
* dictools.install('xx_YY') will install all dictionaries found for the 'xx' language and associate them with all relevant locales
  as described in the dictionaries.xcu file in LibreOffice's git repository.
* upgraded the `C library libhyphen <http://sourceforge.net/projects/hunspell/files/Hyphen/>`_
  to v2.8.3
* use lib2to3 instead of separate code bases
* dropped support for Python 2.4 and 2.5
* support Python 3.3


New in version 1.0

* Upgraded the `C library libhyphen <http://sourceforge.net/projects/hunspell/files/Hyphen/>`_
  to v2.7 which brings significant improvements, most notably correct treatment of
  already hyphenated words such as 'Python-powered'
* use a CSV file from the oo website with meta information
  on dictionaries for installation of dictionaries and
  instantiation of hyphenators. Apps can access the metadata
  on all downloadable dicts through the new module-level attribute hyphen.dict_info or for each hyphenator
  through the 'info' attribute,
* Hyphenator objects have a 'info' attribute which is
  a Python dictionary with meta information on
  the hyphenation dictionary. The 'language' attribute
  is deprecated. *Note:* These new features add
  complexity to the installation process as the metadata and dictionary files
  are downloaded at install time. These features have to be tested
  in various environments before declaring the package stable.
* Streamlined the installation process
* The en_US hyphenation dictionary
  has been removed from the package. Instead, the dictionaries for en_US and the local language are automatically
  downloaded at install time.
* restructured the package and merged 2.x and 3.x setup files
* switch from svn to hg
* added win32 binary of the C extension module for Python32, currently no binaries for Python 2.4 and 2.5


New in version 0.10

* added win32 binary for Python 2.7
* renamed 'hyphenator' class to to more conventional 'Hyphenator'. 'hyphenator' is deprecated.


1. Overview
================

PyHyphen is a pythonic interface to the hyphenation C library used in software such as LibreOffice and the Mozilla suite.
It comes with tools to download, install and uninstall hyphenation dictionaries from LibreOffice's Git repository.
PyHyphen consists of the package 'hyphen' and the module 'textwrap2'. 
The source distribution supports Python 2.6 or higher, including Python 3.3. If you depend on python 2.4 or 2.5, use PyHyphen-1.0b1
instead. In this case you may have to download hyphenation dictionaries manually.

1.1 Content of the hyphen package
------------------------------------------

The 'hyphen' package contains the following:

    - the class hyphen.Hyphenator: each instance of it
      can hyphenate and wrap words using a dictionary compatible with the hyphenation feature of
      LibreOffice and Mozilla.
    - the module dictools contains useful functions such as for downloading and
      installing dictionaries from a configurable repository. After installation of PyHyphen, the
      LibreOffice repository is used by default.
    - hyphen.dict_info: a dict object with metadata on all hyphenation dictionaries installed locally. In previous
      versions, dict_info contained meta data on all downloadable dictionaries. This feature
      is no longer supported as LibreOffice's GIT repository
      does not provide such a list anymore. Instead, Use
      hyphen.config.languages which is an incomplete set of
      language codes of hyphenation dictionaries available from LibreOffice's GIT repository. These codes
      can be passed to hyphen.dictools.install() to download and install
      the respective dictionary and update the local registry.
    - hyphen.config is a configuration file initialized at install time with default values
      for paths of dictionaries and the registry file, as well as the default URL of
      the repository for
      downloadable dictionaries. Initial values for the local paths are set to
      the package root, the URL is set to the LibreOffice
      repository for dictionaries.
    - hyphen.DictInfo: dict-like container type for meta data on dictionaries. It has the following attributes:
      'locales': a list of locales for which the dictionary is suitable;
      'url': the URL from which the dictionary was downloaded, or None; 'filepath': the
      local path including the file name of the dictionary.
    - hyphen.hnj' is the C extension module that does all the ground work. It
      contains the high quality
      `C library libhyphen <http://sourceforge.net/projects/hunspell/files/Hyphen/>`_.
      It supports hyphenation with replacements as well as compound words.
      Note that hyphenation dictionaries are invisible to the
      Python programmer. But each hyphenator object has an attribute 'info' which is a
      DictInfo object containing meta data on the hyphenation dictionary of this Hyphenator instance.
      The 'language' attribute containing a locale for which the dictionary is suitable,
      is deprecated as from v1.0. Use my_hyphenator.info.locales instead to access
      a list of locales for which the dictionary is suitable.


1.2 The module 'textwrap2'
------------------------------

This module is an enhanced though backwards compatible version of the module
'textwrap' from the Python standard library. Unsurprisingly, it adds
hyphenation functionality to 'textwrap'. To this end, a new key word parameter
'use_hyphenator' has been added to the __init__ method of the TextWrapper class which
defaults to None. It can be initialized with any hyphenator object. Note that until version 0.7
this keyword parameter was named 'use_hyphens'. So older code may need to be changed.'


2. Code examples
======================


::

        >>>from hyphen import Hyphenator, dict_info
        from hyphen.dictools import *

        # Download and install some dictionaries in the default directory using the default
        # repository, usually the LibreOffice website
        >>>for lang in ['de_DE', 'en_US']:
            if not is_installed(lang): install(lang)
            
        # Show locales of installed dictionaries
        >>>dict_info.keys()
        ['de_CH', 'en_CA', 'en_PH', 'de', 'de_DE', 'en_TT', 'en_NA', 'en_MW',
        'en_ZA', 'en_AU', 'en_NZ', 'en_JM', 'en_BS', 'en_US', 'de_AT',
        'en_IE', 'en_ZW', 'en_GH', 'en_IN', 'en_BZ', 'en_GB']

        >>>print(dict_info['en_GB'])
        Hyphenation dictionary:
        Locales: ['en_GB', 'en_ZA', 'en_NA', 'en_ZW', 'en_AU', 'en_CA', 'en_IE', 'en_IN'
        , 'en_BZ', 'en_BS', 'en_GH', 'en_JM', 'en_MW', 'en_NZ', 'en_TT']
        filepath: c:\python27\lib\site-packages\hyphen/hyph_en_GB.dic
        URL: http://cgit.freedesktop.org/libreoffice/dictionaries/plain/dictionaries/en/
        hyph_en_GB.dic


        # Create some hyphenators
        h_de = Hyphenator('de_DE')
        h_en = Hyphenator('en_US')

        # Now hyphenate some words
        # Note: the following examples are written in Python 3.x syntax.
        # If you use Python 2.x, you must add the 'u' prefixes as Hyphenator methods expect unicode strings.

        h_en.pairs('beautiful'
        [['beau', 'tiful'], [u'beauti', 'ful']]

        h_en.wrap('beautiful', 6)
        ['beau-', 'tiful']

        h_en.wrap('beautiful', 7)
        ['beauti-', 'ful']
        
        h_en.syllables('beautiful')
        ['beau', 'ti', 'ful']
        
        h_en.info
        {'file_name': 'hyph_en_US.zip', 'country_code': 'US', 'name': 'hyph_en_US',
        'long_descr': 'English (United States)', 'language_code': 'en'}
        

        from textwrap2 import fill
        print fill('very long text...', width = 40, use_hyphenator = h_en)



3. Installation
================================

PyHyphen works with Python 2.6 or higher, including Python 3.x.
The package includes pre-compiled binaries of the hnj module for win32 and Python 2.6, 2.7, 3.2 and 3.3.
On other platforms you will need a build environment such as gcc, make

PyHyphen is pip-installable. In most scenarios the easiest way to install PyHyphen is to type from the shell prompt: 

$ pip install pyhyphen

Manual download and installation will be your preferred option if you want to compile the C library
from source on Windows rather than using the pre-compiled binary, or if you do not want to download dictionaries upon install.

The setup script first checks the Python version, creates a 'hyphen' subdir, and copies
the required files from the 2.x and src subdirs. If needed, lib2to3 will
be used.

Second, setup.py searches in ./bin for a pre-compiled binary
of hnj for your platform. If there is a binary that looks ok, this version is installed. Otherwise,
hnj is compiled from source. On Windows you will need MSVC, mingw
or whatever fits to your Python distribution.
If the distribution comes with a binary of 'hnj'
that fits to your platform and python version, you can still force a compilation from
source by entering

    $python setup.py install --force_build_ext

Under Linux you may need root privileges.

 After compiling and installing the hyphen package, config.py is adjusted as follows:
 
- the local default path for hyphenation dictionaries is set to  the package directory
- the base URL from which
  dictionaries are downloaded is set to LibreOffice's GIT repository

Thereafter the setup script imports the hyphen package to install a default
set of dictionaries, unless the command line contains 'no_dictionaries' after the 'install' command.
The dictionaries installed by default are those for English and the locale, if different.


4. Contributing and reporting bugs
=====================================

Contributions, comments, bug reports, criticism and praise can be sent to the author.

Browse  the Mercurial repository and submit
bug reports at http://pyhyphen.googlecode.com.


pyhyphen's People

Stargazers

 avatar

Watchers

 avatar  avatar

Forkers

mbevila

pyhyphen's Issues

can't install dictionaries

What steps will reproduce the problem?
1. Install pyhypen with python 2.7
2. from hyphen.dictools import install
3. install('en_US')

What is the expected output? What do you see instead?

I hope to install the dictionary, but I have the following error message

<ipython-input-3-3f9c4d992f7e> in <module>()
----> 1 install('en_US')

/usr/local/lib/python2.7/dist-packages/PyHyphen-2.0.2-py2.7-linux-i686.egg/hyphe
n/dictools.pyc in install(language, directory, repos, use_description)
     68 
     69         try:
---> 70             descr_file = urlopen(descr_url)
     71         except URLError:
     72             # OK. So try with the country code.

/usr/lib/python2.7/urllib2.pyc in urlopen(url, data, timeout)
    125     if _opener is None:
    126         _opener = build_opener()
--> 127     return _opener.open(url, data, timeout)
    128 
    129 def install_opener(opener):

/usr/lib/python2.7/urllib2.pyc in open(self, fullurl, data, timeout)
    391 
    392         req.timeout = timeout
--> 393         protocol = req.get_type()
    394 
    395         # pre-process request

/usr/lib/python2.7/urllib2.pyc in get_type(self)
    253             self.type, self.__r_type = splittype(self.__original)
    254             if self.type is None:
--> 255                 raise ValueError, "unknown url type: %s" % 
self.__original
    256         return self.type
    257 

ValueError: unknown url type: $repoen_US/dictionaries.xcu



What version of the product are you using? On what operating system?

Last pyhyphen form pypi (2.02)

Please provide any additional information below.


Original issue reported on code.google.com by [email protected] on 9 Jan 2013 at 11:58

Typo in documentation

From the example in the README.txt:

from textwrap2 import fill
print fill('very long text...', width = 40, use_hyphens = h_en)

has to be changed to:

from textwrap2 import fill
print fill('very long text...', width = 40, use_hyphenator = h_en)

Regards,

Alessandro

Original issue reported on code.google.com by [email protected] on 23 Aug 2011 at 3:22

hyphen_config.py assumes setup.py installed into sys.path

I'm trying to create a Debian package of this project.  With version
0.8, it fails to install.  I believe the problem is that "setup.py
install" invokes hyphen_config.py -- which assumes hyphen is in the
default Python sys.path.  This is not the case when preparing
packages.  Below is a partial transcript demonstrating the problem.

   dh_auto_install
running install
running build
running build_py
running build_ext
running install_lib
creating /home/twb/VCS/PyHyphen-0.8/debian/python-pyhyphen/usr
creating /home/twb/VCS/PyHyphen-0.8/debian/python-pyhyphen/usr/lib
creating /home/twb/VCS/PyHyphen-0.8/debian/python-pyhyphen/usr/lib/python2.5
creating 
/home/twb/VCS/PyHyphen-0.8/debian/python-pyhyphen/usr/lib/python2.5/site-package
s
creating 
/home/twb/VCS/PyHyphen-0.8/debian/python-pyhyphen/usr/lib/python2.5/site-package
s/hyphen
copying build/lib.linux-i686-2.5/hyphen/hyph_en_US.dic -> 
/home/twb/VCS/PyHyphen-0.8/debian/python-pyhyphen/usr/lib/python2.5/site-package
s/hyphen
copying build/lib.linux-i686-2.5/hyphen/__init__.py -> 
/home/twb/VCS/PyHyphen-0.8/debian/python-pyhyphen/usr/lib/python2.5/site-package
s/hyphen
copying build/lib.linux-i686-2.5/hyphen/dictools.py -> 
/home/twb/VCS/PyHyphen-0.8/debian/python-pyhyphen/usr/lib/python2.5/site-package
s/hyphen
copying build/lib.linux-i686-2.5/hyphen/config.py -> 
/home/twb/VCS/PyHyphen-0.8/debian/python-pyhyphen/usr/lib/python2.5/site-package
s/hyphen
copying build/lib.linux-i686-2.5/hyphen/hnj.so -> 
/home/twb/VCS/PyHyphen-0.8/debian/python-pyhyphen/usr/lib/python2.5/site-package
s/hyphen
copying build/lib.linux-i686-2.5/textwrap2.py -> 
/home/twb/VCS/PyHyphen-0.8/debian/python-pyhyphen/usr/lib/python2.5/site-package
s
running install_egg_info
Writing 
/home/twb/VCS/PyHyphen-0.8/debian/python-pyhyphen/usr/lib/python2.5/site-package
s/PyHyphen-0.8.egg-info
Configuring...
Traceback (most recent call last):
  File "setup.py", line 118, in <module>
    if 'install' in sys.argv: execfile('hyphen_config.py')
  File "hyphen_config.py", line 22, in <module>
    cfg()
  File "hyphen_config.py", line 7, in cfg
    import hyphen
ImportError: No module named hyphen
dh_auto_install: command returned error code 256
make: *** [binary] Error 1
dpkg-buildpackage: failure: fakeroot debian/rules binary gave error exit status 
2
debuild: fatal error at line 1319:
dpkg-buildpackage -rfakeroot -D -us -uc failed

Original issue reported on code.google.com by [email protected] on 2 Oct 2008 at 6:53

UnicodeEncodeError: 'latin-1' codec can't encode character u'\u03b4' in position 0: ordinal not in range(256)

From this transcript, you can see that rst2pdf crashes when trying to
make a PDF containing the Greek letter "delta" if and only if pyhyphen
is installed.

$ rst2pdf Requirements.txt 
Traceback (most recent call last):
  File "/usr/bin/rst2pdf", line 8, in <module>
    load_entry_point('rst2pdf==0.9dev-r412', 'console_scripts', 'rst2pdf')()
  File "/var/lib/python-support/python2.5/rst2pdf/createpdf.py", line 1229, in main
    compressed=options.compressed)
  File "/var/lib/python-support/python2.5/rst2pdf/createpdf.py", line 1001, in createPdf
    pdfdoc.build(elements)
  File "/var/lib/python-support/python2.5/reportlab/platypus/doctemplate.py", line 756, in build
    self.handle_flowable(flowables)
  File "/var/lib/python-support/python2.5/reportlab/platypus/doctemplate.py", line 649, in handle_flowable
    if frame.add(f, canv, trySplit=self.allowSplitting):
  File "/var/lib/python-support/python2.5/reportlab/platypus/frames.py", line 159, in _add
    w, h = flowable.wrap(aW, h)
  File "/var/lib/python-support/python2.5/reportlab/platypus/flowables.py", line 520, in wrap
    W,H = _listWrapOn(self._content,aW,self.canv,dims=dims)
  File "/var/lib/python-support/python2.5/reportlab/platypus/flowables.py", line 464, in _listWrapOn
    w,h = f.wrapOn(canv,availWidth,0xfffffff)
  File "/var/lib/python-support/python2.5/reportlab/platypus/flowables.py", line 113, in wrapOn
    w, h = self.wrap(aW,aH)
  File "/var/lib/python-support/python2.5/reportlab/platypus/paragraph.py", line 800, in wrap
    blPara = self.breakLines([first_line_width, later_widths])
  File "/var/lib/python-support/python2.5/wordaxe/rl/paragraph.py", line 808, in breakLines
    hyphWord = hyphenator.hyphenate(uniwordstr)
  File "/var/lib/python-support/python2.5/wordaxe/hyphen.py", line 315, in hyphenate
    hword = self.i_hyphenate(aWord)
  File "/var/lib/python-support/python2.5/wordaxe/plugins/PyHyphenHyphenator.py", line 105, in i_hyphenate
    return ExplicitHyphenator.i_hyphenate_derived(self, aWord)
  File "/var/lib/python-support/python2.5/wordaxe/ExplicitHyphenator.py", line 136, in i_hyphenate_derived
    hword = self.stripper.apply_stripped(word, self.hyph)
  File "/var/lib/python-support/python2.5/wordaxe/BaseHyphenator.py", line 60, in apply_stripped
    result = func(base, *args, **kwargs)
  File "/var/lib/python-support/python2.5/wordaxe/plugins/PyHyphenHyphenator.py", line 98, in hyph
    hword = HyphenatedWord(aWord, hyphenations=self.zerlegeWort(aWord))
  File "/var/lib/python-support/python2.5/wordaxe/plugins/PyHyphenHyphenator.py", line 62, in zerlegeWort
    for left, right in self.hnj.pairs(zusgWort):
  File "/var/lib/python-support/python2.5/hyphen/__init__.py", line 199, in pairs
    return self.__hyphenate__.apply(word, mode)
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u03b4' in 
position 0: ordinal not in range(256)
$ dpkg -l rst2pdf python-{reportlab,hyphen,wordaxe}
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Cfg-files/Unpacked/Failed-cfg/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Hold/Reinst-required/X=both-problems (Status,Err: uppercase=bad)
||/ Name                        Version                     Description
+++-===========================-===========================-====================
==================================================
ii  python-hyphen               0.8-1                       Python wrapper for 
libhnj (libhyphen)
ii  python-reportlab            2.2-2                       ReportLab library 
to create PDF documents using Python
ii  python-wordaxe              0.2.6-1                     germanic (and 
basic) hyphenation algorithms
ii  rst2pdf                     0.9+svn412-1                ReportLab-based 
reStructuredText to PDF renderer
$ locale
LANG=en_AU.utf8
LC_CTYPE="en_AU.utf8"
LC_NUMERIC="en_AU.utf8"
LC_TIME="en_AU.utf8"
LC_COLLATE=C
LC_MONETARY="en_AU.utf8"
LC_MESSAGES="en_AU.utf8"
LC_PAPER="en_AU.utf8"
LC_NAME="en_AU.utf8"
LC_ADDRESS="en_AU.utf8"
LC_TELEPHONE="en_AU.utf8"
LC_MEASUREMENT="en_AU.utf8"
LC_IDENTIFICATION="en_AU.utf8"
LC_ALL=
$ sudo aptitude purge python-hyphen
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Reading extended state information       
Initializing package states... Done
The following packages will be REMOVED:
  python-hyphen{p} 
0 packages upgraded, 0 newly installed, 1 to remove and 0 not upgraded.
Need to get 0B of archives. After unpacking 193kB will be freed.
Do you want to continue? [Y/n/?] 
Writing extended state information... Done
(Reading database ... 37977 files and directories currently installed.)
Removing python-hyphen ...
Processing triggers for python-support ...
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Reading extended state information      
Initializing package states... Done

Current status: 1024 new [-1].
$ rst2pdf Requirements.txt 
[WARNING] createpdf.py L115 Can't load PyHyphen hyphenator for language en_US, 
trying PyHnj hyphenator
WARNING:rst2pdf:Can't load PyHyphen hyphenator for language en_US, trying PyHnj 
hyphenator
[WARNING] flowables.py L293 BoundByWidth too wide to fit in frame: 
<BoundByWidth at 0x97b6d8c> size= maxWidth=238.015748031x
WARNING:rst2pdf:BoundByWidth too wide to fit in frame: <BoundByWidth at 
0x97b6d8c> size= maxWidth=238.015748031x
[WARNING] flowables.py L293 BoundByWidth too wide to fit in frame: 
<BoundByWidth at 0x97bb30c> size= maxWidth=238.015748031x
WARNING:rst2pdf:BoundByWidth too wide to fit in frame: <BoundByWidth at 
0x97bb30c> size= maxWidth=238.015748031x
[WARNING] flowables.py L293 BoundByWidth too wide to fit in frame: 
<BoundByWidth at 0x97bb46c> size= maxWidth=238.015748031x
WARNING:rst2pdf:BoundByWidth too wide to fit in frame: <BoundByWidth at 
0x97bb46c> size= maxWidth=238.015748031x

Original issue reported on code.google.com by [email protected] on 8 Oct 2008 at 12:56

Dictionary Repo Has Disappeared

http://ftp.services.openoffice.org/pub/OpenOffice.org/contrib/dictionaries/ now 
times out. Importantly, this means that `pip install PyHyphen` currently fails.

I've located the en_US hunspell dicts on sourceforge, here, but I didn't look 
for other languages.

http://sourceforge.net/projects/hunspell/files/OldFiles/

Original issue reported on code.google.com by david.schoonover on 18 Dec 2011 at 2:17

hyphen/__init__.py has wrong encoding

The file hyphen/__init__.py in release 0.8 has an ISO 8859 encoding,
but contains the declaration -*- utf-8 -*-.  Currently this only
affects docstrings, but it should still be fixed.

    $ file hyphen/__init__.py
    hyphen/__init__.py: ISO-8859 English text
    $ head -1 hyphen/__init__.py
    # -*- coding: utf-8 -*-
    $ iconv --from iso-8859-1 --to utf-8 hyphen/__init__.py | diff -u hyphen/__init__.py -
    --- hyphen/__init__.py  2008-09-23 23:46:53.000000000 +1000
    +++ -   2008-10-02 19:07:38.358454496 +1000
    @@ -89,8 +89,8 @@

     # Now hyphenate some words

    -print h_de.inserted(u'Sch?nheitsk?nigin')
    -'Sch?n=heits=k?=ni=gin'
    +print h_de.inserted(u'Schönheitskönigin')
    +'Schön=heits=kö=ni=gin'

     print h_en.pairs('beautiful')
     [[u'beau', u'tiful'], [u'beauti', u'ful']]

Original issue reported on code.google.com by [email protected] on 2 Oct 2008 at 9:09

Charset LookupError for some dictionaries

What steps will reproduce the problem?
>>> from hyphen import dictools, hyphenator
>>> dictools.install("sv_SE")
>>> hyphenator("sv_SE").pairs(u"begravningsentreprenör")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "hyphen/__init__.py", line 175, in pairs
    return self.__hyphenate__.apply(word, mode)
LookupError: unknown encoding: charset ISO8859-1


What is the expected output? What do you see instead?
>>> from hyphen import dictools, hyphenator
>>> dictools.install("sv_SE")
>>> hyphenator("sv_SE").pairs(u"begravningsentreprenör")
[[u'be', u'gravningsentrepren\xf6r'], [u'begrav', u'ningsentrepren\xf6r'], 
[u'begravnings', u'entrepren\xf6r'], [u'begravningsent', u'repren\xf6r'], 
[u'begravningsentre', u'pren\xf6r']]


What version of the product are you using? On what operating system?
0.9.3

Please provide any additional information below.

Issue is caused by some dictionaries having "charset" before the charset on the 
first line of the file.

I have fixed it by making a small change to hyphen.c see attached diff.

Original issue reported on code.google.com by [email protected] on 22 Jun 2010 at 4:39

Attachments:

Installation via pip fails

What steps will reproduce the problem?
1. pip install pyhyphen

I'm using MacOS X 10.7 (64bit) with a python 2.6 virtualenv, in case this 
matters.

The installation fails because of whitespace in `setup.py`:


$ pip install pyhyphen
Downloading/unpacking pyhyphen
  Running setup.py egg_info for package pyhyphen
    Traceback (most recent call last):
      File "<string>", line 14, in <module>
      File "/Users/diederik/Sites/virtualenvs/allerhande-ah/build/pyhyphen/setup.py", line 143

    SyntaxError: invalid syntax
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):

  File "<string>", line 14, in <module>

  File "/Users/diederik/Sites/virtualenvs/allerhande-ah/build/pyhyphen/setup.py", line 143



SyntaxError: invalid syntax

----------------------------------------
Command python setup.py egg_info failed with error code 1
Storing complete log in /Users/diederik/.pip/pip.log


When I fix the whitespace, and run the install again, I get the following error:


Copying PyHyphen.egg-info to 
/Users/diederik/Sites/virtualenvs/allerhande-ah/lib/python2.6/site-packages/PyHy
phen-1.0beta1-py2.6.egg-info

running install_scripts

writing list of installed files to 
'/var/folders/24/5y_gshzd7vg9f5mv_hpxf1y40000gn/T/pip-7V2UaF-record/install-reco
rd.txt'

Adjusting /.../hyphen/config.py... Done.

Installing dictionary info... Done.

Installing dictionaries... en_US Traceback (most recent call last):

  File "<string>", line 1, in <module>

  File "/Users/diederik/Sites/virtualenvs/allerhande-ah/build/pyhyphen/setup.py", line 123, in <module>

    install('en_US')

  File "/Users/diederik/Sites/virtualenvs/allerhande-ah/lib/python2.6/site-packages/hyphen/dictools.py", line 48, in install

    s = urllib2.urlopen(url).read()

  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 126, in urlopen

    return _opener.open(url, data, timeout)

  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 391, in open

    response = self._open(req, data)

  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 409, in _open

    '_open', req)

  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 369, in _call_chain

    result = func(*args)

  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 1181, in http_open

    return self.do_open(httplib.HTTPConnection, req)

  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 1156, in do_open

    raise URLError(err)

urllib2.URLError: <urlopen error [Errno 8] nodename nor servname provided, or 
not known>

Original issue reported on code.google.com by [email protected] on 27 Mar 2012 at 9:40

Request to add

What steps will reproduce the problem?
1.
2.
3.

What is the expected output? What do you see instead?


Please use labels and text to provide additional information.


Original issue reported on code.google.com by [email protected] on 2 Jun 2008 at 1:17

Need to pass filename to Hyphenator constructor if dict_info is missing

If dict_info is not available, Hyphenator construction fails unless I pass 
actual dictionary filename as "language" paramater. This is undocumented and 
surprising. Also, dictools functions don't behave this way.

What version of the product are you using? On what operating system?
PyHyphen 1.0beta1

Original issue reported on code.google.com by [email protected] on 24 Sep 2011 at 7:25

Setup fails if locale can't be read

Setup fails if locale can't be read:

       [...]
       Adjusting /.../hyphen/config.py... Done.
       Installing dictionaries... en_US Traceback (most recent call last):      
         File "<string>", line 1, in <module> 
         File "/tmp/pip-30Fjx3-build/setup.py", line 144, in <module>
           sys.stdout.write(local_lang + ' ')
       TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'

This could be fixed by moving Line 144 into the try clause (Line 145) or making 
dictionary installation non-default:
(Line 135: if '--install_dictionaries' in sys.argv:)

Thanks!

Original issue reported on code.google.com by [email protected] on 11 Oct 2012 at 1:34

Request: Use the list of available hyphenation dictionaries

What steps will reproduce the problem?
1. dictools.install("de_AT")

What is the expected output? What do you see instead?

At this point I would like for it to examine the file available at
http://ftp.osuosl.org/pub/openoffice/contrib/dictionaries/hyphavail.lst
which would tell it to download and use the de_DE hyphenation dictionary
instead.

"de,AT,hyph_de_DE,German (Austria),hyph_de_DE.zip"

Original issue reported on code.google.com by [email protected] on 4 Mar 2010 at 10:09

Errors show up while install dictionary languages on Python 3.4

Errors occurred while install dictionaries. System: ubuntu, virtual environment, python 3.4.3. PyHyphen-2.0.5

Tried this:
for lang in ['de_DE', 'en_US']:
if not is_installed(lang): install(lang)

Errors detail:

lib/python3.4/site-packages/hyphen/dictools.py", line 71, in install
descr_file = urlopen(descr_url)
File "/usr/lib/python3.4/urllib/request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.4/urllib/request.py", line 448, in open
req = Request(fullurl, data)
File "/usr/lib/python3.4/urllib/request.py", line 266, in init
self.full_url = url
File "/usr/lib/python3.4/urllib/request.py", line 292, in full_url
self._parse()
File "/usr/lib/python3.4/urllib/request.py", line 321, in _parse
raise ValueError("unknown url type: %r" % self.full_url)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.