jaraco / inflect Goto Github PK

View Code? Open in Web Editor NEW

938.0 22.0 102.0 1.17 MB

Correctly generate plurals, ordinals, indefinite articles; convert numbers to words

Home Page: https://pypi.org/project/inflect

License: MIT License

Python 100.00%

inflect's People

Contributors

Stargazers

Watchers

Forkers

ultranurd brendancol vinilios benthor yavarhusain baojie julosaure shaunpatterson pombredanne dashea hugovk ouseful-backup pydsigner aerickson sydanny andrewgnagy elinaldosoft igormq rodolphopivetta rooterkyberian steelheaven smil3yzzz nils-werner martianmartian sbfmd zqma2 afcarl gfetterman ag1le david-drinn tyrm ratna04priya takdavid ms77grz cp-singh casoetan ravikiransm chris-jones wallyritchie edcolvin hramezani sbraz priyansh2 dragofireup misterchalm22 bliep manikant92 dgilman chandanchainani theauk picobyte kingkonggrunt skylerberg jceg rogervaas mapleccc ajitkumar15 mcprice30 n00bie-to-github maxxx79 remusao silasary jaykay9999 zakari1231 lemarmol jvandermey george-gca alvistack anndaming candale tzachyrm interestsfantastic 4812571 khuyentran1401 esoj74 kimgerdes bmwasaru amanpreet692 davidmerwin arpitjain799 sugarplumchum73 mayhemheroes zmievsa paul2048 jennifer-richards abravalheri cindy777lee777 jbimat jensonk6 jhatler brunoscaglione nasirsikderprotul whenisjakob openculinary ogurets kloczek mertuygr

inflect's Issues

A/An exceptions

A/An should be assigned based on phonetics and not ortographic representation.
There should be either a little list for know exceptions or a phonetic transcription functionality (with it's own exceptions).
The former would be easier.

An herb
A herpes

http://owl.english.purdue.edu/owl/resource/591/01/

In group*sub, one and zero are padded with a space but other numbers aren't

In group1sub, group1bsub, group3sub, allegedly one and zero are padded with a space but other numbers aren't. Is that true? What's the proper behavior?

Wrong plural for Company

inflect.engine().plural("Company") --> Companys
need to change it to Companies

Cactus plural is wrong

In [2]: e = inflect.engine()

In [3]: e.plural_noun("cactus")
Out[3]: 'cactuses'

Specific Phrase Returns a TypeError

The phrase "case of diapers" returns a Type error in the singular_noun() method. Below is a code example with the error.

>>> import inflect
>>> p = inflect.engine()
>>> p.singular_noun("case of diapers")
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/var/www/.../site-packages/inflect.py", line 1755, in singular_noun
    sing = self._sinoun(word, count=count, gender=gender)
  File "/var/www/.../site-packages/inflect.py", line 2381, in _sinoun
    lowersplit[numword:] )
TypeError: sequence item 0: expected string, bool found
>>> p.singular_noun("STuff and Things")
'STuff and Thing'

I included another similar phrase to show that other phrases are working correctly.

UnicodeDecodeError on Python 3 with ASCII locale

vagrant@vagrant-ubuntu-trusty-64:~$ LANG=C python3 -m pip install --user inflect
Downloading/unpacking inflect
  Downloading inflect-0.2.4.tar.gz (91kB): 91kB downloaded
  Running setup.py (path:/tmp/pip_build_vagrant/inflect/setup.py) egg_info for package inflect
    Traceback (most recent call last):
      File "<string>", line 17, in <module>
      File "/tmp/pip_build_vagrant/inflect/setup.py", line 9, in <module>
        readme = open(readme_path).read()
      File "/usr/lib/python3.4/encodings/ascii.py", line 26, in decode
        return codecs.ascii_decode(input, self.errors)[0]
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 47692: ordinal not in range(128)
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):

  File "<string>", line 17, in <module>

  File "/tmp/pip_build_vagrant/inflect/setup.py", line 9, in <module>

    readme = open(readme_path).read()

  File "/usr/lib/python3.4/encodings/ascii.py", line 26, in decode

    return codecs.ascii_decode(input, self.errors)[0]

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 47692: ordinal not in range(128)

----------------------------------------
Cleaning up...
Command python setup.py egg_info failed with error code 1 in /tmp/pip_build_vagrant/inflect
Storing debug log for failure in /home/vagrant/.pip/pip.log

Shit program

This program is garbage. It's the dumbest fucking piece of trash code out there. It thinks words like bus are plural, and turns words like "used" into "useds". A fucking monkey can come up with better code than this fucking piece of trash.

Implement Jazzband guidelines for project inflect.py

This issue tracks the implementation of the Jazzband guidelines for the project inflect.py

It was initiated by @jaraco who was automatically assigned in addition to the Jazzband roadies.

See the TODO list below for the generally required tasks, but feel free to update it in case the project requires it.

Feel free to ping a Jazzband roadie if you have any question.

TODOs

Fix all links in the docs (and README file etc) from old to new repo
Add the Jazzband badge to the README file
Add the Jazzband contributing guideline to the CONTRIBUTING.md file
Check if continuous testing works (e.g. Travis-CI, CircleCI, AppVeyor, etc)
Check if test coverage services work (e.g. Coveralls, Codecov, etc)
Add jazzband account to PyPI project as maintainer role (URL: https://pypi.python.org/pypi?:action=role_form&package_name=<PROJECTNAME>)
Add jazzband-bot as maintainer to the Read the Docs project (URL: https://readthedocs.org/dashboard/<PROJECTNAME>/users/)
Fix project URL in GitHub project description
Review project if other services are used and port them to Jazzband

Project details

Description	Correctly generate plurals, ordinals, indefinite articles; convert numbers to words
Homepage	http://pypi.python.org/pypi/inflect
Stargazers	0
Open issues	0
Forks	0
Default branch	master
Is a fork	True
Has Wiki	True
Has Pages	False

Conjugate (and make infinitive) verbs

Already does this for nouns. Could add for verbs.

Generate the singular of a verb from the plural:
e.g.
siverb('walk') -> 'walks'
(They walk -> it walks)

TODO: dresses', dresses's -> dresses, dresses

In _pl_check_plurals_adj, when chop off then they return False because they are the same. Need to fix this.

query of death

On version 2.1.0
This string:
"lens with a lens ()."
yields an exception.

> /lib/python3.6/site-packages/inflect.py in plural(self, text, count)
>    2239             self._pl_special_adjective(word, count)
>    2240             or self._pl_special_verb(word, count)
> -> 2241             or self._plnoun(word, count),
>    2242         )
>    2243         return "{}{}{}".format(pre, plural, post)
> 
> /lib/python3.6/site-packages/inflect.py in postprocess(self, orig, inflected)
>    2209                 continue
>    2210             if word.capitalize() == word:
> -> 2211                 result[index] = result[index].capitalize()
>    2212             if word == word.upper():
>    2213                 result[index] = result[index].upper()

IndexError: list index out of range

inconsistency in handling spaces

>>> p.number_to_words('11 11')
'one thousand, one hundred and eleven'
>>> p.ordinal(11)
'11th'
>>> p.ordinal('11 ')
'11 th'
>>> p.ordinal('11  ')
'11  th'
>>> p.ordinal('11 11')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ssara29/others/Abin/src/inflect/inflect.py", line 2797, in ordinal
    n = int(num)
ValueError: invalid literal for int() with base 10: '11 11'

While number_to_words is removing all intervening spaces, ordinal ain't doing the same.
Why the inconsistency? Why isn't number_to_words complaining about the space?

License problem

I want to use this package in a corporate setting and the AGPL license is a problem. Is there an MIT/BSD alternative to this package?

plural of Embassy is embassies

>>> plural.inflect("plural_noun('Embassy')")
'Embassys'

Applying plural to an already plural word yields an incorrect result

Running p.plural("women") results in "womens"

Checking whether word is plural or singular

Does inflect have the feature of checking whether a word is plural or singular?

Get rid of nose

I think it would be nice to get rid of all the nose references since it seems to be quite dead now.

It can probably be easily removed by replacing eq_ with assertEqual and so on.

support for Python3

Is inflect - 0.2.4 python package(which works with python2) also works with python3?

Capitalizing plural acronyms

If you feed an all-caps word to plural(), the post-processor will assume that you want the plural form in all-caps as well. This is not the correct behavior for all-caps acronyms like "MRI" or "GMO".

anything ending with ss comes out with one s with singular_noun

p = inflect.engine()
p.singular_noun('dress')
'dres'
p.singular_noun('address')
'addres'

time difference inflected as before/after

I'd like to see a feature like the following:

>>> p.date_diff(datetime.timedelta(minutes=-30))
'00:30:00 before'
>>> p.date_diff(5)
'00:00:05 after'
>>> p.date_diff(0)
'at'

I would expect one to use the result to produce nice output for a time difference like:

Your flight departed 00:00:05 after its scheduled departure.

It would be nice if it accepted a time formatter such that it could alternately emit:

>>> p.date_diff(datetime.timedelta(minutes=-30), fmt=nice_time)
'30 minutes before'
>>> p.date_diff(5, fmt=nice_time)
'five seconds after'
>>> p.date_diff(0, fmt=nice_time)
'at'

Perhaps that inflection is too trivial for a library like inflect. Or perhaps there are nuances I haven't yet considered.

Plural of a plural returns a singular

This:

p.plural('horses') => 'horse'

Should return False like the converse:

p.singular_noun('horse') => False

plural_noun downcases phrases of 3 or more words

$ python3 -c 'import inflect; print(inflect.engine().plural_noun("can of Coke"))'
cans of coke

I believe the guilty bit is around here:

https://github.com/pwdyson/inflect.py/blob/73ac260/inflect.py#L1891

The code should be carving out of the original input instead of the downcased version lowerword.

Reversed Present participles

Eg:
running -> runs
killed -> kills

How can i do this?

Singular words ending in s having their s removed by singular_noun

I think this is a variation on #46. A brief code sample:
for i in ['pancreas', 'mitosis', 'sepsis', 'assess', 'access', 'status', 'photolysis', 'actress']:
print(p.singular_noun(i))

Kudos by the way for getting "status" correct - all of the other libraries I've worked with turn that into "statu", but the rest of these have 1 trailing s removed. Particularly amusing to me is assess - the removal of the trailing s yields a legitimate plural of a different root word :)

Anyway - the best idea I've had for how to address this kind of problem, and I think something I can implement locally, is to check the word after singularization via a dictionary lookup, to see if it's a word (and preferably a singular word). So "access" becoming "acces" would fail, and thus would be returned to "access". I think that logic would work for many of the other common examples (such as "dress" becoming "dres") over in #46.

For the rather large collection of words from the science realm that are singular and end in a single s, I'm guessing that most dictionaries aren't going to have them, so this lookup idea will fail. But it's at least something to try.

For my own use case, presuming that the input is plural to the singular_noun function is a very strong assumption, and may make the function unusable for my purposes. I've got millions of words that I'm "stemming" by finding their singular form and storing them that way. A variant of the function that takes a word and returns the singular form, whether it starts out as plural or singular, would be far more useful. Returning False for words that aren't themselves plural probably enables me to develop my own function around singular_noun (it all depends on how consistently I get a correct reading on the plural state of the noun).

Inconsistencies while using inflect method

Inflect method's behavior is inconsistent with that of the respective subroutine's in the following cases:

>>> p.number_to_words((10, 20))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/vamc/Code/inflect.py/inflect.py", line 2991, in number_to_words
    num = '%s' % num
TypeError: not all arguments converted during string formatting

>>> p.inflect("number_to_words((10, 20))")
'one thousand and twenty'

>>> p.singular_noun("bat")
False

>>> p.inflect("singular_noun(bat)")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/vamc/Code/inflect.py/inflect.py", line 1497, in inflect
    self.sinounmo, section)
  File "/home/vamc/.virtualenvs/test/lib/python3.5/re.py", line 193, in subn
    return _compile(pattern, flags).subn(repl, string, count)
TypeError: sequence item 0: expected str instance, bool found

>>> p.present_participle("bats", 1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: present_participle() takes 2 positional arguments but 3 were given

>>> p.inflect("present_participle(bats, 1)")
'bats, 1ing'

Inflect should probably raise the same errors as the subroutines.

Plural of shoe comes out as sho

>>> import inflect
>>> inflect.__version__
'0.2.5'
>>> p = inflect.engine()
>>> p.plural("shoe")
'sho'

Incorrect singular noun for "address" word

inflect 0.2.5 converts "address" to singular noun incorrectly:

import inflect
engine = inflect.engine()
engine.singular_noun("address")

outputs addres instead of address.

getlist should be wantlist in documentation

Your documentation has wantlist in once place and getlist another. wantlist works and getlist does not.

I think that

words = p.number_to_words(1234, getlist=True)

should be

words = p.number_to_words(1234, wantlist=True)

inflect method inserts extra chars with tuples and sets

The code at the bottom should produce:

eggs eggs
eggs eggs
eggs eggs

... but instead produces:

eggs eggs
eggs eggs)
eggs eggs)

import inflect
p = inflect.engine()

x = ['a', 'b', 'c']
y = ('a', 'b', 'c')
z = set(['a', 'b', 'c'])

print p.plural('egg', x), p.inflect('plural(egg,{0})'.format(x))
print p.plural('egg', y), p.inflect('plural(egg,{0})'.format(y))
print p.plural('egg', z), p.inflect('plural(egg,{0})'.format(z))

Plural issues?

Some are wrong, I think:

>>> p.plural('corpus')  # corpora
'corpuses'
>>> p.plural('means')  # means
'mean'
>>> p.plural('vita')   # vitae
'vitas'
>>> p.plural('backhoe')
'backhoes'
>>> p.plural('hoe')  # hoes
'ho'
>>> p.plural('ho')   # hos
'h'

These are technically correct, but unusual (and not a mistake like the octupus-plural):

>>> p.plural('radius')  # radii, radiuses
'radiuses'
>>> p.plural('curriculum')  # curricula, curriculums
'curriculums'
>>> p.plural('medium')  # mediums, media
'mediums'
>>> p.plural('appendix')  # appendixes, appendices
'appendixes'
# also index

Not sure how it should work for uncountables:

>>> p.plural('cattle')  # cattle        
'cattles'

Maintainer?

Hi @agronholm, @pwdyson, @benthor, I was wondering if there's a current maintainer for the library.

plural() does not handle unicode strings in python 2.7

When calling

>>> inflect.engine().plural('ångstrom')
'\xc3\xa5ngstroms'
>>> from __future__ import unicode_literals
>>> inflect.engine().plural('ångstrom')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/niels/.local/share/virtualenvs/test2-MMy0r1Qw/lib/python2.7/site-packages/inflect.py", line 1602, in plural
    return "{}{}{}".format(pre, plural, post)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe5' in position 0: ordinal not in range(128)
>>>

The same lines work perfectly fine in python 3

Undocumented `False` for `singular_noun` if word is already singular

>>> word = 'woman'
>>> word = p.plural_noun(word)
>>> word
'women'
>>> word = p.plural_noun(word)
>>> word
'womens'

Singularizing 'police' results in 'polouse'

AFIAK, there is no singular form of the word "police", so it should return False in this case.
I guess there is a loop somewhere that replaces "ice" with "ouse". Like in "mice" and "mouse" case.

Source: https://ell.stackexchange.com/questions/4995/when-to-treat-police-as-a-singular-noun-and-a-plural-noun

Inflect 3.0.1 installs the tests directory in site-packages

I upgraded to 3.0.1 and immediately noticed I could no longer import my local tests directory. It seems inflect 3.0.1 installs it's tests directory into site-packages.

>>> import tests
>>> sys.modules['tests']
<module 'tests' from '<python_install_dir>/lib/python3.7/site-packages/tests/__init__.py'>

ls <python_install_dir>/lib/python3.7/site-packages/tests/__init__.py
__init__.py  inflections.txt  test_classical_all.py      test_classical_herd.py   test_classical_person.py  test_compounds.py    test_join.py      test_pl_si.py  test_unicode.py
__pycache__  test_an.py       test_classical_ancient.py  test_classical_names.py  test_classical_zero.py    test_inflections.py  test_numwords.py  test_pwd.py    words.txt

When I rollback to 2.0.1 this tests directory disappears. I suppose this is unintentional behaviour?

Releases missing on PyPI?

The most recent release on PyPI is 1.0.1, but the repo shows tags for 1.0.2, 2.0.0, 2.0.1, and 2.1.0.

Helper for number & noun together

A common use case for this package looks to be something like this:

# Existing:
print(f"{n} {p.plural('violation', n)} found")
-> "0 violations found"
-> "1 violation found"
-> "2 violations found"

It would be handy to have a method that outputs the quantified noun for you, to minimize repeating the quantity (e.g. if you're calculating it, you wouldn't have to assign to an intermediate variable):

# Proposed:
print(f"p.quantify(n, 'violation')} found")
-> "0 violations found"
-> "1 violation found"
-> "2 violations found"

I'm coming to this package new, apologies if there's already a combined method like this.

Allow choice of gender for sinoun

In making the singular from the plural, sinoun always gives the neuter answer:

>>> p = inflect.engine()
>>> p.sinoun(they)
'it'

Could add an gender setting to return either the masculine or feminine

e.g.

>>> p.gender('female')
>>> p.sinoun('they')
'she'

test

this is a test of github's issue tracking

case should not effect plurals

Entry becomes Entrys but entry becomes entries.

Mentioned in #4.

Singular noun of specimen incorrectly converted to speciman

An additional exception to the -men --> -man rule

Units aren't correctly pluralized

degree celsius becomes degree celsiuses but should become degrees celsius
metre per second becomes metre per seconds but should become metres per second

>>> import inflect
>>> p = inflect.engine()
>>> p.plural('degree celsius')
u'degree celsiuses'
>>> p.plural('degree fahrenheit')
u'degree fahrenheits'
>>> p.plural('metre per second')
u'metre per seconds'
>>> p.plural('kilogram a year')
u'kilogram a years'
>>> p.plural('dollar per minute')
u'dollar per minutes'

Difficulties Installing on Windows + Anaconda

There appears to be no win32 version available for Anaconda, which makes it difficult to install in individual environments.

typo: enconium -> encomium

The pl_sb_C_um_a_list includes the word "enconium" which is a misspelling of "encomium"

Add number to date

Could a function number_to_date be added.

Ie would it be possible for 1974

to return

Nineteen seventy four

2004 would be

Two thousand and four

`chocolate chip cookies` become `chocolate chip cooky`

I'm attempting to get the singular form of some food-related words and phrases. I tried a couple popular phrases and came across this bug:

import inflect
p = inflect.engine()
p.singular_noun('cookies')
# 'cookie'

p.singular_noun('chocolate chip cookies')
# 'chocolate chip cooky'

I believe I'm using the library correctly, so I would expect to see chocolate chip cookie instead of chocolate chip cooky.

.inflectrc file does not work

From a code comment: Can't just execute methods from another file like this

    for rcfile in (pathjoin(dirname(__file__), '.inflectrc'),
                  expanduser(pathjoin(('~'), '.inflectrc'))):
       if isfile(rcfile):
           try:
               execfile(rcfile)
           except:
               print3("\nBad .inflectrc file (%s):\n" % rcfile)
               raise BadRcFileError

Cannot install with old setuptools

Windows 7
Python 2.7.8

C:\stufftodelete>pip install -e git+https://github.com/pwdyson/inflect.py#egg=inflect
Obtaining inflect from git+https://github.com/pwdyson/inflect.py#egg=inflect
  Cloning https://github.com/pwdyson/inflect.py to c:\stufftodelete\src\inflect
  Running setup.py (path:C:\stufftodelete\src\inflect\setup.py) egg_info for package inflect
    usage: -c [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
       or: -c --help [cmd1 cmd2 ...]
       or: -c --help-commands
       or: -c cmd --help

    error: invalid command 'egg_info'
    Complete output from command python setup.py egg_info:
    usage: -c [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]

   or: -c --help [cmd1 cmd2 ...]

   or: -c --help-commands

   or: -c cmd --help



error: invalid command 'egg_info'

----------------------------------------
Cleaning up...
Command python setup.py egg_info failed with error code 1 in C:\stufftodelete\src\inflect

But then:

C:\stufftodelete>pip install --upgrade setuptools
Downloading/unpacking setuptools from https://pypi.python.org/packages/3.4/s/setuptools/setuptools-7.0-py2.py3-none-any.
whl#md5=918e7e5ea108507e1ffbbdfccc3496b1
Installing collected packages: setuptools
  Found existing installation: setuptools 0.6c11
    Uninstalling setuptools:
      Successfully uninstalled setuptools
Successfully installed setuptools
Cleaning up...

And so:

C:\stufftodelete>pip install -e git+https://github.com/pwdyson/inflect.py#egg=inflect
Obtaining inflect from git+https://github.com/pwdyson/inflect.py#egg=inflect
  Updating c:\stufftodelete\src\inflect clone
  Running setup.py (path:C:\stufftodelete\src\inflect\setup.py) egg_info for package inflect

  Installing extra requirements: 'egg'
Installing collected packages: inflect
  Running setup.py develop for inflect

    Creating c:\python27\lib\site-packages\inflect.egg-link (link to .)
    Adding inflect 0.2.5pre1 to easy-install.pth file

    Installed c:\stufftodelete\src\inflect
Successfully installed inflect
Cleaning up...

Can the dependency to certain setuptools version be included in the setup, or must it already be in place? If not, can it be mentioned in the README installation instructions?

PyPI release is stale

0.2.4 is released in PyPI with known, fixed bugs. Can you prep an 0.2.5 release? Or add me as a maintainer (committer here and maintainer on PyPI) and I'll cut a release.