jaraco / inflect Goto Github PK
View Code? Open in Web Editor NEWCorrectly generate plurals, ordinals, indefinite articles; convert numbers to words
Home Page: https://pypi.org/project/inflect
License: MIT License
Correctly generate plurals, ordinals, indefinite articles; convert numbers to words
Home Page: https://pypi.org/project/inflect
License: MIT License
A/An should be assigned based on phonetics and not ortographic representation.
There should be either a little list for know exceptions or a phonetic transcription functionality (with it's own exceptions).
The former would be easier.
An herb
A herpes
In group1sub
, group1bsub
, group3sub
, allegedly one and zero are padded with a space but other numbers aren't. Is that true? What's the proper behavior?
inflect.engine().plural("Company") --> Companys
need to change it to Companies
In [2]: e = inflect.engine()
In [3]: e.plural_noun("cactus")
Out[3]: 'cactuses'
The phrase "case of diapers" returns a Type error in the singular_noun() method. Below is a code example with the error.
>>> import inflect
>>> p = inflect.engine()
>>> p.singular_noun("case of diapers")
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/var/www/.../site-packages/inflect.py", line 1755, in singular_noun
sing = self._sinoun(word, count=count, gender=gender)
File "/var/www/.../site-packages/inflect.py", line 2381, in _sinoun
lowersplit[numword:] )
TypeError: sequence item 0: expected string, bool found
>>> p.singular_noun("STuff and Things")
'STuff and Thing'
I included another similar phrase to show that other phrases are working correctly.
vagrant@vagrant-ubuntu-trusty-64:~$ LANG=C python3 -m pip install --user inflect
Downloading/unpacking inflect
Downloading inflect-0.2.4.tar.gz (91kB): 91kB downloaded
Running setup.py (path:/tmp/pip_build_vagrant/inflect/setup.py) egg_info for package inflect
Traceback (most recent call last):
File "<string>", line 17, in <module>
File "/tmp/pip_build_vagrant/inflect/setup.py", line 9, in <module>
readme = open(readme_path).read()
File "/usr/lib/python3.4/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 47692: ordinal not in range(128)
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 17, in <module>
File "/tmp/pip_build_vagrant/inflect/setup.py", line 9, in <module>
readme = open(readme_path).read()
File "/usr/lib/python3.4/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 47692: ordinal not in range(128)
----------------------------------------
Cleaning up...
Command python setup.py egg_info failed with error code 1 in /tmp/pip_build_vagrant/inflect
Storing debug log for failure in /home/vagrant/.pip/pip.log
This program is garbage. It's the dumbest fucking piece of trash code out there. It thinks words like bus are plural, and turns words like "used" into "useds". A fucking monkey can come up with better code than this fucking piece of trash.
This issue tracks the implementation of the Jazzband guidelines for the project inflect.py
It was initiated by @jaraco who was automatically assigned in addition to the Jazzband roadies.
See the TODO list below for the generally required tasks, but feel free to update it in case the project requires it.
Feel free to ping a Jazzband roadie if you have any question.
README
fileCONTRIBUTING.md
filejazzband
account to PyPI project as maintainer role (URL: https://pypi.python.org/pypi?:action=role_form&package_name=<PROJECTNAME>
)jazzband-bot
as maintainer to the Read the Docs project (URL: https://readthedocs.org/dashboard/<PROJECTNAME>/users/
)Description | Correctly generate plurals, ordinals, indefinite articles; convert numbers to words |
Homepage | http://pypi.python.org/pypi/inflect |
Stargazers | 0 |
Open issues | 0 |
Forks | 0 |
Default branch | master |
Is a fork | True |
Has Wiki | True |
Has Pages | False |
Already does this for nouns. Could add for verbs.
Generate the singular of a verb from the plural:
e.g.
siverb('walk') -> 'walks'
(They walk -> it walks)
In _pl_check_plurals_adj, when chop off then they return False because they are the same. Need to fix this.
On version 2.1.0
This string:
"lens with a lens ()."
yields an exception.
> /lib/python3.6/site-packages/inflect.py in plural(self, text, count)
> 2239 self._pl_special_adjective(word, count)
> 2240 or self._pl_special_verb(word, count)
> -> 2241 or self._plnoun(word, count),
> 2242 )
> 2243 return "{}{}{}".format(pre, plural, post)
>
> /lib/python3.6/site-packages/inflect.py in postprocess(self, orig, inflected)
> 2209 continue
> 2210 if word.capitalize() == word:
> -> 2211 result[index] = result[index].capitalize()
> 2212 if word == word.upper():
> 2213 result[index] = result[index].upper()
IndexError: list index out of range
>>> p.number_to_words('11 11')
'one thousand, one hundred and eleven'
>>> p.ordinal(11)
'11th'
>>> p.ordinal('11 ')
'11 th'
>>> p.ordinal('11 ')
'11 th'
>>> p.ordinal('11 11')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ssara29/others/Abin/src/inflect/inflect.py", line 2797, in ordinal
n = int(num)
ValueError: invalid literal for int() with base 10: '11 11'
While number_to_words is removing all intervening spaces, ordinal ain't doing the same.
Why the inconsistency? Why isn't number_to_words complaining about the space?
I want to use this package in a corporate setting and the AGPL license is a problem. Is there an MIT/BSD alternative to this package?
>>> plural.inflect("plural_noun('Embassy')")
'Embassys'
Running p.plural("women") results in "womens"
Does inflect have the feature of checking whether a word is plural or singular?
I think it would be nice to get rid of all the nose references since it seems to be quite dead now.
It can probably be easily removed by replacing eq_
with assertEqual
and so on.
Is inflect - 0.2.4 python package(which works with python2) also works with python3?
If you feed an all-caps word to plural()
, the post-processor will assume that you want the plural form in all-caps as well. This is not the correct behavior for all-caps acronyms like "MRI" or "GMO".
p = inflect.engine()
p.singular_noun('dress')
'dres'
p.singular_noun('address')
'addres'
I'd like to see a feature like the following:
>>> p.date_diff(datetime.timedelta(minutes=-30))
'00:30:00 before'
>>> p.date_diff(5)
'00:00:05 after'
>>> p.date_diff(0)
'at'
I would expect one to use the result to produce nice output for a time difference like:
Your flight departed 00:00:05 after its scheduled departure.
It would be nice if it accepted a time formatter such that it could alternately emit:
>>> p.date_diff(datetime.timedelta(minutes=-30), fmt=nice_time)
'30 minutes before'
>>> p.date_diff(5, fmt=nice_time)
'five seconds after'
>>> p.date_diff(0, fmt=nice_time)
'at'
Perhaps that inflection is too trivial for a library like inflect. Or perhaps there are nuances I haven't yet considered.
This:
p.plural('horses') => 'horse'
Should return False like the converse:
p.singular_noun('horse') => False
$ python3 -c 'import inflect; print(inflect.engine().plural_noun("can of Coke"))'
cans of coke
I believe the guilty bit is around here:
https://github.com/pwdyson/inflect.py/blob/73ac260/inflect.py#L1891
The code should be carving out of the original input instead of the downcased version lowerword
.
Eg:
running -> runs
killed -> kills
How can i do this?
I think this is a variation on #46. A brief code sample:
for i in ['pancreas', 'mitosis', 'sepsis', 'assess', 'access', 'status', 'photolysis', 'actress']:
print(p.singular_noun(i))
Kudos by the way for getting "status" correct - all of the other libraries I've worked with turn that into "statu", but the rest of these have 1 trailing s removed. Particularly amusing to me is assess - the removal of the trailing s yields a legitimate plural of a different root word :)
Anyway - the best idea I've had for how to address this kind of problem, and I think something I can implement locally, is to check the word after singularization via a dictionary lookup, to see if it's a word (and preferably a singular word). So "access" becoming "acces" would fail, and thus would be returned to "access". I think that logic would work for many of the other common examples (such as "dress" becoming "dres") over in #46.
For the rather large collection of words from the science realm that are singular and end in a single s, I'm guessing that most dictionaries aren't going to have them, so this lookup idea will fail. But it's at least something to try.
For my own use case, presuming that the input is plural to the singular_noun function is a very strong assumption, and may make the function unusable for my purposes. I've got millions of words that I'm "stemming" by finding their singular form and storing them that way. A variant of the function that takes a word and returns the singular form, whether it starts out as plural or singular, would be far more useful. Returning False for words that aren't themselves plural probably enables me to develop my own function around singular_noun (it all depends on how consistently I get a correct reading on the plural state of the noun).
Inflect method's behavior is inconsistent with that of the respective subroutine's in the following cases:
>>> p.number_to_words((10, 20))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/vamc/Code/inflect.py/inflect.py", line 2991, in number_to_words
num = '%s' % num
TypeError: not all arguments converted during string formatting
>>> p.inflect("number_to_words((10, 20))")
'one thousand and twenty'
>>> p.singular_noun("bat")
False
>>> p.inflect("singular_noun(bat)")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/vamc/Code/inflect.py/inflect.py", line 1497, in inflect
self.sinounmo, section)
File "/home/vamc/.virtualenvs/test/lib/python3.5/re.py", line 193, in subn
return _compile(pattern, flags).subn(repl, string, count)
TypeError: sequence item 0: expected str instance, bool found
>>> p.present_participle("bats", 1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: present_participle() takes 2 positional arguments but 3 were given
>>> p.inflect("present_participle(bats, 1)")
'bats, 1ing'
Inflect should probably raise the same errors as the subroutines.
>>> import inflect
>>> inflect.__version__
'0.2.5'
>>> p = inflect.engine()
>>> p.plural("shoe")
'sho'
inflect 0.2.5 converts "address" to singular noun incorrectly:
import inflect
engine = inflect.engine()
engine.singular_noun("address")
outputs addres
instead of address
.
Your documentation has wantlist
in once place and getlist
another. wantlist
works and getlist
does not.
I think that
words = p.number_to_words(1234, getlist=True)
should be
words = p.number_to_words(1234, wantlist=True)
The code at the bottom should produce:
eggs eggs
eggs eggs
eggs eggs
... but instead produces:
eggs eggs
eggs eggs)
eggs eggs)
import inflect
p = inflect.engine()
x = ['a', 'b', 'c']
y = ('a', 'b', 'c')
z = set(['a', 'b', 'c'])
print p.plural('egg', x), p.inflect('plural(egg,{0})'.format(x))
print p.plural('egg', y), p.inflect('plural(egg,{0})'.format(y))
print p.plural('egg', z), p.inflect('plural(egg,{0})'.format(z))
Some are wrong, I think:
>>> p.plural('corpus') # corpora
'corpuses'
>>> p.plural('means') # means
'mean'
>>> p.plural('vita') # vitae
'vitas'
>>> p.plural('backhoe')
'backhoes'
>>> p.plural('hoe') # hoes
'ho'
>>> p.plural('ho') # hos
'h'
These are technically correct, but unusual (and not a mistake like the octupus-plural):
>>> p.plural('radius') # radii, radiuses
'radiuses'
>>> p.plural('curriculum') # curricula, curriculums
'curriculums'
>>> p.plural('medium') # mediums, media
'mediums'
>>> p.plural('appendix') # appendixes, appendices
'appendixes'
# also index
Not sure how it should work for uncountables:
>>> p.plural('cattle') # cattle
'cattles'
Hi @agronholm, @pwdyson, @benthor, I was wondering if there's a current maintainer for the library.
When calling
>>> inflect.engine().plural('ångstrom')
'\xc3\xa5ngstroms'
>>> from __future__ import unicode_literals
>>> inflect.engine().plural('ångstrom')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/niels/.local/share/virtualenvs/test2-MMy0r1Qw/lib/python2.7/site-packages/inflect.py", line 1602, in plural
return "{}{}{}".format(pre, plural, post)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe5' in position 0: ordinal not in range(128)
>>>
The same lines work perfectly fine in python 3
>>> word = 'woman'
>>> word = p.plural_noun(word)
>>> word
'women'
>>> word = p.plural_noun(word)
>>> word
'womens'
AFIAK, there is no singular form of the word "police", so it should return False in this case.
I guess there is a loop somewhere that replaces "ice" with "ouse". Like in "mice" and "mouse" case.
I upgraded to 3.0.1 and immediately noticed I could no longer import my local tests directory. It seems inflect 3.0.1 installs it's tests directory into site-packages.
>>> import tests
>>> sys.modules['tests']
<module 'tests' from '<python_install_dir>/lib/python3.7/site-packages/tests/__init__.py'>
ls <python_install_dir>/lib/python3.7/site-packages/tests/__init__.py
__init__.py inflections.txt test_classical_all.py test_classical_herd.py test_classical_person.py test_compounds.py test_join.py test_pl_si.py test_unicode.py
__pycache__ test_an.py test_classical_ancient.py test_classical_names.py test_classical_zero.py test_inflections.py test_numwords.py test_pwd.py words.txt
When I rollback to 2.0.1 this tests directory disappears. I suppose this is unintentional behaviour?
The most recent release on PyPI is 1.0.1, but the repo shows tags for 1.0.2, 2.0.0, 2.0.1, and 2.1.0.
A common use case for this package looks to be something like this:
# Existing:
print(f"{n} {p.plural('violation', n)} found")
-> "0 violations found"
-> "1 violation found"
-> "2 violations found"
It would be handy to have a method that outputs the quantified noun for you, to minimize repeating the quantity (e.g. if you're calculating it, you wouldn't have to assign to an intermediate variable):
# Proposed:
print(f"p.quantify(n, 'violation')} found")
-> "0 violations found"
-> "1 violation found"
-> "2 violations found"
I'm coming to this package new, apologies if there's already a combined method like this.
In making the singular from the plural, sinoun always gives the neuter answer:
>>> p = inflect.engine()
>>> p.sinoun(they)
'it'
Could add an gender setting to return either the masculine or feminine
e.g.
>>> p.gender('female')
>>> p.sinoun('they')
'she'
this is a test of github's issue tracking
Entry becomes Entrys but entry becomes entries.
Mentioned in #4.
An additional exception to the -men --> -man rule
degree celsius
becomes degree celsiuses
but should become degrees celsius
metre per second
becomes metre per seconds
but should become metres per second
>>> import inflect
>>> p = inflect.engine()
>>> p.plural('degree celsius')
u'degree celsiuses'
>>> p.plural('degree fahrenheit')
u'degree fahrenheits'
>>> p.plural('metre per second')
u'metre per seconds'
>>> p.plural('kilogram a year')
u'kilogram a years'
>>> p.plural('dollar per minute')
u'dollar per minutes'
There appears to be no win32 version available for Anaconda, which makes it difficult to install in individual environments.
The pl_sb_C_um_a_list includes the word "enconium" which is a misspelling of "encomium"
Could a function number_to_date be added.
Ie would it be possible for 1974
to return
Nineteen seventy four
2004 would be
Two thousand and four
I'm attempting to get the singular form of some food-related words and phrases. I tried a couple popular phrases and came across this bug:
import inflect
p = inflect.engine()
p.singular_noun('cookies')
# 'cookie'
p.singular_noun('chocolate chip cookies')
# 'chocolate chip cooky'
I believe I'm using the library correctly, so I would expect to see chocolate chip cookie
instead of chocolate chip cooky
.
From a code comment: Can't just execute methods from another file like this
for rcfile in (pathjoin(dirname(__file__), '.inflectrc'),
expanduser(pathjoin(('~'), '.inflectrc'))):
if isfile(rcfile):
try:
execfile(rcfile)
except:
print3("\nBad .inflectrc file (%s):\n" % rcfile)
raise BadRcFileError
Windows 7
Python 2.7.8
C:\stufftodelete>pip install -e git+https://github.com/pwdyson/inflect.py#egg=inflect
Obtaining inflect from git+https://github.com/pwdyson/inflect.py#egg=inflect
Cloning https://github.com/pwdyson/inflect.py to c:\stufftodelete\src\inflect
Running setup.py (path:C:\stufftodelete\src\inflect\setup.py) egg_info for package inflect
usage: -c [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: -c --help [cmd1 cmd2 ...]
or: -c --help-commands
or: -c cmd --help
error: invalid command 'egg_info'
Complete output from command python setup.py egg_info:
usage: -c [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: -c --help [cmd1 cmd2 ...]
or: -c --help-commands
or: -c cmd --help
error: invalid command 'egg_info'
----------------------------------------
Cleaning up...
Command python setup.py egg_info failed with error code 1 in C:\stufftodelete\src\inflect
But then:
C:\stufftodelete>pip install --upgrade setuptools
Downloading/unpacking setuptools from https://pypi.python.org/packages/3.4/s/setuptools/setuptools-7.0-py2.py3-none-any.
whl#md5=918e7e5ea108507e1ffbbdfccc3496b1
Installing collected packages: setuptools
Found existing installation: setuptools 0.6c11
Uninstalling setuptools:
Successfully uninstalled setuptools
Successfully installed setuptools
Cleaning up...
And so:
C:\stufftodelete>pip install -e git+https://github.com/pwdyson/inflect.py#egg=inflect
Obtaining inflect from git+https://github.com/pwdyson/inflect.py#egg=inflect
Updating c:\stufftodelete\src\inflect clone
Running setup.py (path:C:\stufftodelete\src\inflect\setup.py) egg_info for package inflect
Installing extra requirements: 'egg'
Installing collected packages: inflect
Running setup.py develop for inflect
Creating c:\python27\lib\site-packages\inflect.egg-link (link to .)
Adding inflect 0.2.5pre1 to easy-install.pth file
Installed c:\stufftodelete\src\inflect
Successfully installed inflect
Cleaning up...
Can the dependency to certain setuptools
version be included in the setup, or must it already be in place? If not, can it be mentioned in the README installation instructions?
0.2.4 is released in PyPI with known, fixed bugs. Can you prep an 0.2.5 release? Or add me as a maintainer (committer here and maintainer on PyPI) and I'll cut a release.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.