deadpixi / contracts Goto Github PK

View Code? Open in Web Editor NEW

342.0 342.0 18.0 51 KB

An implementation of contracts for Python.

License: GNU Lesser General Public License v3.0

Python 100.00%

contracts's People

Contributors

Stargazers

Watchers

Forkers

wontonst bernardstanislas yuqiao drewhagen dawiditer asmodehn zac-hd inirudebwoy takac ronnix jamescasbon leorkg nbraud magarnicle polymath-is gauss345 lifemodder19135 barseghyanartur

contracts's Issues

Please add LICENSE to sdist tarball

This will help for distributions in packaging .

Feature Request: Ability to provide detailed failure message for @require and @ensure

Hi deadpixi,

First of all, just want to tell you that this library has been immensely helpful - small, simple, and works great. Awesome way to introduce great ideas into projects!

However, I found one thing that I feel could improve the user experience, and could lead to a more widespread usage of this library within Python data community.

In short, sometimes a generic message that a particular contract failed might not be enough. Usually, there will be some non-trivial amount of work to do to figure out the root of the problem. Therefore, I think if there was some kind of mechanism to parameterise message upon failure, it could improve the usefulness of the contracts further.

Here's a concrete example. Say, I'm writing some kind of data-transformation functions on DataFrames using Pandas (a very common task in Python data world). I want to codify the requirement that certain columns - say, a and c - need to be present in the input DataFrame.

I'm using Python 3.6.5, Pandas 0.23.4, but this should work on 3.6+, and any non-ancient Pandas.

The current way of implementing this:

from dpcontracts import require

# create our simple DataFrame with two columns - `a` and `b`.
df = pd.DataFrame([dict(a=1, b=2)])

def cols_present_simple(required_cols: set, cols):
    """Check if all columns in `required_cols` are present in `cols`."""
    return all([col in cols for col in required_cols])

@require('Certain columns need to be present', lambda a: cols_present_simple({'a', 'c'}, a.df.columns))
def func_simple(df):
    # do stuff
    pass

func_simple(df)

This errors out with PreconditionError: Certain columns need to be present, which is great since we know the contract failed before the function is run. But now it's up to us to figure out which columns are not present. In this case it's trivial, but in the real-world it's normal to see DataFrames with hundreds of columns, and requirements including tens of columns.

This is an approach inspired by engarde library which is similar, but gives a clear, actionable example of what exactly has broken the contract - so the user doesn't have to do any extra work.

def cols_present_with_msg(required_cols: set, cols: set):
    """Check if all columns in `required_cols` are present in `cols`. If not, raise an assertion erro"""
    try:
        assert required_cols.issubset(cols)
        return True
    except AssertionError as e:
        import sys
        from dpcontracts import PreconditionError
        missing_cols = required_cols - cols
        e.args = [f"These columns are missing: {missing_cols}"]
        raise PreconditionError from e

@require('Certain columns need to be present', lambda a: cols_present_with_msg({'a', 'c'}, set(a.df.columns)))
def func_with_msg(df):
    # do stuff
    pass

func_with_msg(df)

This fails with AssertionError: These columns are missing: {'c'}, so it's obvious what needs fixing.

I should say that this idea came from using this library in a data-heavy context, so I would completely understand if this is out of scope for this library. The code was purely to illustrate the idea, and is most definitely not indicative of how it could be implemented.

Is this project still alive? Alternatives?

I came here due to a PyCon talk. It appears the work on this project has stopped. Could you provide a short statement why it's not actively maintained? Did you find a better alternative? I would greatly appreciate it, if you found the time to guide me — and other future visitors — to the solution you are using now. Thank you in advance!

Async decorators

How about adding decorators for async procedures ?

Something like :

def require_async(description, predicate):
    def asyncdec(func):
        @wraps(func)
        @dpcontracts.require(description, predicate)
        async def wrapper(*args, **kwargs):
            return await func(*args, **kwargs)
        return wrapper
    return asyncdec

or maybe better :

import dpcontracts
from functools import wraps

def require_async(description, predicate):
    requirement = dpcontracts.require(description, predicate)
    def asyncdec(func):
        funcreq = requirement(func)
        @wraps(func)
        async def wrapper(*args, **kwargs):
            funcreq(args)
            return await func(*args, **kwargs)
        return wrapper
    return asyncdec

Could then be used like that :

    import trio

    @require_async("n is an int", lambda args: isinstance(args.n, int))
    async def return_delay(n: int) -> int:
        await trio.sleep(2)
        return n

    res = trio.run(return_delay, "bob")

    print(res)

Handling generators

I found an issue with this library.

If you pass a generator to a function that has an @require which loops through the generator to do some verification, the pointer to the empty generator is returned.

Here's a minimal repro:

(default) Roys-MacBook-Pro:web royzheng$ cat o.py
from dpcontracts import require

@require('mylist should have elements greater than 0', lambda args: all(a > 0 for a in args.mylist))
def test(mylist):
    print(list(mylist))

test([1,2,3])
test(x for x in xrange(1,4))
(default) Roys-MacBook-Pro:web royzheng$ python o.py
[1, 2, 3]
[]

Smells like somewhere in the code we need to check inspect.isgeneratorfunction and if so, original_generator = itertools.tee(the_generator) and pass that to the function instead of the original generator

@require & @ensure lambda isinstance not (apparently) working

My verbose hypothesis feedback is clean and I'm not seeing any sign of dpcontracts.[any]Error. I expected that either requiring x to be a float while feeding in ints or requiring result to be a list would give me a dpcontracts error. Can you tell me what is wrong?

I'm running WinPy 3.6 from Pycharm.

test_calc.py

import calc
from hypothesis import given,assume
from hypothesis import strategies as st
...
@given(st.integers(),st.integers())
def test_add_type(x,y):
    pass
...

calc.py

from dpcontracts import require, ensure
@require("add_type x", lambda args: isinstance(args.x, float))
@require("add_type y", lambda args: isinstance(args.y, int))
@ensure("add_type return", lambda args, result: isinstance(result, list))
def add_type(x, y):
    z = float(x + y)
    return z

Feature Request: Support for sequencing?

TL;DR: Possible feature request.

Thank you for writing this module. For the first time, I've finally hit on a DbC library that covers 99% of my use-cases for the tool. It's easy to use and works really well! But, I have a question concerning that other 1%. I don't often need it, but when I do, I need it badly.

One of the nicer features of Eiffel's DbC support is the ability to reference "previous" versions of object fields. For instance, given:

class Counter(object):
    def __init__(self):
        self.ctr = 0
    def foo(self):
        xyz = arbitrary_function(self.ctr)
        self.ctr = self.ctr + 1
        return xyz

It'd be nice to be able to express the following (note: this is just pseudocode):

@ensure("...", lambda args, res: args.self.ctr == 1 + PREVIOUS(args.self.ctr))

Obviously, Python won't support the infrastructure by which that can happen, but if I'm allowed some chance to capture the old version of a member prior to the execution of the ensure assertion, I should be able to fake it by referencing that cached value at that time.

(Hopefully this makes sense. Apologies if I'm not making sense.)

I guess I can always split foo into two functions:

def foo(self):
    return self.internal_foo()[0]

@ensure("...", lambda args, res: res[1] + 1 == args.self.ctr)
def foo(self):
    ctr = self.ctr
    xyz = arbitrary_function(ctr)
    self.ctr = ctr + 1
    return (xyz, ctr)

However, this runs the risk of being verbose (two methods defined instead of one) and onerous (if many members need to be checked for sequential relationships).

Is there some other best-practice work-around for this particular use-case that I'm missing? Thanks!

This is really cool

No issue, just wanted to tell you

Use subtypes of AssertionError to support better handling of violations

If contract violtions raised distinct subtypes of AssertionError, it would be possible to programmatically handle various kinds of contract violation. I imagine something like:

class PreconditionError(AssertionError):
    """An AssertionError raised due to violation of a precondition."""
class PostconditionError(AssertionError):
    """An AssertionError raised due to violation of a postcondition."""

# Then, to replace `if condition: assert foo, description` in each place:
if precondition and predicate(rargs):  # and similar for postconditions etc.
    raise PreconditionError(description)

This could be used by a fuzzer - e.g. HypothesisWorks/hypothesis#1474 - to automatically discover valid inputs, by distinguishing between invalid input and other errors.

Poor performance due to namedtuple construction in build_call

Is this performance to be expected? There seems to be a lot of time spent in building namedtuples...

$ python --version
Python 3.6.6 :: Anaconda, Inc.

$ cat test_dp.py
import sys
from dpcontracts import require

if sys.argv[1] == "dp":
    @require("something about x", lambda args: args.x is not None)
    def hot_method(x):
        pass

else:
    def hot_method(x):
        pass

for i in range(100000):
    hot_method(i)

$ python test_dp.py nodp
python test_dp.py nodp  0.05s user 0.00s system 98% cpu 0.053 total
$ python test_dp.py dp
python test_dp.py dp  29.19s user 0.02s system 99% cpu 29.284 total

# turned down to 10,000 iters...
$ python -m cProfile -stottime test_dp.py dp
         775121 function calls (765029 primitive calls) in 3.144 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  10019/1    2.109    0.000    3.283    3.283 {built-in method builtins.exec}
    10028    0.194    0.000    0.261    0.000 {built-in method builtins.__build_class__}
    10010    0.129    0.000    2.756    0.000 __init__.py:357(namedtuple)
    10000    0.119    0.000    3.221    0.000 dpcontracts.py:437(build_call)
    30110    0.085    0.000    0.085    0.000 {method 'format' of 'str' objects}
    10001    0.078    0.000    0.182    0.000 inspect.py:2095(_signature_from_function)
    10001    0.055    0.000    0.312    0.000 inspect.py:1082(getfullargspec)
    10001    0.040    0.000    0.233    0.000 inspect.py:2176(_signature_from_callable)
    10001    0.039    0.000    0.051    0.000 inspect.py:2725(__init__)
    10000    0.032    0.000    3.260    0.000 dpcontracts.py:502(inner)

Duplicated documentation

I just noticed, in dpcontracts.py and in README.rst.
Is that intentional ? it might be tricky to keep them both in sync...

Merge design-by-contract libraries?

Hi,
I'm trying to push for a standardized contracts library in Python, and so far there was some discussion on python-ideas mail list, please see this thread which was later forked to this thread and this thread.

We implemented our own design-by-contract (DbC) library, icontract, since we lacked some of the features in your library as well as in other ones and wanted to move forward fast for our own code base.

However, during the discussion on python-ideas, I realized how harmful this decision actually was. DbC has a hard time at getting adopted within a Python community, and having multiple libraries will just aggravate the matters: the contracts start really to shine when you have a small ecosystem around them (automatic test generation, documentation and IDE plugins). I don't see this ecosystem emerging with multiple libraries since each of the components of the ecosystem would need to support some or all of the libraries.

Finally, I see DbC as an excellent way of documentation that almost every library would hopefully adopt. Having multiple DbC libraries makes it impossible to integrate two third-party libraries when each of them depend on a different DbC approach. This is very apparent in the case of inheritance: inheriting from for some classes from a module dependent on one DbC library would need to use different contracts than another class inheriting from yet another class dependent on another DbC library. This is confusing and I doubt anybody would like such a mess in the code base.

With all that said, do you see it possible that we merge icontract with dpcontracts somehow? We can 1) make a new library, 2) I could merge the features that we missed in dpcontracts from icontract (or the other way around), or 3) we contact developers working on python standard library and start developing a standard library from the scratch if that's OK with them?

Here is a short list of features that we missed in dpcontracts:

Informative messages. Writing contracts was tedious since most contract messages in our code base were duplicates of what the contract already stated (lambda args: args.x > 0, "x positive"). We had to add a description to a contract very rarely and having the library figure out the message was extremely helpful to reduce the pain.
Values of the arguments in the message. We found it important that the violation message includes the values of the arguments supplied to it. That involved quite some work (since we had to parse the condition function and re-trace its AST manually on contract breach), but was invaluable in production because contract violations were either tedious or impossible to reproduce without this information. Please see the examples in this section of the icontract readme. This feature also speeds up the development quite a bit since we don't have to turn on the debugger and the message makes it often pretty obvious where the bug lies.
Inheritance of contracts. This was important not only for object-oriented part of the system. This feature also allowed us to make contracts on a group of static methods (written in an abstract class) which serve as an interface, and then inherit and implement them by different components. This actually made it easy to have a functional (as in functional programming) interface with predefined contracts.
Sphinx extension to include contracts in the automatically generated documentation.
Linter that statically checks that arguments of the contracts coincide with that of the function.

All these extra features are already implemented in icontract and we already use them in our code base and production without problems. If you want to test them, simply fire up a virtual environment and pip install them. The features "informative messages" and "value of arguments in the message" are only executed on contract breach hence bring no computation overhead to normal running of the system.

I'm looking forward to your response.

Pass function arguments to the lambda

Currently, all contract functions magically get the arguments as well the variable __result__. Not only is this strange, it also causes most linters to error out. Is it possible to change this behaviour so that the function arguments are passed in as arguments to the lambda instead? Similar for class invariants - __instance__ should be passed in.

PyPI management

Is this package on pypi? If not can we put it up as dpcontracts