gadventures / gapipy Goto Github PK

View Code? Open in Web Editor NEW

2.0 24.0 1.0 671 KB

Python client for the G Adventures REST API (https://developers.gadventures.com)

License: MIT License

Makefile 0.76% Python 99.24%

gapi python api clients

gapipy's Introduction

G API Python Client

A client for the G Adventures REST API (https://developers.gadventures.com)

GitHub Repository: https://github.com/gadventures/gapipy/
Documentation: http://gapipy.readthedocs.org.
Free software: MIT License

Quick Start

>>> from gapipy import Client
>>> api = Client(application_key='MY_SECRET_KEY')

>>> # Get a resource by id
>>> tour_dossier = api.tour_dossiers.get(24309)
>>> tour_dossier.product_line
u'AHEH'
>>> tour_dossier.departures.count()
134
>>> tour_dossier.name
u'Essential India'
>>> itinerary = tour_dossier.structured_itineraries[0]
>>> {day.day: day.summary for day in itinerary.days[:3]}
{1: u'Arrive at any time. Arrival transfer included through the G Adventures-supported Women on Wheels project.',
2: u'Take a morning walk through the city with a young adult from the G Adventures-supported New Delhi Streetkids Project. Later, visit Old Delhi, explore the spice markets, and visit Jama Masjid and Connaught Place.',
3: u"Arrive in Jaipur and explore this gorgeous 'pink city'."}

>>> # Create a new resource
>>> booking = api.bookings.create({'currency': 'CAD', 'external_id': 'abc'})

>>> # Modify an existing resource
>>> booking.external_id = 'def'
>>> booking.save()

Since `2.25.0 (2020-01-02)`_

>>> # since 2.25.0 reference stubs that fail to fetch will return a

>>> # subclass of requests.HTTPError (See: https://github.com/gadventures/gapipy/pull/119)
>>> # This can also be done on Query.get by passing a Falsy value for the
>>> # httperrors_mapped_to_none kwarg.
>>>
>>> dep = api.departures.get('404_404', httperrors_mapped_to_none=None)
... # omitted stacktrace
HTTPError: 404 Client Error: {"http_status_code":404,"message":"Not found.","errors":[],"time":"2020-01-02T19:46:07Z","error_id":"gapi_asdf1234"} for url: https://rest.gadventures.com/departures/404_404

>>> dep = api.departures.get('404404')
>>> dep.start_address.country
<Country: BR (stub)>

>>> # lets have GAPI return a _404_ error here for the country stub `fetch`
>>> # when we attempt to retrieve the continent attribute

>>> dep.start_address.country.continent  # reference/stub forces a fetch

>>> # pre 2.25.0 behaviour
... # omitted stacktrace
AttributeError: 'Country' has no field 'continent' available

>>> # post 2.25.0 behaviour
... # omitted stacktrace
HTTPError: 404 Client Error: {"http_status_code":404,"message":"Not found.","errors":[],"time":"2020-01-02T19:46:07Z","error_id":"gapi_qwer5678"} for url: https://rest.gadventures.com/countries/BR

Resources

Resource objects are instantiated from python dictionaries created from JSON data. The fields are parsed and converted to python objects as specified in the resource class.

A nested resource will only be instantiated when its corresponding attribute is accessed in the parent resource. These resources may be returned as a stub, and upon access of an attribute not present, will internally call .fetch() on the resource to populate it.

A field pointing to the URL for a collection of a child resources will hold a Query object for that resource. As for nested resources, it will only be instantiated when it is first accessed.

Queries

A Query for a resource can be used to fetch resources of that type (either a single instance or an iterator over them, possibly filtered according to some conditions). Queries are roughly analogous to Django's QuerySets.

An API client instance has a query object for each available resource (accessible by an attribute named after the resource name)

Methods on Query objects

All queries support the get, create and options methods. The other methods are only supported for queries whose resources are listable.

options()

Get the options for a single resource

get(resource_id, [headers={}])

Get a single resource; optionally passing in a dictionary of header values.

create(data)

Create an instance of the query resource using the given data.

all([limit=n])

Generator over all resources in the current query. If limit is a positive integer n, then only the first n results will be returned.

A TypeError will be raised if limit is not None or int type
A ValueError will be raised if limit <= 0

filter(field1=value1, [field2=value2, ...])

filter(**{"nested.field": "value"}): Filter resources on the provided fields and values. Calls to filter can be chained. The method will return a clone of the Query object and must be stored in a separate variable in order to have access to stacked filters.
count(): Return the number of resources in the current query (by reading the count field on the response returned by requesting the list of resources in the current query).

Caching

gapipy can be configured to use a cache to avoid having to send HTTP requests for resources it has already seen. Cache invalidation is not automatically handled: it is recommended to listen to G API webhooks to purge resources that are outdated.

By default, gapipy will use the cached data to instantiate a resource, but a fresh copy can be fetched from the API by passing cached=False to Query.get. This has the side-effect of recaching the resource with the latest data, which makes this a convenient way to refresh cached data.

Caching can be configured through the cache_backend and cache_options settings. cached_backend should be a string of the fully qualified path to a cache backend, i.e. a subclass of gapipy.cache.BaseCache. A handful of cache backends are available out of the box:

gapipy.cache.SimpleCache: A simple in-memory cache for single process environments and is not thread safe.
gapipy.cache.RedisCache: A key-value cache store using Redis as a backend.
gapipy.cache.NullCache (Default): A cache that doesn't cache.
gapipy.cache.DjangoCache (requires Django): A cache which uses Django's cache settings for configuration. Requires there be a gapi entry in settings.CACHES.

Since the cache backend is defined by a python module path, you are free to use a cache backend that is defined outside of this project.

Connection Pooling

We use the requests library, and you can take advantage of the provided connection pooling options by passing in a 'connection_pool_options' dict to your client.

Values inside the 'connection_pool_options' dict of interest are as follows:

Set enable to True to enable pooling. Defaults to False.
Use number to set the number of connection pools to cache. Defaults to 10.
Use maxsize to set the max number of connections in each pool. Defaults to 10.
Set block to True if the connection pool should block and wait for a connection to be released when it has reached maxsize. If False and the pool is already at maxsize a new connection will be created without blocking, but it will not be saved once it is used. Defaults to False.

Dependencies

The only dependency needed to use the client is requests.

Testing

Running tests is pretty simple. We use nose as the test runner. You can install all requirements for testing with the following:

$ pip install -r requirements-testing.txt

Once installed, run unit tests with:

$ nosetests -A integration!=1

Otherwise, you'll want to include a GAPI Application Key so the integration tests can successfully hit the API:

$ export GAPI_APPLICATION_KEY=MY_SECRET_KEY; nosetests

In addition to running the test suite against your local Python interpreter, you can run tests using Tox. Tox allows the test suite to be run against multiple environments, or in this case, multiple versions of Python. Install and run the tox command from any place in the gapipy source tree. You'll want to export your G API application key as well:

$ export GAPI_APPLICATION_KEY=MY_SECRET_KEY
$ pip install tox
$ tox

Tox will attempt to run against all environments defined in the tox.ini. It is recommended to use a tool like pyenv to ensure you have multiple versions of Python available on your machine for Tox to use.

Fields

_model_fields represent dictionary fields.

Note

_model_fields = [('address', Address)] AND
Address subclasses BaseModel

{
   "address": {
      "street": "19 Charlotte St",
      "city": "Toronto",
      "state": {
         "id": "CA-ON",
         "href": "https://rest.gadventures.com/states/CA-ON",
         "name": "Ontario"
      },
      "country": {
         "id": "CA",
         "href": "https://rest.gadventures.com/countries/CA",
         "name": "Canada"
      },
      "postal_zip": "M5V 2H5"
   }
}

_model_collection_fields represent a list of dictionary fields.

Note

_model_collection_fields = [('emails', AgencyEmail),] AND
AgencyEmail subclasses BaseModel

{
   "emails": [
      {
         "type": "ALLOCATIONS_RELEASE",
         "address": "[email protected]"
      },
      {
         "type": "ALLOCATIONS_RELEASE",
         "address": "[email protected]"
      }
   ]
}

_resource_fields refer to another Resource

gapipy's People

Contributors

Stargazers

Watchers

Forkers

wmak

gapipy's Issues

Setup pollutes site-packages

Unsure if this is a real problem, but uninstalling gapipy leaves behind a bunch of tests files. Because setup.py uses find_packages and tests is a package, the directory is copied into site-packages as well.

One idea is to change find_packages() to find_packages(exclude='tests') or move tests into the gapipy directory, so it doesn't get so messy.

$ pip uninstall gapipy

Uninstalling gapipy-2.24.0:
  Would remove:
    /Users/craign/.virtualenvs/gadv/lib/python2.7/site-packages/gapipy-2.24.0.dist-info/*
    /Users/craign/.virtualenvs/gadv/lib/python2.7/site-packages/gapipy/*
    /Users/craign/.virtualenvs/gadv/lib/python2.7/site-packages/tests/*
  Would not remove (might be manually added):
    /Users/craign/.virtualenvs/gadv/lib/python2.7/site-packages/tests/base.py
    /Users/craign/.virtualenvs/gadv/lib/python2.7/site-packages/tests/settings.py
    /Users/craign/.virtualenvs/gadv/lib/python2.7/site-packages/tests/test_forms.py
    /Users/craign/.virtualenvs/gadv/lib/python2.7/site-packages/tests/test_models.py
    /Users/craign/.virtualenvs/gadv/lib/python2.7/site-packages/tests/test_tasks.py
    /Users/craign/.virtualenvs/gadv/lib/python2.7/site-packages/tests/test_utils/__init__.py
    /Users/craign/.virtualenvs/gadv/lib/python2.7/site-packages/tests/test_utils/urls.py
    /Users/craign/.virtualenvs/gadv/lib/python2.7/site-packages/tests/test_views.py
    /Users/craign/.virtualenvs/gadv/lib/python2.7/site-packages/tests/urls.py
    /Users/craign/.virtualenvs/gadv/lib/python2.7/site-packages/tests/utils.py
Proceed (y/n)? y
  Successfully uninstalled gapipy-2.24.0

Rate limiting

I had a look through the source code, and it doesn't appear that rate limiting is applied anywhere.
As per the documentation, headers are returned specifying the number of calls left for the hour. If the number of calls remaining is zero, the code could refuse to make another call until the hour is up.

The documentation doesn't seem to specify how to find out when the limit resets. My guess is that it's a "clock hour", i.e. it resets every hour, on the hour. However, I don't know and don't want to risk getting blocked to test it. If it is clock hour, then it would be trivial to implement a "raise an Exception if we are asked to do too many calls this hour" method. If it's not clock-hour, the API should be updated with a X-RateLimit-Limit, which specifies the number of seconds until the rate limit is reset.

Thoughts? I'm happy to help develop/test this bit of code, but would need to know how the solution would look.

Add `images` resource

gapipy doesn't know about images resources

[offtopic] Transfer of django-fsm-admin ownership

Hi,

This repository gadventures /
django-fsm-admin seems to lack maintenance, I propose myself as a new maintainer.

Could you give me the right to push in your repository or just transfer it on my account in github?

Thank you!

PS: We still use it some legacy projects and I am willing to keep it maintained.

PPS: If you are not sure about me, for example I maintain (or helping):

https://github.com/shanx/django-maintenancemode

https://github.com/adamcharnock/django-tz-detect

Also I adopted few Django projects and still continue maintaining them, biggest example:

https://github.com/bashu/django-easy-maps

PPPS: I've tried to reach you via contact form on https://www.gadventures.com/

Update PyPI to 2.4.4

gapipy 2.4.4 was released 6 days ago. PyPI is still on 2.4.3

Remove "active" field from Agent

The active field is not returned on the Agent resource

Add requirements to the Booking resource

gapipy doesn't know about the requirements resource

Filtering not applied on related object list views?

Here's a bit of odd behaviour that I've encountered. I feel like this is in error, but will have to do a bit more digging to be sure.

In the following examples, g is a gapipy client with the only options passed in being an application key (the g.com key, in this case).

Reproducing

First off, if we fetch a tour_dossier and then access the related departures listing, it can tell us what the count is (from the count attr on the JSON responses):

In [3]: g.tour_dossiers.get(24821).departures.count()
Out[3]: 1247

If we apply some filter, the count changes appropriately:

In [4]: g.tour_dossiers.get(24821).departures.filter(start_date__gte='2019-09-20').count()
Out[4]: 672

If -- instead of calling .count on it -- we iterate the result of the filter call, we seem to get the full unfiltered set of departures:

In [5]: len([d for d in g.tour_dossiers.get(24821).departures.filter(start_date__gte='2019-09-20')])
Out[5]: 1247

😢 What happened to my filters? I would expect to see 672 items here, same as where we used .count. This is the source of my confusion.

Contrast with...

... what happens when we query the departures list view directly, rather than reaching through a tour_dossier:

Filter for all the departures associated with that tour_dossier and we get the same 1247 value:

In [6]: g.departures.filter(**{'tour_dossier.id': '24821'}).count()
Out[6]: 1247

If we add on our date filter and use .count, again we get 672:

In [7]: g.departures.filter(**{'tour_dossier.id': '24821', 'start_date__gte': '2019-09-20'}).count()
Out[7]: 672

And here, the expected behaviour -- iterating that filter expression and we encounter the same number of items as when .counting it -- seems like the filtering persists in this case:

In [8]: len([d for d in g.departures.filter(**{'tour_dossier.id': '24821', 'start_date__gte': '2019-09-20'})])
Out[8]: 672

Client instances can share config because `default_config` contains mutable values

I have never (knowingly) been tripped up by it, but I noticed this while reviewing #137...

gapipy.client.default_config is basically used as a default argument to client initialization and some of its content are (mutable) dicts. Because of that it's possible to accidentally "share" data there -- you might have some surprises if you instantiate multiple clients in the same process with different configs.

To demonstrate...

>>> from gapipy import Client

>>> # Let's make a client and enable connection pooling
>>> c1 = Client(connection_pool_options={"enable": True})
>>> c1.connection_pool_options["enable"]
True

>>> # Ok, let's make another WITHOUT connection pooling
>>> c2 = Client(connection_pool_options={"enable": False})
>>> c2.connection_pool_options["enable"]
False

>>> # Cool, now let's see what that first client is up to, it was True to begin with...
>>> c1.connection_pool_options["enable"]
False

>>> c1.connection_pool_options is c2.connection_pool_options
True

I suspect the crux of the issue is probably twofold:

new client instances get direct references to stuff in default_config and some of those things are mutable
Client.__init__ explicitly mutates the default connection_pool_options dict when some extra connection pool configs have been passed

I bet you could have a similar issue with the global http headers config, except that Client.__init__ doesn't mutate that one you'd have to have a series of events like:

instantiate a client
instantiate another client
mutate one of their client.global_http_headers and the other client is affected because they both shared the same default value

(Pretty sure I had a hand in both of those connection-pool-options and global-http-headers things 🤦 oops)

I haven't taken a run at writing tests for it yet, but I wonder if the fix is simply:

update get_config to copy.deepcopy (or similar) the value it gets from default_config before returning it, and
use get_config (instead of direct dict access) when getting "connection_pool_options" out of default_config

We could also take other approaches like:

Client.__init__ should deepcopy the default dict before yanking stuff out, or
the default dict should be returned from a function instead of existing at at module level, or
something else entirely 🤷

Add config parameter for request timeout

We use requests internally to make calls out to the G API. By default, requests has no timeout defined, and it's up to the backend source to define when it'll give up. Currently we have a backend time out of 60 seconds, primarily due to resources like the bundler.

Requests greatly encourages we set a reasonable timeout and use it in every production call [1]

You can tell Requests to stop waiting for a response after a given number of seconds with the timeout parameter. Nearly all production code should use this parameter in nearly all requests. Failure to do so can cause your program to hang indefinitely:

This should be added as a configurable parameter, with a reasonable default (20 or 30 seconds seems more than enough)

http://docs.python-requests.org/en/master/user/quickstart/#timeouts

`resource.save(partial=True)` behaviour when no changes have been made

(I'm marking this as question since it's not an issue as much as it is something I'm curious to discuss...)

Today, gapipy provides the ability to pass partial=True when calling a resource-instance's save method -- when you do that it will issue a PATCH request and send only the attributes that seem to have been modified to GAPI rather than the entire resource.

The question I'm wondering about is: what behaviour makes most sense in the case that no attributes have been modified?

Current behaviour:

gapipy compares the initial data dict to the current data dict and computes the changes are an empty dict
gapipy makes a PATCH request with an empty data-dict
GAPI sees an empty PATCH payload and responds with an HTTP 400 and the message You must provide fields to PATCH

I'm wondering if that's more or less surprising than possible alternate behaviours like...

Silently make it a noop:

gapipy compares the initial data dict to the current data dict and computes the changes are an empty dict
gapipy decides that no PATCH request is necessary and returns early

Loudly make it a noop:

gapipy compares the initial data dict to the current data dict and computes the changes are an empty dict
gapipy decides that no PATCH request is necessary and throws some sort of explicit exception like CantPatchNothingYouDingusError or something

Any thoughts on current-behaviour versus those alternate behaviours (versus other alternate behaviours)?

For context: I'm thinking about this because g.com does some haphazard updating of profile fields based on our User objects and it's not uncommon for us to bump into that HTTP 400.

At the moment we handle that 400 the same as any other 400, but it's not really an error in some sense: nothing went wrong except that we asked to make no updates and GAPI told us that that's a weird request.

I'm trying to decide if g.com should behave differently by: detecting and swallowing that particular 400, or by noticing when we're going to do an empty patch and avoiding it, or something else. Maybe it makes sense to push those concerns up into gapipy, or maybe it does not. 🤔

`get`-ing a resource that was cached in `RedisCache` refreshes that key's TTL

Steps to reproduce:

instantiate a gapipy using gapipy.cache.RedisCache as cache backend and some default TTL, e.g. 'cache_options': { 'default_timeout': 60*60 }
fetch a resource so that it is inserted into Redis, e.g. gapipy_instance.itineraries.get(1198, 946, cached=False)
check the TTL of that key (e.g. redis-cli ttl '*itineraries:1198:946*') -- it will be the default timeout you set minus the few seconds that have passed since calling get
wait a few seconds
fetch the same resource again but tell gapipy to use its cache, e.g. gapipy_instance.itineraries.get(1198, 946, cached=True)
check the TTL of that key again, the default timeout you set minus the few seconds that have passed since your most recent call to get

Expected behaviour:

subsequent gets that are served from cache should not disturb the TTL of the key, in the above example, our second check of the TTL should be lower than the first check

Effect:

a resource cached in Redis that is fetched from cache more than once-per-TTL will never fall out of the cache

Sticky `filter` arguments

I'm seeing some confusing behaviour... let talk about whether it's a bug or not!

Repro steps:

Hop into a shell and instantiate yourself a gapipy client, I'll call it g in this example...
Search for departures having a certain tour dossier associated with them... found some! Then search for departures having a certain itinerary associated with them... none found.

    >>> len([x for x in g.departures.filter(**{'tour_dossier.id': '22758'}).all()])
    280

    >>> len([x for x in g.departures.filter(**{'structured_itineraries.variation_id': 4364}).all()])
    0

Ok, kill that shell. Start a fresh shell and we'll swap the order... search for the itinerary association first and it returns results! Search for the tour_dossier association second and it returns no results. The opposite of last time.

    >>> len([x for x in g.departures.filter(**{'structured_itineraries.variation_id': 4364}).all()])
    91

    >>> len([x for x in g.departures.filter(**{'tour_dossier.id': '22758'}).all()])
    0

🤔

Expected result:

Order of operations shouldn't matter. We should see 280 departures for tour_dossier.id=22758 and 91 for structured_itineraries.variation_id=4364.

Investigation:

Looking at the Query class, it seems as though we clear self._filters after making some requests, but not all:

get_resource_data resets it after doing APIRequestor.get (and caching the result)
count resets it after doing APIRequestor.list_raw (and extracting the count)
all calls APIRequestor.list but doesn't reset self._filters

Thoughts:

In my opinion Query.all should reset self._filters after it calls APIRequestor.list and makes its request to the API -- adopting the same behaviour as the other two Query code paths which hit the API.

On the other hand, there is a test case that highlights the fact that we can "chain" filters in this way.

This seems like an odd pair of behaviours to have (again, IMO):

if I .count() my query the filters are reset
if I .all() or list(...) my query the filters are not reset

NB: my contrived example could avoid this behaviour and obtain correct counts using Query.count rather than iterating over the list... it's easy to imagine similar analogous situations where e.g. whole resources and not counts are required

upgrade future from 0.16.0 to 0.17.1 ?

Am attempting to install gapipy as part of another project. This comes up when running pip install -r requirements.txt there;

gapipy 2.19.3 has requirement future==0.16.0, but you'll have future 0.17.1 which is incompatible.

If it isn't too much trouble, can we consider upgrade future to version 0.17.1 ? (NOT an urgent priority)

Add packing lists resources

gapipy doesn't know about the packing lists resources: packing_lists and packing_items