Coder Social home page Coder Social logo

gsa / catalog.data.gov Goto Github PK

View Code? Open in Web Editor NEW
38.0 14.0 14.0 12.28 MB

Development environment for catalog.data.gov

Home Page: https://catalog.data.gov

Dockerfile 0.47% Python 5.68% Shell 1.88% Makefile 0.57% PLpgSQL 0.07% JavaScript 2.72% HTML 6.30% CSS 82.31%

catalog.data.gov's Introduction

GitHub Actions

catalog.data.gov

This is a local development harness for catalog.data.gov. For details on the system architecture, please see our data.gov systems list!

Usage

The only deployable artifact associated with this repository is the requirements.txt file. See github actions for full configuration in live environments.

The live environment is different than the development environment in a number of ways. Changes made in this repo that work correctly in the development environment may require additional steps to be taken in order to make sure the application is deployable to the live environment:

  • If you need to add or change a dependency, you should make that change in the ckan/requirements.in, run make update-dependencies and commit the changed files. (See the section below on requirements for details.) Good news: no other changes are required!

  • If you need to add or change configuration that lives in the application ini file (such as a plugin), you will also need to update the configuration file template at ckan/setup/ckan.ini.

  • If you find you need to modify the ckan/Dockerfile to add OS packages or install software, other changes may need to be made to the cloud.gov buildpack. Please bring these situations to the team's attention.

Development

Requirements

We assume your environment is already setup with these tools.

Getting started

Build and start the docker containers.

$ make build up

Open your web browser to localhost:5000 (or ckan:5000 if you add ckan to your hosts file).
You can log into your instance with user admin, password password.

Run the integration tests.

$ make test

Stop and remove the containers and volumes associated with this setup.

$ make clean

See .env to override settings. Some settings may require a re-build (make clean build).

Note: the solr configuration has a locking mechanism that only allows one solr to access its data at a time. There are two methods to recover solr in this state. make clear-solr-volume destroys all of the solr data and starts from scratch. make unlock-solr-volume unlocks the data to allow another solr to access it. BE CAREFUL when running the make unlock-solr-volume command! If two solrs are talking to the same volume, the data may corrupt and would need to be destroyed anyway.

Test extensions

To test extensions locally you can run TODO: update this for pytest

docker-compose exec ckan bash
nosetests --ckan --with-pylons=src/ckan/test-catalog-next.ini src/ckanext-datagovtheme/ckanext/datagovtheme/
nosetests --ckan --with-pylons=src/ckan/test-catalog-next.ini src/ckanext-datagovtheme/ckanext/datajson/
nosetests --ckan --with-pylons=src/ckan/test-catalog-next.ini src/ckanext-datagovtheme/ckanext/geodatagov/

Run Cypress Tests

To test the UI and e2e user tests, run

$ make test

Run Cypress tests interactively

To run cypress tests locally, cypress needs to be installed first. Run npm install cypress.

At this point, you will need to manually change the .env file to have CKAN_SITE_URL=http://localhost:5000. This is to cover for a docker bug upstream: docker/compose#7423

Then, you can run make cypress. For WSL or complex installation, please see a data.gov team member or follow the steps laid out here.

Deploying to cloud.gov

Copy vars.yml.template to vars.yml, and customize the values in that file. Then, assuming you're logged in for the Cloud Foundry CLI:

Create the database used by CKAN itself. You have to wait a bit for the datastore DB to be available (see the cloud.gov instructions on how to know when it's up).

$ cf create-service aws-rds small-psql ${app_name}-db -c '{"version": "11"}'

Create the Redis service for cache

$ cf create-service aws-elasticache-redis redis-dev ${app_name}-redis

Create the SOLR service for data search

$ cf create-service solr-cloud base ${app_name}-solr -c solr/service-config-${space}.json -b ssb-solr-gsa-datagov-${space}

Create the secrets service to store secret environment variables. See Secrets below.

You should now be able to visit https://[ROUTE], where [ROUTE] is the route reported by cf app ${app_name}.

Secrets

ips on managing secrets. When creating the service for the first time, use create-user-provided-service instead of update.

$ cf update-user-provided-service ${app_name}-secrets -p 'CKAN___BEAKER__SESSION_SECRET, SAML2_PRIVATE_KEY'
Name Description Where to find
CKAN___BEAKER__SESSION__SECRET Session secret for encrypting CKAN sessions. pwgen -s 32 1
SAML2_PRIVATE_KEY Base64 encoded SAML2 key matching the certificate configured for Login.gov Google Drive

Login.gov integration

We use Login.gov as our SAML2 Identity Provider (IdP). Production apps use the production Login.gov instance while other apps use the Login.gov identity sandbox.

Each year in March, Login.gov rotates their credentials. See our wiki for details.

Our Service Provider (SP) certificate and key are provided in through environment variable and user-provided service.

The Login.gov IdP metadata is stored in file under ckan/setup/.

On Docker CKAN 2.9 images

The repository extends the Open Knowledge Foundation ckan-dev:2.9 docker image. The ckan-base:2.9 image, if needed for some reasons, is available via dockerhub with the aformentioned tag, as referenced in OKF's docker-ckan repository.

Public docker image

If build pass tests a docker-image will be published in the docker hub: https://hub.docker.com/r/datagov/catalog-next.
This image will be used in extensions to test.

Note on requirements

The source of truth about package dependencies is managed with pip kept in ckan/requirements.txt. The base OKFN Docker image we are using, though, doesn't install all dependencies we need. We have modified our ckan image (ckan/Dockerfile) to install frozen requirements from ckan/requirements.txt at image build time to help ensure all developers are working with the same set of requirements.

The Makefile target update-dependencies will use pip to generate a new requirements.txt and update ckan/requirements.txt.

$ make update-dependencies

To support cloud.gov installation via normal python buildpack, there is a symbolic link requirements.txt that references ckan/requirements.txt.

Adding new extensions in requirements

If you try to add and extension and it didn't work you should try chown user:user -R . (in the catalog.data.gov repo folder) because if you run docker as superuser and then as a regular user won't be able to add the folder for the new extension

Procedure for updating a dependency

  1. Add/change the dependency in ckan/requirements.in
  2. Run make update-dependencies build clean test
  3. Make sure to commit ckan/requirements.txt and ckan/requirements.in to make the change permanent.

Create an extension

You can use the ckan template in much the same way as a source install, only executing the command inside the CKAN container and setting the mounted src/ folder as output:

$ docker-compose exec ckan /bin/bash -c \
"ckan generate extension"

The new extension will be created in the src/ folder. You might need to change the owner of its folder to have the appropriate permissions.

Running the debugger (pdb / ipdb)

To run a container and be able to add a breakpoint with pdb or ipdb, run the ckan-dev container with the --service-ports option:

docker-compose run --service-ports ckan

This will start a new container, displaying the standard output in your terminal. If you add a breakpoint in a source file in the src folder (import pdb; pdb.set_trace()) you will be able to inspect it in this terminal next time the code is executed. If you are testing a harvest process (gather/fetch/run), try turning off the command to start in the background in the ckan/docker-entrypoint.d/10-setup-harvest.sh. Then, run the relevant command manually (make harvest fetch-queue) after startup.

SAML2

To enable the ckanext-saml2 extension, add saml2auth to CKAN__PLUGINS list in the .env file and then access to https://localhost:8443/dataset Open your web browser to localhost:8443.
You can log into your instance with you login.gov user.

CI

Continuous Integration via GitHub Actions.

Continuous Deployment via GitHub Actions.

Put site into maintenance mode

To block access to the catalog apps (catalog-web, catalog-admin), set the environment variables (CATALOG_WEB_MODE, CATALOG_ADMIN_MODE) in the catalog-proxy app. Use 'MAINTENANCE' for scheduled downtime, 'DOWN' for unscheduled downtime. Any other value will resume normal operation.

catalog.data.gov's People

Contributors

adborden avatar amercader avatar avdata99 avatar btylerburton avatar chris-macdermaid avatar codeshtuff avatar datagov-bot avatar dependabot[bot] avatar dlennox24 avatar fuhuxia avatar gerbyzation avatar jbrown-xentity avatar jin-sun-tts avatar jtorreggiani avatar mogul avatar nickumia-reisys avatar pdelboca avatar robert-bryson avatar roll avatar rshewitt avatar snyk-bot avatar starsinmypockets avatar thejuliekramer avatar woodt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

catalog.data.gov's Issues

BUG, fail to show dataset from WAF source

Error

Viewing datasets of type "dataset" is not supported 
('Template spatial/snippets/dataset_map_sidebar.html cannot be found'). 

How to reproduce

Add a WAF source from the URL http://meta.geo.census.gov/data/existing/decennial/GEO/CPMB/boundary/2016gz/kml_aiannh_500/
Add config:

{"validator_profiles": ["iso19139ngdc"], "private_datasets": false}

(you can also import WAF sources from the import script)
This source was harvested last time in 2016: validate we expected to work

Run the harvest process, you will get 2 datasets. You'll see the error trying to open any of these datasets.

image

#related to #307


Same error for New Mexico Resource Geographic Information System (NM RGIS) source
URL: http://rgismetadata.unm.edu/19115/collections/
Config

{"validator_profiles": ["iso19139ngdc"], "private_datasets": false}

We can get the datasets list but we can get inside

image

Discovery: Integration tests for catalog-next

Analyze options to update the integration tests for this repo

Technology Pros Cons
Bats It's already working Is not easy to create new tests
Cypress Is what upstream CKAN is using now. We can re-use some preexisting tests for CKAN Require to migrate current tests
Regular CKAN test We current use this for extensions Require to migrate current tests
Selenium Is a more general tool and could be used with python Require to migrate current tests

update-dependencies is not working

I updated the datagovtheme extension today and after run make update-dependencies I didn't see the new version in the requirement file

Logs:

make update-dependencies 
docker-compose run --rm -T ckan freeze-requirements.sh 1000 1000
Starting catalogdatagov_redis_1 ... done
Starting catalogdatagov_db_1    ... done
Running poetry lock ...
Creating virtualenv ckan-catalog-C0o-n6lY-py2.7 in /root/.cache/pypoetry/virtualenvs
Using virtualenv: /root/.cache/pypoetry/virtualenvs/ckan-catalog-C0o-n6lY-py2.7
Updating dependencies
Resolving dependencies...
   1: fact: ckan-catalog is 0.1.0
   1: derived: ckan-catalog
   1: fact: ckan-catalog depends on ckanext-datagovcatalog (0.0.1 git)
   1: fact: ckan-catalog depends on repoze.who (2.3)
   1: fact: ckan-catalog depends on psycopg2 (2.7.3.2)
   1: fact: ckan-catalog depends on Flask (0.12.4)
   1: fact: ckan-catalog depends on OWSLib (0.8.6)
   1: fact: ckan-catalog depends on cffi (1.12.3)
   1: fact: ckan-catalog depends on pyutilib.component.core (4.6.4)
   1: fact: ckan-catalog depends on fanstatic (0.12)
   1: fact: ckan-catalog depends on messytables (>=0.15.2)
   1: fact: ckan-catalog depends on gevent (*)
   1: fact: ckan-catalog depends on ckanext-report (0.1 git)
   1: fact: ckan-catalog depends on GeoAlchemy (>=0.6)
   1: fact: ckan-catalog depends on newrelic (*)
   1: fact: ckan-catalog depends on pyparsing (>=2.1.10)
   1: fact: ckan-catalog depends on pysolr (3.6.0)
   1: fact: ckan-catalog depends on xlrd (>=1.0.0)
   1: fact: ckan-catalog depends on ckan (2.8.6 git)
   1: fact: ckan-catalog depends on sqlparse (0.2.2)
   1: fact: ckan-catalog depends on lxml (>=2.3)
   1: fact: ckan-catalog depends on rq (0.6.0)
   1: fact: ckan-catalog depends on ckanext-googleanalyticsbasic (0.1 git)
   1: fact: ckan-catalog depends on passlib (1.7.3)
   1: fact: ckan-catalog depends on greenlet (0.4.12)
   1: fact: ckan-catalog depends on polib (1.0.7)
   1: fact: ckan-catalog depends on sqlalchemy-migrate (0.10.0)
   1: fact: ckan-catalog depends on WebHelpers (1.3)
   1: fact: ckan-catalog depends on redis (2.10.6)
   1: fact: ckan-catalog depends on google_compute_engine (2.8.13)
   1: fact: ckan-catalog depends on click (6.7)
   1: fact: ckan-catalog depends on celery (3.1.25)
   1: fact: ckan-catalog depends on ckanext-qa (2.0 git)
   1: fact: ckan-catalog depends on gunicorn (*)
   1: fact: ckan-catalog depends on WebTest (1.4.3)
   1: fact: ckan-catalog depends on simplejson (3.10.0)
   1: fact: ckan-catalog depends on kombu (3.0.37)
   1: fact: ckan-catalog depends on werkzeug (~0.15.3)
   1: fact: ckan-catalog depends on Babel (2.3.4)
   1: fact: ckan-catalog depends on python-magic (0.4.15)
   1: fact: ckan-catalog depends on Jinja2 (~2.10.1)
   1: fact: ckan-catalog depends on ckanext-geodatagov (0.1 git)
   1: fact: ckan-catalog depends on ckanext-harvest (1.3.2 git)
   1: fact: ckan-catalog depends on Paste (1.7.5.1)
   1: fact: ckan-catalog depends on SQLAlchemy (1.1.11)
   1: fact: ckan-catalog depends on Genshi (0.7.3)
   1: fact: ckan-catalog depends on cryptography (2.7)
   1: fact: ckan-catalog depends on rfc3987 (*)
   1: fact: ckan-catalog depends on urllib3 (1.25.9)
   1: fact: ckan-catalog depends on ckanext-archiver (2.0.0 git)
   1: fact: ckan-catalog depends on ply (3.4)
   1: fact: ckan-catalog depends on ckanext-spatial (0.2 git)
   1: fact: ckan-catalog depends on lepl (*)
   1: fact: ckan-catalog depends on pytz (2016.7)
   1: fact: ckan-catalog depends on tzlocal (1.3)
   1: fact: ckan-catalog depends on PasteScript (2.0.2)
   1: fact: ckan-catalog depends on bleach (~3.1.1)
   1: fact: ckan-catalog depends on python-dateutil (<2.0.0,>=1.5.0)
   1: fact: ckan-catalog depends on repoze.who-friendlyform (1.0.8)
   1: fact: ckan-catalog depends on Shapely (>=1.2.13)
   1: fact: ckan-catalog depends on Routes (1.13)
   1: fact: ckan-catalog depends on argparse (*)
   1: fact: ckan-catalog depends on pyyaml (*)
   1: fact: ckan-catalog depends on ckanext-envvars (*)
   1: fact: ckan-catalog depends on pika (>=1.1.0)
   1: fact: ckan-catalog depends on boto (*)
   1: fact: ckan-catalog depends on Flask-Babel (0.11.2)
   1: fact: ckan-catalog depends on pyOpenSSL (18.0.0)
   1: fact: ckan-catalog depends on WebOb (1.0.8)
   1: fact: ckan-catalog depends on Pylons (0.9.7)
   1: fact: ckan-catalog depends on ckantoolkit (0.0.3)
   1: fact: ckan-catalog depends on GeoAlchemy2 (0.5.0)
   1: fact: ckan-catalog depends on zope.interface (4.3.2)
   1: fact: ckan-catalog depends on progressbar (2.3)
   1: fact: ckan-catalog depends on ofs (0.4.2)
   1: fact: ckan-catalog depends on Pairtree (0.7.1-T)
   1: fact: ckan-catalog depends on unicodecsv (>=0.9)
   1: fact: ckan-catalog depends on ckanext-datagovtheme (0.1 git)
   1: fact: ckan-catalog depends on ckanext-datajson (0.1 git)
   1: fact: ckan-catalog depends on requests (~2.20.0)
   1: fact: ckan-catalog depends on Markdown (~3.1)
   1: fact: ckan-catalog depends on vdm (0.14)
   1: fact: ckan-catalog depends on jsonschema (2.4.0)
   1: selecting ckan-catalog (0.1.0)
   1: derived: jsonschema (2.4.0)
   1: derived: vdm (0.14)
   1: derived: Markdown (~3.1)
   1: derived: requests (~2.20.0)
   1: derived: ckanext-datajson (0.1 git)
   1: derived: ckanext-datagovtheme (0.1 git)
   1: derived: unicodecsv (>=0.9)
   1: derived: Pairtree (0.7.1-T)
   1: derived: ofs (0.4.2)
   1: derived: progressbar (2.3)
   1: derived: zope.interface (4.3.2)
   1: derived: GeoAlchemy2 (0.5.0)
   1: derived: ckantoolkit (0.0.3)
   1: derived: Pylons (0.9.7)
   1: derived: WebOb (1.0.8)
   1: derived: pyOpenSSL (18.0.0)
   1: derived: Flask-Babel (0.11.2)
   1: derived: boto (*)
   1: derived: pika (>=1.1.0)
   1: derived: ckanext-envvars (*)
   1: derived: pyyaml (*)
   1: derived: argparse (*)
   1: derived: Routes (1.13)
   1: derived: Shapely (>=1.2.13)
   1: derived: repoze.who-friendlyform (1.0.8)
   1: derived: python-dateutil (<2.0.0,>=1.5.0)
   1: derived: bleach (~3.1.1)
   1: derived: PasteScript (2.0.2)
   1: derived: tzlocal (1.3)
   1: derived: pytz (2016.7)
   1: derived: lepl (*)
   1: derived: ckanext-spatial (0.2 git)
   1: derived: ply (3.4)
   1: derived: ckanext-archiver (2.0.0 git)
   1: derived: urllib3 (1.25.9)
   1: derived: rfc3987 (*)
   1: derived: cryptography (2.7)
   1: derived: Genshi (0.7.3)
   1: derived: SQLAlchemy (1.1.11)
   1: derived: Paste (1.7.5.1)
   1: derived: ckanext-harvest (1.3.2 git)
   1: derived: ckanext-geodatagov (0.1 git)
   1: derived: Jinja2 (~2.10.1)
   1: derived: python-magic (0.4.15)
   1: derived: Babel (2.3.4)
   1: derived: werkzeug (~0.15.3)
   1: derived: kombu (3.0.37)
   1: derived: simplejson (3.10.0)
   1: derived: WebTest (1.4.3)
   1: derived: gunicorn (*)
   1: derived: ckanext-qa (2.0 git)
   1: derived: celery (3.1.25)
   1: derived: click (6.7)
   1: derived: google_compute_engine (2.8.13)
   1: derived: redis (2.10.6)
   1: derived: WebHelpers (1.3)
   1: derived: sqlalchemy-migrate (0.10.0)
   1: derived: polib (1.0.7)
   1: derived: greenlet (0.4.12)
   1: derived: passlib (1.7.3)
   1: derived: ckanext-googleanalyticsbasic (0.1 git)
   1: derived: rq (0.6.0)
   1: derived: lxml (>=2.3)
   1: derived: sqlparse (0.2.2)
   1: derived: ckan (2.8.6 git)
   1: derived: xlrd (>=1.0.0)
   1: derived: pysolr (3.6.0)
   1: derived: pyparsing (>=2.1.10)
   1: derived: newrelic (*)
   1: derived: GeoAlchemy (>=0.6)
   1: derived: ckanext-report (0.1 git)
   1: derived: gevent (*)
   1: derived: messytables (>=0.15.2)
   1: derived: fanstatic (0.12)
   1: derived: pyutilib.component.core (4.6.4)
   1: derived: cffi (1.12.3)
   1: derived: OWSLib (0.8.6)
   1: derived: Flask (0.12.4)
   1: derived: psycopg2 (2.7.3.2)
   1: derived: repoze.who (2.3)
   1: derived: ckanext-datagovcatalog (0.0.1 git)
PyPI: 1 packages found for jsonschema 2.4.0
PyPI: No release information found for vdm-0.8, skipping
PyPI: 1 packages found for vdm 0.14
PyPI: No release information found for markdown-1.6, skipping
PyPI: 2 packages found for markdown >=3.1,<3.2
PyPI: No release information found for requests-0.12.01, skipping
PyPI: No release information found for requests-2.15.0, skipping
PyPI: No release information found for requests-0.0.1, skipping
PyPI: 2 packages found for requests >=2.20.0,<2.21.0
PyPI: No release information found for unicodecsv-0.9.0, skipping
PyPI: 9 packages found for unicodecsv >=0.9
PyPI: 1 packages found for pairtree 0.7.1-T
PyPI: 1 packages found for ofs 0.4.2
PyPI: No release information found for progressbar-2.3-dev, skipping
PyPI: 1 packages found for progressbar 2.3
PyPI: No release information found for zope.interface-3.0.0b1, skipping
PyPI: 1 packages found for zope.interface 4.3.2
PyPI: 1 packages found for geoalchemy2 0.5.0
PyPI: 1 packages found for ckantoolkit 0.0.3
PyPI: 1 packages found for pylons 0.9.7
PyPI: 1 packages found for webob 1.0.8
PyPI: No release information found for pyopenssl-0.11, skipping
PyPI: 1 packages found for pyopenssl 18.0.0
PyPI: 1 packages found for flask-babel 0.11.2
PyPI: 81 packages found for boto *
PyPI: 1 packages found for pika >=1.1.0
PyPI: 1 packages found for ckanext-envvars *
PyPI: No release information found for pyyaml-3.09, skipping
PyPI: No release information found for pyyaml-3.08, skipping
PyPI: No release information found for pyyaml-3.03, skipping
PyPI: No release information found for pyyaml-3.02, skipping
PyPI: No release information found for pyyaml-3.01, skipping
PyPI: No release information found for pyyaml-3.07, skipping
PyPI: No release information found for pyyaml-3.06, skipping
PyPI: No release information found for pyyaml-3.05, skipping
PyPI: No release information found for pyyaml-3.04, skipping
PyPI: 10 packages found for pyyaml *
PyPI: No release information found for argparse-0.0.1, skipping
PyPI: No release information found for argparse-1.2, skipping
PyPI: 17 packages found for argparse *
PyPI: No release information found for routes-2.4.0, skipping
PyPI: 1 packages found for routes 1.13
PyPI: 45 packages found for shapely >=1.2.13
PyPI: 1 packages found for repoze.who-friendlyform 1.0.8
PyPI: No release information found for python-dateutil-1.0, skipping
PyPI: No release information found for python-dateutil-1.1, skipping
PyPI: No release information found for python-dateutil-1.2, skipping
PyPI: No release information found for python-dateutil-0.1, skipping
PyPI: No release information found for python-dateutil-0.3, skipping
PyPI: No release information found for python-dateutil-0.5, skipping
PyPI: No release information found for python-dateutil-0.4, skipping
PyPI: No release information found for python-dateutil-2.0, skipping
PyPI: 1 packages found for python-dateutil >=1.5.0,<2.0.0
PyPI: No release information found for bleach-0.1, skipping
PyPI: No release information found for bleach-0.2, skipping
PyPI: No release information found for bleach-0.1.1, skipping
PyPI: No release information found for bleach-0.1.2, skipping
PyPI: 5 packages found for bleach >=3.1.1,<3.2.0
PyPI: 1 packages found for pastescript 2.0.2
PyPI: 1 packages found for tzlocal 1.3
PyPI: 1 packages found for pytz 2016.7
PyPI: 44 packages found for lepl *
PyPI: No release information found for ply-1.6, skipping
PyPI: No release information found for ply-1.8, skipping
PyPI: No release information found for ply-3.3, skipping
PyPI: No release information found for ply-3.1, skipping
PyPI: No release information found for ply-2.2, skipping
PyPI: No release information found for ply-2.1, skipping
PyPI: No release information found for ply-2.0, skipping
PyPI: No release information found for ply-2.5, skipping
PyPI: No release information found for ply-2.4, skipping
PyPI: 1 packages found for ply 3.4
PyPI: No release information found for urllib3-0.3, skipping
PyPI: No release information found for urllib3-0.3.1, skipping
PyPI: No release information found for urllib3-0.4.1, skipping
PyPI: No release information found for urllib3-0.4.0, skipping
PyPI: 1 packages found for urllib3 1.25.9
PyPI: 11 packages found for rfc3987 *
PyPI: 1 packages found for cryptography 2.7
PyPI: No release information found for genshi-0.5, skipping
PyPI: No release information found for genshi-0.3.2, skipping
PyPI: No release information found for genshi-0.3.3, skipping
PyPI: No release information found for genshi-0.3, skipping
PyPI: No release information found for genshi-0.3.1, skipping
PyPI: No release information found for genshi-0.3.6, skipping
PyPI: No release information found for genshi-0.4, skipping
PyPI: No release information found for genshi-0.3.4, skipping
PyPI: No release information found for genshi-0.3.5, skipping
PyPI: No release information found for genshi-0.4.1, skipping
PyPI: No release information found for genshi-0.5.1, skipping
PyPI: No release information found for genshi-0.4.3, skipping
PyPI: No release information found for genshi-0.4.2, skipping
PyPI: No release information found for genshi-0.4.4, skipping
PyPI: 1 packages found for genshi 0.7.3
PyPI: 1 packages found for sqlalchemy 1.1.11
PyPI: 1 packages found for paste 1.7.5.1
PyPI: 3 packages found for jinja2 >=2.10.1,<2.11.0
PyPI: No release information found for python-magic-0.4.1, skipping
PyPI: 1 packages found for python-magic 0.4.15
PyPI: No release information found for babel-0.9, skipping
PyPI: No release information found for babel-0.8, skipping
PyPI: No release information found for babel-0.9.1, skipping
PyPI: No release information found for babel-0.9.2, skipping
PyPI: No release information found for babel-0.9.3, skipping
PyPI: No release information found for babel-0.9.4, skipping
PyPI: No release information found for babel-0.9.5, skipping
PyPI: No release information found for babel-0.8.1, skipping
PyPI: 1 packages found for babel 2.3.4
PyPI: No release information found for werkzeug-0.10.3, skipping
PyPI: 4 packages found for werkzeug >=0.15.3,<0.16.0
PyPI: No release information found for kombu-3.0.17-20140602, skipping
PyPI: 1 packages found for kombu 3.0.37
PyPI: No release information found for simplejson-2.1.0rc3, skipping
PyPI: 1 packages found for simplejson 3.10.0
PyPI: 1 packages found for webtest 1.4.3
PyPI: No release information found for gunicorn-20.0.1, skipping
PyPI: 79 packages found for gunicorn *
PyPI: 1 packages found for celery 3.1.25
PyPI: 1 packages found for click 6.7
PyPI: 1 packages found for google-compute-engine 2.8.13
PyPI: 1 packages found for redis 2.10.6
PyPI: 1 packages found for webhelpers 1.3
PyPI: No release information found for sqlalchemy-migrate-0.1, skipping
PyPI: No release information found for sqlalchemy-migrate-0.2.2, skipping
PyPI: No release information found for sqlalchemy-migrate-0.2.1, skipping
PyPI: No release information found for sqlalchemy-migrate-0.2.0, skipping
PyPI: No release information found for sqlalchemy-migrate-0.4.0, skipping
PyPI: 1 packages found for sqlalchemy-migrate 0.10.0
PyPI: No release information found for polib-0.5.4, skipping
PyPI: No release information found for polib-0.4.2, skipping
PyPI: No release information found for polib-0.5.0, skipping
PyPI: No release information found for polib-0.4.1, skipping
PyPI: No release information found for polib-0.4.0, skipping
PyPI: 1 packages found for polib 1.0.7
PyPI: 1 packages found for greenlet 0.4.12
PyPI: 1 packages found for passlib 1.7.3
PyPI: 1 packages found for rq 0.6.0
PyPI: No release information found for lxml-2.3alpha2, skipping
PyPI: No release information found for lxml-2.3alpha1, skipping
PyPI: No release information found for lxml-1.3.1, skipping
PyPI: 67 packages found for lxml >=2.3
PyPI: 1 packages found for sqlparse 0.2.2
PyPI: 3 packages found for xlrd >=1.0.0
PyPI: 1 packages found for pysolr 3.6.0
PyPI: No release information found for pyparsing-1.1.2, skipping
PyPI: No release information found for pyparsing-1.2, skipping
PyPI: No release information found for pyparsing-1.3.3, skipping
PyPI: 14 packages found for pyparsing >=2.1.10
PyPI: No release information found for newrelic-0.5.52.109, skipping
PyPI: No release information found for newrelic-1.10.1.36, skipping
PyPI: No release information found for newrelic-0.5.48.104, skipping
PyPI: No release information found for newrelic-2.16.0.12, skipping
PyPI: No release information found for newrelic-1.9.0.21, skipping
PyPI: No release information found for newrelic-1.0.5.156, skipping
PyPI: No release information found for newrelic-1.10.2.38, skipping
PyPI: No release information found for newrelic-0.5.47.103, skipping
PyPI: No release information found for newrelic-0.5.58.122, skipping
PyPI: No release information found for newrelic-1.6.0.13, skipping
PyPI: No release information found for newrelic-1.11.0.55, skipping
PyPI: No release information found for newrelic-1.0.3.138, skipping
PyPI: No release information found for newrelic-1.1.0.192, skipping
PyPI: No release information found for newrelic-1.7.0.31, skipping
PyPI: No release information found for newrelic-1.10.0.28, skipping
PyPI: No release information found for newrelic-0.5.49.105, skipping
PyPI: No release information found for newrelic-1.4.0.137, skipping
PyPI: No release information found for newrelic-1.2.1.265, skipping
PyPI: No release information found for newrelic-1.3.0.289, skipping
PyPI: No release information found for newrelic-1.8.0.13, skipping
PyPI: No release information found for newrelic-1.2.0.246, skipping
PyPI: No release information found for newrelic-1.13.0.30, skipping
PyPI: No release information found for newrelic-1.12.0.56, skipping
PyPI: No release information found for newrelic-1.0.2.130, skipping
PyPI: No release information found for newrelic-0.5.50.107, skipping
PyPI: No release information found for newrelic-1.5.0.103, skipping
PyPI: No release information found for newrelic-2.86.3.69, skipping
PyPI: 112 packages found for newrelic *
PyPI: 4 packages found for geoalchemy >=0.6
PyPI: 46 packages found for gevent *
PyPI: 1 packages found for messytables >=0.15.2
PyPI: 1 packages found for fanstatic 0.12
PyPI: 1 packages found for pyutilib.component.core 4.6.4
PyPI: 1 packages found for cffi 1.12.3
PyPI: No release information found for owslib-0.10.2, skipping
PyPI: 1 packages found for owslib 0.8.6
PyPI: 1 packages found for flask 0.12.4
PyPI: No release information found for psycopg2-2.0.7, skipping
PyPI: No release information found for psycopg2-2.0.6, skipping
PyPI: No release information found for psycopg2-2.0.4, skipping
PyPI: No release information found for psycopg2-2.0.3, skipping
PyPI: No release information found for psycopg2-2.0.2, skipping
PyPI: No release information found for psycopg2-2.0.5.1, skipping
PyPI: No release information found for psycopg2-2.0.8, skipping
PyPI: 1 packages found for psycopg2 2.7.3.2
PyPI: 1 packages found for repoze.who 2.3
PyPI: Getting info for jsonschema (2.4.0) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading wheel: jsonschema-2.4.0-py2.py3-none-any.whl
   1: selecting jsonschema (2.4.0)
PyPI: Getting info for vdm (0.14) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: vdm-0.14.tar.gz
   1: selecting vdm (0.14)
   1: selecting ckanext-datajson (0.1 2de3535)
   1: selecting ckanext-datagovtheme (0.1 04d32b7)
PyPI: Getting info for pairtree (0.7.1-T) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: Pairtree-0.7.1-T.tar.gz
   1: selecting pairtree (0.7.1-T)
PyPI: Getting info for ofs (0.4.2) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: ofs-0.4.2.tar.gz
   1: fact: ofs (0.4.2) depends on argparse (*)
   1: selecting ofs (0.4.2)
PyPI: Getting info for progressbar (2.3) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: progressbar-2.3.tar.gz
   1: selecting progressbar (2.3)
PyPI: Getting info for zope.interface (4.3.2) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: zope.interface-4.3.2.tar.gz
   1: selecting zope.interface (4.3.2)
PyPI: Getting info for geoalchemy2 (0.5.0) from PyPI
   1: fact: geoalchemy2 (0.5.0) depends on SQLAlchemy (>=0.8)
   1: selecting geoalchemy2 (0.5.0)
PyPI: Getting info for ckantoolkit (0.0.3) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: ckantoolkit-0.0.3.tar.gz
   1: selecting ckantoolkit (0.0.3)
PyPI: Getting info for pylons (0.9.7) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: Pylons-0.9.7.tar.gz
   1: fact: pylons (0.9.7) depends on Routes (>=1.10.3)
   1: fact: pylons (0.9.7) depends on WebHelpers (>=0.6.4)
   1: fact: pylons (0.9.7) depends on Beaker (>=1.2.2)
   1: fact: pylons (0.9.7) depends on Paste (>=1.7.2)
   1: fact: pylons (0.9.7) depends on PasteDeploy (>=1.3.3)
   1: fact: pylons (0.9.7) depends on PasteScript (>=1.7.3)
   1: fact: pylons (0.9.7) depends on FormEncode (>=1.2.1)
   1: fact: pylons (0.9.7) depends on simplejson (>=2.0.8)
   1: fact: pylons (0.9.7) depends on decorator (>=2.3.2)
   1: fact: pylons (0.9.7) depends on nose (>=0.10.4)
   1: fact: pylons (0.9.7) depends on Mako (>=0.2.4)
   1: fact: pylons (0.9.7) depends on WebOb (>=0.9.6.1)
   1: fact: pylons (0.9.7) depends on WebError (>=0.10.1)
   1: fact: pylons (0.9.7) depends on WebTest (>=1.1)
   1: fact: pylons (0.9.7) depends on Tempita (>=0.2)
   1: selecting pylons (0.9.7)
   1: derived: Tempita (>=0.2)
   1: derived: WebError (>=0.10.1)
   1: derived: Mako (>=0.2.4)
   1: derived: nose (>=0.10.4)
   1: derived: decorator (>=2.3.2)
   1: derived: FormEncode (>=1.2.1)
   1: derived: PasteDeploy (>=1.3.3)
   1: derived: Beaker (>=1.2.2)
PyPI: 5 packages found for tempita >=0.2
PyPI: No release information found for weberror-0.8dev-20071109, skipping
PyPI: 7 packages found for weberror >=0.10.1
PyPI: 43 packages found for mako >=0.2.4
PyPI: No release information found for nose-1.3.5, skipping
PyPI: 13 packages found for nose >=0.10.4
PyPI: No release information found for decorator-4.0.8, skipping
PyPI: No release information found for decorator-3.4.1, skipping
PyPI: 25 packages found for decorator >=2.3.2
PyPI: No release information found for formencode-1.2.4dev, skipping
PyPI: 8 packages found for formencode >=1.2.1
PyPI: No release information found for pastedeploy-0.9.7dev-r5510, skipping
PyPI: 9 packages found for pastedeploy >=1.3.3
PyPI: No release information found for beaker-1.6.5, skipping
PyPI: 27 packages found for beaker >=1.2.2
PyPI: Getting info for webob (1.0.8) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: WebOb-1.0.8.zip
   1: selecting webob (1.0.8)
PyPI: Getting info for pyopenssl (18.0.0) from PyPI
   1: fact: pyopenssl (18.0.0) depends on cryptography (>=2.2.1)
   1: fact: pyopenssl (18.0.0) depends on six (>=1.5.2)
   1: selecting pyopenssl (18.0.0)
   1: derived: six (>=1.5.2)
PyPI: 15 packages found for six >=1.5.2
PyPI: Getting info for flask-babel (0.11.2) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading wheel: Flask_Babel-0.11.2-py2.py3-none-any.whl
   1: fact: flask-babel (0.11.2) depends on Flask (*)
   1: fact: flask-babel (0.11.2) depends on Babel (>=2.3)
   1: fact: flask-babel (0.11.2) depends on Jinja2 (>=2.5)
   1: selecting flask-babel (0.11.2)
PyPI: Getting info for pika (1.1.0) from PyPI
   1: selecting pika (1.1.0)
PyPI: Getting info for ckanext-envvars (0.0.1) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: ckanext-envvars-0.0.1.tar.gz
   1: selecting ckanext-envvars (0.0.1)
PyPI: Getting info for routes (1.13) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: Routes-1.13.tar.gz
   1: fact: routes (1.13) depends on repoze.lru (>=0.3)
   1: selecting routes (1.13)
   1: derived: repoze.lru (>=0.3)
PyPI: 5 packages found for repoze.lru >=0.3
PyPI: Getting info for repoze.who-friendlyform (1.0.8) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: repoze.who-friendlyform-1.0.8.tar.gz
   1: fact: repoze.who-friendlyform (1.0.8) depends on repoze.who (>=1.0)
   1: fact: repoze.who-friendlyform (1.0.8) depends on zope.interface (*)
   1: fact: repoze.who-friendlyform (1.0.8) depends on WebOb (>=0.9.7)
   1: selecting repoze.who-friendlyform (1.0.8)
PyPI: Getting info for python-dateutil (1.5) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: python-dateutil-1.5.tar.gz
   1: selecting python-dateutil (1.5)
PyPI: Getting info for pastescript (2.0.2) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading wheel: PasteScript-2.0.2-py2.py3-none-any.whl
   1: fact: pastescript (2.0.2) depends on Paste (>=1.3)
   1: fact: pastescript (2.0.2) depends on PasteDeploy (*)
   1: fact: pastescript (2.0.2) depends on six (*)
   1: selecting pastescript (2.0.2)
PyPI: Getting info for tzlocal (1.3) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: tzlocal-1.3.tar.gz
   1: fact: tzlocal (1.3) depends on pytz (*)
   1: selecting tzlocal (1.3)
PyPI: Getting info for pytz (2016.7) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading wheel: pytz-2016.7-py2.py3-none-any.whl
   1: selecting pytz (2016.7)
   1: selecting ckanext-spatial (0.2 4ac25f1)
PyPI: Getting info for ply (3.4) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: ply-3.4.tar.gz
   1: selecting ply (3.4)
   1: fact: ckanext-archiver (2.0.0) depends on SQLAlchemy (>=0.6.6)
   1: fact: ckanext-archiver (2.0.0) depends on requests (>=1.1.0)
   1: fact: ckanext-archiver (2.0.0) depends on progressbar (*)
   1: fact: ckanext-archiver (2.0.0) depends on ckanext-report (*)
   1: selecting ckanext-archiver (2.0.0 4cb10ac)
PyPI: Getting info for urllib3 (1.25.9) from PyPI
   1: selecting urllib3 (1.25.9)
PyPI: Getting info for cryptography (2.7) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: cryptography-2.7.tar.gz
   1: fact: cryptography (2.7) depends on asn1crypto (>=0.21.0)
   1: fact: cryptography (2.7) depends on six (>=1.4.1)
   1: fact: cryptography (2.7) depends on cffi (>=1.8,<1.11.3 || >1.11.3)
   1: fact: cryptography (2.7) depends on enum34 (*)
   1: fact: cryptography (2.7) depends on ipaddress (*)
   1: selecting cryptography (2.7)
   1: derived: ipaddress (*)
   1: derived: enum34 (*)
   1: derived: asn1crypto (>=0.21.0)
PyPI: 22 packages found for ipaddress *
PyPI: 30 packages found for enum34 *
PyPI: 11 packages found for asn1crypto >=0.21.0
PyPI: Getting info for genshi (0.7.3) from PyPI
   1: selecting genshi (0.7.3)
PyPI: Getting info for sqlalchemy (1.1.11) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: SQLAlchemy-1.1.11.tar.gz
   1: selecting sqlalchemy (1.1.11)
PyPI: Getting info for paste (1.7.5.1) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: Paste-1.7.5.1.tar.gz
   1: selecting paste (1.7.5.1)
   1: selecting ckanext-harvest (1.3.2 8cde93c)
   1: selecting ckanext-geodatagov (0.1 9df6669)
PyPI: Getting info for python-magic (0.4.15) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading wheel: python_magic-0.4.15-py2.py3-none-any.whl
   1: selecting python-magic (0.4.15)
PyPI: Getting info for babel (2.3.4) from PyPI
   1: fact: babel (2.3.4) depends on pytz (>=0a)
   1: selecting babel (2.3.4)
PyPI: Getting info for kombu (3.0.37) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading wheel: kombu-3.0.37-py2.py3-none-any.whl
   1: fact: kombu (3.0.37) depends on anyjson (>=0.3.3)
   1: fact: kombu (3.0.37) depends on amqp (>=1.4.9,<2.0)
   1: selecting kombu (3.0.37)
   1: derived: amqp (>=1.4.9,<2.0)
   1: derived: anyjson (>=0.3.3)
PyPI: No release information found for amqp-0.0.1, skipping
PyPI: 1 packages found for amqp >=1.4.9,<2.0
PyPI: 1 packages found for anyjson >=0.3.3
PyPI: Getting info for simplejson (3.10.0) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: simplejson-3.10.0.tar.gz
   1: selecting simplejson (3.10.0)
PyPI: Getting info for webtest (1.4.3) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: WebTest-1.4.3.zip
   1: fact: webtest (1.4.3) depends on WebOb (*)
   1: selecting webtest (1.4.3)
   1: fact: ckanext-qa (2.0) depends on ckanext-archiver (>=2.0)
   1: fact: ckanext-qa (2.0) depends on ckanext-report (*)
   1: fact: ckanext-qa (2.0) depends on SQLAlchemy (>=0.6.6)
   1: fact: ckanext-qa (2.0) depends on requests (*)
   1: fact: ckanext-qa (2.0) depends on xlrd (>=0.8.0)
   1: fact: ckanext-qa (2.0) depends on messytables (>=0.8)
   1: fact: ckanext-qa (2.0) depends on python-magic (>=0.4)
   1: fact: ckanext-qa (2.0) depends on progressbar (*)
   1: fact: ckanext-qa (2.0) depends on six (>=1.9)
   1: selecting ckanext-qa (2.0 d7d384c)
   1: derived: six (>=1.9)
PyPI: Getting info for celery (3.1.25) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading wheel: celery-3.1.25-py2.py3-none-any.whl
   1: fact: celery (3.1.25) depends on billiard (>=3.3.0.23,<3.4)
   1: fact: celery (3.1.25) depends on kombu (>=3.0.37,<3.1)
   1: fact: celery (3.1.25) depends on pytz (>0.0-dev)
   1: selecting celery (3.1.25)
   1: derived: billiard (>=3.3.0.23,<3.4)
PyPI: 1 packages found for billiard >=3.3.0.23,<3.4
PyPI: Getting info for click (6.7) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading wheel: click-6.7-py2.py3-none-any.whl
   1: selecting click (6.7)
PyPI: Getting info for google-compute-engine (2.8.13) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: google-compute-engine-2.8.13.tar.gz
   1: fact: google-compute-engine (2.8.13) depends on boto (*)
   1: fact: google-compute-engine (2.8.13) depends on distro (*)
   1: selecting google-compute-engine (2.8.13)
   1: derived: distro (*)
PyPI: 12 packages found for distro *
PyPI: Getting info for redis (2.10.6) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading wheel: redis-2.10.6-py2.py3-none-any.whl
   1: selecting redis (2.10.6)
PyPI: Getting info for webhelpers (1.3) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: WebHelpers-1.3.tar.gz
   1: fact: webhelpers (1.3) depends on MarkupSafe (>=0.9.2)
   1: selecting webhelpers (1.3)
   1: derived: MarkupSafe (>=0.9.2)
PyPI: 18 packages found for markupsafe >=0.9.2
PyPI: Getting info for sqlalchemy-migrate (0.10.0) from PyPI
   1: fact: sqlalchemy-migrate (0.10.0) depends on pbr (>=1.3,<2.0)
   1: fact: sqlalchemy-migrate (0.10.0) depends on SQLAlchemy (>=0.7.8,<0.9.5 || >0.9.5)
   1: fact: sqlalchemy-migrate (0.10.0) depends on decorator (*)
   1: fact: sqlalchemy-migrate (0.10.0) depends on six (>=1.7.0)
   1: fact: sqlalchemy-migrate (0.10.0) depends on sqlparse (*)
   1: fact: sqlalchemy-migrate (0.10.0) depends on Tempita (>=0.4)
   1: selecting sqlalchemy-migrate (0.10.0)
   1: derived: Tempita (>=0.4)
   1: derived: pbr (>=1.3,<2.0)
PyPI: 10 packages found for pbr >=1.3,<2.0
PyPI: Getting info for polib (1.0.7) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading wheel: polib-1.0.7-py2.py3-none-any.whl
   1: selecting polib (1.0.7)
PyPI: Getting info for greenlet (0.4.12) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: greenlet-0.4.12.tar.gz
   1: selecting greenlet (0.4.12)
PyPI: Getting info for passlib (1.7.3) from PyPI
   1: selecting passlib (1.7.3)
   1: selecting ckanext-googleanalyticsbasic (0.1 54647da)
PyPI: Getting info for rq (0.6.0) from PyPI
   1: fact: rq (0.6.0) depends on redis (>=2.7.0)
   1: fact: rq (0.6.0) depends on click (>=3.0)
   1: selecting rq (0.6.0)
PyPI: Getting info for sqlparse (0.2.2) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading wheel: sqlparse-0.2.2-py2.py3-none-any.whl
   1: selecting sqlparse (0.2.2)
   1: selecting ckan (2.8.6 eca78d5)
PyPI: Getting info for pysolr (3.6.0) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading wheel: pysolr-3.6.0-py2.py3-none-any.whl
   1: fact: pysolr (3.6.0) depends on requests (>=2.9.1)
   1: selecting pysolr (3.6.0)
   1: selecting ckanext-report (0.1 b67875b)
PyPI: Getting info for messytables (0.15.2) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: messytables-0.15.2.tar.gz
   1: fact: messytables (0.15.2) depends on xlrd (>=0.8.0)
   1: fact: messytables (0.15.2) depends on python-magic (>=0.4.12)
   1: fact: messytables (0.15.2) depends on chardet (>=2.3.0)
   1: fact: messytables (0.15.2) depends on python-dateutil (>=1.5.0)
   1: fact: messytables (0.15.2) depends on lxml (>=3.2)
   1: fact: messytables (0.15.2) depends on requests (*)
   1: fact: messytables (0.15.2) depends on html5lib (*)
   1: fact: messytables (0.15.2) depends on json-table-schema (>=0.2,<=0.2.1)
   1: selecting messytables (0.15.2)
   1: derived: json-table-schema (>=0.2,<=0.2.1)
   1: derived: html5lib (*)
   1: derived: lxml (>=3.2)
   1: derived: chardet (>=2.3.0)
PyPI: 2 packages found for json-table-schema >=0.2,<=0.2.1
PyPI: 18 packages found for html5lib *
PyPI: 6 packages found for chardet >=2.3.0
PyPI: Getting info for fanstatic (0.12) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: fanstatic-0.12.tar.gz
   1: fact: fanstatic (0.12) depends on Paste (*)
   1: fact: fanstatic (0.12) depends on WebOb (*)
   1: selecting fanstatic (0.12)
PyPI: Getting info for pyutilib.component.core (4.6.4) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading wheel: pyutilib.component.core-4.6.4-py2-none-any.whl
   1: fact: pyutilib.component.core (4.6.4) depends on six (*)
   1: selecting pyutilib.component.core (4.6.4)
PyPI: Getting info for cffi (1.12.3) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: cffi-1.12.3.tar.gz
   1: fact: cffi (1.12.3) depends on pycparser (*)
   1: selecting cffi (1.12.3)
   1: derived: pycparser (*)
PyPI: 20 packages found for pycparser *
PyPI: Getting info for owslib (0.8.6) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: OWSLib-0.8.6.tar.gz
   1: fact: owslib (0.8.6) depends on python-dateutil (>=1.5)
   1: fact: owslib (0.8.6) depends on pytz (*)
   1: selecting owslib (0.8.6)
PyPI: Getting info for flask (0.12.4) from PyPI
   1: fact: flask (0.12.4) depends on Werkzeug (>=0.7)
   1: fact: flask (0.12.4) depends on Jinja2 (>=2.4)
   1: fact: flask (0.12.4) depends on itsdangerous (>=0.21)
   1: fact: flask (0.12.4) depends on click (>=2.0)
   1: selecting flask (0.12.4)
   1: derived: itsdangerous (>=0.21)
PyPI: 5 packages found for itsdangerous >=0.21
PyPI: Getting info for psycopg2 (2.7.3.2) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: psycopg2-2.7.3.2.tar.gz
   1: selecting psycopg2 (2.7.3.2)
PyPI: Getting info for repoze.who (2.3) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading wheel: repoze.who-2.3-py2-none-any.whl
PyPI: Downloading wheel: repoze.who-2.3-py3-none-any.whl
   1: fact: repoze.who (2.3) depends on WebOb (*)
   1: fact: repoze.who (2.3) depends on zope.interface (*)
   1: selecting repoze.who (2.3)
   1: selecting ckanext-datagovcatalog (0.0.1 2c724d8)
PyPI: Getting info for amqp (1.4.9) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading wheel: amqp-1.4.9-py2.py3-none-any.whl
   1: selecting amqp (1.4.9)
PyPI: Getting info for anyjson (0.3.3) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: anyjson-0.3.3.tar.gz
   1: selecting anyjson (0.3.3)
PyPI: Getting info for billiard (3.3.0.23) from PyPI
PyPI: No dependencies found, downloading archives
PyPI: Downloading sdist: billiard-3.3.0.23.tar.gz
   1: selecting billiard (3.3.0.23)
PyPI: Getting info for markdown (3.1.1) from PyPI
   1: selecting markdown (3.1.1)
PyPI: Getting info for requests (2.20.1) from PyPI
   1: fact: requests (2.20.1) depends on chardet (>=3.0.2,<3.1.0)
   1: fact: requests (2.20.1) depends on idna (>=2.5,<2.8)
   1: fact: requests (2.20.1) depends on urllib3 (>=1.21.1,<1.25)
   1: fact: requests (2.20.1) depends on certifi (>=2017.4.17)
   1: derived: not requests (2.20.1)
PyPI: Getting info for requests (2.20.0) from PyPI
   1: fact: requests (2.20.0) depends on chardet (>=3.0.2,<3.1.0)
   1: fact: requests (2.20.0) depends on idna (>=2.5,<2.8)
   1: fact: requests (2.20.0) depends on urllib3 (>=1.21.1,<1.25)
   1: fact: requests (2.20.0) depends on certifi (>=2017.4.17)
   1: derived: not requests (2.20.0)
   1: fact: no versions of requests match >2.20.0,<2.20.1 || >2.20.1,<2.21.0
   1: conflict: no versions of requests match >2.20.0,<2.20.1 || >2.20.1,<2.21.0
   1: ! requests (>2.20.0,<2.20.1 || >2.20.1,<2.21.0) is partially satisfied by not requests (2.20.0)
   1: ! which is caused by "requests (2.20.0) depends on urllib3 (>=1.21.1,<1.25)"
   1: ! thus: requests (>=2.20.0,<2.20.1 || >2.20.1,<2.21.0) requires urllib3 (>=1.21.1,<1.25)
   1: fact: requests (>=2.20.0,<2.20.1 || >2.20.1,<2.21.0) requires urllib3 (>=1.21.1,<1.25)
   1: derived: not requests (>=2.20.0,<2.20.1 || >2.20.1,<2.21.0)
   1: derived: certifi (>=2017.4.17)
   1: conflict: requests (2.20.1) depends on urllib3 (>=1.21.1,<1.25)
   1: ! requests (2.20.1) is partially satisfied by not requests (>=2.20.0,<2.20.1 || >2.20.1,<2.21.0)
   1: ! which is caused by "requests (>=2.20.0,<2.20.1 || >2.20.1,<2.21.0) requires urllib3 (>=1.21.1,<1.25)"
   1: ! thus: requests (>=2.20.0,<2.21.0) requires urllib3 (>=1.21.1,<1.25)
   1: ! not urllib3 (>=1.21.1,<1.25) is satisfied by urllib3 (1.25.9)
   1: ! which is caused by "ckan-catalog depends on urllib3 (1.25.9)"
   1: ! thus: requests is forbidden
   1: ! requests (>=2.20.0,<2.21.0) is satisfied by requests (~2.20.0)
   1: ! which is caused by "ckan-catalog depends on requests (~2.20.0)"
   1: ! thus: version solving failed
   1: Version solving took 366.890 seconds.
   1: Tried 1 solutions.

[SolverProblemError]
Because no versions of requests match >2.20.0,<2.20.1 || >2.20.1,<2.21.0
 and requests (2.20.0) depends on urllib3 (>=1.21.1,<1.25), requests (>=2.20.0,<2.20.1 || >2.20.1,<2.21.0) requires urllib3 (>=1.21.1,<1.25).
And because requests (2.20.1) depends on urllib3 (>=1.21.1,<1.25), requests (>=2.20.0,<2.21.0) requires urllib3 (>=1.21.1,<1.25).
So, because ckan-catalog depends on both urllib3 (1.25.9) and requests (~2.20.0), version solving failed.

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/clikit/console_application.py", line 131, in run
    status_code = command.handle(parsed_args, io)
  File "/usr/lib/python2.7/site-packages/clikit/api/command/command.py", line 120, in handle
    status_code = self._do_handle(args, io)
  File "/usr/lib/python2.7/site-packages/clikit/api/command/command.py", line 171, in _do_handle
    return getattr(handler, handler_method)(args, io, self)
  File "/usr/lib/python2.7/site-packages/cleo/commands/command.py", line 92, in wrap_handle
    return self.handle()
  File "/usr/lib/python2.7/site-packages/poetry/console/commands/lock.py", line 28, in handle
    return installer.run()
  File "/usr/lib/python2.7/site-packages/poetry/installation/installer.py", line 74, in run
    self._do_install(local_repo)
  File "/usr/lib/python2.7/site-packages/poetry/installation/installer.py", line 161, in _do_install
    ops = solver.solve(use_latest=self._whitelist)
  File "/usr/lib/python2.7/site-packages/poetry/puzzle/solver.py", line 36, in solve
    packages, depths = self._solve(use_latest=use_latest)
  File "/usr/lib/python2.7/site-packages/poetry/puzzle/solver.py", line 190, in _solve
    raise SolverProblemError(e)

Running poetry export ...
Warning: The lock file is not up to date with the latest changes in pyproject.toml. You may be getting outdated dependencies. Run update to update them.
argparse 1.4.0 Python command-line parsing library
babel 2.3.4 Internationalization utilities
`-- pytz >=0a
bleach 3.1.5 An easy safelist-based HTML-sanitizing tool.
|-- packaging *
|   |-- pyparsing >=2.0.2 
|   `-- six * 
|-- six >=1.9.0
`-- webencodings *
boto 2.49.0 Amazon Web Services Library
celery 3.1.25 Distributed Task Queue
|-- billiard >=3.3.0.23,<3.4
|-- kombu >=3.0.37,<3.1
|   |-- amqp >=1.4.9,<2.0 
|   `-- anyjson >=0.3.3 
`-- pytz >0.0-dev
cffi 1.12.3 Foreign Function Interface for Python calling C code.
`-- pycparser *
ckan 2.8.6 CKAN Software
ckanext-archiver 2.0.0 Archives resources in CKAN (CKAN Extension)
|-- ckanext-report *
|-- progressbar *
|-- requests >=1.1.0
|   |-- certifi >=2017.4.17 
|   |-- chardet >=3.0.2,<3.1.0 
|   |-- idna >=2.5,<2.8 
|   `-- urllib3 >=1.21.1,<1.25 
`-- sqlalchemy >=0.6.6
ckanext-datagovcatalog 0.0.1 Catalog customizations
ckanext-datagovtheme 0.1 Datagov Theme
ckanext-datajson 0.1 CKAN extension to generate /data.json
ckanext-envvars 0.0.1 CKAN configuration settings available from env vars
ckanext-geodatagov 0.1
ckanext-googleanalyticsbasic 0.1 Basic extension to add google analytics tracking code in page header
ckanext-harvest 1.3.2 Harvesting interface plugin for CKAN
ckanext-qa 2.0 Quality Assurance plugin for CKAN
|-- ckanext-archiver >=2.0
|   |-- ckanext-report * 
|   |-- progressbar * 
|   |-- requests >=1.1.0 
|   |   |-- certifi >=2017.4.17 
|   |   |-- chardet >=3.0.2,<3.1.0 
|   |   |-- idna >=2.5,<2.8 
|   |   `-- urllib3 >=1.21.1,<1.25 
|   `-- sqlalchemy >=0.6.6 
|-- ckanext-report *
|-- messytables >=0.8
|   |-- chardet >=2.3.0 
|   |-- html5lib * 
|   |   |-- six >=1.9 
|   |   `-- webencodings * 
|   |-- json-table-schema >=0.2,<=0.2.1 
|   |-- lxml >=3.2 
|   |-- python-dateutil >=1.5.0 
|   |-- python-magic >=0.4.12 
|   |-- requests * 
|   |   |-- certifi >=2017.4.17 
|   |   |-- chardet >=3.0.2,<3.1.0 (circular dependency aborted here)
|   |   |-- idna >=2.5,<2.8 
|   |   `-- urllib3 >=1.21.1,<1.25 
|   `-- xlrd >=0.8.0 
|-- progressbar *
|-- python-magic >=0.4
|-- requests *
|   |-- certifi >=2017.4.17 
|   |-- chardet >=3.0.2,<3.1.0 
|   |-- idna >=2.5,<2.8 
|   `-- urllib3 >=1.21.1,<1.25 
|-- six >=1.9
|-- sqlalchemy >=0.6.6
`-- xlrd >=0.8.0
ckanext-report 0.1 Framework for defining reports in CKAN
ckanext-spatial 0.2 Geo-related plugins for CKAN
ckantoolkit 0.0.3 UNKNOWN
click 6.7 A simple wrapper around optparse for powerful command line utilities.
cryptography 2.7 cryptography is a package which provides cryptographic recipes and primitives to Python developers.
|-- asn1crypto >=0.21.0
|-- cffi >=1.8,<1.11.3 || >1.11.3
|   `-- pycparser * 
|-- enum34 *
|-- ipaddress *
`-- six >=1.4.1
fanstatic 0.12 Flexible static resources for web applications.
|-- paste *
`-- webob *
flask 0.12.4 A microframework based on Werkzeug, Jinja2 and good intentions
|-- click >=2.0
|-- itsdangerous >=0.21
|-- jinja2 >=2.4
|   `-- markupsafe >=0.23 
`-- werkzeug >=0.7
flask-babel 0.11.2 Adds i18n/l10n support to Flask applications
|-- babel >=2.3
|   `-- pytz >=0a 
|-- flask *
|   |-- click >=2.0 
|   |-- itsdangerous >=0.21 
|   |-- jinja2 >=2.4 
|   |   `-- markupsafe >=0.23 
|   `-- werkzeug >=0.7 
`-- jinja2 >=2.5
    `-- markupsafe >=0.23 
genshi 0.7.3 A toolkit for generation of output for the web
geoalchemy 0.7.2 Using SQLAlchemy with Spatial Databases
`-- sqlalchemy >=0.6.1
geoalchemy2 0.5.0 Using SQLAlchemy with Spatial Databases
`-- sqlalchemy >=0.8
gevent 1.2.2 Coroutine-based network library
`-- greenlet >=0.4.10
google-compute-engine 2.8.13 Google Compute Engine
|-- boto *
|-- distro *
`-- setuptools *
greenlet 0.4.12 Lightweight in-process concurrent programming
gunicorn 19.10.0 WSGI HTTP Server for UNIX
jinja2 2.10.3 A very fast and expressive template engine.
`-- markupsafe >=0.23
jsonschema 2.4.0 An implementation of JSON Schema validation for Python
kombu 3.0.37 Messaging library for Python
|-- amqp >=1.4.9,<2.0
`-- anyjson >=0.3.3
lepl 5.1.3 A Parser Library for Python 2.6+/3+: Recursive Descent; Full Backtracking
lxml 4.6.1 Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API.
markdown 3.1.1 Python implementation of Markdown.
`-- setuptools >=36
messytables 0.15.2 Parse messy tabular data in various formats
|-- chardet >=2.3.0
|-- html5lib *
|   |-- six >=1.9 
|   `-- webencodings * 
|-- json-table-schema >=0.2,<=0.2.1
|-- lxml >=3.2
|-- python-dateutil >=1.5.0
|-- python-magic >=0.4.12
|-- requests *
|   |-- certifi >=2017.4.17 
|   |-- chardet >=3.0.2,<3.1.0 
|   |-- idna >=2.5,<2.8 
|   `-- urllib3 >=1.21.1,<1.25 
`-- xlrd >=0.8.0
newrelic 5.20.1.150 New Relic Python Agent
ofs 0.4.2 OFS - provides plugin-orientated low-level blobstore.
`-- argparse *
owslib 0.8.6 OGC Web Service utility library
|-- python-dateutil >=1.5
`-- pytz *
pairtree 0.7.1-T Pairtree FS implementation.
passlib 1.6.5 comprehensive password hashing framework supporting over 30 schemes
paste 1.7.5.1 Tools for using a Web Server Gateway Interface stack
pastescript 2.0.2 A pluggable command-line frontend, including commands to setup package file layouts
|-- paste >=1.3
|-- pastedeploy *
|   `-- setuptools * 
`-- six *
pika 1.1.0 Pika Python AMQP Client Library
ply 3.4 Python Lex & Yacc
polib 1.0.7 A library to manipulate gettext files (po and mo files).
progressbar 2.3 Text progress bar library for Python.
psycopg2 2.7.3.2 psycopg2 - Python-PostgreSQL Database Adapter
pylons 0.9.7 Pylons Web Framework
|-- beaker >=1.2.2
|   `-- funcsigs * 
|-- decorator >=2.3.2
|-- formencode >=1.2.1
|   `-- six * 
|-- mako >=0.2.4
|   `-- markupsafe >=0.9.2 
|-- nose >=0.10.4
|-- paste >=1.7.2
|-- pastedeploy >=1.3.3
|   `-- setuptools * 
|-- pastescript >=1.7.3
|   |-- paste >=1.3 
|   |-- pastedeploy * 
|   |   `-- setuptools * 
|   `-- six * 
|-- routes >=1.10.3
|   `-- repoze.lru >=0.3 
|-- simplejson >=2.0.8
|-- tempita >=0.2
|-- weberror >=0.10.1
|   |-- paste >=1.7.1 
|   |-- pygments * 
|   |-- tempita * 
|   `-- webob * 
|-- webhelpers >=0.6.4
|   `-- markupsafe >=0.9.2 
|-- webob >=0.9.6.1
`-- webtest >=1.1
    `-- webob * 
pyopenssl 18.0.0 Python wrapper module around the OpenSSL library
|-- cryptography >=2.2.1
|   |-- asn1crypto >=0.21.0 
|   |-- cffi >=1.8,<1.11.3 || >1.11.3 
|   |   `-- pycparser * 
|   |-- enum34 * 
|   |-- ipaddress * 
|   `-- six >=1.4.1 
`-- six >=1.5.2
pyparsing 2.4.7 Python parsing module
pysolr 3.6.0 Lightweight python wrapper for Apache Solr.
`-- requests >=2.9.1
    |-- certifi >=2017.4.17 
    |-- chardet >=3.0.2,<3.1.0 
    |-- idna >=2.5,<2.8 
    `-- urllib3 >=1.21.1,<1.25 
python-dateutil 1.5 Extensions to the standard python 2.3+ datetime module
python-magic 0.4.15 File type identification using libmagic
pytz 2016.7 World timezone definitions, modern and historical
pyutilib.component.core 4.6.4 The PyUtilib Component Architecture.
`-- six *
pyyaml 5.3.1 YAML parser and emitter for Python
redis 2.10.6 Python client for Redis key-value store
repoze.who 2.3 repoze.who is an identification and authentication framework for WSGI.
|-- setuptools *
|-- webob *
`-- zope.interface *
    `-- setuptools * 
repoze.who-friendlyform 1.0.8 Collection of repoze.who friendly form plugins
|-- repoze.who >=1.0
|   |-- setuptools * 
|   |-- webob * 
|   `-- zope.interface * 
|       `-- setuptools * (circular dependency aborted here)
|-- webob >=0.9.7
`-- zope.interface *
    `-- setuptools * 
requests 2.20.1 Python HTTP for Humans.
|-- certifi >=2017.4.17
|-- chardet >=3.0.2,<3.1.0
|-- idna >=2.5,<2.8
`-- urllib3 >=1.21.1,<1.25
rfc3987 1.3.8 Parsing and validation of URIs (RFC 3986) and IRIs (RFC 3987)
routes 1.13 Routing Recognition and Generation Tools
`-- repoze.lru >=0.3
rq 0.6.0 RQ is a simple, lightweight, library for creating background jobs, and processing them.
|-- click >=3.0
`-- redis >=2.7.0
shapely 1.7.1 Geometric objects, predicates, and operations
simplejson 3.10.0 Simple, fast, extensible JSON encoder/decoder for Python
sqlalchemy 1.1.11 Database Abstraction Library
sqlalchemy-migrate 0.10.0 Database schema migration for SQLAlchemy
|-- decorator *
|-- pbr >=1.3,<2.0
|-- six >=1.7.0
|-- sqlalchemy >=0.7.8,<0.9.5 || >0.9.5
|-- sqlparse *
`-- tempita >=0.4
sqlparse 0.2.2 Non-validating SQL parser
tzlocal 1.3 tzinfo object for the local timezone
`-- pytz *
unicodecsv 0.14.1 Python2's stdlib csv module is nice, but it doesn't support unicode. This module is a drop-in replacement which *does*.
urllib3 1.24.3 HTTP library with thread-safe connection pooling, file post, and more.
vdm 0.14 A versioned domain model framework.
webhelpers 1.3 Web Helpers
`-- markupsafe >=0.9.2
webob 1.0.8 WSGI request and response object
webtest 1.4.3 Helper to test WSGI applications
`-- webob *
werkzeug 0.15.6 The comprehensive WSGI web application library.
xlrd 1.2.0 Library for developers to extract data from Microsoft Excel (tm) spreadsheet files
zope.interface 4.3.2 Interfaces for Python
`-- setuptools *
cp requirements/requirements.txt ckan/requirements.txt

Template error at /dataset

If I try to open /dataset page I have this error:

IOError: [Errno 2] No such file or directory: '/var/tmp/ckan/logos/bureau.csv'

 Module /srv/app/src/ckanext-datagovtheme/ckanext/datagovtheme/templates/snippets/facet_list.html:70 in top-level template code
   {% set label = h.get_bureau_info(label)['title'] if h.get_bureau_info(label) else 'Code ' + label %}
Module ckanext.datagovtheme.helpers:493 in get_bureau_info
   file_obj = open(filename)

image

Acceptance criteria

  • After harvest for one source I'm able to see datasets in /dataset page

Bug: Spatial field validation fails while we harvest datajson

Error at DCAT-US harvest sources while we harvest them (report here).

ckan.logic.ValidationError: 
{
  'spatial': [
     u'Error decoding JSON object: Extra data: line 1 column 7 - line 1 column 23 (char 6 - 22)'
  ]
}

This issue stems from #312

How to reproduce it

This source is harvested in production:

image

Task

  • Determine what is the source of the problem
  • Send a PR to fix it
  • Create an updated report on the state of the harvesting processes after solving this problem
  • Add a test case for a DCAT-US source with spatial field

Code

This error happens in the spatial extension, here.
The GSA fork also includes the same code and should also fail
The GeoDataGov ext avoid the error by doing a roll up extras before package_create here so the spatial ext is not able to find the broken spatial extra.

if action_name in self.ROLLUP_SAVE_ACTIONS:
    extras_rollup = {}
    new_extras = []
    for extra in data_dict.get('extras', []):
        if extra['key'] in self.EXTRAS_ROLLUP_KEY_IGNORE:
            new_extras.append(extra)
        else:
            extras_rollup[extra['key']] = extra['value']
    if extras_rollup:
        new_extras.append({'key': 'extras_rollup',
                           'value': json.dumps(extras_rollup)})
    data_dict['extras'] = new_extras

This feature avoids the spatial validation for the spatial extra.

At GSA CANK 2.3 we allow to override before_action and after_action functions to all extensions. This is not available at CKAN 2.8

The GeoDataGov extension uses this function to roll up extras.

Notes

  • The ideal fix should be to use some hook before package_update and package_create but the class IPackageController seems not include a before_create or before_update hooks.
  • We should add an issue to allow spatial as a string or deal with this in a different way. Avoid a validation is not a real fix here.

Error importing WAF harvest sources

Source with config: '{"validator_profiles": ["fgdc_minimal"]} fails with Unknown validation profile(s): fgdc_minimal

Harvest source failing: fema-r10

How to reproduce

Run the importer for this source:

python import_harvest_sources.py --names=fema-r10 --destination_api_key=b396d3d9-1f24-4b72-92c1-78f981009a92

Full importer error

****** creating fema-r10: 7 of 289 sources
remote_ckan.lib - Get organization data https://catalog.data.gov/api/3/action/organization_show {'id': 'fema-gov'}
remote_ckan.lib - organization fema-gov saved at /checks
remote_ckan.lib - New extra found at org [email protected]
[email protected]
[email protected]

remote_ckan.lib - New extra found at org organization_type=Federal Government
remote_ckan.lib - Creating organization fema-gov
remote_ckan.lib - Get organization data https://catalog-next.sandbox.datagov.us/api/3/action/organization_show {'id': 'fema-gov'}
remote_ckan.lib - organization fema-gov saved at /checks
remote_ckan.lib - Request OK https://catalog-next.sandbox.datagov.us/api/3/action/organization_update
remote_ckan.lib - Config found in extras: {'value': '{"validator_profiles": ["fgdc_minimal"], "private_datasets": false}', 'key': 'config'}
remote_ckan.lib - Creating harvest source FEMA-R10
remote_ckan.lib - ERROR status: 409
	 content:b'{"help": "https://catalog-next.sandbox.datagov.us/api/3/action/help_show?name=harvest_source_create", "success": false, "error": {"__type": "Validation Error", "config": ["Error parsing the configuration options: Unknown validation profile(s): fgdc_minimal"]}}'
	 sent: {'name': 'fema-r10', 'owner_org': 'fema-gov', 'title': 'FEMA-R10', 'url': 'https://hazards.fema.gov/filedownload/metadata/R10/', 'notes': '', 'source_type': 'waf', 'frequency': 'WEEKLY', 'config': '{"validator_profiles": ["fgdc_minimal"], "private_datasets": false}', 'extras': [], 'id': 'fd47c02e-3ef2-42fb-b4e5-40560b136a08'}

Test DCAT-US harvest from sources using collections

One difference between harvest upstream version and the GSA fork is that the fork changes the order in which we harvest.

The DCAT-US harvester adds datajson_collection to source config to be used at the harvester ext to divide the harvest process first for parents and then for children.
First, we harvest father datasets and then the children.

We already have a test case for DCAT-US with collections but this test does not cover this feature.

Tasks

  • Add test case for DCAT-US collection with multiple unordered datasets.
  • Fix it if is required

Related to #40 and Multi#301

[BUG] Unable to create waf-collection harvest source

Notice

This issue is blocking Sandblox-PR#1792.

Error

We can't create waf-collection harvest source in the new harvest source form

image

This is because we are not using the custom form to create this type of source.
The catalog classic use datagov_harvest instead harvest in the plugin list. This is defined at geodatagov extension. This class override the form to create harvest sources.

If we move plugin from harvest to datagov_harvest we still have an error

image

How to reproduce

  1. Replace the harvest extension with datagov_harvest in the plugins configuration (production.ini)
  2. Open the new harvest form https://catalog-next.sandbox.datagov.us/harvest/new
  3. Fill in the details for a waf-collection harvester
  4. Click Save

Tasks

  • Define if want to use datagov_harvest and fix the error or move this new form (and the other features) to a different extension preserving the harvest plugin in use. (discussion here)
  • Write issues regarding this problem

Notes

This new plugin was added for Z3950 harvester 7 years ago.

The order of plugins is different:
Screen Shot 2020-06-24 at 10 56 51 AM

Logs

Using datagov_harvest we see this error log:

Module ckan.logic.action.create:170 in package_create         

 
check_data_dict	None
context	{'__auth_audit': [], '__auth_user_obj_checked': True, 'auth_user_obj': <User id=6fb43a58-a883-47c b-8e19-8503d7f32064 name=admin password=$pbkdf2-sha512$25000$WgvhHOOc897bO6dUau09xw$waMJ6mmbgBynQpo8aep6ijGRm8EqijCjlQ5aeMPqA5XfjxDKE.QSXO8czsiBHUnEisH8oxC8QutjPWgItmFsig fullname=None [email protected] apikey=f1bb38d5-8841-48cb-a414-cda128f811ab created=2020-06-23 15:39:44.994457 reset_key=None about=None activity_streams_email_notifications=False sysadmin=True state=active>, 'message': '', 'model': <module 'ckan.model' from '/srv/app/src/ckan/ckan/model/__init__.pyc'>, 'save': True, 'session': <sqlalchemy.orm.scoping.scoped_session object at 0x7f292908ef50>, 'user': u'admin'}

data_dict	{'collection_metadata_url': u'https://meta.geo.census.gov/data/existing/dec ennial/GEO/GPMB/TIGERline/TIGER2018/SeriesInfo/SeriesCollection_tl_2018_concity.shp.iso.xml', 'database': u'', 'extra_search_criteria': u'', 'extras': [], 'frequency': u'MANUAL', 'name': u'gpb-5', 'notes': u'', 'owner_org': u'1c4b46eb-afe8-4d4c-a974-822f9d7df76a', 'port': u'', 'private_datasets': u'False', 'save': u'Save', 'source_type': u'waf-collection', 'title': u'GPB 5b', 'type': 'harvest', 'url': u'https://meta.geo.census.gov/data/existing/decennial/GEO/GPMB/TIGERline/TIGER2018/concity/', 'validator_profiles': u'', 'validator_schema': u''}

model	<module 'ckan.model' from '/srv/app/src/ckan/ckan/model/__init__.pyc'>
package_plugin	<Plugin DataGovHarvest 'datagov_harvest'>

schema	{'__extras': [<function harvest_source_extra_validator at 0x7f29261ceed8>], 'config': [<fun ction ignore_missing at 0x7f29263740c8>, <function harvest_source_config_validator at 0x7f29261cede8>, <function convert_to_extras at 0x7f29262bcde8>], 'extras': {'__extras': [<function ignore at 0x7f292636cf50>], 'deleted': [<function ignore_missing at 0x7f29263740c8>], 'id': [<function ignore at 0x7f292636cf50>], 'key': [<function not_empty at 0x7f292636cd70>, <function extra_key_not_in_root_schema at 0x7f29274fe578>, <function unicode_safe at 0x7f29263742a8>], 'revision_timestamp': [<function ignore at 0x7f292636cf50>], 'state': [<function ignore at 0x7f292636cf50>], 'value': [<function not_missing at 0x7f292636ccf8>]}, 'frequency': [<function ignore_missing at 0x7f29263740c8>, <type 'unicode'>, <function harvest_source_frequency_exists at 0x7f29261d60c8>, <function convert_to_extras at 0x7f29262bcde8>], 'name': [<function not_empty at 0x7f292636cd70>, <type 'unicode'>, <function name_validator at 0x7f29274fd500>, <function package_name_validator at 0x7f29274fd578>], 'notes': [<function ignore_missing at 0x7f29263740c8>, <type 'unicode'>], 'organization': [<function ignore_missing at 0x7f29263740c8>], 'owner_org': [<function owner_org_validator at 0x7f29274eeb90>, <type 'unicode'>], 'private': [<function ignore_missing at 0x7f29263740c8>, <function boolean_validator at 0x7f29274eede8>, <function datasets_with_no_organization_cannot_be_private at 0x7f29274fe230>], 'save': [<function ignore at 0x7f292636cf50>], 'source_type': [<function not_empty at 0x7f292636cd70>, <type 'unicode'>, <function harvest_source_type_exists at 0x7f29261ced70>, <function convert_to_extras at 0x7f29262bcde8>], 'state': [<function ignore_missing at 0x7f29263740c8>], 'title': [<function callable at 0x7f292127e7d0>, <type 'unicode'>], 'type': [<function dataset_type_exists at 0x7f29261d6140>, <type 'unicode'>], 'url': [<function not_empty at 0x7f292636cd70>, <type 'unicode'>, <function harvest_source_url_validator at 0x7f29261cecf8>]}

user	u'admin'



Module ckan.lib.navl.dictization_functions:452 in unflatten         

 
convert_to_list	[]
current_pos	[{'key': 'frequency', 'value': u'MANUAL'}]
data	{
    ('__extras',): {
        'collection_metadata_url': u'https://meta.geo.census.gov/data/e xisting/decennial/GEO/GPMB/TIGERline/TIGER2018/SeriesInfo/SeriesCollection_tl_2018_concity.shp.iso.xml', 
        'database': u'', 
        'extra_search_criteria': u'', 
        'extras': [], 
        'port': u'', 
        'private_datasets': u'False', 
        'validator_profiles': u'', 
        'validator_schema': u''
}, 
    ('collection_metadata_url',): u'https://meta.geo.census.gov/data/existing/decennial/GEO/GPMB/TIGERline/TIGER2018/SeriesInfo/SeriesCollection_tl_2018_concity.shp.iso.xml', 
    ('config',): '{
        "collection_metadata_url": "https://meta.geo.census.gov/data/existing/decennial/GEO/GPMB/TIGERline/TIGER2018/SeriesInfo/SeriesCollection_tl_2018_concity.shp.iso.xml", 
        "private_datasets": false
    }', 
    ('extras',): [
        {'key': 'frequency', 'value': u'MANUAL'}
        ], 
    ('extras', 0, 'key'): 'frequency', ('extras', 0, 'value'): u'MANUAL', 
    ('extras', 1, 'key'): 'source_type', ('extras', 1, 'value'): u'waf-collection', 
    ('frequency',): u'MANUAL', 
    ('name',): u'gpb-5', 
    ('notes',): u'', 
    ('owner_org',): u'1c4b46eb-afe8-4d4c-a974-822f9d7df76a', 
    ('private_datasets',): False, 
    ('source_type',): u'waf-collection', 
    ('title',): u'GPB 5b', 
    ('type',): u'harvest', 
    ('url',): u'https://meta.geo.census.gov/data/existing/decennial/GEO/GPMB/TIGERline/TIGER2018/concity/'
}

flattend_key	('extras', 1, 'key')
key	1

unflattened	{'__extras': {'collection_metadata_url': u'https://meta.geo.census.gov/data/exis ting/decennial/GEO/GPMB/TIGERline/TIGER2018/SeriesInfo/SeriesCollection_tl_2018_concity.shp.iso.xml', 'database': u'', 'extra_search_criteria': u'', 'extras': [], 'port': u'', 'private_datasets': u'False', 'validator_profiles': u'', 'validator_schema': u''}, 'collection_metadata_url': u'https://meta.geo.census.gov/data/existing/decennial/GEO/GPMB/TIGERline/TIGER2018/SeriesInfo/SeriesCollection_tl_2018_concity.shp.iso.xml', 'config': '{"collection_metadata_url": "https://meta.geo.census.gov/data/existing/decennial/GEO/GPMB/TIGERline/TIGER2018/SeriesInfo/SeriesCollection_tl_2018_concity.shp.iso.xml", "private_datasets": false}', 'extras': [{'key': 'frequency', 'value': u'MANUAL'}], 'frequency': u'MANUAL', 'name': u'gpb-5', 'notes': u'', 'owner_org': u'1c4b46eb-afe8-4d4c-a974-822f9d7df76a', 'private_datasets': False, 'source_type': u'waf-collection', 'title': u'GPB 5b', 'type': u'harvest', 'url': u'https://meta.geo.census.gov/data/existing/decennial/GEO/GPMB/TIGERline/TIGER2018/concity/'}
view
>>  current_pos = current_pos[key]
IndexError: list index out of range

Bug: Harvest source importer misses some metadata

Bug report

Expected Behavior

Actual Behavior

Details/Tasks

Acceptance Criteria

After we import harvest sources we need to validate we have all the information we expect

  • Check if we have the same config as the original source (sometimes the config come as extras)
  • Check if we have the same extras as the original source (if we detect config inside extras we move it to config and remove from extras)
  • Check the organization is the same and has the same image included.
  • in the old/classic catalog app (with CKAN 2.3) when we create a harvest soure we need to pick federal or not-federal source. This info seems not been captured while we import. Is this private data?: No, this info is saved as validator_schema in extra['config']. If validator_schema is empty the source is federal

At the final report about harvest sources in the new catalog we found some error that make us think that we fail while import this harfest sources.

Some sources faileds with validation errors:

source url config added updated nulls errors
epa-sciencehub https://pasteur.epa.gov/metadata.json {"private_datasets": "False"} 1665 0 -1 Title: EPA True NO2 ground site measurements. 1 Error(s) Found. ### ERROR 1: 'distribution':[{'accessURL': 'http://www-air.larc.nasa.gov/cgi-bin/ArcView/disco ...
mcc-data-json https://data.mcc.gov/data.json {"private_datasets": "False"} 199 0 -1 Title: Ghana - Agriculture - Land Tenure; 1 Error(s) Found. ### ERROR 1: 'distribution':[{'downloadURL': 'http://microdata.worldbank.org/index.php/cata ...
healthdata-gov https://healthdata.gov/data.json {"private_datasets": "False"} 410 0 1302 Title: Commercial Medical Insurance 1 Error(s) Found. ### ERROR 1: 'references':['https://wwwdev.cdc.gov/visionhealth/vehss/data/claims/marketscan.html; https://marketscan.truvenhealth.com/marketscanportal/'] is not valid under any of the given schemas ...
energystar http://data.energystar.gov/data.json {"private_datasets": "False"} 54 0 -1 Title: ENERGY STAR Certified Ceiling Fans Light Kits; 1 Error(s) Found. ### ERROR 1: 'keyword' is a required proper ...

BUG, fail to show spatial resource

After fix #43 we are not able to get inside resources of a spatial dataset
Related to #307
Duplicated from #1690

How to reproduce
Add a spatial dataset (e.g. use a harvest source as in #43)

Full error log

HelperError: Helper 'archiver_is_resource_broken_line' has not been defined.
View as:   Interactive (full)  |  Text (full)  |  XML (full)
Module ckan.controllers.package:1147 in resource_read         view
>>  return render(template, extra_vars=vars)
Module ckan.lib.base:125 in render         view
>>  return cached_template(template_name, renderer)
Module pylons.templating:249 in cached_template         view
>>  return render_func()
Module ckan.lib.base:162 in render_template         view
>>  return render_jinja2(template_name, globs)
Module ckan.lib.base:94 in render_jinja2         view
>>  return template.render(**extra_vars)
Module jinja2.environment:989 in render         view
>>  return self.environment.handle_exception(exc_info, True)
Module jinja2.environment:754 in handle_exception         view
>>  reraise(exc_type, exc_value, tb)
Module /srv/app/src/ckan/ckanext/datastore/templates/package/resource_read.html:1 in top-level template code         view
>>  {% ckan_extends %}
Module /usr/lib/python2.7/site-packages/ckanext/datagovtheme/templates/package/resource_read.html:46 in top-level template code         view
>>  {% set pkg = c.pkg_dict %}
Module /srv/app/src/ckan/ckan/templates/package/resource_read.html:3 in top-level template code         view
>>  {% set res = c.resource %}
Module /usr/lib/python2.7/site-packages/ckanext/datagovtheme/templates/package/base.html:3 in top-level template code         view
>>  {% set pkg = c.pkg_dict %}
Module /usr/lib/python2.7/site-packages/ckanext/datagovtheme/templates/page.html:1 in top-level template code         view
>>  {% ckan_extends %}
Module /srv/app/src/ckan/ckan/templates/page.html:1 in top-level template code         view
>>  {% extends "base.html" %}
Module /usr/lib/python2.7/site-packages/ckanext/harvest/plugin/../templates/base.html:1 in top-level template code         view
>>  {% ckan_extends %}
Module /usr/lib/python2.7/site-packages/ckanext/datagovtheme/templates/base.html:1 in top-level template code         view
>>  {% ckan_extends %}
Module /srv/app/src/ckan/ckan/templates/base.html:101 in top-level template code         view
>>  {%- block page %}{% endblock -%}
Module /srv/app/src/ckan/ckan/templates/page.html:19 in block "page"         view
>>  {%- block content %}
Module /srv/app/src/ckan/ckan/templates/page.html:22 in block "content"         view
>>  {% block main_content %}
Module /srv/app/src/ckan/ckan/templates/page.html:53 in block "main_content"         view
>>  {% block pre_primary %}
Module /srv/app/src/ckan/ckan/templates/package/resource_read.html:22 in block "pre_primary"         view
>>  {% block resource %}
Module /srv/app/src/ckan/ckan/templates/package/resource_read.html:24 in block "resource"         view
>>  {% block resource_inner %}
Module /srv/app/src/ckan/ckan/templates/package/resource_read.html:27 in block "resource_inner"         view
>>  {% block resource_actions %}
Module /srv/app/src/ckan/ckan/templates/package/resource_read.html:29 in block "resource_actions"         view
>>  {% block resource_actions_inner %}
Module /srv/app/src/ckan/ckanext/datastore/templates/package/resource_read.html:4 in block "resource_actions_inner"         view
>>  {{ super() }}
Module /usr/lib/python2.7/site-packages/ckanext/datagovtheme/templates/package/resource_read.html:25 in block "resource_actions_inner"         view
>>  {{ h.archiver_is_resource_broken_line(c.resource) }}
Module jinja2.environment:412 in getattr         view
>>  return obj[attribute]
Module ckan.lib.helpers:87 in __getitem__         view
>>  key=key
HelperError: Helper 'archiver_is_resource_broken_line' has not been defined.

image

Test Collections from WAF-Collection Harvest Source

Acceptance criteria

  • When I run the test suite
  • Tests testing collections with WAF harvester pass

Tasks/Details

  • When I search for a parent dataset (need live example of collection from WAF-collection type harvest source)

  • Then I see it in search results

  • When I search for a child dataset (need live example of collection from WAF-collection type harvest source)

  • Then I do not see the child dataset in the search results, but I do see the parent dataset in the results (need live example of collection from WAF-collection type harvest source)

  • WHEN I click on a collection

    • THEN I can see all datasets that belong to that collection
    • AND I can search within a collection for datasets that belong to that collection

Task-list

  • Find a live example of collection from WAF-collection type harvest source
  • Write test for WAF-collection type harvest source
  • Tests are green

BUG: Collections Children Datasets Not Harvesting

How to reproduce

  1. Harvest a collection of datasets like from a harvest source like http://www.opm.gov/data.json. Click on a collection like this one, try and view datasets within the collection - there are none.

Catalog: https://admin-catalog.data.gov/dataset/annual-performance-reports-aprs

Sandbox: https://catalog-next.sandbox.datagov.us/dataset/annual-performance-reports-aprs

Expected behavior

You click on Search datasets within this collection within a collection and should be able to see the children datasets like this one

Actual behavior

When you click Search datasets within this collection within a collection you see a page with 0 datasets like this one

BUG: Error Importing Harvest Source u-s-forest-service-geospatial-data-discovery

How to reproduce

When using the import script

$ pipenv run python import_harvest_sources.py --source_ty=datajson --limit=3 --destination_url=https://catalog-next.sandbox.datagov.us --destination_api_key=$CKAN_API_KEY

https://catalog.data.gov/api/3/action/harvest_source_show?id=11cd07d8-f5a7-4207-9852-6160e09ff240 gives a 500 error, the name is u-s-forest-service-geospatial-data-discovery Harvest Source

Expected behavior

Harvest source should be imported

Actual behavior

https://catalog.data.gov/api/3/action/harvest_source_show?id=11cd07d8-f5a7-4207-9852-6160e09ff240 gives a 500 error, the name is u-s-forest-service-geospatial-data-discovery Harvest Source

Recover hidden code at Theme

The DataGovTheme depends on ckanext-archiver and ckanext-qa and uses its helpers:

Since this new catalog move extension to the upstream version we miss these functions and the Theme is not able to show resources inside datasets.

We hide these calls in a PR.

This issue is to validate if we want these features and recover them.
Related issues:

Should be part of the analysis at:

Test Collections from CSW Harvest Source

User Story

As a data.gov developer I want to test that the existing collections being harvested from a CSW harvest source are being harvested correctly

Acceptance Criteria

  • When I search for a parent dataset (need live example of collection from CSW harvest source)

  • Then I see it in search results

  • When I search for a child dataset (need live example of collection from CSW harvest source)

  • Then I do not see the child dataset in the search results, but I do see the parent dataset in the results (need live example of collection from CSW harvest source)

  • WHEN I click on a collection

    • THEN I can see all datasets that belong to that collection
    • AND I can search within a collection for datasets that belong to that collection

Task-list

  • Find a live example of collection from CSW harvest source
  • Write test for CSW harvest source
  • Tests are green

Error importing WAF harvest sources

Related to #307

Error import WAF source:
Error parsing the configuration options: Unknown validation profile(s): fgdc_minimal

To reproduce

python import_harvest_sources.py     \
   --origin_url=https://catalog.data.gov     \
   --destination_url=http://ckan:5000     \
   --destination_api_key=xxxxx     \
   --source_type=waf     \
   --limit=2

Error log.

Creating organization Federal Geographic Data Committee
Request OK http://ckan:5000/api/3/action/organization_create
Config found in extras: {'value': '{"validator_profiles": ["fgdc_minimal"], "private_datasets": false}', 'key': 'config'}
Creating harvest source FGDC WAF (Hosted by DOI for Geoplatform.gov) 
	http://data.doi.gov/WAF/FGDC 
	{"validator_profiles": ["fgdc_minimal"], "private_datasets": false}
ERROR status: 409
	 content:b'{"help": "http://ckan:5000/api/3/action/help_show?name=harvest_source_create", "success": false, "error": {"__type": "Validation Error", "config": ["Error parsing the configuration options: Unknown validation profile(s): fgdc_minimal"]}}'

Tasks

  • Detect where this validator profile should be added and allow import these harvest sources.

Allow tests of any extension to run here

Document the steps to run the tests for any extension.
Related to #301

We need a way to full test extensions inside this new catalog app.
For instance, to test our new catalog extension we run:

docker-compose exec ckan bash -c \
    "cd src_extensions/ckanext-catalogdatagov && nosetests -v -s --with-pylons=test.ini"

We detect errors trying to run these tests for ckanext-datajson and ckanext-harvest
In both cases, we mount a volume with full source and run tests

For ckanext-datajson

nosetests --ckan --with-pylons=test.ini ckanext/datajson/tests

The reset_db function freezes the tests

fails at `test_datajsonharvester.py", line 42, in setup: reset_db()` 
ERROR: ProgrammingError: (psycopg2.ProgrammingError) 
    relation "package_tag_revision" does not exist 
    LINE 1: delete from "package_tag_revision"

For ckanext-harvest

To test notification/email in harvest upstream ext inside this env: Fail when we try to use

nosetests --ckan --with-pylons=test.ini ckanext.harvest.tests.nose.test_action:TestHarvestErrorMail

Run saml2 locally

Allow optional execution of saml2 extension locally

  • Add a login.gov configuration for localhost
  • Allow to run catalog locally without saml2
  • Allow to run catalog locally with saml2
  • Add basic tests for saml2 extension

Test Collections from WAF Harvest Source

User Story

As a data.gov developer I want to test that the existing collections being harvested from a WAF harvest source are being harvested correctly

Acceptance Criteria

  • When I search for a parent dataset (need live example of collection from WAF harvest source)

  • Then I see it in search results

  • When I search for a child dataset (need live example of collection from WAF harvest source)

  • Then I do not see the child dataset in the search results, but I do see the parent dataset in the results (need live example of collection from WAF harvest source)

  • WHEN I click on a collection

    • THEN I can see all datasets that belong to that collection
    • AND I can search within a collection for datasets that belong to that collection

Task-list

  • Find a live example of collection from WAF harvest source
  • Write test for WAF harvest source
  • Tests are green

Discovery: Upgrade to >= Flask 1.0 [CVE-2019-1010083, low severity]

CVE-2019-1010083
low severity
Vulnerable versions: < 1.0.0
Patched version: 1.0.0
The Pallets Project Flask before 1.0 is affected by: unexpected memory usage. The impact is: denial of service. The attack vector is: crafted encoded JSON data. The fixed version is: 1. NOTE: this may overlap CVE-2018-1000656.

Note that CKAN 2.8.x does not appear to be compatible with Flask >= 1.0.0, although CKAN 2.9 is.

Acceptance Criteria

  • Determine whether risk must be remediated or should be accepted
  • Create ticket for remediation if required

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.