Coder Social home page Coder Social logo

vmaas's Introduction

Tests codecov GitHub release

VMaaS

Vulnerability Metadata as a Service

What Is This Thing?

VMaaS is intended to be a microservice that has access to data connecting RPMs, repositories, errata, and CVEs, and can answer the question "What security changes do I have to apply to the following set of RPMs?"

The goal is to have a common set of data, that can be updated from multiple sources, and accessed from an arbitrary number of web-service instances. To that end, database contains the docker-definitions for getting the data store up and running, webapp is the service that uses the data to answer a variety of vulnerability-related questions, and reposcan is an example of a plugin whose job is to fill the datastore with vulnerability information.

What ISN'T This Thing?

VMaaS is NOT intended to be an inventory-management system. It doesn't 'remember' system profiles or containers, or manage inventory workflow in any way. An inventory-management system could use VMaaS as one source of 'health' information for the entities being managed.

Architecture

Quick Command Guide

Local deployment (development)

All-in-one command magic

docker-compose up      # Build images and start containers
docker-compose down    # Stop and remove containers (built images will persist)
docker-compose down -v # Stop and remove containers and database data volume (built images will persist)

Build images

docker-compose build

Managing containers

All at once

docker-compose start
docker-compose stop

Single service

docker-compose start vmaas_database
docker-compose stop vmaas_database

Run tests

You can run all tests from scratch just after cloning repo using command:

docker-compose -f docker-compose.test.yml up --build --abort-on-container-exit

Developing / Debugging

You can build and start your container in "developer mode". You can tune metrics using Prometheus and Grafana dev containers, see doc/metrics.md.

Copy database from live OpenShift instance (requires valid credentials)

oc project vmaas-stage
# Dump database
oc exec -c vmaas-reposcan-service $(oc get pod -l pod=vmaas-reposcan-service --no-headers -o custom-columns=:metadata.name) -- bash -c 'PGPASSWORD=vmaas_writer_pwd pg_dump -h $(python3 -c "import app_common_python as a;print(a.LoadedConfig.database.hostname)") -U vmaas_writer vmaas | gzip > /data/pgdump.sql.gz'
# Download database dump
oc port-forward $(oc get pod -l pod=vmaas-reposcan-service --no-headers -o custom-columns=:metadata.name) 10000:10000
curl http://localhost:10000/pgdump.sql.gz > /tmp/pgdump.sql.gz
# Populate local database
docker-compose up -d
docker-compose exec vmaas_database psql -U vmaas_admin postgres -c "drop database vmaas"
docker-compose exec vmaas_database psql -U vmaas_admin postgres -c "create database vmaas"
cat /tmp/pgdump.sql.gz | gzip -d | docker-compose exec -T vmaas_database psql -U vmaas_admin vmaas
# Generate new webapp sqlite dump
./scripts/turnpike-mock curl -X PUT http://localhost:8081/api/v1/export/dump

vmaas's People

Contributors

blayson avatar bsquizz avatar casey-williams-rh avatar eherget avatar genalt avatar ggainey avatar jdobes avatar jiridostal avatar lphiri avatar matysek avatar michaelmraka avatar michalslomczynski avatar mischulee avatar mkholjuraev avatar mtclinton avatar psav avatar psegedy avatar radohanculak avatar semtexzv avatar shannon-donahue avatar tkasparek avatar tlestach avatar vkrizan avatar vmaas-bot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vmaas's Issues

updates: KeyError in pkg_id2repo_id[upd_pkg_id]

With following packages

vim-enhanced-2:7.4.160-2.el7.x86_64
vim-common-2:7.4.160-2.el7.x86_64
vim-minimal-2:7.4.160-1.el7.x86_64

I am hitting this error:

vmaas-webapp | ERROR:tornado.application:Uncaught exception GET /api/v1/updates/vim-common-2:7.4.160-2.el7.x86_64 (172.18.0.1)
vmaas-webapp | HTTPServerRequest(protocol='http', host='localhost:8080', method='GET', uri='/api/v1/updates/vim-common-2:7.4.160-2.el7.x86_64', version='HTTP/1.1', remote_ip='172.18.0.1', headers={'Host': 'localhost:8080', 'Accept': '*/*', 'User-Agent': 'curl/7.55.1'})
vmaas-webapp | Traceback (most recent call last):
vmaas-webapp |   File "/usr/lib64/python2.7/site-packages/tornado/web.py", line 1412, in _execute
vmaas-webapp |     result = method(*self.path_args, **self.path_kwargs)
vmaas-webapp |   File "/app/app.py", line 19, in get
vmaas-webapp |     response = self.process_string(name)
vmaas-webapp |   File "/app/app.py", line 44, in process_string
vmaas-webapp |     return self.application.updatesapi.process_list({'package_list': [data]})
vmaas-webapp |   File "/app/updates.py", line 269, in process_list
vmaas-webapp |     for r_id in pkg_id2repo_id[upd_pkg_id]:
vmaas-webapp | KeyError: 88458
vmaas-webapp | ERROR:tornado.access:500 GET /api/v1/updates/vim-common-2:7.4.160-2.el7.x86_64 (172.18.0.1) 54.48ms

It seems like it looking for fedora package instead of rhel

vmaas=> select * from evr where id in (select evr_id from package where id = 88458);
  id   | epoch | version  | release |                                   evr                                   
-------+-------+----------+---------+-------------------------------------------------------------------------
 16212 | 2     | 8.0.1553 | 1.fc27  | (2,"{""(8,)"",""(0,)"",""(1553,)""}","{""(1,)"",""(0,fc)"",""(27,)""}")
(1 row)

even though there is vim-common package in rhel

vmaas=> select * from package inner join  evr on package.evr_id = evr.id where evr_id in (select evr_id from package where name = 'vim-common') and name = 'vim-common';

  id   |    name    | evr_id | arch_id |                             checksum                             | checksum_type_id |  id   | epoch | version  |  release  |                                           evr                                           
-------+------------+--------+---------+------------------------------------------------------------------+------------------+-------+-------+----------+-----------+-----------------------------------------------------------------------------------------
   559 | vim-common |   6507 |       1 | 93bed1cf7e0724e10f614f2e839e44ba596fc406                         |                1 |  6507 | 2     | 7.4.629  | 5.el6     | (2,"{""(7,)"",""(4,)"",""(629,)""}","{""(5,)"",""(0,el)"",""(6,)""}")
  2911 | vim-common |   2180 |       1 | 923e8a2e85e4612b5c8f7c35c08482002dcd9cff                         |                1 |  2180 | 2     | 7.2.411  | 1.8.el6   | (2,"{""(7,)"",""(2,)"",""(411,)""}","{""(1,)"",""(8,)"",""(0,el)"",""(6,)""}")
  2957 | vim-common |   1764 |       1 | 8cf91f0f612ebdb601637e6a46999fa9dadded6e                         |                1 |  1764 | 2     | 7.2.411  | 1.6.el6   | (2,"{""(7,)"",""(2,)"",""(411,)""}","{""(1,)"",""(6,)"",""(0,el)"",""(6,)""}")
  7021 | vim-common |   5708 |       1 | 58f87e449d39124a44c2b9f14e78eae482a59a8a                         |                1 |  5708 | 2     | 7.4.629  | 5.el6_8.1 | (2,"{""(7,)"",""(4,)"",""(629,)""}","{""(5,)"",""(0,el)"",""(6,)"",""(8,)"",""(1,)""}")
 14201 | vim-common |   5861 |       1 | b29afff72319f5255fb2b53e1a7718b5ac7724c8                         |                1 |  5861 | 2     | 7.2.411  | 1.4.el6   | (2,"{""(7,)"",""(2,)"",""(411,)""}","{""(1,)"",""(4,)"",""(0,el)"",""(6,)""}")
 40757 | vim-common |  12231 |       1 | 84704e35d510631dedd1ee497001c3b1b5113665                         |                1 | 12231 | 2     | 7.4.160  | 1.el7     | (2,"{""(7,)"",""(4,)"",""(160,)""}","{""(1,)"",""(0,el)"",""(7,)""}")
 49171 | vim-common |  10568 |       1 | 9dbb79a05a1eaefec5c587a5fe73d8d489d95d62                         |                1 | 10568 | 2     | 7.4.160  | 2.el7     | (2,"{""(7,)"",""(4,)"",""(160,)""}","{""(2,)"",""(0,el)"",""(7,)""}")
 49808 | vim-common |  12073 |       1 | ac2ac4703400e04e8cb81ba7d65077553d61b198                         |                1 | 12073 | 2     | 7.4.160  | 1.el7_3.1 | (2,"{""(7,)"",""(4,)"",""(160,)""}","{""(1,)"",""(0,el)"",""(7,)"",""(3,)"",""(1,)""}")
 75884 | vim-common |  15560 |       1 | 5054b42407705286e1b529e0baa5f303d723499abd77c9c891b1b1bdc717def7 |                2 | 15560 | 2     | 8.0.1553 | 1.fc26    | (2,"{""(8,)"",""(0,)"",""(1553,)""}","{""(1,)"",""(0,fc)"",""(26,)""}")
 88458 | vim-common |  16212 |       1 | 2441c4ce9568019e9f149f76db848b7f67613f48d582a960d83e9bd99067cae7 |                2 | 16212 | 2     | 8.0.1553 | 1.fc27    | (2,"{""(8,)"",""(0,)"",""(1553,)""}","{""(1,)"",""(0,fc)"",""(27,)""}")
 89278 | vim-common |  17838 |       1 | c5da666beabf2f24dcdf8c3247e005dcd42a0bf5538c0bea9801d5a97c6fabd6 |                2 | 17838 | 2     | 8.0.1573 | 1.fc27    | (2,"{""(8,)"",""(0,)"",""(1573,)""}","{""(1,)"",""(0,fc)"",""(27,)""}")
(11 rows)

Setup proper linking between containers

In reposcan README is specified to use --link option to connect with DB container. This option is deprecated and should be replaced with some better system.

api: improve input packages name parsing

Now we expect to have 'epoch' information as a first field in rpm-name epoch:-name-version-release.arch.

We need to add possibility to pass rpm-name if format name-:epoch-version-release.arch

reposcan: impove logging

Now, there is a lot of messages printed on stdout. Sync of ~25000 repositories produced ~80MB log. Many of these messages are only good for debugging and should be omitted by default.

Consider using standard libs instead of current cli/logger.py.

Include date&time into log messages because it's hard to debug stuff without it.

reposcan: updateinfo referencing packages from other repos

errata in updateinfo can reference also related packages from other repositories (e.g. SRPMS). Current code can only assign only packages from current repository to errata. In updateinfo are packages assigned to errata referenced by NEVRA. It can't simply look for packages in all repositories in DB because of slowdown, memory usage and fact there can be multiple packages with same NEVRA.

reposcan: add missing tests

  • there are missing unit tests for already implemented classes
    • investigate how to unit test classes for download and databse import
  • create tests of complete application
  • setup Travis CI to run run_tests.sh on push/opened pull request
  • setup pylint

updates: Missing updates when using rpm name w/o epochnum

I have same set of rpms with and without epochnum

โžœ  VMaaS wc -l rpms-64 
626 rpms-64
โžœ  VMaaS wc -l rpms-64-wo-epoch 
626 rpms-64-wo-epoch

But I get differnet number of updates for these two sets

โžœ  VMaaS curl -d @rpms-64 'http://127.0.0.1:8080/api/v1/updates/' | python -m json.tool > updates_for_el64 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  420k    0  393k  100 27498   393k  27498  0:00:01  0:00:01 --:--:--  304k
โžœ  VMaaS wc -l updates_for_el64                                                                           
14505 updates_for_el64


โžœ  VMaaS curl -d @rpms-64-wo-epoch 'http://127.0.0.1:8080/api/v1/updates/' | python -m json.tool > updates_for_el64-wo-epoch 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  345k    0  319k  100 26234   319k  26234  0:00:01  0:00:01 --:--:--  273k
โžœ  VMaaS wc -l updates_for_el64-wo-epoch                                                                                    
11922 updates_for_el64-wo-epoch

For example there is difference for vim-minimal-2:7.2.411-1.8.el6.x86_64 and vim-minimal-7.2.411-1.8.el6.x86_64

โžœ  VMaaS curl 'http://127.0.0.1:8080/api/v1/updates/vim-minimal-2:7.2.411-1.8.el6.x86_64' | python -m json.tool             
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   622    0   622    0     0    622      0 --:--:-- --:--:-- --:--:-- 17771
{
    "update_list": {
        "vim-minimal-2:7.2.411-1.8.el6.x86_64": [
            {
                "erratum": "RHSA-2016:2972",
                "package": "vim-minimal-2:7.4.629-5.el6_8.1.x86_64",
                "repository": "rhel-6-server-rpms__6_DOT_9__x86_64"
            },
            {
                "erratum": "RHSA-2016:2972",
                "package": "vim-minimal-2:7.4.629-5.el6_8.1.x86_64",
                "repository": "rhel-6-workstation-rpms__6Workstation__x86_64"
            },
            {
                "erratum": "RHSA-2016:2972",
                "package": "vim-minimal-2:7.4.629-5.el6_8.1.x86_64",
                "repository": "rhel-6-workstation-rpms__6_DOT_9__x86_64"
            },
            {
                "erratum": "RHSA-2016:2972",
                "package": "vim-minimal-2:7.4.629-5.el6_8.1.x86_64",
                "repository": "rhel-6-server-rpms__6Server__x86_64"
            }
        ]
    }
}


โžœ  VMaaS curl 'http://127.0.0.1:8080/api/v1/updates/vim-minimal-7.2.411-1.8.el6.x86_64' | python -m json.tool 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    59    0    59    0     0     59      0 --:--:-- --:--:-- --:--:-- 59000
{
    "update_list": {
        "vim-minimal-7.2.411-1.8.el6.x86_64": []
    }
}

/updates returns duplicate data

For a given NEVRA with an update, the same update-nevra/erratum/repo appears twice.

As an example - for kernel-2.6.32-696.el6.x86_64, RHSA-2017:2795 updated it to kernel-2.6.32-696.10.3.el6.x86_64 in two repos, rhel-6-server-rpms and rhel-6-workstation-rpms - but that combination shows up twice:

"kernel-2.6.32-696.el6.x86_64": [                                                    
            {                                                                                
                "erratum": "RHSA-2017:2795",                                                 
                "package": "kernel-2.6.32-696.10.3.el6.x86_64",                              
                "repository": "rhel-6-server-rpms"                                           
            },                                                                               
            {                                                                                
                "erratum": "RHSA-2017:2795",                                                 
                "package": "kernel-2.6.32-696.10.3.el6.x86_64",                              
                "repository": "rhel-6-workstation-rpms"                                      
            },                                                                               
            {                                                                                
                "erratum": "RHSA-2017:2795",                                                 
                "package": "kernel-2.6.32-696.10.3.el6.x86_64",                              
                "repository": "rhel-6-server-rpms"                                           
            },                                                                               
            {                                                                                
                "erratum": "RHSA-2017:2795",                                                 
                "package": "kernel-2.6.32-696.10.3.el6.x86_64",                              
                "repository": "rhel-6-workstation-rpms"                                      
            },                                        
...

non ascii characters cause 400 response

non ascii characters in the input break the output vmaas returns 400 instead of meaningful reponse as for
curl http://webapp-vmaas-stable.1b13.insights.openshiftapps.com/api/v1/updates/baล -4.3.42-5.fc23.x86_64 I'd expect {"update_list": {"baล -4.3.42-5.fc23.x86_64": []}} but instead of it I am getting:

400 Bad request

Your browser sent an invalid request.

reposcan: deduplicate code

Similar code for table of keys (e.g. severity) handling is used both in reposcan and cve scan. We should merge implementations into a single one used in all places.

reposcan: sha vs sha1

Some old repositories are using "sha" instead of "sha1". There needs to be a workaround for this to handle both "sha" and "sha1" as same type. (Note: This issue has also workaround in Spacewalk)

vmaas=# select * from checksum_type;
id | name
----+--------
13 | sha1
14 | sha
15 | sha256
(3 rows)

updates: ProgrammingError: syntax error at or near ")"

I've found 2 packages so far that result in 500 Internal Server Error

Red_Hat_Enterprise_Linux-Release_Notes-7-en-US-0:7-2.el7.noarch
crontabs-1.11-6.20121102git.el7.noarch

# curl http://localhost:8080/api/v1/updates/crontabs-1.11-6.20121102git.el7.noarch               
<html><title>500: Internal Server Error</title><body>500: Internal Server Error</body></html>%
# curl http://localhost:8080/api/v1/updates/Red_Hat_Enterprise_Linux-Release_Notes-7-en-US-0:7-2.el7.noarch
<html><title>500: Internal Server Error</title><body>500: Internal Server Error</body></html>%    

Server side error:

vmaas-webapp | ERROR:tornado.application:Uncaught exception GET /api/v1/updates/Red_Hat_Enterprise_Linux-Release_Notes-7-en-US-0:7-2.el7.noarch (172.18.0.1)
vmaas-webapp | HTTPServerRequest(protocol='http', host='localhost:8080', method='GET', uri='/api/v1/updates/Red_Hat_Enterprise_Linux-Release_Notes-7-en-US-0:7-2.el7.noarch', version='HTTP/1.1', remote_ip='172.18.0.1', headers={'Host': 'localhost:8080', 'Accept': '*/*', 'User-Agent': 'curl/7.55.1'})
vmaas-webapp | Traceback (most recent call last):
vmaas-webapp |   File "/usr/lib64/python2.7/site-packages/tornado/web.py", line 1412, in _execute
vmaas-webapp |     result = method(*self.path_args, **self.path_kwargs)
vmaas-webapp |   File "/app/app.py", line 31, in get
vmaas-webapp |     response = self.process_string(name)
vmaas-webapp |   File "/app/app.py", line 54, in process_string
vmaas-webapp |     return self.application.updatesapi.process_list({'package_list': [data]})
vmaas-webapp |   File "/app/updates.py", line 209, in process_list
vmaas-webapp |     self.cursor.execute("select pkg_id, repo_id from pkg_repo where pkg_id in %s;", [tuple(update_pkg_ids)])
vmaas-webapp | ProgrammingError: syntax error at or near ")"
vmaas-webapp | LINE 1: select pkg_id, repo_id from pkg_repo where pkg_id in ();
vmaas-webapp |                                                               ^
vmaas-webapp | 
vmaas-webapp | ERROR:tornado.access:500 GET /api/v1/updates/Red_Hat_Enterprise_Linux-Release_Notes-7-en-US-0:7-2.el7.noarch (172.18.0.1) 61.20ms

Which also blocks transaction as mentioned in #67

Errata API missing data for fields

There are four fields that are defined in the VMaaS API details draft document for which I could not find existing data in the database. This issue is to add the fields to the database and update the Errata API implementation to include these fields in the response.

  • type
  • summary
  • bugzilla list
  • reference list

reposcan: KeyError: KeyError: 'http://pulp_host/.../6/6Server/x86_64/os/'

vmaas_reposcan | Traceback (most recent call last):
vmaas_reposcan |   File "./reposcan.py", line 40, in <module>
vmaas_reposcan |     repository_controller.store()
vmaas_reposcan |   File "/vmaas-reposcan/repodata/repository_controller.py", line 117, in store
vmaas_reposcan |     self._read_repomds(failed)
vmaas_reposcan |   File "/vmaas-reposcan/repodata/repository_controller.py", line 52, in _read_repomds
vmaas_reposcan |     db_revision = self.db_repositories[repository.repo_url]["revision"]
vmaas_reposcan | KeyError: 'http://pulp_host/.../6/6Server/x86_64/os/'

reposcan: support removal of data

  • add support for removal of previously synced repository
  • add support for clearing DB at once (maybe, it can be done by removing DB container as well)
  • add automatic removal of redundant rows (packages/evr/errata that are not part of any repositories anymore)

Updating CVEs is failing

I observed following behavior:

  1. Create clean database
  2. sync CVEs data: curl -X GET http://127.0.0.1:8081/api/v1/sync/cve
  3. sync repos: curl -d @/tmp/repolist_product -X POST http://127.0.0.1:8081/api/v1/sync/repo
    => PASS

but

  1. Create clean database
  2. sync repos: curl -d @/tmp/repolist_product -X POST http://127.0.0.1:8081/api/v1/sync/repo
  3. sync CVEs data: curl -X GET http://127.0.0.1:8081/api/v1/sync/cve
    => FAIL
vmaas-reposcan | Syncing CVEs : 6745
vmaas-reposcan | Syncing 6745 CVEs.
vmaas-reposcan | Severities missing in DB: 1
vmaas-reposcan | CVEs to import: 6743
vmaas-reposcan | CVEs to update: 2
vmaas-reposcan | Traceback (most recent call last):
vmaas-reposcan |   File "/vmaas-reposcan/reposcan.py", line 66, in cve_sync_task
vmaas-reposcan |     controller.store()
vmaas-reposcan |   File "/vmaas-reposcan/nistcve/cve_controller.py", line 131, in store
vmaas-reposcan |     self.cverepo_store.store(repo)
vmaas-reposcan |   File "/vmaas-reposcan/database/cverepo_store.py", line 56, in store
vmaas-reposcan |     self.cve_store.store(repo)
vmaas-reposcan |   File "/vmaas-reposcan/database/cve_store.py", line 178, in store
vmaas-reposcan |     self._populate_cves(repo)
vmaas-reposcan |   File "/vmaas-reposcan/database/cve_store.py", line 167, in _populate_cves
vmaas-reposcan |     list(to_update), page_size=len(to_update))
vmaas-reposcan |   File "/usr/lib64/python3.6/site-packages/psycopg2/extras.py", line 1256, in execute_values
vmaas-reposcan |     cur.execute(b''.join(parts))
vmaas-reposcan | psycopg2.ProgrammingError: column "severity_id" is of type integer but expression is of type text
vmaas-reposcan | LINE 3: ...                                    severity_id = v.severity...
vmaas-reposcan |                                                              ^
vmaas-reposcan | HINT:  You will need to rewrite or cast the expression.```

No longer get results from API call

I've recently rebased my fork off of RedHatInsights/vmaas and now I no longer get results when I hit the API. I've checked the database manually and at least 2 packages (all that I checked) submitted have later version packages in the db. The response from the API call simply lists the same packages I sent in.

I reset my git repo to various commits and found that this commit 6be821afad80d81ed1b2ea8c70c279f506fd3a58 (api: add method to process input packages list and make result in format:...) is the first commit where I get no results.

In debugging this issue, it seems the database image is somehow being pre-populated with data as the first time reposcan runs on a freshly built image/container, it skips downloading from all 16 repos. I'm not sure if this is related, but my concern is the pre-populated database could be missing values in newly added fields, etc.

my pkgfile.json contains:

{
    "package_list": [
        "qt-settings-19-22.4.el7.noarch",
        "perl-Pod-Checker-1.50-1.el7.noarch",
        "bash-4.2.46-20.el7_2.x86_64",
        "postgresql-9.2.10-1.el7.x86_64",
        "xorg-x11-proto-devel-7.7-13.el7.noarch"
    ]
}

The command to make the API call and view the response is:
curl -X POST -d @pkgfile.json -H "Content-Type: application/json" http://localhost:8080/api/v1/json | python -m json.tool | less

the API response from current RedHatInsights/master is:

{
    "bugzilla_list": {},
    "cve_list": {},
    "errata_list": {},
    "nevra_list": {},
    "package_list": {
        "bash-4.2.46-20.el7_2.x86_64": [],
        "perl-Pod-Checker-1.50-1.el7.noarch": [],
        "postgresql-9.2.10-1.el7.x86_64": [],
        "qt-settings-19-22.4.el7.noarch": [],
        "xorg-x11-proto-devel-7.7-13.el7.noarch": []
    },
    "repo_list": {}
}

if I reset my git repo to the commit before the one mentioned above, the API response is:

{
    "bugzilla_list": {},
    "cve_list": {},
    "errata_list": {},
    "nevra_list": {},
    "package_list": {
        "bash-4.2.46-20.el7_2.x86_64": [
            [
                "bash-4.2.46-28.el7",
                "RHSA-2017:1931",
                "http://pulp-read.dist.prod.ext.phx2.redhat.com/content/dist/rhel/server/7/7Server/x86_64/os/"
            ],
            [
                "bash-4.2.46-28.el7",
                "RHSA-2017:1931",
                "http://pulp-read.dist.prod.ext.phx2.redhat.com/content/dist/rhel/workstation/7/7Workstation/x86_64/os/"
            ]
        ],
        "perl-Pod-Checker-1.50-1.el7.noarch": [],
        "postgresql-9.2.10-1.el7.x86_64": [],
        "qt-settings-19-22.4.el7.noarch": [],
        "xorg-x11-proto-devel-7.7-13.el7.noarch": [
            [
                "xorg-x11-proto-devel-7.7-20.el7",
                "RHSA-2017:1865",
                "http://pulp-read.dist.prod.ext.phx2.redhat.com/content/dist/rhel/server/7/7Server/x86_64/os/"
            ],
            [
                "xorg-x11-proto-devel-7.7-20.el7",
                "RHSA-2017:1865",
                "http://pulp-read.dist.prod.ext.phx2.redhat.com/content/dist/rhel/workstation/7/7Workstation/x86_64/os/"
            ]
        ]
    },
    "repo_list": {}
}

return package description

Add the package description to the API. This can likely be pulled in by the reposcan from primary.xml.

reposcan: wait until db is ready

When started via docker-compose reposcan is ready sooner than database which leads into

vmaas_reposcan | Traceback (most recent call last):
vmaas_reposcan |   File "./reposcan.py", line 32, in <module>
vmaas_reposcan |     repository_controller = RepositoryController()
vmaas_reposcan |   File "/vmaas-reposcan/repodata/repository_controller.py", line 23, in __init__
vmaas_reposcan |     self.repo_store = RepositoryStore()
vmaas_reposcan |   File "/vmaas-reposcan/database/repository_store.py", line 11, in __init__
vmaas_reposcan |     self.conn = DatabaseHandler.get_connection()
vmaas_reposcan |   File "/vmaas-reposcan/database/database_handler.py", line 16, in get_connection
vmaas_reposcan |     database=cls.db_name, user=cls.db_user, password=cls.db_pass, host=cls.db_host, port=cls.db_port)
vmaas_reposcan |   File "/usr/lib64/python3.6/site-packages/psycopg2/__init__.py", line 130, in connect
vmaas_reposcan |     conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
vmaas_reposcan | psycopg2.OperationalError: could not connect to server: Connection refused
vmaas_reposcan | 	Is the server running on host "database" (172.18.0.2) and accepting
vmaas_reposcan | 	TCP/IP connections on port 5432?
vmaas_reposcan | 

reposcan should test db availability, wait and retry (e.g. 5x).

reposcan refresh results in lots of tracebacks

Updated to latest, and repo/cve scan has started throwing the following tracebacks even when they succeed:

===
...
CVE to CWE mapping to import: 0
Syncing CVEs finished.
CVE sync task finished: OK.
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 141, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "/usr/lib/python3.6/site-packages/urllib3/util/connection.py", line 83, in create_connection
raise err
File "/usr/lib/python3.6/site-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 357, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib64/python3.6/http/client.py", line 1239, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib64/python3.6/http/client.py", line 1285, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib64/python3.6/http/client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib64/python3.6/http/client.py", line 1026, in _send_output
self.send(msg)
File "/usr/lib64/python3.6/http/client.py", line 964, in send
self.connect()
File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 166, in connect
conn = self._new_conn()
File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 150, in _new_conn
self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f484e5daeb8>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 440, in send
timeout=timeout
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/lib/python3.6/site-packages/urllib3/util/retry.py", line 388, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='webapp', port=8079): Max retries exceeded with url: /api/internal/refresh (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f484e5daeb8>: Failed to establish a new connection: [Errno 111] Connection refused',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/vmaas-reposcan/reposcan.py", line 139, in _notify_webapp
response = requests.get(refresh_url)
File "/usr/lib/python3.6/site-packages/requests/api.py", line 72, in get
return request('get', url, params=params, **kwargs)
File "/usr/lib/python3.6/site-packages/requests/api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 508, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='webapp', port=8079): Max retries exceeded with url: /api/internal/refresh (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f484e5daeb8>: Failed to establish a new connection: [Errno 111] Connection refused',))

Unable to connect to http://webapp:8079/api/internal/refresh.

Add Sphinx documentation

Even though there is a wiki already for this project, I feel that using Sphinx generated documentation for how to install, how to use it (info already on README page), and the endpoints (examples of what to pass and what gets returned).

I'd also add some type of rule (i.e. linting) to make sure that docs are checked before PRs get merged to ensure that we keep them up to date.

aggregate products that an erratum is applicable to

Currently for each update we get a list of repositories. From these we can use /repos to infer the product name. However, this limits our knowledge of affected products to only those from which we collected package list from.

It would be great if vmaas could aggregate the products for an erratum so that a general list of products a given erratum is applicable to can be requested. (e.g. on errata/erratum_name)

For example:

affected_products: ['Red Hat Enterprise Linux Server 7 x86_64', 'Red Hat Enterprise Linux Server - AUS 7.4 x86_64', ...]

Records in "update_list" contains null values for "repository"

{
	"update_list": {
		"bash-0:4.2.46-20.el7_2.x86_64": [
			{
				"erratum": "RHSA-2017:1931",
				"repository": null,
				"package": "bash-4.2.46-28.el7.x86_64"
			},
			{
				"erratum": "RHSA-2017:1931",
				"repository": null,
				"package": "bash-4.2.46-28.el7.x86_64"
			},

The value of "repository" should be name of the repository where the update is available.

date and datetime information should be standardized

Right now VMaaS will display information about data and datetime exactly as it is stored in the source of origin. This means that we can have these fields displayed in UTC or containing timezone information.

I feel that we should standardize how these fields are stored and displayed so not to cause confusion to our users and/or systems requesting information.

Errata References - how to process and what to include?

  1. updateinfo contains references of 4 types. The "self" type seems to be a URL pointing to info about the erratum. However, I noticed many contain URLs pointing to rhn.redhat.com. So I left in place the code in webapp that sets and erratum's url field to "https://access.redhat.com/errata/%s" % erratum_name. I am not storing the "self" type references found in updateinfo.

  2. I added a table (errata_refs) that contains updateinfo references of types "other" and "bugzilla". I was not able to move "cve" types into this table because they need to have the foreign key constraint to the cve table (since CWE's also map to CVE's).

  3. Several of the "other" type references did not have id's. The ones that did not have id's looked to be url references to other docs of some sort. The ones that did have id's had id's with values like "classification", "ref_0", "ref_2", and these matched the errata api example in the errata API details document. So in reposcan, only "other" type references that have non-null id's are stored in the errata_refs table (in addition to all "bugzilla" types), and the name is constructed by appending "-" + erratum_name to the value found in the reference id field. For example "classification-RHSA-2017:1931" or "ref_0-RHSA-2017:1931".

reposcan: review memory consumption

While syncing 25000 repositories in single run I noticed the memory consumption is slowly growing. This could be caused by keeping some parsed repodata values in memory after they are imported and not needed anymore, it should be checked.

api: format of JSON respose of 'updates' API

Now we have output in format:

"bash-4.2.46-20.el7_2.x86_64":[  
         [  
            "bash-4.2.46-21.el7_3.x86_64",
            "RHBA-2016:2858",
            "rhel-7-server-rpms"
         ]
]

Maybe it would be better to have a dictionary instead of list:

"bash-4.2.46-20.el7_2.x86_64":[  
        { 
          "package": "bash-4.2.46-21.el7_3.x86_64",
          "erratum": "RHBA-2016:2858",
          "repository": "rhel-7-server-rpms"
        }
]

This could be useful because order doesn't matter in this case.

reposcan: support configuration of values

There are (and will be) some hard coded values in code like number of threads to download etc. that are good to be configurable using some config file or some container-friendly way.

Repo labes

Fix mismatch between repos labels seen on client vs. pulp.
Repo vs. content set.

Attempt to scan repos fails with KeyError: 'products'

The request POST http://127.0.0.1:8081/api/v1/sync/repo fails with

{
	"success": false,
	"msg": "Incorrect JSON format."
}

and

vmaas-reposcan | Traceback (most recent call last):
vmaas-reposcan |   File "/vmaas-reposcan/reposcan.py", line 230, in post
vmaas-reposcan |     products, repos = self._parse_input_list()
vmaas-reposcan |   File "/vmaas-reposcan/reposcan.py", line 210, in _parse_input_list
vmaas-reposcan |     for product_name, product in repo_group["products"].items():
vmaas-reposcan | KeyError: 'products'
vmaas-reposcan | 
vmaas-reposcan | WARNING:tornado.access:400 POST /api/v1/sync/repo (172.18.0.1) 0.93ms

The body I used for the request worked last time I used it, I can strip off the certificate/key data and paste it here if needed.

Wrong Content-Type in repo sync and cve sync response

For both /api/v1/sync/repo and /api/v1/sync/cve the Content-Type of the response is text/html; charset=UTF-8 where it should be application/json; charset=UTF-8.

The response data is JSON:

{
	"success": true,
	"msg": "Repo sync task started."
}

It's important to set the Content-Type correctly so automated tools can handle the data contained in the response.

How to better find security errata

E.g. this is clearly an erratum fixing some CVEs (according to associated BZ names) - https://access.redhat.com/errata/RHSA-2018:0512

But it has empty CVEs field so we effectively ignore this erratum as possible security update.

I think we shouldn't return just errata having any CVE associated. We should return any security errata (even without CVEs associated like in this example) + errata having any CVE associated (to include also regular bugfix errata fixing CVEs).

RFE: Allow filtering of erratum type for the api/v1/updates endpoint

One should be able to request for a specific erratum type when POSTing to api/v1/updates endpoint. The attribute could be called erratum_type and it should take a list of strings.

If the list is empty or not provided, then only security-related erratum should be returned.

reposcan: Python 3 vs Python 2

Reposcan code is Py3-compatible only. This is the reason why it's container is based on Fedora. In CenOS/RHEL 7 are not available Py3 versions of libraries (installing them from pip is not good enough for future production use). There are some options how to solve this:

  1. Make the code both Py2 and Py3 compatible, so they can run natively in Py2 (Centos 7 container) and natively in Py3 (Fedora container). This doubles the required effort for testing etc.
  2. Create custom (COPR?) RPM repository with missing Py3 libraries built. These libraries need to be built in SCL fashion.

webapp: wait untill db is ready

When started via docker-compose webapp is ready sooner than database which leads into

vmaas_webapp | Traceback (most recent call last):
vmaas_webapp |   File "/app/app.py", line 19, in <module>
vmaas_webapp |     os.getenv('POSTGRES_PORT', errata.DEFAULT_DB_PORT))
vmaas_webapp |   File "/app/errata.py", line 76, in init_db
vmaas_webapp |     connection = psycopg2.connect(database=db_name, user=db_user, password=db_pass, host=db_host, port=db_port)
vmaas_webapp |   File "/usr/lib64/python2.7/site-packages/psycopg2/__init__.py", line 164, in connect
vmaas_webapp |     conn = _connect(dsn, connection_factory=connection_factory, async=async)
vmaas_webapp | psycopg2.OperationalError: could not connect to server: Connection refused
vmaas_webapp | 	Is the server running on host "database" (172.18.0.2) and accepting
vmaas_webapp | 	TCP/IP connections on port 5432?
vmaas_webapp | 
vmaas_webapp exited with code 1

it should test db availability, wait an retry (e.g. 5x).

Fix pylint issues in webapp

Currently I've (temporarily!) disabled pylint warnings in webapp:
************* Module errata
R: 10, 4: Too many arguments (9/5) (too-many-arguments)
R: 97, 4: Too many local variables (18/15) (too-many-locals)
************* Module updates
R: 73, 4: Too many local variables (57/15) (too-many-locals)
R:271, 8: Too many nested blocks (8/5) (too-many-nested-blocks)
R: 73, 4: Too many branches (49/12) (too-many-branches)
R: 73, 4: Too many statements (134/50) (too-many-statements)

Needs to be resolved and remove '#pylint disable' comments.

cvss score missing from /cve endpoint

Need the cvss score returned along with the other cve data.

{
    "CVE-2016-2178": {
        "impact": null,
        "public_date": "None",
        "synopsis": "CVE-2016-2178",
        "description": null,
        "modified_date": "None",
        "redhat_url": null,
        "secondary_url": null,
        "cwe_list": []
    }
}

[Feature Request] Return list of all known repos

Repository names are needed for constructing requests like:

{
    "package_list": [
        ...
    ],
    "repository_list": [
      "rhel-6-workstation-rpms",
      "rhel-7-server-rpms"
    ]
}

There should be a way to list all repositories known to VMaaS. Currently it's possible to query repos by name (label):
GET /api/v1/repos/rhel-7-server-rpms

When no repository name is specified I expected the response to be a list of all known repositories:
GET /api/v1/repos/
It is not the case and I don't find the current response meaningful:

{
	"repository_list": {
		"": null
	}
}

reposcan: wrong info about `epoch` in database

I paid my attention to one package which has been wrong parsed by the reposcan.

repository: server/7/7Server/x86_64/os/Packages/
package: texlive-soul-svn15878.2.4-32.el7.noarch.rpm

should be parsed as:
epoch: 0
name: texlive-soul
version: svn15878.2.4
release: 32.el7
arch: noarch

BUT in database I see:
id | epoch | version | release | evr
------+-------+--------------+---------+--------------------------------------------------------------------------------------
1340 | 2 | svn15878.2.4 | 32.el7 | (2,"{""(0,svn)"",""(15878,)"",""(2,)"",""(4,)""}","{""(32,)"",""(0,el)"",""(7,)""}")

datetime is not JSON serializable

When calling CVE api, 500 Internal Server Error is returned, with traceback.
TypeError: datetime.datetime(2012, 7, 31, 10, 45) is not JSON serializable.

This is caused by ujson removal.

reposcan: inconsistent severity format

Some repositories metadata containt apparently some different kind of severity string format generating following mess in DB:

vmaas=# select * from severity;
id | name
----+--------------
40 | Low
41 | None
42 | Critical
43 | Important
44 | Moderate
45 | None +
|
46 | Important +
|
47 | Critical +
|
48 | Moderate +
|
49 | Low +
|
(10 rows)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.