Coder Social home page Coder Social logo

nexb / vulnerablecode Goto Github PK

View Code? Open in Web Editor NEW
504.0 22.0 184.0 27.13 MB

A free and open vulnerabilities database and the packages they impact. And the tools to aggregate and correlate these vulnerabilities. Sponsored by NLnet https://nlnet.nl/project/vulnerabilitydatabase/ for https://www.aboutcode.org/ Chat at https://gitter.im/aboutcode-org/vulnerablecode Docs at https://vulnerablecode.readthedocs.org/

Home Page: https://public.vulnerablecode.io

License: Apache License 2.0

Python 52.72% CSS 5.29% HTML 41.22% Dockerfile 0.04% Nix 0.27% Shell 0.13% Makefile 0.26% JavaScript 0.07%
cve security security-tools vulnerability-detection vulnerability vulnerability-databases vulnerability-database vulnerability-scanners vulnerability-identification vulndb

vulnerablecode's Introduction

VulnerableCode

Build Status Code License Data License Python 3.8+ stability-wip Gitter chat

VulnerableCode is a free and open database of open source software package vulnerabilities because open source software vulnerabilities data and tools should be free and open source themselves:

we are trying to change this and evolve the status quo in a few other areas!

  • Vulnerability databases have been traditionally proprietary even though they are mostly about free and open source software.
  • Vulnerability databases also often contain a lot of lesser value data which means a lot of false positive signals that require extensive expert reviews.
  • Vulnerability databases are also mostly about vulnerabilities first and software package second, making it difficult to find if and when a vulnerability applies to a piece of code. VulnerableCode focus is on software package first where a Package URL is a key and natural identifier for packages; this is making it easier to find a package and whether it is vulnerable.

Package URL themselves were designed first in ScanCode and VulnerableCode and are now a de-facto standard for vulnerability management and package references.

See https://github.com/package-url/purl-spec

The VulnerableCode project is a FOSS community resource to help improve the security of the open source software ecosystem and its users at large.

VulnerableCode consists of a database and the tools to collect, refine and keep the database current.

Warning

VulnerableCode is under active development and is not yet fully usable.

Read more about VulnerableCode https://vulnerablecode.readthedocs.org/

VulnerableCode is financially supported by NLnet, nexB, Google (through the GSoC) and the active contributions of several volunteers.

VulnerableCode tech stack is Python, Django, PostgreSQL, nginx and Docker and several libraries.

Getting started

Run with Docker

First install docker and docker-compose, then run:

git clone https://github.com/nexB/vulnerablecode.git && cd vulnerablecode
make envfile
docker-compose build
docker-compose up -d
docker-compose run vulnerablecode ./manage.py import --list

Then run an importer for nginx advisories (which is small):

docker-compose exec vulnerablecode ./manage.py import vulnerabilities.importers.nginx.NginxImporter
docker-compose exec vulnerablecode ./manage.py improve --all

At this point, the VulnerableCode app and API should be up and running with some data at http://localhost

Populate VulnerableCode database

VulnerableCode data collection works in two steps: importing data from multiple sources and then refining and improving how package and software vulnerabilities are related.

To run all importers and improvers use this:

./manage.py import --all
./manage.py improve --all

Local development installation

On a Debian system, use this:

sudo apt-get install  python3-venv python3-dev postgresql libpq-dev build-essential
git clone https://github.com/nexB/vulnerablecode.git && cd vulnerablecode
make dev envfile postgres
make test
source venv/bin/activate
./manage.py import vulnerabilities.importers.nginx.NginxImporter
./manage.py improve --all
make run

At this point, the VulnerableCode app and API is up at http://127.0.0.1:8001/

Interface

VulnerableCode comes with a minimal web UI:

vulnerablecode-ui.png

And a JSON API and its minimal web documentation:

vulnerablecode-json-api.png

vulnerablecode-api-doc.png

License

Copyright (c) nexB Inc. and others. All rights reserved.

VulnerableCode is a trademark of nexB Inc.

SPDX-License-Identifier: Apache-2.0 AND CC-BY-SA-4.0

VulnerableCode software is licensed under the Apache License version 2.0.

VulnerableCode data is licensed collectively under CC-BY-SA-4.0.

See https://www.apache.org/licenses/LICENSE-2.0 for the license text.

See https://creativecommons.org/licenses/by-sa/4.0/legalcode for the license text.

See https://github.com/nexB/vulnerablecode for support or download.

See https://aboutcode.org for more information about nexB OSS projects.

Acknowledgements

This project was funded through the NGI0 PET Fund, a fund established by NLnet with financial support from the European Commission's Next Generation Internet programme, under the aegis of DG Communications Networks, Content and Technology under grant agreement No 825310.

https://nlnet.nl/project/VulnerableCode/

This project was funded through the NGI0 Discovery Fund, a fund established by NLnet with financial support from the European Commission's Next Generation Internet programme, under the aegis of DG Communications Networks, Content and Technology under grant agreement No 825322.

https://nlnet.nl/project/vulnerabilitydatabase/

vulnerablecode's People

Contributors

amitgupta7580 avatar armijnhemel avatar ayansinhamahapatra avatar aydinnyunus avatar dependabot[bot] avatar elanzini avatar eslamhiko avatar haikoschol avatar hritik14 avatar johnmhoran avatar kartiksibal avatar keshav-space avatar lohani2280 avatar mjherzog avatar omkarph avatar pombredanne avatar pushpit07 avatar rolfschr avatar savish28 avatar sbs2001 avatar shravankshenoy avatar sify21 avatar singh1114 avatar tardyp avatar tdruez avatar tg1999 avatar tushar912 avatar udaykor avatar yilmi avatar ziadhany avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vulnerablecode's Issues

Track licenses for each data pointers and records

We need to decide what we want to do wrt. licenses for data.
See https://cve.mitre.org/about/termsofuse.html for instance for the CVE/NVD.
There are a few ways to think about this:

  1. we are storing only pointers so there is no licenses issues to track as we are not storing third-party data
  2. we are storing only pointers and caching existing data so we should handle this in a way similar to what search engine do.
  3. we are storing data so we should track licenses either per-record or per source

Each of these cases may have an impact of the resulting data licenses, which should be as open as possible (ideally some CC0-1.0)

Missing to store data for all Debian "releases"

Currently for debian vulnerability data, we are missing to store all the affected/resolved version for each of the various distro "releases".
Example:
Following is the JSON snippet taken from debian security tracker:

{"mimetex": {

"CVE-2009-2458": {
    "scope": "remote", 
    "debianbug": 537254, 
    "description": "Multiple stack-based buffer overflows in mimetex.cgi in mimeTeX", 
    "releases": 
        {"stretch": 
            {"status": "resolved", 
             "repositories": {"stretch": "1.74-1"}, 
             "urgency": "medium", 
             "fixed_version": "1.50-1.1"}, 
        "jessie": 
            {"status": "resolved", 
            "repositories": {"jessie": "1.74-1"}, 
            "urgency": "medium", 
            "fixed_version": "1.50-1.1"}, 
        "buster": 
            {"status": "resolved", 
            "repositories": {"buster": "1.74-1"}, 
            "urgency": "medium", 
            "fixed_version": "1.50-1.1"}, 
        "wheezy": 
            {"status": "resolved", 
            "repositories": {"wheezy": "1.73-2"}, 
            "urgency": "medium", 
            "fixed_version": "1.50-1.1"}, 
        "sid": 
            {"status": "resolved", 
            "repositories": {"sid": "1.74-1"}, 
            "urgency": "medium", 
            "fixed_version": "1.50-1.1"}}}, 

"CVE-2009-2459": 
    {"scope": "un-remote", 
    "debianbug": 537254, 
    "description": "Multiple unspecified vulnerabilities in mimeTeX.", 
    "releases": 
    {"stretch": 
        {"status": "resolved", 
        "repositories": {"stretch": "1.74-1"}, 
        "urgency": "medium", 
        "fixed_version": "1.50-1.1"}, 
    "jessie": 
        {"status": "not-resolved", 
        "repositories": {"jessie": "1.74-1"}, 
        "urgency": "medium", 
        "fixed_version": "1.50-1.1"}, 
    "buster": 
        {"status": "resolved", 
        "repositories": {"buster": "1.74-1"}, 
        "urgency": "medium", 
        "fixed_version": "1.50-1.1"}, 
    "wheezy": 
        {"status": "resolved", 
        "repositories": {"wheezy": "1.73-2"},
        "urgency": "medium", 
        "fixed_version": "1.50-1.1"}, 
    "sid": 
        {"status": "resolved", 
        "repositories": {"sid": "1.74-1"}, 
        "urgency": "medium", 
        "fixed_version": "1.50-1.1"}}}}

Following is the final set of created records:

[
    {
        'fixed_version': '1.50-1.1',
        'package_name': 'mimetex',
        'status': 'resolved',
        'urgency': 'medium',
        'vulnerability_id': 'CVE-2009-2458',
        'description': 'Multiple stack-based buffer overflows in mimetex.cgi in mimeTeX'
    },
    {
        'fixed_version': '1.50-1.1',
        'package_name': 'mimetex',
        'status': 'not-resolved',
        'urgency': 'medium',
        'vulnerability_id': 'CVE-2009-2459',
        'description': 'Multiple unspecified vulnerabilities in mimeTeX.'
    }
]

As we can see clearly it is missing to store all the affected/resolved version for each of the various distro "releases".

Essential modifications required in Models.

27684353-f70208d0-5cc9-11e7-9f65-158b8c57f57a

This is our present model.

Some things to consider:

  1. CVSS should be in Vulnerability Reference and not in Vulnerability, since, CVSS scores are assigned to particular CVE-IDs
  2. There should be some distinction in b/w a vulnerable package version and fixed version. Atm, we store just the fixed version.

Add pylint.

Add pylint to CI, to check basic things in the code.

Evolve logics to deal with duplicates in data.

Example of: GET /vulncode_app/data/prototypejs

{
    "name": "prototypejs",
    "version": [
        "0",
        "1.6.0.2-1"
    ],
    "vulnerabilities": [
        {
            "summary": "Unspecified vulnerability in Prototype JavaScript framework (prototypejs) before 1.6.0.2 allows attackers to make \"cross-site ajax requests\" via unknown vectors.",
            "reference_id": "CVE-2008-7220"
        },
        {
            "summary": "Unspecified vulnerability in Prototype JavaScript framework (prototypejs) before 1.6.0.2 allows attackers to make \"cross-site ajax requests\" via unknown vectors.",
            "reference_id": "CVE-2008-7220"
        },
        {
            "summary": "Unspecified vulnerability in Prototype JavaScript framework (prototypejs) before 1.6.0.2 allows attackers to make \"cross-site ajax requests\" via unknown vectors.",
            "reference_id": "CVE-2008-7220"
        },
        {
            "summary": "The Prototype (prototypejs) framework before 1.5.1 RC3 exchanges data using JavaScript Object Notation (JSON) without an associated protection scheme, which allows remote attackers to obtain the data via a web page that retrieves the data through a URL in the SRC attribute of a SCRIPT element and captures the data using other JavaScript code, aka \"JavaScript Hijacking.\"",
            "reference_id": "CVE-2007-2383"
        }
    ]
}

Package.version field needs proper definition

Currently, if I look at the JSON data from debian security tracker I find that it does not say which version is vulnerable but it says which is resolved/fixed. However, if I look at the JSON data from arch linux security tracker I find that it contains both fixed version as well as vulnerable version. This creates ambiguity while dumping the data to the Package.version field. While in the case of debian we simply dump fixed_version value from the JSON data to Package.version field, which value should be dumped in Package.version field for arch linux, i.e, affected_version or fixed_version value?

arch linux security tracker data - https://security.archlinux.org/json
debian security tracker data - https://security-tracker.debian.org/tracker/data/json

Ability to query the public CVE-search instance and return vulnerabilities

Given a package identifier input, return if there is a known vulnerability for it.

Package identifiers:

  • name
  • name+version

Behind the scenes:

  1. Query the CVE-search API and try to find a match for the package
  2. Return results

Step 1. would after that be replaced by a local query, to the local db, where the aggregated and correlated vulnerability data would be populated from the scrapers, but let's not store anything for now, simply get the data on demand.

Django - SQL Injection possibility

Q: Is Django for this project? Decoupling Django might simplify the needs as well as make it agnostic (IMHO).
Djano - SQL Injection possibility in key and index lookups for JSONField/HStoreField

Collect Ubuntu vulnerability data

There are a couple ways to get Ubuntu vulnerabilities data from the main https://people.canonical.com/~ubuntu-security page:

  1. An XML oval feed at https://people.canonical.com/~ubuntu-security/oval/
  2. HTML pages that can be scraped at https://people.canonical.com/~ubuntu-security/cve/ ... for instances at https://people.canonical.com/~ubuntu-security/cve/main.html
  3. possibly other sources such as
  • a bzr repo
  • an RSS/Atom feeds for higher level security notices for several vulns at once

Some notes:

About oval

The oval feed is likely a preferred option if it is comprehensive and contains data we need. This needs to be verified. For oval there is likely parsing code that could be reused in via4cve ... see this or this for redhat that may be using oval.

About the bzr repo

There may be also interesting stuff in the bzr repo:

Improve vulnerability severity or scoring storage design

We presently have just CVSS as a vulnerability severity indicator. Some datasets classify vulnerability as a textual indicator, High, Low, etc.

We can add:

vulnerability_score=models.TextField(max_length=50, help_text="Severity of the vulnerability")

This will ensure that we can save both Textual & Non-CVSS severity indicators, that any dataset provides.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.