Coder Social home page Coder Social logo

nexb / vulnerablecode Goto Github PK

View Code? Open in Web Editor NEW
504.0 23.0 184.0 27.29 MB

A free and open vulnerabilities database and the packages they impact. And the tools to aggregate and correlate these vulnerabilities. Sponsored by NLnet https://nlnet.nl/project/vulnerabilitydatabase/ for https://www.aboutcode.org/ Chat at https://gitter.im/aboutcode-org/vulnerablecode Docs at https://vulnerablecode.readthedocs.org/

Home Page: https://public.vulnerablecode.io

License: Apache License 2.0

Python 52.77% CSS 5.25% HTML 41.22% Dockerfile 0.04% Nix 0.27% Shell 0.13% Makefile 0.26% JavaScript 0.07%
cve security security-tools vulnerability-detection vulnerability vulnerability-databases vulnerability-database vulnerability-scanners vulnerability-identification vulndb ossindex snyk nvd cpe cvss osv package-url purl

vulnerablecode's Issues

Improve vulnerability severity or scoring storage design

We presently have just CVSS as a vulnerability severity indicator. Some datasets classify vulnerability as a textual indicator, High, Low, etc.

We can add:

vulnerability_score=models.TextField(max_length=50, help_text="Severity of the vulnerability")

This will ensure that we can save both Textual & Non-CVSS severity indicators, that any dataset provides.

Evolve logics to deal with duplicates in data.

Example of: GET /vulncode_app/data/prototypejs

{
    "name": "prototypejs",
    "version": [
        "0",
        "1.6.0.2-1"
    ],
    "vulnerabilities": [
        {
            "summary": "Unspecified vulnerability in Prototype JavaScript framework (prototypejs) before 1.6.0.2 allows attackers to make \"cross-site ajax requests\" via unknown vectors.",
            "reference_id": "CVE-2008-7220"
        },
        {
            "summary": "Unspecified vulnerability in Prototype JavaScript framework (prototypejs) before 1.6.0.2 allows attackers to make \"cross-site ajax requests\" via unknown vectors.",
            "reference_id": "CVE-2008-7220"
        },
        {
            "summary": "Unspecified vulnerability in Prototype JavaScript framework (prototypejs) before 1.6.0.2 allows attackers to make \"cross-site ajax requests\" via unknown vectors.",
            "reference_id": "CVE-2008-7220"
        },
        {
            "summary": "The Prototype (prototypejs) framework before 1.5.1 RC3 exchanges data using JavaScript Object Notation (JSON) without an associated protection scheme, which allows remote attackers to obtain the data via a web page that retrieves the data through a URL in the SRC attribute of a SCRIPT element and captures the data using other JavaScript code, aka \"JavaScript Hijacking.\"",
            "reference_id": "CVE-2007-2383"
        }
    ]
}

Track licenses for each data pointers and records

We need to decide what we want to do wrt. licenses for data.
See https://cve.mitre.org/about/termsofuse.html for instance for the CVE/NVD.
There are a few ways to think about this:

  1. we are storing only pointers so there is no licenses issues to track as we are not storing third-party data
  2. we are storing only pointers and caching existing data so we should handle this in a way similar to what search engine do.
  3. we are storing data so we should track licenses either per-record or per source

Each of these cases may have an impact of the resulting data licenses, which should be as open as possible (ideally some CC0-1.0)

Ability to query the public CVE-search instance and return vulnerabilities

Given a package identifier input, return if there is a known vulnerability for it.

Package identifiers:

  • name
  • name+version

Behind the scenes:

  1. Query the CVE-search API and try to find a match for the package
  2. Return results

Step 1. would after that be replaced by a local query, to the local db, where the aggregated and correlated vulnerability data would be populated from the scrapers, but let's not store anything for now, simply get the data on demand.

Essential modifications required in Models.

27684353-f70208d0-5cc9-11e7-9f65-158b8c57f57a

This is our present model.

Some things to consider:

  1. CVSS should be in Vulnerability Reference and not in Vulnerability, since, CVSS scores are assigned to particular CVE-IDs
  2. There should be some distinction in b/w a vulnerable package version and fixed version. Atm, we store just the fixed version.

Missing to store data for all Debian "releases"

Currently for debian vulnerability data, we are missing to store all the affected/resolved version for each of the various distro "releases".
Example:
Following is the JSON snippet taken from debian security tracker:

{"mimetex": {

"CVE-2009-2458": {
    "scope": "remote", 
    "debianbug": 537254, 
    "description": "Multiple stack-based buffer overflows in mimetex.cgi in mimeTeX", 
    "releases": 
        {"stretch": 
            {"status": "resolved", 
             "repositories": {"stretch": "1.74-1"}, 
             "urgency": "medium", 
             "fixed_version": "1.50-1.1"}, 
        "jessie": 
            {"status": "resolved", 
            "repositories": {"jessie": "1.74-1"}, 
            "urgency": "medium", 
            "fixed_version": "1.50-1.1"}, 
        "buster": 
            {"status": "resolved", 
            "repositories": {"buster": "1.74-1"}, 
            "urgency": "medium", 
            "fixed_version": "1.50-1.1"}, 
        "wheezy": 
            {"status": "resolved", 
            "repositories": {"wheezy": "1.73-2"}, 
            "urgency": "medium", 
            "fixed_version": "1.50-1.1"}, 
        "sid": 
            {"status": "resolved", 
            "repositories": {"sid": "1.74-1"}, 
            "urgency": "medium", 
            "fixed_version": "1.50-1.1"}}}, 

"CVE-2009-2459": 
    {"scope": "un-remote", 
    "debianbug": 537254, 
    "description": "Multiple unspecified vulnerabilities in mimeTeX.", 
    "releases": 
    {"stretch": 
        {"status": "resolved", 
        "repositories": {"stretch": "1.74-1"}, 
        "urgency": "medium", 
        "fixed_version": "1.50-1.1"}, 
    "jessie": 
        {"status": "not-resolved", 
        "repositories": {"jessie": "1.74-1"}, 
        "urgency": "medium", 
        "fixed_version": "1.50-1.1"}, 
    "buster": 
        {"status": "resolved", 
        "repositories": {"buster": "1.74-1"}, 
        "urgency": "medium", 
        "fixed_version": "1.50-1.1"}, 
    "wheezy": 
        {"status": "resolved", 
        "repositories": {"wheezy": "1.73-2"},
        "urgency": "medium", 
        "fixed_version": "1.50-1.1"}, 
    "sid": 
        {"status": "resolved", 
        "repositories": {"sid": "1.74-1"}, 
        "urgency": "medium", 
        "fixed_version": "1.50-1.1"}}}}

Following is the final set of created records:

[
    {
        'fixed_version': '1.50-1.1',
        'package_name': 'mimetex',
        'status': 'resolved',
        'urgency': 'medium',
        'vulnerability_id': 'CVE-2009-2458',
        'description': 'Multiple stack-based buffer overflows in mimetex.cgi in mimeTeX'
    },
    {
        'fixed_version': '1.50-1.1',
        'package_name': 'mimetex',
        'status': 'not-resolved',
        'urgency': 'medium',
        'vulnerability_id': 'CVE-2009-2459',
        'description': 'Multiple unspecified vulnerabilities in mimeTeX.'
    }
]

As we can see clearly it is missing to store all the affected/resolved version for each of the various distro "releases".

Package.version field needs proper definition

Currently, if I look at the JSON data from debian security tracker I find that it does not say which version is vulnerable but it says which is resolved/fixed. However, if I look at the JSON data from arch linux security tracker I find that it contains both fixed version as well as vulnerable version. This creates ambiguity while dumping the data to the Package.version field. While in the case of debian we simply dump fixed_version value from the JSON data to Package.version field, which value should be dumped in Package.version field for arch linux, i.e, affected_version or fixed_version value?

arch linux security tracker data - https://security.archlinux.org/json
debian security tracker data - https://security-tracker.debian.org/tracker/data/json

Add pylint.

Add pylint to CI, to check basic things in the code.

Collect Ubuntu vulnerability data

There are a couple ways to get Ubuntu vulnerabilities data from the main https://people.canonical.com/~ubuntu-security page:

  1. An XML oval feed at https://people.canonical.com/~ubuntu-security/oval/
  2. HTML pages that can be scraped at https://people.canonical.com/~ubuntu-security/cve/ ... for instances at https://people.canonical.com/~ubuntu-security/cve/main.html
  3. possibly other sources such as
  • a bzr repo
  • an RSS/Atom feeds for higher level security notices for several vulns at once

Some notes:

About oval

The oval feed is likely a preferred option if it is comprehensive and contains data we need. This needs to be verified. For oval there is likely parsing code that could be reused in via4cve ... see this or this for redhat that may be using oval.

About the bzr repo

There may be also interesting stuff in the bzr repo:

Django - SQL Injection possibility

Q: Is Django for this project? Decoupling Django might simplify the needs as well as make it agnostic (IMHO).
Djano - SQL Injection possibility in key and index lookups for JSONField/HStoreField

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.