Coder Social home page Coder Social logo

usnistgov / oscal-deep-diff Goto Github PK

View Code? Open in Web Editor NEW
28.0 9.0 6.0 1.46 MB

Open Security Controls Assessment Language (OSCAL) Deep Differencing Tool

License: Other

TypeScript 99.97% JavaScript 0.03%
diff compliance json nist security yaml oscal

oscal-deep-diff's People

Contributors

david-waltermire avatar dependabot[bot] avatar nikitawootten-nist avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

oscal-deep-diff's Issues

OSCAL-deep-diff next steps for control and statement-level comparisons

User Story:

As an OSCAL user, I need tooling to help review changes between revisions of large OSCAL documents.

Goals:

  • Add option to explicitly prevent matching on certain fields (similar to ignore option, but for lists).
  • Add option to flatten a list of subtrees into one large list for comparison (for example, matching all controls irrespective of their group)
    • Note that this requires special considerations for objects that can be defined in multiple places (controls can be defined in a group, or as a sub-control of another control (maybe using a glob pattern for this?)
  • Rename output object's fields so that it is clear and consistent (always use left and right instead of old and new to refer to the documents)

No longer in scope of this issue:

  • Clearer output for when objects are moved from one subtree to another.
  • Develop special constraints for OSCAL control and statement level comparisons
    • Identify when a control/statement has changed
    • Identify when significant text was added (this may involve alternate string similarity options)
    • Identify where review is needed (this may require developing additional tooling)

Note some of these issues will be broken into smaller tickets, see usnistgov/OSCAL#988 for more details

Dependencies:

None.

Acceptance Criteria

  • readme documentation affected by the changes in this issue have been updated.
  • A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
  • The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.

Documentation of application usage

Currently oscal-deep-diff has no documentation for using the cli, or developing comparison configuration objects. The application needs:

  • A stable CLI with a document describing usage
  • Stable configuration objects for defining how objects and sub-objects are compared, as well as documented examples of use
  • Well defined output structure with a description of what the output means.

Switch show content on and off

User Story:

As an OSCAL tool developer, I want to allow the tool to be able to only show JSON pointers in order to optimize the output object for machine-readable applications.

Goals:

  • The output comparison document is very large, containing the full sub-objects of array items. This is for human readability.
  • In a machine-readable context, the content is irrelevant, only the json pointers are needed.
  • For machine readability and small document sizes (optimized for network-bound workloads such as web applications), an option can be added that would cause the output object to only contain JSON pointers.

Dependencies:

None.

Acceptance Criteria

  • readme documentation affected by the changes in this issue have been updated.
  • A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
  • The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.
  • relevant options have command line arguments added

CLI `--config` flag override logic simple bug

The CLI checks if the --config flag exists along with a manually passed in config. It should throw an error if both are defined, but it currently throws an error always when the --config flag is defined.

Production release readiness & publication in MIDAS

User Story:

As a NIST OSCAL team member, in order to ensure this project is well-designed, implemented, tested, and documented, I want itemized list of NIST publication requirements for publishing research software, status, and supported evidence for individual items per and other relevant guidelines.

NOTE: this issue is derived from usnistgov/metaschema-node#16

Goals:

The following NIST requirements must be met:

  • NIST S 1801.03 (supplemented by this checklist)
    • How do you expect your code/software to be used (choose 1):
      • Code is informational (e.g. part of the supplemental information in a narrative publication) and not intended for re-use
      • Code itself is intended for re-use (e.g. in a specific scientific area) or the public is being invited to contribute to it
    • Developing and Testing:
      • A testing plan was developed, followed, and documented. The testing plan is available at a specified link. (see usnistgov/OSCAL#44)
      • Continuous testing was conducted during updates and new builds.
      • Code includes appropriate IT security and privacy controls. (DNA)
    • Documenting:
      • Documentation is available as appropriate as: (choose 1)
        • Integrated with the source code
        • On separate web pages (e.g. nist.gov, pages.nist.gov)
        • In a separate publication
        • Other
      • Documentation includes, as appropriate:
        • A readme
        • Function-level documentation
        • Information about how a binary was produced
        • System requirements and prerequisites (e.g., OS version, memory, dependencies): **Available in package.json
        • Installation instructions
        • User instructions/guides
        • API Specification
        • A changelog file (included in GitHub releases)
        • Specification of maturity level (i.e. is the software still being developed,, are you expecting feedback on performance and usability, is the project completed)
        • A communication to users of your intent to provide (or not provide) support
    • License and disclaimers:
      • NIST license and disclaimers
      • External collaborators who were part of this project have been credited
      • Third-party software licenses permit modification and/or redistribution
        • Appropriate licensing is included
        • Files modified by NIST contain notice that modifications are released to the public domain as appropriate
  • Fair Principles (supplemented by this checklist)
    • Findable (Will be satisfied by MIDAS entry)
      • (Meta)data are assigned a globally unique and persistent identifier
      • Data are described with rich metadata (defined by R1 below)
      • Metadata clearly and explicitly include the identifier of the data they describe
      • (Meta)data are registered or indexed in a searchable resource
    • Accessible (satisfied by NPM)
      • (Meta)data are retrievable by their identifier using a standardised communications protocol
      • Metadata are accessible, even when the data are no longer available
    • Interoperable (satisfied by NPM)
      • (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
      • (Meta)data use vocabularies that follow FAIR principles
      • (Meta)data include qualified references to other (meta)data
    • Reusable (satisfied by NPM)
      • (Meta)data are richly described with a plurality of accurate and relevant attributes
        • (Meta)data are released with a clear and accessible data usage license
        • (Meta)data are associated with detailed provenance
        • (Meta)data meet domain-relevant community standards

Manual constraint generation tool/Comparison callbacks

User Story:

As an OSCAL-deep-diff user, I need to be able to easily generate constraint sets for reproducible comparisons.

Goals:

{A clear and concise description of what you want to happen. This should be outcome focused. Include concise description of any alternative solutions or features you've considered. Feel free to include screenshots or examples about the feature request here.}

Dependencies:

  • The comparison library must allow callbacks that can modify comparison behavior including:
    • A callback when the tool does not find a constraint for an array
      • This can be used to extend matching behavior
      • This can also be used within web-apps
      • Allow for users to return/inject new constraints?
    • A callback when the tool finds out-of-tree matches
    • A callback when the tool cannot match or compare elements with the given constraint
      • Possibly add the ability to restart the comparison with new changes?
  • Leveraging the new callbacks, the existing CLI can be made interactive in order to assist the user in generating constraint sets for a comparison that can be re-used.

Acceptance Criteria

  • readme documentation affected by the changes in this issue have been updated.
  • A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
  • The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.

Ignore property constraint

Some workflows may want certain object properties (such as id or uuid) explicitly ignored from object comparisons. Other workflows may want entire sub-objects to not be compared (such as ignoring the back-matter object in a catalog)

Add an option to omit empty array match sub-changes and empty array changes

User Story:

As an OSCAL deep diff user analyzing comparison results by hand, I would like a way to omit unchanged array and array item records from the json comparison output for all or specific paths (e.x. omit unchanged control parameters, but keep unchanged control parameters) to make it easier to read.

Goals:

  • A configuration option that omits empty array_changed from an object
  • A configuration option that omits empty subChanges array items from an object

Dependencies:

N/A

Acceptance Criteria

  • readme documentation affected by the changes in this issue have been updated.
  • A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
  • The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.

{The items above are general acceptance criteria for all User Stories. Please describe anything else that must be completed for this issue to be considered resolved.}

Human readability mode for CLI output

User Story:

As an CLI user, I need to be able to easily interpret the output of the comparison.

Goals:

When no extra options are provided, the CLI should produce an output that is easily human readable. Some inspiration output formats include:

  • JSON Diff's Structural Output: Image Example
  • colordiff's ability to break a large file into changes with clear separation: Image Example
  • ICDiff's multi-column diff output: Image Example

The main limitation is that the format chosen must not have any dependencies that would break non-cli usage (e.g. fs).

Dependencies:

None.

Acceptance Criteria

  • readme documentation affected by the changes in this issue have been updated.
  • A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
  • The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.

{The items above are general acceptance criteria for all User Stories. Please describe anything else that must be completed for this issue to be considered resolved.}

Property interrogation in path selection operations

User Story:

As an OSCAL deep diff user, I would like to be able to change comparison behavior based on a property of an object that is being compared, such as:

  • Selectively ignoring properties of a specific type
  • Changing the string similarity method based on a property of a sub-object

This would require the "path condition" system to be reworked to allow for syntax that selects a property of a path, e.g. controls/#/props[name="ignoreme" and value="true"]. This may mean moving towards an XPath-eque selection syntax.

Goals:

  • Formalize a selection syntax and provide an example in a comment to this issue
  • Make the adjustments to the selection syntax and effected comparison features

Dependencies:

N/A

Acceptance Criteria

  • readme documentation affected by the changes in this issue have been updated.
  • A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
  • The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.

{The items above are general acceptance criteria for all User Stories. Please describe anything else that must be completed for this issue to be considered resolved.}

Control-level-comparison Intermediate Document

User Story:

As an OSCAL catalog consumer, I need to be able to compare OSCAL documents on the control level (compare how each control changes from revision to revision). This output comparison document should be expressive enough to be rendered to a human-readable format (such as a web page, PDF, or spreadsheet).

Goals:

The implementation of this feature should take the form of a separate stage of the comparison that takes the base diff option and transforms it into a document with the following information:

  • How a control is mapped between revisions, including the title and/or id of the control for easy reference
  • A note stating if the control is unchanged, changed, moved, added, or withdrawn
  • For controls that have been changed:
    • A simple score (high, medium, low or numeric?) noting how much of a control has changed, and if the change is administrative or not
    • A detailed list of control changes, including if some sub-object (or sub-control) has been moved from another control/sub-object

Dependencies:

{Describe any previous issues or related work that must be completed to start or complete this issue.}

Acceptance Criteria

  • readme documentation affected by the changes in this issue have been updated.
  • A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
  • The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.

Create a `TESTING.md` document

User Story:

As an OSCAL team member, I need documentation describing our compliance with NIST's software publication checklist (see more in https://github.com/usnistgov/oscal-base/issues/1).

Goals:

  • A description of the testing methodology for this project, and how tests can be run
    • A summary of what is and is not tested
    • An explanation of how to run and add new tests
    • A metric defining the desired code coverage
  • Conformity to SA-11 and (some) enhancements
    • SA-11(1): GitHub's static code analysis action
    • SA-11(2): Dependabot?
    • SA-11(4): CODEOWNERS
    • SA-11(7)
    • ?

Dependencies:

This issue was formulated as a result of https://github.com/usnistgov/oscal-base/issues/1

Acceptance Criteria

  • readme documentation affected by the changes in this issue have been updated.
  • A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
  • The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.

Note

Implement constraint and configuration object parsing

Currently the ability to specify the way two subobjects are compared (a constraint) has been implemented, but the following needs to be implemented:

  • A defined structure for constraints and configuration objects
  • The ability to load constraints from a file and verify that they are valid
  • Tests surrounding constraint operation
    - [ ] Output object should complain if constraints are not in place for an object.

Optimize recursive match through memoization

The current implementation of the match system uses a recursive function to maximize the scoring of matched sub-objects. This scoring system has one big flaw, that large documents take an exorbitant amount of time to compute (two NIST-SP-800-53 revisions could take upwards of several minutes on my machine). The recursive functions essentially recompute each sub-object match many times.

This is an optimal substructure problem, and is best solved by "caching" subobjects, trading time efficiency for space efficiency.

Set up CI

A simple CI pipeline (though CircleCI) needs to be set up to run the mocha tests in this repository. All tests are located in the src/ folder along with the source code, and are differentiated with a .spec suffix.

Transition from CircleCI to GitHub actions

User Story:

As an OSCAL developer, I need a consistent CI/CD process.

Goals:

GitHub actions replacing the CircleCI testing and deployment tasks.

Dependencies:

None

Acceptance Criteria

  • readme documentation affected by the changes in this issue have been updated.
  • A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
  • The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR. (meta)

Match by sub-property constraint

For example, in an OSCAL catalog, back-matter resources could be matched by title, but it may be more accurate to match them by the citation text, which is a sub-object's property. Being able to match by a sub-object (possibly using a JSON pointer) would increase matching power.

Load documents from URL

The CLI should be able to load documents from a URL, in addition to an on disk and in memory comparison. This would also make writing tests with live OSCAL documents a possibility.

Deeper comparisons with the tool

User Story:

As an OSCAL user, I need to be able to detect deeper changes in OSCAL documents with configuration that allows for better reproducibility.

More Details

Currently, the OSCAL-deep-diff tool can, given two documents, compare objects (nested maps) directly including their sub-objects, as well as optimally matching arrays of objects together by their properties, so that the matched array pairs have the minimum number of differences between them. This covers many scenarios that can apply to OSCAL documents, such as minor edits to catalogs, or groups that have been renamed.

The tool does not handle one scenario that can happen a lot during major revisions: sub-objects moved from one item of an array to another. For example, within a catalog, a control could move from one group to another between revisions, or parts from multiple controls could be consolidated into one control (or dispersed into multiple controls).

Goals:

This is an Epic issue, which means it is a issue that links to multiple sub-issues.

  • Create YAML definitions for configuration parameters (as the number of configuration options & usage scenarios increases, relying on collections of command line flags makes for an increasingly frustrating experience)
  • Add models for json object primitives, with helper functions for resolving pointers and, matching
  • Remove hanging references to old/new documents (as well as "added" and "removed") and replace with the terms left/right
  • Keep track of unmatched items of each array, and match them together

No longer in scope of this issue:

  • Add the option to specify other string-similarity algorithms
  • Add the option to prompt the user for how a given array should be matched if no constraint has been provided or learned
  • Add the option to save all "learned" constraints in order to replicate the same comparison with another set of documents

Note some of these issues will be broken into smaller tickets, see usnistgov/OSCAL#988 for more details.

Dependencies:

No active issues are dependencies.

Acceptance Criteria

For all sub-tasks:

  • readme documentation affected by the changes in this issue have been updated.
  • A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
  • The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.

CircleCI config does not publish on tag

Describe the bug

When a new tag is created, CircleCI should run the pipeline that would automatically publish the package. This does not happen.

How do we replicate the issue?

  1. Tag a release
  2. Observe that the workflow does not run

In `outputConfigs` object, allow parent properties to be specified and allow empty

User Story:

As a user of the tool, I would like to be able to produce excel reports of objects whose parents do not contain the same properties (e.g. comparing groups at the root of a catalog).

Currently if identifiers are specified in an outputConfigs item, the records are required to have these properties, and their parents must also contain them. This behavior limits the application of excel outputs to cases like an OSCAL catalog, which shares a title and id with its parent, group.

Goals:

  • Add parentIdentifiers option (if omitted, default to no parent identifiers?)
  • Allow identifiers and parentIdentifiers to have missing records (e.g. an OSCAL part optionally has an id)

Dependencies:

N/A

Acceptance Criteria

  • readme documentation affected by the changes in this issue have been updated.
  • A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
  • The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.

Intermediate Output Document does not collect results from parent changes

Describe the bug

When the base comparison collects results from the comparison document, it finds all changes that match the specified pattern (for example, controls) and merges them into one list. This does not account for parent changes, such as a group being added or deleted.

Expected behavior

Parent ArrayChanged' leftOnly and rightOnly fields should be traversed when assembling the intermediate output document of a base comparison.

Other Comments

Also a related problem that I do not have details on yet: some conditions cause the excel output document to produce bad output (no matched controls, no changes) that causes the table formatter and conditional formatting rules to break.

Control-level-comparison Output Document

User Story:

As an OSCAL catalog consumer, I need a easy to reference and human-readable document that compares how controls in a catalog have changed from one revision to the next (see https://csrc.nist.gov/CSRC/media/Publications/sp/800-53/rev-5/final/documents/sp800-53r4-to-r5-comparison-workbook.xlsx for an example of a human-generated comparison of control changes).

Goals:

This implementation of this feature should take the form of a final stage to the OSCAL-deep-diff comparison that takes the output document generated in #27 and creates a human-readable document (spreadsheet, html document, or otherwise).

Dependencies:

This issue depends on #27

Acceptance Criteria

  • readme documentation affected by the changes in this issue have been updated.
  • A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
  • The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.

{The items above are general acceptance criteria for all User Stories. Please describe anything else that must be completed for this issue to be considered resolved.}

Usage, documentation, and examples

User Story:

As an OSCAL author and user, I need proper documentation of the deep-diff's capabilities.

Goals:

  • All new features are documented in READMEs or utilizing Github's wiki system
  • The documentation is replicated on the OSCAL tools page
  • Documentation includes real world example outputs
  • Deep-diff YAML configuration has a generated schema with instructions in tools such as VS Code

Dependencies:

Acceptance Criteria

  • readme documentation affected by the changes in this issue have been updated.
  • A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
  • The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.

{The items above are general acceptance criteria for all User Stories. Please describe anything else that must be completed for this issue to be considered resolved.}

Out of tree comparison

When comparing documents, sometimes objects move from one part of the document to another. This can be in the form of a promotion (for example, an OSCAL enhancement becoming a full control), demotion (for example, a control being demoted into an enhancement), or a move (for example, a control moving families).

Note that the previous implementation of the matching system did have the capability to perform these "out of tree" matches.

Goals:

  • Unmatched controls from an array comparison are collected
  • At every level, unmatched controls are matched by their parent control's pointer and compared
  • Out of tree matches are consumed properly by the control level comparison documents

Dependencies:

Acceptance Criteria

  • readme documentation affected by the changes in this issue have been updated.
  • A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
  • The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.