Coder Social home page Coder Social logo

spdx / spdx-spec Goto Github PK

View Code? Open in Web Editor NEW
267.0 32.0 131.0 36.46 MB

The SPDX specification in MarkDown and HTML formats.

Home Page: https://spdx.github.io/spdx-spec/

License: Other

Python 100.00%
spdx specification software-package-data-exchange licenses linux-foundation

spdx-spec's Introduction

The System Package Data Exchange (SPDX®) Specification

The System Package Data Exchange (SPDX®) specification is an open standard capable of representing systems with software components in as SBOMs (Software Bill of Materials) and other AI, data and security references supporting a range of risk management use cases.

The SPDX standard helps facilitate compliance with free and open source software licenses by standardizing the way license information is shared across the software supply chain. SPDX reduces redundant work by providing a common format for companies and communities to share important data about software licenses and copyrights, thereby streamlining and improving compliance.

This repository holds under active development version of the specification as:

  • MarkDown (master branch)
  • HTML (gh-pages branch, built on every commit to master and development/ branches)

See for the official releases of the specification or additional information also the SPDX website.

Specification Structure

The specification consists of a model which is generated from the spdx-3-model repository and additional information in the docs directory.

The examples directory contains examples of various SPDX serializations for the current version of the spec.

Building the specification

Prerequisites

You have to MkDocs installed on your machine. If you don't have it yet installed please follow these installation instructions.

Building HTML

# Execute built-in dev-server that lets you preview the specification
$ mkdocs serve

# Building static HTML site
$ mkdocs build

spdx-spec's People

Contributors

aevaonline avatar fu7mu4 avatar goneall avatar henkbirkholz avatar hfukuchi avatar iamwillbar avatar ivanayov avatar jayman2000 avatar jlovejoy avatar jmudge avatar jonasob avatar kestewart avatar m1kit avatar marklodato avatar nishakm avatar noriokobota avatar paulmillar avatar rexjaeschke avatar rnjudge avatar salicodes avatar seabass-labrax avatar silverhook avatar skaet avatar swinslow avatar tardyp avatar tschmidtb51 avatar tsteenbe avatar vargenau avatar wking avatar zvr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spdx-spec's Issues

add PackageAlternateName as optional field with 0-many

Thomas: Extend SPDX-Package with optional “PackageNameAliases” field to record aliases to PackageName. Use case - support renamed packages or the same package that have different names on various distribution platforms. Example MySQL <-> MariaDB
Idea is being able to record alternate names for same package.
This would be optional. Want to be able to semantically detect this.
Yev: possibly change the cardinality? Prevents the document creator on what the “official name” is.
Gary: Might be valuable to retain an official name? Can see it both way. Use case, package originator.
Yev: Gets to declared/concluded dichotomy, are we sure we want to go there?
Gary: Not sure, can think of some cases either way.
Yev: Maybe introduce AlternateName (for package, file, license, snippet, etc.) and leave the package name specified is the one that SPDX document author as authorative voice.
Gary: Apply at element level makes sense.
Yev: Not sure if files should have alternate name though. File encoding could be probablamatic. We’ll need to apply to each with semantic, so not sure we got to there.
Yev: Prefers AlternateName over “..Alias” concept. Cardinality 0 to many.
Thomas: Is good with AlternateName terminology. Only question is should we prefix it? Ie. PackageAlternateName in tag:value. And then AlternateName in RDF.
Yev & Thomas in agreement.
Yev: If one or more alternate names provided, does PackageName needed?
Gary, Kate: Yes.
Alexios: Only for Packages right now? Yev: Yes, lets limit it to this right now.
Can think of case that it would be good to have license name alternates…
Yev: Possibly, but may want to consult further with legal. Snippet names avoid. File names - no.
Conclusion: ok to add PackageAlternateName as optional field with 0-many.

Clarify case sensitivity of Short Form licenses - for list and tools.

This has been moved from bugzilla: https://bugs.linuxfoundation.org/show_bug.cgi?id=1327

Kate Stewart 2015-11-19 19:00:15 UTC
Jilayne wrote:
in http://lists.spdx.org/pipermail/spdx-tech/2015-November/002905.html

  • in http://wiki.spdx.org/view/Technical_Team/Minutes/2014-09-16#
    Case_sensitivity_for_license_information - the tech team discussed this on 16 Sept 2014, note saying “License ID’s case sensitive”

  • and then the legal team discussed it - http://wiki.spdx.org/view/Legal_Team/Minutes/2014-09-18 - and concluded:
    • Mark raised issue of whether SPDX License List short identifiers and (new) license expression operators should be case sensitive with the Tech Team and discussed further here: decided that for purposes of spec, in terms of a legitimate value, both could be case insensitive (but best practice would be to display with precise capitalization). Mark to go back to tech team with this decision.

So… looks like maybe we didn’t really capture this elsewhere? In any case, I don’t see a reason to have them be case sensitive in terms of matching (for tools), but have them display with the upper/lower case as they are shown in the SPDX License List - it’s easier for humans to read/spot :)

Kate Stewart 2015-11-19 19:01:50 UTC
I'll add it to the 2.1 version of the spec. Also consider adding this as an appendum/erratta for 2.0.

Kate Stewart 2015-12-22 18:13:49 UTC
Discussed on 12/22 - no concerns, going forward with documenting.

Bill Schineller 2016-05-10 17:53:56 UTC
didn't jump out at me where / if we made edit yet to SPDX 2.2
todo

Kate Stewart 2016-05-17 17:01:29 UTC
Have proposed edit to 6.1, and Appendix I. Lets review.

Kate Stewart 2016-05-17 17:14:40 UTC
In discussion, some concern about other tools and matching in future.

Circling back this discussion to include Mark Gisi.

Bill Schineller 2016-05-17 17:15:33 UTC
fwiw:
from http://lists.w3.org/Archives/Public/www-rdf-interest/2003Aug/0002.html
RDF is case-sensitive. From the last call Concepts working draft:

 Two RDF URI references are equal if and only if they compare as
 equal, character by character, as Unicode strings.

 -- http://www.w3.org/TR/rdf-concepts/#section-Graph-URIref

An upper-case 'A' and a lower-case 'a' are different Unicode characters.

Bill Schineller 2016-05-24 17:13:32 UTC
Kate / Jilayne agreed to leave the Spec language as-is for 2.1

as-is means 'it IS case-sensitive'

leaving ticket open with Version 'unspecified' in case we want to revisit in the future.

We were reluctant to make case-insensitive now for 2.1 without understanding the impacts case might have on URIs (website, other tools, RDF graphs, ...)

Permit comments to be added to SPDX documents

Comments in SPDX documents (depends on file formats).
In RDFa/XML - there is a specific term defined.
In tag:value - # - at start of line - need to be added to document. Entire line comments.
No middle of comment line in SPDX document.

Comments take the form of '#', as the first non-blank character, and continue to the end of line (marked by characters U+000D or U+000A) or end of file if there is no end of line after the comment marker. Comments are treated as white space.
⇒ Alexios recommends aligning with Turtle

What about SPDX-License-Identifier: ( ) in source code.

"#" ok if disallowed character? Need to check specification.

Clarify use of newlines for license expressions used in source files

In Appendix V, it is not clear if multiple lines are to be used for compound set of licenses. Suggest changing the following statement:

The SPDX License Identifier syntax may consist of a single license (represented by a short identifier from the SPDX license list) or a compound set of licenses (represented by joining together multiple licenses using the license expression syntax).

to:

The SPDX License Identifier syntax may consist of a single license (represented by a short identifier from the SPDX license list) or a compound set of licenses (represented by joining together multiple licenses using the license expression syntax which is enclosed in parenthesis and may span multiple lines).

Add optional AttributionText field to File & Package level.

Agreement from those on the call, yes it should be at File & Package level.
This will be an optional field.
Useful for generating notice files, etc.

Consider does it make sense for snippets?

Adding “Attributions” to package field to store required attributions (per Oliver)
Alexios: property named FileAttributionText which will hold the text that has to be reproduced.
It can be considered as a combination of information found in properties like FileCopyrightText, LicenseInfoInFile, FileContributor, but it might not simply the sum of these values.

The relative part in the spec could be something like:

4.xx File Attribution Text
4.xx.1 Purpose: This field provides a place for the SPDX data creator to record all attributions found in the file that are required to be communicated. These typically include copyright statement(s), license text, and a disclaimer.
4.xx.2 Intent: The intent is to provide the recipient of the SPDX file with all the legally required attributions in the file, therefore complying with the license obligations.
4.xx.3 Cardinality: Optional, one.
4.xx.4 Data Format: free form text that can (and usually will) span multiple lines
4.xx.5 Tag: "FileAttributionText:"
In Tag:value format, the multiple lines are delimited by "" and "".
Example:

FileAttributionText: <text>
#   Copyright (C) 2004 Free Software Foundation, Inc.
#   Written by Scott James Remnant, 2004
#
# This file is free software; the Free Software Foundation gives
# unlimited permission to copy and/or distribute it, with or without
# modifications, as long as this notice is preserved.
</text>
4.xx.6 RDF: property fileAttributionText in class spdx:File
Example:
    	<File rdf:about="...">
           	<fileAttributionText>
#   Copyright (C) 2004 Free Software Foundation, Inc.
#   Written by Scott James Remnant, 2004
#
# This file is free software; the Free Software Foundation gives
# unlimited permission to copy and/or distribute it, with or without
# modifications, as long as this notice is preserved.
           	</fileAttributionText>
    	</File>

Normative children and informative parent information

We currently have some awkward wording around cardinality (#40). I think the difficulty comes from defining cardinality alongside the child element, when it's really a parent property. I'd rather see each parent clearly define their allowed children, with potential backlinks from children to possible parents. For an example of this in another spec, see HTML's html and p elements, which have a normative “Content model” and a convenience “Contexts in which this element can be used”. Once we shift those around, we could have the root element clearly declare that it could contain zero or more package entries, zero or more file entries, etc.

This would be a rather large change, and there are a number of open PRs already in flight. I'm happy to put together a PR for this, but it would be good to get at least preliminary agreement on the approach first, and ideally have fewer in-flight PRs going on in parallel ;).

Change cardinality of selected fields from manditory to optional.

SPDX 2.1 has 34 mandatory tags. Propose to reduce the number of mandatory fields to minimal fields needed for exchange to reduce friction to participate.

Believe the following fields could be made optional:
§2.4 DocumentName
§2.5 DocumentNamespace - Most of the time these URL are totally artificial as producers do not maintain SPDX as linked data
§3.9 PackageVerificationCode - Verification should be optional, license scanners ignore different type of files such as .git dirs and as such two scanners can produce different PackageVerificationCode for the same package
§3.15 PackageLicenseDeclared - PackageLicenseConcluded and PackageLicenseInfoFromFiles provide the same information
§5.3 SnippetByteRange - Making this optional reduces friction to SPDX participation. Maintainer can easily manually specify SnippetLineRange but SnippetByteRange requires tooling

Gary: consider going to profiles? Simpler - same field names, make optional or not. Some redundancy in documentation.

SPDX-Document
2.1.5 SPDXVersion
2.2.5 DataLicense
2.3.5 SPDXID
2.4.5 DocumentName
2.5.5 DocumentNamespace
2.8.5 Creator
2.9.5 Created

SPDX-Package
3.1.5 PackageName
3.2.5 SPDXID
3.7.5 PackageDownloadLocation
3.9.6 PackageVerificationCode
3.13.5 PackageLicenseConcluded
3.14.5 PackageLicenseInfoFromFiles
3.15.5 PackageLicenseDeclared
3.17.5 PackageCopyrightText

SPDX-File
4.1.5 FileName
4.2.5 SPDXID
4.4.6 FileChecksum
4.5.5 LicenseConcluded
4.6.5 LicenseInfoInFile
4.8.5 FileCopyrightText

SPDX-Snippet
5.1.5 SnippetSPDXID
5.2.5 SnippetFromFileSPDXID
5.3.5 SnippetByteRange
5.5.5 SnippetLicenseConcluded
5.8.5 SnippetCopyrightText

Non-SPDX License Identifier
6.1.5 LicenseID
6.2.5 ExtractedText
6.3.5 LicenseName

SPDX–Annotation
8.1.5 Annotator
8.2.5 AnnotationDate
8.3.5 AnnotationType
8.5.5 AnnotationComment

Generate the license list chapter

@tsteenbe Following up on the tech call on 10 July, I updated the license generation tools to create a markdown page for the license list: https://github.com/spdx/license-list-data/blob/master/licenses.md

To make this work for the license-list-data repository, the links and the text are different from the license list chapter itself. You are welcome to use the licenses.md to help generate the chapter text, but it may be easier to just generate the page using a node.js script using a well structured JSON file as input.

I would recommend using the JSON table of contents page at https://github.com/spdx/license-list-data/blob/master/json/licenses.json and https://github.com/spdx/license-list-data/blob/master/json/exceptions.json. The structure and tag names for these pages are stable. If you would like to get the license list for a specific version, checkout the tag by the version name.

Conditional License Expressions e.g. (IF SOURCE THEN MIT ELSE CC-BY-3.0+)

Transferred from https://bugs.linuxfoundation.org/show_bug.cgi?id=1360

Bill Schineller 2016-05-24 17:51:10 UTC
From David Wheeler on mailing list http://lists.spdx.org/pipermail/spdx-tech/2016-May/003091.html

In the Linux Foundation CII "best practices" badge effort I'm noticing an interesting problem. Some projects have different license situations for their source code and documentation, but there's no simple way to express that using SPDX License expressions. Examples of projects where the license isn't easily expressed with SPDX expressions are:
https://bestpractices.coreinfrastructure.org/projects/1
https://bestpractices.coreinfrastructure.org/projects/137

I propose adding a new construct:
"(IF THEN [ELSE ])" to License expressions.
For starters, can be:
DOCUMENTATION = True if & only if (iff) documentation
SOURCE = True if & only if (iff) source code

So "Source code under MIT, everything else under CC-BY-3.0 or later" becomes this license expression:
"(IF SOURCE THEN MIT ELSE CC-BY-3.0+)".

If there's no "else" and the condition is false, it'd be interpreted as the empty set of rights ("no rights"), so these would mean the same thing:
"MIT OR (IF DOCUMENTATION THEN CC-BY-3.0+)"
"(IF DOCUMENTATION THEN (MIT OR CC-BY-3.0+) ELSE MIT)"

I imagine Condition could be beefed up to allow AND/OR/NOT, file matching, jurisdiction matching, and comparisons with the current date (for timed releases in the future). But that's for a later discussion.

--- David A. Wheeler

Add keyword section that can apply to any element (optional, 0-many)

Thomas: New tag PackageTag (optional, cardinality - multiple) enable users to add custom (user-defined?) tags to package to group. Useful to automatic assessment of license results

Example:

PackageName: jUnit
# Similar to Maven dependency scopes e.g. compile, runtime, etc.
PackageTag: scope:test
# Defines type of SW such as build_tool, test_framework, sw_library, utility
PackageTag: type:build_tool

Kate: Currently using Package Comment - but problematic with filtering with other comments that are in the comments. Overloaded, so problematic.
Kate: Could it be a Package Type - like https://spdx.org/spdx-specification-21-web-version#h.7vzbl5vywpa7 ? Yev not sure worth going in this direction.
Gary: Could Annotation be used? Extend annotation types? https://spdx.org/spdx-specification-21-web-version#h.wlc7jg3vsu43
Kate: Use case of package compiled to binary file - would we want to tag a binary or jar file? Thomas, Gary - yes can see it used but not as common as packages
Yev: when documenting a supply chain - what should be tagged, what should be in relationship?
Gary: can see it being useful. Maybe annotations isn’t best approach. But having it applicable to all elements may make sense? Property of element.
Yev: Images, containers, etc.
Alex: Licenses - …
New High level property appropriate to any element?
Yev: We need to be able careful with tagging license…
Alex: Whatever you put in tags, should be interpreted of author of SPDX document.
Freeform tagging vs. specific categories - user assertion. Specify tags are declaration of documentation author, vs. Custom tags not declarative. Create adhoc tags, and no meaning as far as SPDX concerns. Spec says it can exist.
Yev: That makes it useful for insourcing
Thomas: Also could be used when customer relationship.
Tending towards: May apply to any element, optional, signal author may want to convey. Enumeration 0 or more.
Yev: Field in Document Scope to describe the meanings of tags? Also could be done in Comments. Explanation of used tags included in creator comment - as best practice.
Conclusion: Keyword Section - may apply to any element, its optional, with cardinality 0 to many, Signal that authors to convey. Schedule for 3.0

Expand §8.3.4 Annotation Type with new values (LICENSE, PATENT, COPYRIGHT, EXPORT, TRADEMARK) and change cardinality

Thomas: Expand §8.3.4 Annotation Type with new values “LICENSE” | “PATENT”. Enables annotator to more precisely indicate type of annotation
Discussed use case brought up by customers - will save research. OpenCV is case cited. Be able to annotate that a patent has expired, etc.
Use LICENSE when its not 100% clear, so may want to provide information about equivalence or not with another. Zlib 1.0.6 and another close to it.
Different lawyers handle different roles, want to give lawyers comments that apply to the appropriate reviewers.
Thomas: Copyrighter holder may be out of business, so may want to have a “COPYRIGHT” as well.
Discussion of who adds the and roles between Alexios & Thomas.
Desire to permit multiple TYPES to be used with ANNOTATION

  1. Add in new values LICENSE, PATENT, COPYRIGHT, EXPORT, TRADEMARK as valid types.
  2. Permit cardinality from one to many.
    Gary ok with cardinality change, sees as useful for automation.
    Looking at this as a 2.2 feature.

Simplify license expression grammar in Appendix IV

This was https://bugs.linuxfoundation.org/show_bug.cgi?id=1334

Kate Stewart 2015-12-08 16:57:25 UTC
From David Wheeler on December 4, 2015

The current Appendix IV is also overly complex and confusing:

  • There's no need to have "compound-expression" as separate from "license-expression". The "license-expression" is defined to be either simple or compound, but a simple-expression is also a legal compound-expression, so the whole indirection is unnecessary and confusing.

  • In simple-expression, the "+" should just optionally follow license-id; that's how anyone would parse it, and it's easier to explain too.

So I suggest replacing simple-expression, compound-expression (to be removed), and license-expression with this simpler spec:

simple-expression = license-id ["+"] / license-ref

license-expression =   simple-expression [ "WITH" license-exception-id ] /

  license-expression "AND" license-expression /

  license-expression "OR" license-expression /

  "(" license-expression ")"

You could change simple-expression to be:

simple-expression = license-id ["+"][ "WITH" license-exception-id ] / license-ref and omit the ["WITH...] in the following line, but I like the idea of allowing a license-ref with a standard exception. Besides, that's currently allowed, no reason to remove this functionality.

Both this and the original description are silent about left-to-right or right-to-left; I don't think it matters, but if someone wants things to be parsed identically, perhaps that should be mentioned.

I can imagine adding suffixes like "!" (I'm sure it's only this particular version of the license) or "?" (I'm not sure that it's limited to this particular version of the license), in addition to "+".

However, that's a separate discussion.

Also: is there any reason to FORBID the "+" suffix after a license-ref or license-exception-id?

In particular, someone might use a license-ref while waiting for a license to be added to the SPDX license list or exception list.

A way Would change my proposal grammar above to:

simple-expression = license-id / license-ref

license-expression =   simple-expression ["+"] [ "WITH" license-exception-id ["+"] ] /

Kate Stewart 2015-12-15 18:34:44 UTC
From discussion on 20151215 - Mark wants to confirm that the revised version still works in his encoding. Other than that, simpler is better, so once its proven out, we'll look at changing this in the 2.1 spec.

Bill Schineller 2016-05-17 17:35:11 UTC
Mark?

Introduce CopyrightText at all levels and relax cardinality

Thomas: Instead of PackageCopyrightText and CopyrightText add introduce new PackageCopyrightHolder and CopyrightText. Both optional and more than one entry can exists per SPDX-File or SPDX-Package. Better indicates individual rights holders and makes parsing of this data easier.

Kate: https://spdx.org/spdx-specification-21-web-version#h.2grqrue - relax one to many?
Yev: Copyright holder implies present tense, could have been reassigned. Declared vs concluded may be required, not sure we want to go there.
Thomas: Changing cardinality would help.
Yev: We’ll need to apply this to Files & Snippets, as well.
Gary, Kate, Yev, Alexios, Thomas - all +1 on relaxing cardinality. 2.2, permit cardinality to go from one to many. Shelve copyrightholder until more compelling use case.
Tags: PackageCopyrightText, FileCopyrightText, SnippetCopyrightText; RDF: property spdx:copyrightText

PACKAGE-MANGER naming inconsistent

Yev Bronshteyn 2016-05-30:
The "PACKAGE-MANAGER" category is inconsistent with other names, where we use underscore instead of hyphen (such as "DISTRIBUTION_ARTIFACT" or "DATAFILE_OF" in relationship).

The categories are not demonstrated in the RDF examples. To demonstrate them, we would need to, ideally, represent them with URIs, e.g.

<category rdf:resource="http://spdx.org/rdf/terms#referenceCategory_package_manager" />

This also means categories need to be added to the ontology.

Lastly, upon further reading, I would recommend separating the "target" property in RDF into two: "type" and "locator", which are terms we already define spearately. Unliked the tag format, which aims to be readable, the core tenet of RDF is to be resolvable. This way, type can be represented in RDF by a URI that can resolve to provide more information about the target. We can define the vocabulary of that as part of the ontology work for SPDX 2.1 - it needn't be in the spec.

So an example of a full external reference in to a standard repository might be:

<spdx:Package  rdf:about="http://yevster.com/packages/foobar">
	<spdx:externalRef>
		<spdx:ExternalRef>
                    <spdx:referenceCategory rdf:resource="http://spdx.org/rdf/terms#referenceCategory_package_manager" />
		    <spdx:referenceType rdf:resource="http://spdx.org/rdf/refeferences/maven-central" />
                    <spdx:referenceLocator>org.apache.commons:commons-lang:3.2.1</spdx:referenceLocator>
		</spdx:ExternalRef>
	</spdx:externalRef>
</spdx:package>

Yev Bronshteyn 2016-05-31 04:55:57 UTC
It should be pointed out that the approach described for external reference types above is the same one that we use for all other "listed values", including license IDs, relationship types, etc. Anything that's listed in the spec (the body or appendix) is identified by a URI in RDF format. I submit that this should also be the case for reference types.

Note: This was https://bugs.linuxfoundation.org/show_bug.cgi?id=1361

Expand Possible Relationships (section 7.1) between elements to include more use cases.

(spec 7.1 needs to be expand - which useful ones are missing & definitions.

  • Yev would like to see inverse of EXPANDED_FROM_ARCHIVE, Archive of.
  • Thomas would like to add following relationships to better define context of license finding:
    • EXAMPLE_OF - Source code included with OSS package for example purposes.
    • TEST_TOOL_OF - To distinguish test frameworks such as jUnit from source dependencies.
    • TOOL_OF or DEV_TOOL_OF - To indicate tooling included in OSS package for development or utility purposes which are not used to build a package. Think git commit hooks, FTP upload scripts, etc.
  • Clarifying prerequisite description in existing spec (2.2?) is also needed for target system
  • Others to comment… provide input.

Clarify the use of newlines in license expressions in Appendix IV

The current license expression syntax states that whitespace must be used between elements of a compound expression. However, it does not explicitly state if a newline, CR, or LF is included in the definition of whitespace.

Suggest adding a definition of white space which include new line.

formally capture External Identifiers (e.g. Maven GAV, NIST CPE) by which a Package is known in SPDX

This has been transferred from: https://bugs.linuxfoundation.org/show_bug.cgi?id=1295

Bill Schineller 2015-06-23 16:08:02 UTC
Capture External Identifiers (e.g. Maven GAV, NIST CPE) by which a Package is known in SPDX doc.

So that SPDX data can be easily correlated with data that other repositories, package management, build systems have about the package.
Each of these external systems has their own format for a specific version of a 'package' (what SPDX calls a package, other systems might call an 'artifact' or Vendor-Product-Version...)

  1. Maven
    Format: :[:]
    Example: activemq:activemq-transport-http:1.3

  2. CPE (Common Product Enumeration) see https://cpe.mitre.org/specification/
    Format: cpe:/a:::[:][: | packed field]
    Example: cpe:/a:acegisecurity:acegi-security:1.0.3

  3. Rubygems
    Format: [/]
    Example: ActionTimer/0.0.2

  4. npmjs
    Format: [/]
    Example: rethinkdbdash/1.16.3

  5. NuGet
    Format: [/]
    Example: AForge.Controls/2.2.3

Bill Schineller 2015-06-23 16:17:23 UTC

  1. PyPI
    Format: Format: [/]
    Example: medialog.iconpicker/0.2.3

Bill Schineller 2015-06-23 17:47:14 UTC
Per conversation on tech concall, we should be clear that we only want to accept External Identifiers that point to a specific, discrete version of software / set of files. i.e. no wildcards, no 'this version or greater' semantics ----- the 'namespace' i.e. what system the identifier is unique within is critical to this ---- where to find the repository online is important --- requirements for a 'repository' (repository of information, not necesarily repository of bits) to be legitimate (the identifier must be unique within that repository) - should be able to get a hardcopy of the software? (nah, NIST CPE is just a list...) - --- is there a way to factor out the list of repositories from the spec? maybe a list of 'repositories of information' that we might maintain on spdx.org ?

Bill Schineller 2015-07-14 13:59:40 UTC
Draft spec proposal at https://docs.google.com/document/d/1j6LWnkh5GbMV9Xo5_zJ0wTNLROEIa4o1OU279YueI90/edit?usp=sharing

Kate Stewart 2015-12-22 18:49:01 UTC
This is still a work in progress for tech team.

Bill Schineller 2016-05-10 17:41:36 UTC
in Section 3.21 and the new Appendix VI (6) of SPDX 2.1 near-final draft Note (the Appendix has a finite list of some External repositories e.g. NIST Common Product Enumeration (CPE) and Maven GAV. SPDX 2.1 chose not to try to implement custom-defined External repos not in the list. Also a relatively coarse-grained list of Categories

Bill Schineller 2016-05-24 17:54:25 UTC
reassigning to Kate, to pull the proposed Appendix VI into the SPDX version 2.1 spec.[reply] [-]

Bill Schineller 2016-06-28 17:29:29 UTC
Appendix VI: External Repository Identifiers was pulled into SPDX 2.1 https://docs.google.com/document/d/112x3s3g1Qg2tj8bjvIPsqIBlWUp3Sob37cvAx2eiS6U/edit#heading=h.hb0u4akk190q One pending issue is how to have a single type for 'debian' but be able to differentiate different distro versions jessie, wheezy, ...

Remove + from valid idstring values and consistently link to appendix IV

The 2.1 spec is not particularly DRY on idstring values. There are a number of local definitions that match up with 1*(ALPHA / DIGIT / "-" / "."), a definition that includes + (perhaps from before it was a Licence Expression operator?) and a “defined in Appendix” (without specifying which appendix). I think we should extend our use of ABNF to include more than just appendix IV. We'd define idstring (or just id?) in the first place we needed it (here?), and then later sections would link that earlier definition and consume it's ABNF rule.

I'm happy to work up a PR for this if it sounds useful.

Deprecating SHA1

This is likely a condition for projects going for CII badging so good thing to do, given public compromises noted. Other notes from earlier discussion on google doc

  • PVC to use something
  • Uday interested in putting this proposal, Brad, Yev +1
  • SPDX 3.0 - is going to be needed. General ok from all on call.

Need to be able to express Dependencies on a range of versions.

Thomas: See a need for syntax to capture how a package dependencies where the dependencies are specified with version range e.g. resulting non deterministic builds. SPDX only now offers to specify dependencies using fixed one-on-one relationships e.g Package A depends on Package B v1.1. In reality Package A specifies it relies on Package B v1.0 or newer.

Having this in the spec provides package maintainers a technology agnostic way to specify their dependencies closer to reality. Provides consumers of these packages with an indicator that including package may result license mix that can change with every build. May also be useful to handle the difference between the declared (by maintainer) and resolved dependencies (by package manager).

Example - SPDX specifies dependency on angular 4.1.1, see it’s package.json specifies depends on core-js 2.4.1 or newer

Note: approach is not figured out yet, but general agreement that this is a problem and we should look into solving it for the next release.

Google redirects in many links on SPDX 2.1 specification HTML version

Hi Kate, Gary recommended I open this bug here. Please let me know if this would be better handled elsewhere. Here's the issue:


I noticed a minor issue in the HTML version of SPDX 2.1 specification. All of the HTML links in that section go through the Google redirect service, prepending the SPDX URL with a Google URL (e.g. https://www.google.com/url?q=http://spdx.org/spdx-license-list/matching-guidelines&sa=D&ust=1473291615549000&usg=AFQjCNGAF8fFt6wIxj4Sj1XSOS0LdR2a5A). I'm guessing that this may be due to copy-and-paste from a gdoc draft?

For the PDF version, I didn't do a thorough check, but did look at Appendix II, and this issue does not seem to affect that version. However, I did notice some links to Google Docs in Appendix IV there (the 2nd and 3rd links, to "Appendix I.1" and "Appendix I.2") which are probably meant to be internal links rather than to an s.sfusd.edu Google Doc (this was present in both the PDF and HTML version of the spec).


Best,
Brad

Need to be able to describe relationships between SPDX license-list files (new element?)

Problem: how do we capture sets of related licenses, esp. Translations

  • Proposed solution: treat every license as separate file, then describe relationship w/ new element?
  • Any way to describe relationships among license groups such as official/unofficial translations, ported/unported, etc.?
  • For EU Public License in German, that might look something like this:
   <relatedLicenses>
      <relatedLicense relationshipType="official-translation" targetLicenseIdentifier="EUPL-1.1">EUPL-1.1</relatedLicense>
   </relatedLicenses>
   ...
   Note: Up to 24 for EUPL, etc.
  • Could this include license stacks (like newlib)? License stacks used as licenses? How to differentiate from license stacks used as informal package-license manifests?

Make the matching template formats of license part of the spec - add matching guidelines annotation to SPDX licenses and to NON-SPDX licenses.

Thomas: Make the matching template formats of license part of the spec - both SPDX listed licenses and NON-SPDX listed licenses. Would like to add matching guidelines annotation to SPDX licenses and to NON-SPDX licenses. Also add templating for copyright holders and dates.
XML specification of license texts. Has templating. Matching guidelines.
Want to add cross references to license that are on the SPDX license list.
Concern: schema to store information about license is ok, but matching templates could become problematic. May be differnently to apply consistently. Old templating language in specification is only available on listed licenses. Make other properties to listed licenses. XML language is being used by legal team to line up with guidelines, but may not be standardized enough. Non-standardized input format, move to output format.
This is possibly 3 different proposals:
Add additional properties to OTHER LICENSE INFORMATION file to bring up to same level as SPDX listed licenses.
Add additional fields for listed licenses, so information present in XML can be made visible as start of output representations (for instance bullets, copyright) (we don’t want them using the input format)
Add in OTHER LICENSE INFORMATION that is not in SPDX license list model to the SPDX license lists (ie. comment)

Some harmonization here is going to be needed. We probably want to include license exceptions in remodeling discussion. This is probably a 3.0 feature.

Add “NOASSERTION” to the license expression syntax

Like #49, but for NOASSERTION instead of NONE. The semantics would be:

NOASSERTION means:
(i) the SPDX License Expression author has attempted to but cannot reach a reasonable objective determination;
(ii) the SPDX License Expression author has made no attempt to determine this field; or
(iii) the SPDX License Expression author has intentionally provided no information (no meaning should be implied by doing so).

That matches our existing usage except for PackageLicenseInfoFromFiles and similar, where we currently drop (i). I don't think those consumers would suffer from the additional case, because I don't see an actionable distinction between those cases. When would you care about the distinction between “tried but gave up”, “did not try”, and “won't tell you”? If folks did care about those distinctions (which I think unlikely), we'd want to be using different tokens for each case.

Other divergent NOASSERTION consumers are:

  • SnippetLicenseConcluded, which adds an additional case:

    the SPDX document creator is uncomfortable concluding a license, despite some license information being available;

    I don't think we need to bother with that one at all, since I can't think of a case where I could distinguish between it and the “cannot reach a reasonable objective determination” case, even for license expressions I write myself. But I haven't looked up the background motivation for this case, perhaps it is useful. If so, I don't see the harm in including it for all consumers.

  • LicenseName, which is completely unrelated to license expressions.

Allow Relationship Types to be Predicates

Allow relationship types to be predicates

(e.g. http://mynamspace#mypackage spdx:contains http://mynamespace#myfile).
Verbose due to the presence of optional comment field
Details from discussion from Yev:

Package → Relationship
Relationship → File

Package:contains → File

contains: http://myname:myFile

<.... id=”myPackage”>
<spdx:contains id=”http://myname:myFile” />
</rdf:Description>

http://myNamespace#myPackge spdx:contains http://myNamepsace#myFile

Thomas notes they are using relationship comments to customize relationship. So would not like to see this ability remove. Likes the proposal, but not want to see “other” removed.

Yev, both should be valid (short version, as well as original). Yev to provide example. Not make old ones go away, just enable addition of concise way of expressing relationships.
Thomas agrees Tag value will become clearer as result of having this additional syntax.
Open question - can annotation describe a relationship? Based on model, not able to. So not a solution at this point.

Add size of file as optional field

Size of File (optional) - express as number of bytes similar to mechanism used for snippet. Will be useful for heuristics working with snippets and licensing.

Add SHA1GIT as optional file checksum

Transfered from bugzilla: https://bugs.linuxfoundation.org/show_bug.cgi?id=1356

Kate Stewart 2016-05-19 14:06:22 UTC
see: http://lists.spdx.org/pipermail/spdx-tech/2016-May/003101.html

and from farther down the thread.

I see how making the SHA1 algorithm non-mandatory would be a breaking change, and that we'd like to avoid that. But maybe we could at least allow SHA1GIT as an additional algorithm and add it to the spec.

WRT the use-case you're asking for: It's all about performance. In our case scanners actually do scan Git checkouts most of the time, as dependencies (be it build time or runtime time) are usually included as Git submodules. When scanning these files, it does not make much sense to force the scanner to calculate the SHA1 on each file (in order to create valid SPDX) if the SHA1GIT is already known. However, I have to admit that getting the blob SHA1 for a given file name is a rather slow operation in Git, and for single small files (which is not uncommon for source code files) it might actually be faster to calculate the SHA1 instead of looking up the known SHA1GIT.

Finally, there's also the "reverse" use-case: Suppose you have an SPDX file with a bunch of File Checksums given, an you'd like to know which are the candidate Git commits these files can originate from. If only the SHA1s are given, you'd have to iterate over all eligible commits in you Git repositiory, checkout the files, and calculate the SHA1 on them to see whether there's a match. With the SHA1GIT on the other hand, you could directly search Git's object database to find the trees / commits that contain the given blobs.

I agree it probably is an edge-case, but maybe still enough reason to at least allow SHA1GIT as a File Checksum algorithm.

Regards, Sebastian

Bill Schineller 2016-05-24 17:19:48 UTC
Decided not to change Spec version 2.1 with respect to mandatory SHA1.

Also at present not adding sha1git as a checksum type in Spec version 2.1

Changing the whole story around checksums is the type of thing we would consider discussing for an SPDX version 3.0.

(For now we hope to encourage consistent re-usable SPDX documents by sticking with our current approach of uniquely identifying each file by a SHA1)

Add appendix with SPDX best practices

There lots of best practices documents created by the SPDX tech/outreach teams but none of them are easy to find. Propose to add appendix to the spec that links to various best practices resources

Want to include only package files that are exceptions to package license

Yev Bronshteyn 2016-04-26 17:12:03 UTC
Currently, either all the files in a package must be specified or, via the filesAnalyzed attribute, none.

However, there's a use case for specifying only those files that are exceptions to package-level licensing or, perhaps, other metadata. From the email conversation at http://lists.spdx.org/pipermail/spdx-tech/2016-April/003068.html:

I don't see the value of including the filesAnalyzed tag in my use case. I'm not doing "analysis", I am telling you what the answer is. Others can later do analysis, using that and other data, if they want to. Since this is human-created, I'm trying to minimize the number of lines.

Bill Schineller 2016-05-17 17:57:21 UTC
Won't come to closure on this for 2.1 version of the Spec, so setting bugzilla Version to 'unspecified'

Make checksums optional in next spec

Having checksums mandatory is very impractical and you often end up with SPDX docs with checksums that do not match what you redistribute by the bit making them useless in practice.
Having checksums is not a bad idea but making them mandatory is a very bad one IMHO and it makes creating valid SPDX document difficult without a good reason.

Make parens optional in license expressions

There are no good reasons for ( parens ) to be mandatory for most compound expressions.
The only cases I can fathom would be:

  1. multiline expressions, and IMHO these should be banned
  2. rare cases where the operator precedence is not enough
  3. to make expressions more easy to read

Therefore, they should be optional and best left to the cases where they are needed only. This would make things much simpler.
Tool wise, any decent boolean expression handler does not care much about un-needed parens. If they are that do, they should be fixed.

Add missing SPDX file

Would be nice if the SPDX specification specifies itself with a SPDX file as CC-BY-3.0 AND MIT

Put each sentence on its own line in the Markdown source?

I help out with the Open Container Initiative, which does this. There's no effect on the rendered content, and putting each sentence on its own line helps with git blame-based workflows. For example, links like this can target a single sentence. And it's easier to see the last commit that the content with sentence granularity (instead of paragraph granularity):

$ git blame chapters/appendix-IV-SPDX-license-expressions.md | grep ' 90)'
f902b619 (Thomas Steenbergen 2017-05-02 22:46:48 +0200  90) Sometimes a set of license terms apply except under special circumstances. In this case, use the binary "WITH" operator to construct a new license expression to represent the special exception situation. A valid \<license-expression> is where the left operand is a \<simple-expression> value and the right operand is a \<license-exception-id> that represents the special exception terms.

Update SPDX License List in Appendix 1

The following table contains the full names and short identifiers for the SPDX License List, v2.5 which was released July 2016. For the full and most up-to-date version of the SPDX License List as well as other related information, please see http://spdx.org/licenses/

Should we upgrade to v2.6 in the next fix release

Unclear or broken links

Several links within the 2.1 specification are broken or unclear where they are referring to.

Examples

2.9 .. This field is distinct from the fields in section 7, which involves the addition of information during a subsequent review.

Guess this link is incorrect refers to Relationship section. How does this related to Created attribute?

5.5 If the Concluded License is not the same as the License Information in File, a written explanation should be provided in the Comments on License field (section X.5). With respect to NOASSERTION, a written explanation in the Comments on License field (section X.7) is preferred.

X.5? X.7. Think this should be twice reference to 5.7

In Appendix I: SPDX License List Master Files -> http://git.spdx.org/?p=license-list.git%3Ba=summary 404's should point to https://github.com/spdx/license-list

PackageSupplier: Use <> instead of parens for optional email?

The 2.1 spec has:

"Person:" person name and optional "("email")"

and:

PackageSupplier: Person: Jane Doe ([email protected])

However, it would seem more usable if we lean on RFC 2822 and use <> to set off the address. Borrowing from RFC 2822's rules for display-name and angle-addr, the ABNF for the PackageSupplier: would be would be:

package-supplier = "PackageSupplier:" 1*(person / organization)
person = "Person:" name-optional-addr
organization = "Organization:" name-optional-addr
name-optional-addr = display-name [angle-addr]

Make Creator/Created optional

These are currently required, but for SPDX that is maintained in version control for a project (e.g. here), the information may be interesting, but doesn't seem critical (and it can be extracted from version control for the very curious).

This is a small part of #29.

Add appendix describing SPDX Listed License fields

There are several properties used in the SPDX Listed Licenses which are not documented in the specification.

They are currently documented in the RDFa terms used section of the Accessing SPDX Licenses document. There are also references to these fields in the License XML Elements and Attributes document.

Missing elements include:

  • isOsiApproved
  • isFsfFree (recently added)
  • standardLicenseHeader
  • example (used in Exceptions)

Propose we add another appendix Listed License Information which details out all the fields including those in common with extracted license text (e.g. licenseId, etc.).

Use headers with anchors for h3 and beyond

We're currently using # and ## to mark h1 and h2 headers (e.g. here and here). But once we get down to h3 and beyond, we start using emphasis ** instead of headers ### (e.g. here). We also stop providing anchors. That means that, while we can link to h2 headers (like this), there's no way to link to the h3+ sections. If folks are ok with it, I'd like to file a PR that converted our h3+ headers to use ###, etc. and gave them all anchors like we have for h2. Thoughts?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.