force11 / force11-scwg Goto Github PK

Force11 Software Citation Working Group

Home Page: https://www.force11.org/software-citation-working-group

TeX 100.00%

force11-scwg's Introduction

⚠️ This group is no longer active. If you are interested in implementing software citation, please join https://www.force11.org/group/software-citation-implementation-working-group ⚠️

FORCE11 Software Citation Working Group

Mission Statement (WIP)

The software citation working group is a committee that will leverage the perspectives of a variety of existing initiatives working on software citation to produce a consolidated set of citation principles in order to encourage broad adoption of a consistent policy for software citation across disciplines and venues. The working group will review existing efforts and make a set of recommendations. These recommendations will be put up for endorsement by the organizations represented by this group and others that play an important role in the community.

The group will produce a set of principles, illustrated with working examples, and a plan for dissemination and distribution. This group will not be producing detailed specifications for implementation although it may review and discuss possible technical solutions.

See Joint Declaration of Data Citation Principle as a example of a similar deliverable.

The final output of the group was Smith AM, Katz DS, Niemeyer KE, FORCE11 Software Citation Working Group. (2016) Software citation principles. PeerJ Computer Science 2:e86 https://doi.org/10.7717/peerj-cs.86.

Co-chairs: Arfon Smith & Daniel S. Katz & Kyle Niemeyer

##Timeline

Phase 1 (June - July 2015)

Kick off meeting (telecon) with goals of:

Establish interest/backgrounds of working group participants.
Review mission statement, timeline and goals
Seek out additional participants (if we're missing key individuals)

Phase 2 (July - September 2015)

Gather materials documenting existing practices in member disciplines
Gather materials from workshops and other reports
Review materials, identifying overlaps and differences

Phase 3 (September 2015 - January 2016)

Drafting of Software Citation Principles ~~(possibly in person at WSSSPE, Boulder, CO - 28/29 September)~~
Seek community feedback on draft
Iterate

Phase 4 (January - March 2016)

Software Citation Principles proposed final draft complete.
Seek out community endorsements for draft principles.

Phase 5 (April 2016)

Presentation of formal recommendations at FORCE2016

Communication plan

Monthly telecons
GitHub for documentation/iterating on content
Google groups for general discussion
FORCE11 email list for announcements

##Members

If you are interested in joining the group, please:

Add yourself to the list below through a pull request
To add yourself to the group mailing list and group folder on FORCE11:

Sign into FORCE11: http://www.force11.org/user/register
After Signing in, on the group page (http://www.force11.org/group/software-citation-working-group) subscribe to the group here using the "subscribe to group" link in right column above the group list

Name	Affiliation	Role
Alberto Accomazzi (@aaccomazzi)	Harvard-Smithsonian CfA	Participant
Alice Allen (@owlice)	Astrophysics Source Code Library	Participant
Micah Altman (@maltman)	Program on Information Science, MIT	Participant
Jay Jay Billings (@jayjaybillings)	Oak Ridge National Laboratory	Participant
Carl Boettiger (@cboettig)	UC Berkeley	Participant
Jed Brown (@jedbrown)	CU Boulder	Participant
Sou-Cheng Choi (@sctchoi)	NORC at the University of Chicago and Illinois Institute of Technology	Participant
Neil Chue Hong (@npch)	Software Sustainability Institute	Participant
Tom Crick (@tomcrick)	Cardiff Metropolitan University	Participant
Mercè Crosas (@mcrosas)	IQSS, Harvard University	Participant
Scott Edmunds (@ScottBGI)	GigaScience, BGI Hong Kong	Participant
Christopher Erdmann (@libcce)	Harvard-Smithsonian CfA	Participant
Martin Fenner (@mfenner)	DataCite	Participant
Darel Finkbeiner (@darelf)	OSTI	Participant
Ian Gent (@turingfan)	University of St Andrews, recomputation.org	Participant
Carole Goble (@carolegoble)	The University of Manchester, Software Sustainability Institute	Participant
Paul Groth (@pgroth)	Elsevier Labs	Participant
Melissa Haendel (@mellybelly)	OHSU	Participant
Stephanie Hagstrom (@sthagstrom)	FORCE11	Participant
Robert Hanisch (@rjhanisch)	NIST/ODI	Participant
Edwin Henneken (@ehenneken)	Harvard-Smithsonian CfA	Participant
Ivan Herman (@iherman)	W3C	Participant
Konrad Hinsen (@khinsen)	CNRS	Participant
James Howison (@jameshowison)	UTexas	Participant
Michael Hucka (@mhucka)	Caltech	Participant
Lorraine Hwang (@ljhwang)	UC Davis	Participant
Thomas Ingraham (@tingraham)	F1000Research	Participant
Matthew B. Jones (@mbjones)	NCEAS, UC Santa Barbara	Participant
Catherine Jones ([@cm-j0nes] (https://github.com/cm-j0nes))	Science and Technology Facilities Council	Participant
Daniel S. Katz (@danielskatz)	University of Illinois	Co-chair
Alexander Konovalov (@alex-konovalov)	University of St Andrews	Participant
John Kratz (@JEK-III)	California Digital Library	Participant
Jennifer Lin (@jenniferlin15)	Public Library of Science	Participant
Frank Löffler (@knarrff)	Louisiana State University	Participant
Brian Matthews (@brianmatthews42)	Science and Technology Facilites Council	Participant
Abigail Cabunoc Mayes (@acabunoc)	Mozilla Science Lab	Participant
Daniel Mietchen (@Daniel-Mietchen)	NIH	Participant
Bill Mills (@BillMills)	Mozilla Science Lab	Participant
Evan Misshula (@EMisshula)	CUNY Graduate Center	Participant
August Muench (@augustfly)	American Astronomical Society	Participant
Fiona Murphy (@DrFionalm)	Independent Researcher	Participant
Lars Holm Nielsen (@lnielsen)	CERN	Participant
Kyle Niemeyer (@kyleniemeyer)	Oregon State University	Co-chair
Robert Peters (@rcpeters)	ORCID.org	Participant
Tom Pollard (@tompollard)	MIT	Participant
Karthik Ram (@_inundata)	University of California, Berkeley	Participant
Fernando Rios (@zoidy)	Johns Hopkins University	Participant
Ashley Sands (@ashleysa)	UCLA Information Studies	Participant
Soren Scott (@roomthily)	Independent Researcher	Participant
Frank J. Seinstra (@fjseins)	Netherlands eScience Center	Participant
Arfon Smith (@arfon)	GitHub	Co-chair
Kaitlin Thaney (@kaythaney)	Mozilla Science Lab	Participant
Ilian Todorov (@iliant)	STFC	Participant
Matt Turk (@MatthewTurk)	University of Illinois	Participant
Miguel de Val-Borro (@migueldvb)	Princeton University	Participant
Daan Van Hauwermeiren (@DaanVanHauwermeiren)	Ghent University	Participant
Stijn Van Hoey (@StijnVanHoey)	Ghent University	Participant
Belinda Weaver (@weaverbel)	The University of Queensland	Participant
Nic Weber (@nniiicc)	University of Washington iSchool	Participant
Marijane White (@marijane)	OHSU	Participant
Qian Zhang (@paopao74cn)	University of Illinois	Participant

(this list is in alphabetic order by surname; please keep it that way when making additions)

force11-scwg's People

Contributors

Stargazers

Watchers

Forkers

lnielsen mercecrosas npch pgroth kaythaney bkatiemills evanmisshula olexandr-konovalov mbjones jek-iii daniel-mietchen karthik ehenneken ashleysa turingfan matthewturk tingraham brianmatthews42 cm-j0nes fjseins jenniferlin15 jameshowison libcce jenniferboyd aaccomazzi rjhanisch tomcrick abbycabs scottbgi fionalmmurphy kyleniemeyer ljhwang owlice iliant stijnvanhoey augustfly migueldvb bboscoe knarrff zoidy jayjaybillings darelf nelsonjc-osti carolegoble jedbrown sctchoi mit-informatics gitter-badger khinsen rcpeters paopao74cn marijane openrif alee espacial keknight mjbuys

force11-scwg's Issues

what does an identifier point to?

Discuss the issue that an identifier can point to a specific version, and this is what we mostly were thinking here. There are also other use cases that are valid, such as identifiers that point to a collection of versions of software, or identifiers that point to the latest version.

One of the drivers behind a collection is to be able to follow and obtain credit for a total software package.

figure 2 change

move stakeholder column to the far right, and make it not so strict, recognize that there are multiple stakeholder groups for many use cases

Extend discussion of software metadata needs

The current section Existing efforts around metadata standards discusses several software metadata efforts, but doesn't clearly articulate that a consolidation effort is needed. Projects like CodeMeta are attempting to to crosswalk software metadata specifications for interoperability, and this section should highlight the need for this while clarifying that it is out of scope for these citation principles. I am willing to draft a pull request with a short edit to this section.

Use case text

strongly agree that the text in the table 2 caption is moved to section 3

Additional research use case, historical/cultural analysis

Researcher wishes to reproduce as closely as possible the experience of use of particular software in a particular context.

See library of congress Report: http://www.digitalpreservation.gov/multimedia/documents/PreservingEXE_report_final101813.pdf

What shoudl we say about "Software Papers

section 6 is empty

we don't have any examples of how the principles would be applied to the use cases, which we were going to do

Please delete respecting software authors' requests.

This may be controversial, and indeed I know it is from recent twitter discussions. But I feel it's important - or at least even if too controversial for agreement it's important to have the discussion.

There is text in the draft which says

"In addition, if the software authors ask that a paper should be cited, that should be respected."

I would like this sentence simply to be deleted.

I profoundly disagree with this point. I feel that software authors' opinion of whether their work should be cited is almost (but not completely) irrelevant.

Not only that I think it's extremely important that the community understands that it is not up to software authors to demand citation. They can certainly request it, that is fine. But citation is part of the scientific process not a request that an author can insist on. This is NOT trying to demean the importance of software citation, it's actually equating it to all other forms of citation. I cite papers because it's the right thing to do, not because somebody asked me to. Nobody ever cites the papers that authors want to have cited: if they did then all papers ever written would cite all other papers ever written.

The only area where I think the author's wish should be viewed as relevant is when it's a 50:50 call on whether to cite a piece of software or not. I.e. I should cite the software I should cite it irrespective of the author's wish and - critically - if it's wrong to cite it then I shouldn't cite it irrespective of the author's wish. If it's an edge case then yes, it's reasonable to cite it if the author wants me to.

As an analogy, consider the very common case where a review comes back asking for 4 citations to somebody you strongly suspect to be the author of the review. This is often seen at best as an embarassment or at worst as a form of mild scientific misconduct. In my view software authors insisting on citation (as opposed to requesting it) is similar.

There is a long running dispute in this area in the case of Gnu parallel, for example. The author insists on software citation for any paper that uses it, and explicitly asks people not to use it if they are not prepared to use it. But there is no nuance: i.e. the author is not encouraging me to do so if it is right, but requiring me to do so if I use the software on a scientific paper.

additional use cases

changes or additions to list of use cases

Revise principles to clarify implications for citation limits / page limits

It will always be the responsibility of reviewers to police whether citations are appropriate. Principles should be clear that citation limits should accommodate rather than stifle software citation (and resource citation, and data citation).

Access to software: free vs commercial

The section talks about software that is “free” as well as “commercial” software. I am not sure whether this is about free as in freedom (or just gratis or freely available), since it is compared with commercial software, which is unrelated in general, see http://www.gnu.org/philosophy/words-to-avoid.html#Commercial

I suppose that “free” should be replaced by “gratis” and “commercial” be replaced by “non-free” in that section.

infographic for principles?

Should we create a graphic of some type that makes this more appealing to a wider audience?

perhaps in addition, create a few (3?) slides that people can use to talk about this.

"Software citations should permit ... access to the software itself"

Under the "Access" header, the data declaration states that:

"Data citations should facilitate access to the data themselves"

Under the same header, the software declaration states:

"Software citations should permit and facilitate access to the software itself "

The addition of "permit" suggests that software citations should also grant the user with permission to access the software. Is this intentional?

It doesn't seem like a good idea to make access a requirement for discovery, so "permit" might not be helpful in this sentence.

Related identifiers

While we often need to cite a specific version of software, e.g. a release, we also need a way to cite the software in general, and to link multiple releases together. For this reason we need more than one persistent identifier for each software: a) a general one, and b) a specific one for each release.

This issue is similar to what we see with versions for data. One use case would be to link all versions of a piece of software with a Zenodo DOI together, and then also associate the stars and forks of the code repository.

Van de Sompel et al, 5 attributes important for software metadata records?

Not sure if this made it into the earlier list of relevant materials, but I think this article gives a good introduction to research objects:

Van de Sompel, H., Payette, S., Erickson, J., Lagoze, C., & Warner, S. (2004). Rethinking scholarly communication: Building the system that scholars deserve. D-Lib Magazine, 10. Retrieved from http://www.dlib.org/dlib/september04/vandesompel/09vandesompel.html

They mentioned the following processes for multiple kinds of scholarly communication (data, code, paper, etc.) = registration, certification, awareness, archiving, rewarding.

Do we want to ensure those 5 attributes are relayed within the metadata?

Confusion about Excel

It's a small point but I find the text about Excel confusing in 5.1. That is, in the same paragraph we are told to cite Excel, not to cite Excel, and that the two statements are consistent. I realise this is a parody of what is said, but some clarification might be helpful. Even if it just meant changing the example of Excel to something else in the storing and plotting data example.

use case: data repository wants to link data, software, and papers in provenance trace

Domain and institutional data repositories have both data and software artifacts, and want to link these together in a provenance trace that can be cited. Sometimes the software is a separately identified artifact, but at other times software is included inside of data packages, and the researcher wants to cite the combined product. See example of mixed data and software package (containing R code) here: https://knb.ecoinformatics.org/#view/doi:10.5063/F1Z899CZ

Granularity of the citation

one of the key issues with any citation, whether document, individual, or software is the specificity of what is being cited. in the case of publications, there is almost zero specificity most of the time.

it's very easy to cite an entire package even though one function was used. part of this problem is being solved in the Python world through this project (https://github.com/duecredit/duecredit).

any citation should have the ability to specify more than just the obvious, but even the obvious would be a good starting point.

the citation/url should therefore allow for greater specificity within a code base. in general though, a provenance record of the workflow would be significantly more useful than a citation from a research perspective.

things we don't want to discuss further but should acknowledge

peer review of cited software, or other review mechanisms

Principles should emphasize the need for a better way than citations to manage academic credit

The current wording risks leaving the take-home as "All our issues with citation, credit, and reproducibility of software can be adequately addressed within the current model of academic citation practices." We dismiss problems where the desire of credit conflict with the desire to track provenance by saying "that's a problem for academic citation in general, so to the extent that citation still fulfils both roles for papers, it can do so for data as well."

I fear this misses the orders of magnitude difference between how these problems manifest in software vs how they are dealt with in papers. The quirks of citation practices which have been manageable in papers are exacerbated to a degree in which they may no longer be manageable.

For instance: Both citations to both software and papers can suffer the 'wrapper problem' -- citing a review paper acknowledges the provenance of ideas but fails to allocate credit (citation count) to the originators. Likewise citing a software client library acknowledges the provenance through it's dependency on the sever software system, but fails to transfer credit to it. The difference is one of scale -- a closely knit research community can self-police a glaring omission of credit if an author cites a textbook in place of a citation classic of the field. A reviewer is far less likely to be familiar with the original sources and underlying dependencies when they encounter a citation to a software wrapper around an existing algorithm or software system.

Both software and papers share a tension in citing for provenance and citing for credit, but software has this issue in spades. Provenance means fine-grained citation to particular version, credit means accumulating those citations against a single object. Thin wrappers around fundamental dependencies are commonplace. Authorship concepts are both more diverse and less governed by well-understood norms.

While we strive to offer practical guidelines that acknowledge the current incentive system of academic citation, a more modern system of assigning credit is sorely needed. It is not that academic software needs a separate system from academic papers, but that it underscores the need to overhaul the system of credit for both.

As discussed in the workshop, I'm working on a pull request along these lines, but comments & references welcome here.

Comment from Catherine Jones in chat

I have to go soon, so I wanted to make a general comment about the principles, section 2. The term "science/scientific" is used a lot through this section. Here in the UK this term has a meaning which restricts this to physical sciences and excludes social sciences, art & humanities. The term "research" tends to be used in these circumstances. I believe that these principles apply to all domains, so maybe the wording should be considered. I appreciate this may by a UK only cultural issue. I was on a data policy committee where this was a very touchy issue.

Is the question of when to cite an entirely community-specific decision?

The current document appears to declare of the question of what to cite/when to cite completely out of scope:

"The software citation principles do not define what software should be cited, but rather, how software should be cited."

The result appears to be to defer analysis to the differing scholarly communities. The declaration of "importance" suggests that software should be cited more often, but again seems to imply that practices will be community specific.

Given the focus of the document on reproducibility, might it be possible to specify some necessary conditions (not sufficient) for citation in the principles, e.g:

"When a software is used directly in the process of establishing a published claim, that software should be cited. "

Additional material and community actions

sciencecodemanifesto.org set out principles including citation

sect 4.1 there have been numerous workshops on reproducibility which have included software and data citation
the latest in the UK was the Alan Turing Institute Symposium on Reproducibility last week.

sect 4.2 the NIH report was also put out to public comment

sect 4.3
other efforts on metadata standards include:

researchobject.org (:-))
The Software Ontology http://theswo.sourceforge.net/ http://jbiomedsem.biomedcentral.com/articles/10.1186/2041-1480-5-25
EDAM Ontology (used by ELIXIR and incorporating the Software Ontology)

5.6 we should mention of RRID which is a Force11 activity

need to write an abstract

We need to write an abstract before Force2016.

Recording of first meeting (24/7/15)

https://www.youtube.com/watch?v=QqDfuJj8xTE

Additional considerations for contribution representation

Consider more advanced options to represent the relationship between people and various software contributions. e.g. section 4.3.

Models such as those evolving in openRIF (formerly VIVO-ISF) https://github.com/openrif, which represent contribution types and roles towards any contribution type, and are not dependent on authoring of journals (though this is not the only source of such models)
Transitivity of contribution in packages that rely on one another and across versions. I quite like how this has been captured for various versions of data in the HCLS dataset description - you can have a summary level representation that includes all contributions to date, or a version level distribution that has only contributions to that version.
Include reference to how software should be represented in CVs and biosketches to aid evaluation and review.

explain that this is for source code citation specifically, not more general software citation

Educator use cases

Think about how to give researchers credit for projects like https://github.com/jupyter/nbgrader

change figure 2

In Figure 2, change Requirements label to be "Basic Requirements"

discuss RRIDs?

@CaroleGoble wrote in #111
5.6 we should mention of RRID which is a Force11 activity

I'm separating this out, so the main part of that issue, on Related Work, can be assigned to Arfon

Reference lists

In my personal view one of the shortcomings of the Joint Data Citation Principles is that they don't specifically mention that citations should go into reference lists. I am glad to see that the software citation principles mention reference lists. There might be a better place in the text for this, e.g. in an item 7 interoperability, again similar to the joint data citation principles.

New background info

I think it would be useful to cite:

NIST’s report on the economic impact of TREC [1] as motivation for why software citation is important
"Assessing the impact of software on science: A bootstrapped learning of software entities in full-text papers" [2] - it has some nice stats on how much software is found in PLOS.. not OA :-( unfortunately

[1] http://trec.nist.gov/pubs/2010.economic.impact.pdf
[2] http://dx.doi.org/10.1016/j.joi.2015.07.012

Recording of the second meeting (30 October 2015)

https://www.youtube.com/watch?v=nfGMA_SjOdU

Additional references for section 4.3

I think that, at least for historical reasons, we should mention DOAP. Although, afaik, Ed Dumbill does not pursue the project any more, it was, for a long time, almost the only game in town when it came to a more formal set of terms used in computer science (mainly in open source projects).
More recent is the set of terms defined by schema.org, namely https://schema.org/SoftwareApplication. Knowing the importance schema.org has in the search space, this metadata set may have a great importance in practice.

Add prereq "use case" to narrative prior to use case table

Summary of discussion a little before noon on Sunday of workshop:
Ensure there is an introduction to the "zero" use case in the narrative introduction to the use cases table. This prerequisite is that a "creator" has generated a piece of software that has metadata. Then the rest of the use cases follow, but we need to clarify that the software has been generated.

Citation styles

Citations in text follow the citation style used. Two practical recommendations (which might already be work for the implementation group) are: a) include version information, and b) include a label to indicate that it is software, e.g. [Software].

Deletion of some text about how to cite.

I just opened and closed an issue because I may have misunderstood the text. So I am reopening this one but would still wish this text to be deleted:

"In addition, if the software authors ask that a paper should be cited, that should be respected."

It's not clear to me if the point being made is what paper to cite once a decision is made to cite some software.

In fact I feel that sentence is still superfluous and in fact could be deleted, but I feel much less strongly about this. The preceding and succeeding sentences surely cover every case? I.e. they clearly explain that if i should cite it I should cite it, so I'm not sure what the "in addition" point is.

On the other hand this sentence could be read as implying that if I use software and the authors want me to cite it, then I should cite it. This is the point I was addressing earlier, to which I strongly objected. And which I would stand by if it was interpreted that way. In that case I would much more strongly urge deletion of that sentence, as per my now closed issue.

Update use cases table with changes in Google Doc

Table and discussion needs to be updated based on changes made during 5 April meeting, as per https://docs.google.com/document/d/1dS0SqGoBIFwLB5G3HiLLEOSAAgMdo8QPEpjYUaWCvIU/edit

Recommended vs. required/minimal metadata for use cases

Based on comments from the 5 April call and some in the Use Cases Google Doc, there is some interest in differentiating between metadata for each use case that we see as required/minimal and what we recommend.

My suggestion is that we use an open circle (LaTeX: \textopenbullet) for the "optional" recommended metadata.

There are already some suggestions:

@mfenner and @owlice suggested adding description/abstract/readme as recommended
@ljhwang suggested that license may become recommended (rather than required) for most

Since this may involve some discussion, perhaps rather than issuing PRs people can make additional suggestions or comments here.

Document which use cases are in scope and whether principles sufficient

The use case section appears to refer to use cases that are not within the scope (or entirely within) the recommendation. For example "show how funded software has been used" seems to relate to citing a series of software, not a specific version. It is not clear that citing a series is in scope.

Recommend to indicate for each use case whether (a) the use case is in the scope of the recommendations (b) if in scope, whether principles are necessary vs. sufficient for citation wrt to use case

Discussion items

Aspects that we should to make some reference to in the discussion even if its just to rule out of scope:

what we mean by software - source code? compiled code? VM? execution environment this is discussed quite well in
http://rrr.cs.st-andrews.ac.uk/wp-content/uploads/2015/10/guidelines-software-identification.pdf
http://www.software.ac.uk/blog/2016-03-29-reproducible-research-citing-your-execution-environment-using-docker-and-doi?bw
Citation semantics for changeable objects:
- locating software
- snapshot of software at the time of publication
Citation challenges of foreground and background software
Data Citation Implementation is spending time on the JATS extensions for data, and on publisher workflows. Should we mention JATS?
The need to educate publishers and reviewers -> as the examples in Mike Jackson's excellent Blog point out.
The challenge of who is an author of software

What should we say about "Software Papers"

I think a key unresolved question is how to address the practice of "software papers".

If a piece of software has a "software paper", should that be:

cited on its own (supercede the software citation itself,
cited in addition to the software citation itself,
not cited; only cite the software itself (discourage software papers).

I'm not really sure, but I think my vote is for 2, although I acknowledge that that then creates two citations, exacerbating the "too many references" issue.

number of articles in table 1

Table 1 refers to 286 publications, but the 2nd paragraph of the section "Motivation" refers to the same source but regarding a random sample of 90 articles. Should it be 286 too?

Policy makers are another stakeholder.

Persistence of identifier vs. persistence of software

The persistence principle outlined in (4) is a key element in making software citeable. Where software has become part of the record of science not only the identifier and metadata of the software should be persistent, it should also be the goal to keep a persistent copy of the source code, where applicable. This links with the accessibility principle (5).

There are still many open questions about how to resolve package dependencies in the long term, therefore I would not make the persistent access to code a hard requirement but may add something more specific towards preserving the record of science.

new use case for funder?

As a funder, I want to measure the impact of the researchers I fund. This is a bit different from measuring the software itself - might require a new line in the table, with the requirement "authors"

Should we include branch info in software citation?

Recently I work with a development branch code in GitHub to compute something rather than the latest release/master branch code. I wonder if we need to add "branch" information in software citation?

Line 440 re: inaccessible versions

For fast reference:

As stated in the Persistence principle (\ref{principle:persistence}), we recognize that the commercial software version may no longer be available, but it still should be cited along with information about how it was accessed.

Should this be limited to commercial software only? I can think of a few 'open' hydro models that don't maintain older versions.

Change of wording of Software Importance

(Sorry to be doing this late, I realise would have been better to contribute earlier.)

I suggest a change of wording for Importance in section 1.

The current wording is "Software should be cited whenever and wherever a research product (such as a paper or derived software) relies upon it, specifically, as part of the standard reference list for that research product."

I feel the current text is too dogmatic about what software should be cited, and indeed is contradictory to the preamble text which says "For example, in this section we do not define what software should be cited, but how it should be cited."

To emphasise the point that software should be treated the same as anything else, I would suggest a revision to something like:

Software should be cited on the same basis as any other research product such as a paper or book. That is, authors should see citing the appropriate set of software products as being as important as citing the appropriate set of papers. Software citations should not be separated and should be part of the standard reference list for that research product

Two typos, lines 301/401

Line 301: noteable to notable

Line 401 to:

Understanding these chains of knowledge and credit have been part of the history of science field for some time, though more recent work is suggesting more nuanced evaluation of the credit chains~\cite{casrai-credit, transitive_credit_json-ld}.

(insert 'the' for 'part of history of science')