Coder Social home page Coder Social logo

metanorma-standoc's Introduction

Metanorma-standoc

Gem Version Build Status Code Climate Pull Requests Commits since latest

Gem for serialising the Metanorma Standoc model.

Functionality

This gem processes Metanorma AsciiDoc input into the Metanorma document model. This gem provides underlying generic functionality; behaviour specific to each standards class is refined in the gem specific to that standards class (e.g. Metanorma ISO).

The following outputs are generated:

  • Metanorma semantic XML

  • Metanorma presentation XML

These Metanorma XML representations are processed downstream by the isodoc gem to generate other end deliverables, depending on each standards class.

The following input formats are supported:

  • Metanorma AsciiDoc

See the Metanorma website for more information.

Note
AsciiMathML is used for mathematical formatting. The gem uses the Ruby AsciiMath parser, which is syntactically stricter than the common MathJax processor; if you do not get expected results, try bracketting terms your in AsciiMathML expressions.

Installation

See the Metanorma website for instructions.

In the terminal:

$ gem install metanorma-standoc
$ gem install metanorma-cli

metanorma-cli is the command-line interface for the Metanorma suite (incorporating the metanorma executable seen above).

Documentation

See the Metanorma website for details.

metanorma-standoc's People

Contributors

opoudjis avatar camobap avatar ronaldtse avatar w00lf avatar andrew2net avatar petertky avatar intelligent2013 avatar hassanakbar avatar webdev778 avatar ribose-jeffreylau avatar github-actions[bot] avatar abunashir avatar alexeymorozov avatar manuelfuenmayor avatar zoras avatar

Stargazers

Josh Coffey avatar  avatar Oleksii Kuznietsov avatar  avatar

Watchers

 avatar Lucian avatar Sebastian Skałacki avatar Ildar Manzhikov avatar James Cloos avatar  avatar  avatar  avatar  avatar Mehmet Sabırlı avatar Vasil Buraliev avatar  avatar KW Kwan avatar  avatar

metanorma-standoc's Issues

Getting latexmlmath to work in travis

Can't get it to, the .travis test involving latexmlmath for metanorma-standoc is failing. I have made travis do a apt-get of latexml, but that does not seem to have any impact. I am trying to test whether latexmlmath is installed at all, but I can't get anything meaningful to output to stderr or stdout!

@CAMOBAP795 ... You seem to know what you're doing with testing latexml; could you help here?

Add option to generate smart quotes

Smart quotes are not currently being generated in the XML, HTML, DOC output, though that is the default of native asciidoctor. That is presumably because the metanorma asciidoctor conversion code is preempting the operation of some Asciidoctor substitutions, though "--" is being converted to em-dash fine.

Do we want the option of smart quotes? And which default value do we want, on or off?

Bibliographic date for "revision" is outside of Bibdata structure

In Metanorma XML we have the revision date outside of the <bibdata> structure:

<version>
  <edition>1</edition>
  <revision-date>2018-09-10T00:00:00Z</revision-date>
</version>

This makes it impossible to know the revision (and other dates) of a document from the Relaton file, we need the Metanorma file to do that.

Should this information be available within <bibdata>?

Stray whitespaces all over XML in new version

These stray whitespaces should not exist. While they don't affect the eventual presentation output, they pollute the canonical format, and trigger lots of highlighting by text editors (and GitHub).

Key point is they did not exist in a very recent version, this issue is a regression.

Document attributes:
Screen Shot 2019-03-16 at 7 50 16 PM

Version (one of the examples):
Screen Shot 2019-03-16 at 7 50 02 PM

Actual text:
Screen Shot 2019-03-16 at 7 50 31 PM

link remains harmful

This is something of a Heisenbug, hard to reproduce, but:

https://www.iso.org/directives[www.iso.org/directives] on the first iteration of processing links is being parsed properly into <link target="https://www.iso.org/directives"> www.iso.org/directives</link>. However for reasons I cannot reproduce or understand, on the second iteration of Nokogiri processing, when the XML is serialised, it is ending up as <link target="https://www.iso.org/directives"/>www.iso.org/directives.

There is something, likely in Nokogiri, which is getting stuck on link and mis-serialising it. I have had previous such problems with the same element name (metanorma/metanorma-iso#187). I will have to put in a workaround of processing links in Nokogiri as a different element name (using elink makes the problem go away), and then convert it back to link on cleanup.

Read document-class from Document preamble

We should define a "document preamble" that we can encode metadata of the Metanorma document.

Currently, we require the following command to build a CSD document:

metanorma -t csd -x html csd-my-document.adoc

The problem happens when there are multiple document classes in the same directory; a Makefile is unable to differentiate the different documents, for example, an ISO document vs a CSD document.

The solution is to keep the -t and -x commands inside the document itself (as part of its "metanorma directives"), so that an iso-document.adoc and csd-document.adoc can create their own type of documents.

For example, this document is built using the ISO class processor.

= My ISO document
:mn-document-class: iso
:mn-output-extensions: html,xml,pdf

.Foreword
...

Deal with PlantUML filenames

Vasil's PlantUML diagrams specify the filename of the PlantUML, using a feature added in 2017 as described in http://forum.plantuml.net/5483/please-specify-filename-%40startuml-extension-automatically?show=5483#q5483

True to their OpenSource provenance, PlantUML have not documented this behaviour. The current macro assumes that it is enforcing the filename for PlantUML; it isn't if the filename is defined within the PlantUML, so it is not finding the generated graphics.

Modify macro behaviour to use the specified PlantUML filename if defined.

Crash when `title-intro-en` is empty

When :title-intro-en: is set to empty in the document header, it crashes:

bundler: failed to load command: metanorma (/Users/myself/.rbenv/versions/2.4.3/lib/ruby/gems/2.4.0/bin/metanorma)
NoMethodError: undefined method `source' for nil:NilClass
  /Users/myself/.rbenv/versions/2.4.3/lib/ruby/gems/2.4.0/gems/metanorma-standoc-1.0.9/lib/asciidoctor/standoc/front.rb:150:in `asciidoc_sub'
  /Users/myself/.rbenv/versions/2.4.3/lib/ruby/gems/2.4.0/gems/metanorma-iso-1.0.9/lib/asciidoctor/iso/front.rb:116:in `block in title_intro'

Make validation optional

For huge documents, such as NIST SP 800-53r4, validation potentially adds many minutes to execution time. @ronaldtse: Should we make validation of documents optional?

Promote keyword, pre to standoc, isodoc

Most metanorma gems define them, identically; they were left out of the base stack because ISO does not use them. It's fine for them not to be implemented in ISO only.

Regression of processing characters within `< >`

I have a document that contains text like this: <星>. These were rendered correctly (with asciidoctor 1.5.7.1, metanorma-iso 1.0.6) before.

When I did a bundle update (asciidoctor 1.5.8, metanorma-iso 1.0.11), all of this text disappeared.

This does not seem to be an asciidoctor problem because there was still no output when I manually required the older asciidoctor version (1.5.7.1).

This text <星> is rendered as completely missing in the XML.

Resume ordered list numbering?

Currently Metanorma does not support resumption of numbering between two discontinuous ordered lists. (As it happens, neither does Asciidoctor!) But it is likely a requirement for ISO standards; the big list in ISO-8601-2 Clause 11.2, for example, now presupposes it.

Resuming numbering of ordered lists is easy in Word. (Much too easy, in fact, as it is one of the many routine breakages in Word.) Resuming numbering of ordered lists in HTML requires either CSS counters (https://stackoverflow.com/questions/4615500/how-to-start-a-new-list-continuing-the-numbering-from-the-previous-list), or liberal use of the start attribute (we generate the HTML, so we can keep track of the numbering.)

Metanorma can therefore implement resumption of ordered list numbering. Should it?

If it does, a new attribute will need to be added to ordered lists in the Metanorma model ("resume"). Isodoc will need to count the number of entries in the preceding list, and use the start attribute. if Word does not pay attention to the start attribute (and I suspect it won't, given how lists have been done there), the Word output will need to use the undocumented list numbering feature of list styles (which is currently used in html2doc, not isodoc, so code will have to be refactored.)

@ronaldtse Is this necessary functionality? Given how fragile Word is about list numbering, I'm not encouraging this.

Use "=" for title if “:title:” document attribute is omitted

Originally we used a separate :title-xxx: because ISO documents have a multi-part title.

In document classes that don't have a multipart title, like CSD, OGC or RFC, we don't need a multi-part title, and the normal Asciidoctor = title suffices.

This issue is to allow reading the :title: attribute from the Asciidoc = line for document classes that don't need multi-part titles.

Beware that we will still need to support multi-lingual titles. This should work:

= English title
:title-fr: titre francais

As should this:

= Blah
:title: English title
:title-fr: titre francais

Consider whether we support multiple languages of text within the same document

Some standardization documents need to be presented in dual-languages. A prime example is ISO Guide 73, presented side by side in 2 columns in English and French.

ITU documents can be in 6 official languages, and for various reasons it is useful to have them side by side allowing comparison.

There are also translations of Chinese standards into English (you can find such official submissions online, e.g., by SAC to ISO), there are also Chinese standards in English and German (the official bi-lateral standards recognition programs China-US and China-Germany).

We should support a single ADoc for multiple languages as well.

This is a generic task for StanDoc.

rename source to uri

Relaton and Metanorma have diverged in how they name URIs of documents: Relaton uses uri element, Metanorma uses source. Change metanorma-standoc, isodoc, and its inheritors to use uri.

Permit assets to be excluded from numbering

Rare, but some assets may occasionally need to be excluded from the numbering sequence for that asset. For example, NIST SP 800 53r4, the table included in the example under 2.2 is not mean to be labelled and numbered as a table of the documentation.

Positioning of examples

In ISO 8601-1 3.2.1, there are examples mid-clause and in the list within the clause. The mid-clause examples are authored to appear first, but appear at the end.

Machine-readable requirements

There are two approaches to machine-readable requirements:

  1. declarative
  2. in-prose (aka textual content tagging).

A declarative requirement is a data structure, where you specify:

  • subject: who is the requirement addressed to?
  • obligation: is this a "permission", "recommendation", "requirement"?
    ** MUST/SHALL, SHOULD, or CAN?
  • qualitative or quantitative?
    ** if quantitative, what are the measurement targets?
  • object: what is the object requirement?
  • verification steps: what steps do we take to verify the requirement has been met?

In text, this may look like:

[requirement,type=recommendation,subject=“The user”,obligation=must,object=“be logged in”]

or

[requirement, type=recommendation]
* The user must be logged in

In the ideal world all requirements are stated this way in the XML and also in rendered output (e.g. in a table form).

However, in the real-world many requirements are embedded in content text (the narrative). We don’t have the authority to re-format all of that text into the above format (and the myriad authors probably won’t want to do so either).

So the other approach is through “tagging”, similar to the general approach with TEI. For example, "The user must be logged in” can be represented in text like:


[subject]#The user# [obligation]#must# [object]#be logged in#.
"

The difference between these two approaches is both in the input syntax and the output rendering. Their representation in XML however, is identical.

NOTE: In IETF XML they use the tag to tag obligation elements, such as MUST NOT

Hierarchical requirements, multiple requirements should be allowed.

It seems that we can model a requirement in AsciiDoc as a Section, a List or a Paragraph?

Allow admonition titles

Admonition titles are already permitted in UNECE. Admonitions have titles in NIST 800 53r4, and the nature of the admonition as tip/caution/etc is already handled as an attribute, not a title in Metanorma XML. Titles should be permitted in Metanorma admonitions in general, and the generic title TIP/CAUTION/etc used only when a title is not already supplied.

Allow case variation for "where", "key" match

Standoc expects definition lists after a formula to be introduced by where on its own paragraph. Allow the text to be in any case, and to be followed by punctuation.

Same goes for the use of "key" to introduce definition lists for figures.

Extracted source code with filenames

@opoudjis given that we can extract source code now (#42), it might make sense to have the input document also provide an output filename for extraction.

Currently the extracted source code has no file extension. We probably should allow the user to enter a filename for each piece of source code.

In fact, this should be the approach for MRR as well.

Numbering of footnotes in section titles

The foonote processing code inline_footnote() assumes that instance variable @fn_number has already been initialised in the document call to init, and increments it. But it turns out that section titles are being processed right out of Asciidoctor::Inline.convert, before document is called. So @fn_number is not yet initialised, and crashes.

`<sourcecode>` block must indicate type of language

In CSD, this input:

[source,java]
----
public class Fibonacci {
    public static long fibonacci(int n) {
        if (n <= 1) return n;
        else return fibonacci(n-1) + fibonacci(n-2);
    }

    public static void main(String[] args) {
        int n = Integer.parseInt(args[0]);
        for (int i = 1; i <= n; i++)
            StdOut.println(i + ": " + fibonacci(i));
    }
}
----

Was rendered as this XML:

<sourcecode id="_efd291ff-701e-4c5d-b922-98018fce71a0">public class Fibonacci { 
    public static long fibonacci(int n) { 
        if (n &lt;= 1) return n; 
        else return fibonacci(n-1) + fibonacci(n-2); 
    } 
 
    public static void main(String[] args) { 
        int n = Integer.parseInt(args[0]); 
        for (int i = 1; i &lt;= n; i++) 
            StdOut.println(i + ": " + fibonacci(i)); 
    } 
}</sourcecode>

In the HTML, we should provide rendering for it too. Word rendering maybe a separate task.

URGENT: rfc-editor.org is down and all documents that uses ietfbib not compilable

From Thomas (and I also encountered this):

it stopped working as www.rfc-editor.org:443 is not available anymore and the domain went into an domain selling phase ☹

make clean all
rm -f csd-vpoll.xml csd-vpoll.pdf csd-vpoll.html
if [ "x" == "1x" ]; then bundle; fi
FILENAME=csd-vpoll.adoc; \
       echo "Compiling via docker..."; docker run -v "$(pwd)":/metanorma/ ribose/metanorma "metanorma -t csd -x xml,pdf,html $FILENAME"
Compiling via docker...
asciidoctor: WARNING: gem 'concurrent-ruby' is not installed. This gem is recommended when registering custom converters.
[metanorma] detecting backends:
[metanorma] processor "rfc2" registered
[metanorma] processor "rfc3" registered
[metanorma] processor "standoc" registered
[metanorma] processor "iso" registered
/usr/lib/ruby/2.5.0/net/http.rb:939:in `rescue in block in connect': Failed to open TCP connection to www.rfc-editor.org:443 (Connection refused - connect(2) for "www.rfc-editor.org" port 443) (Errno::ECONNREFUSED)

Point is, bib fetch failure should not cause the document to not compile.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.