Coder Social home page Coder Social logo

where does a segment end? about ecmarkdown HOT 19 CLOSED

tc39 avatar tc39 commented on May 25, 2024
where does a segment end?

from ecmarkdown.

Comments (19)

domenic avatar domenic commented on May 25, 2024

Part of #26, certainly.

from ecmarkdown.

bterlson avatar bterlson commented on May 25, 2024

agmentSegment will be renamed to fragment. Partial EBNF, if it helps...

Document = Paragraph , [ { "\n\n" , Paragraph  } ] ;

Paragraph = OrderedList | UnorderedList | NonList ;

OrderedList = OrderedListItem , [ { "\n" , OrderedListItem | UnorderedListItem } ] ;

UnorderedList = UnorderedListItem , [ { "\n" , OrderedListItem | UnorderedListItem } ] ;

OrderedListItem = { white space } , { digit } , "." , space , Fragment ;

UnorderedListItem = { white space } , "*" , space , Fragment ;

NonList = Fragment ;

Fragment = { StarFormat | UnderscoreFormat | TildeFormat
         | TickFormat | PipeFormat | Text } ;
         // could consider renaming to output element

StarFormat = "*" , Text , "*" ;
// With some complex restrictions on surrounding context to match gmd semantics

// and etc. for each fragment type except TickFragment which has no context 
// restrictions per gmd semantics.

from ecmarkdown.

jmdyck avatar jmdyck commented on May 25, 2024

Part of #26, certainly.

Ah, sorry, didn't notice #26, or I might have raised this there.

EBNF

Is there a difference between "white space" and "space"?

Does a multi-line block of code fit into that grammar?

from ecmarkdown.

bterlson avatar bterlson commented on May 25, 2024

White space is tab or space. Space is just a space.

Text matches everything other than two linebreaks so as long as parts of the code don't match any formatting rules it would work. But if you're asking about code fences, those are not supported. Should they be?

from ecmarkdown.

jmdyck avatar jmdyck commented on May 25, 2024

Text matches everything other than two linebreaks so as long as parts of the code don't match any formatting rules it would work.

Backing up a bit, is Text one or more 'literal' characters with zero or more FooFragments embedded? So, e.g., the following Document is 3 Paragraphs, each of which is a NonList, which is a Fragment, which is a Text?:

This is a plain paragraph.

This is a plain paragraph
that I split into two lines.

The value of `Number.NaN` is *NaN*.

So consider the example:

Here is some code: `
    if (true) {
        debugger;
    }`
The effect is implementation-defined.

I think you're saying that:
(a) this constitutes a Text (and thus a Paragraph), and
(b) the embedded code-sample is a TickFragment, and so will be converted to a <code> element,
so it "works" to that extent. But you're also saying that if I write

In the following code: `
    if (true) {
        _v_ = 3;
    }`
the variable `_v_` is assigned the value 3.

then both occurrences of _v_ will be converted to <var>v</var>, which might not be what I intended. (I'm not sure what you mean by "code fences", but I'm guessing it's something that would prevent this recursive conversion within a code-sample.)

What I'm wondering about is the indentation and line-breaks within (and around) the code-sample. Are they preserved in the content of the <code> element? (In which case, how they appear when rendered depends on the <code> element's 'white-space' styling?)

from ecmarkdown.

bterlson avatar bterlson commented on May 25, 2024

I think you're saying that:
(a) this constitutes a Text (and thus a Paragraph ), and
(b) the embedded code-sample is a TickFragment , and so will be converted to a element,

Yes.

But you're also saying that if I write

In the following code: `
    if (true) {
         _v_ = 3;
    }`
the variable  `_v_` is assigned the value 3.

then both occurrences of v will be converted to v , which might not be what I intended.

That is what I said but I recall now that we disallow nesting of formats for lack of a reason to allow them as in MD.

What I'm wondering about is the indentation and line-breaks within (and around) the code-sample. Are they preserved in the content of the element?

Yes, white space (tabs + space) and single line breaks (\n) are part of the code element. Double line breaks start a new paragraph (leaving the tick unclosed which is instead parsed as Text). But when rendered as HTML the line breaks won't show up in the output (need a pre tag for that).

Code fences are for blocks of code where all whitespace is respected. Github Flavored Markdown uses paragraphs starting and ending with ```. This could emit

 elements which emu currently syntax highlights.

from ecmarkdown.

bterlson avatar bterlson commented on May 25, 2024

Updated the EBNF to reflect non-nesting.

from ecmarkdown.

jmdyck avatar jmdyck commented on May 25, 2024

.. I recall now that we disallow nesting of formats for lack of a reason to allow them as in MD.
... Updated the EBNF to reflect non-nesting.

I see the change to the EBNF, but it's not clear that this disallows nesting. Recall my question:

is Text one or more 'literal' characters with zero or more FooFragments embedded?

It seems like it would have to be, otherwise how can you parse

The value of `Number.NaN` is *NaN*.

as a Paragraph (specifically, a NonList, i.e., a Fragment)? But if that were the definition of Text, then StarFragment (etc) would still allow nested FooFragments.

If your intent is that Text is literal characters only, then you need to change the EBNF some more. E.g., you could say:

Fragment = { StarFragment | ... | Text } ;

but that violates expectations of how things are named. Instead, I'd suggest something like:

Fragment = { Span } ;
Span = StarSpan | UnderscoreSpan | TildeSpan | StringSpan | TickSpan | PipeSpan | Text ;

Or, if you like the current Fragment production, you can keep it, but you'd need to introduce

Something = { Fragment }

and then change current uses of Fragment to Something. (Personally, for Something, I'd be inclined to pick Paragraph, and change current occurrences of Paragraph to Block. After all, a list isn't really a paragraph, but it is a block, in the HTML layout sense.)

from ecmarkdown.

jmdyck avatar jmdyck commented on May 25, 2024

In the EBNF, it appears that an OrderedList cannot contain an UnorderedList (as in, e.g., B.2.3.2.1 / step 5.d) or vice versa (as in, e.g., 14.5.1 / group 1 / bullet 1). Is this intentional?

(I note that there aren't any tests for UnorderedLists at all, let alone for combinations of ordered and unordered lists, so I can't infer intent that way.)

from ecmarkdown.

bterlson avatar bterlson commented on May 25, 2024

In the EBNF, it appears that an OrderedList cannot contain an UnorderedList (as in, e.g., B.2.3.2.1 / step 5.d) or vice versa (as in, e.g., 14.5.1 / group 1 / bullet 1). Is this intentional?

It was, but you raise a good point! I'll fix it... although I think those examples are kinda gross :)

(I note that there aren't any tests for UnorderedList s at all, let alone for combinations of ordered and unordered lists, so I can't infer intent that way.)

UnorderedList is not yet implemented! Current EMD only supports numbered lists. Bulleted lists are on my "list" for today, although I'm finding the refactorings required to be more difficult than expected.

Re: your valid points about the current EBNF, I think Fragment = { StarFragment | ... | Text } ; was my intent which seems fine. I'm not sure I follow why the naming is bad - because some fragments repeat (the top-level one) while the inner fragments don't? I'm ok with this because if we decided to allow nesting (a simple one-line change in the parser) then the naming would make sense. Consider it a "static semantics" in the ES sense that you can't nest :-P

Having much fun discussing this EBNF :-D I made it on a whim; glad I did.

from ecmarkdown.

bterlson avatar bterlson commented on May 25, 2024

Updated EBNF in an attempt to say that an UnorderedList starts with an UnorderedListItem and then has any number of Unordered or Ordered list items. Found that GMD is pretty lax - eg check out the following:

1. foo
* bar
* baz
  1. foo
  2. bar
  3. baz

from ecmarkdown.

jmdyck avatar jmdyck commented on May 25, 2024

although I think those examples are kinda gross :)

Here are all the examples I can find in the ES spec. Maybe some are less gross.

OL-containing-UL:

  • 11.9.1 / item 1
  • 22.1.3.24 / alg 3 / step 1
  • B.2.3.2.1 / step 5.d
  • B.3.3 / items 1,2,3

UL-containing-OL:

  • 6.1.7.3 / [[DefineOwnProperty]] / bullet 2
  • 14.5.1 / group 1 / bullet 1

from ecmarkdown.

jmdyck avatar jmdyck commented on May 25, 2024

I think Fragment = { StarFragment | ... | Text } ; was my intent which seems fine. I'm not sure I follow why the naming is bad

(a) Putting an alternation (A | B | ...) within a repetition {...} means that the grammar provides no word for (the concept represented by) the alternation. E.g., If you say "A Fragment is a sequence of one or more ...", then there's no concise way to end the sentence. You could make up a word, but it isn't supplied by the grammar.

(b) If you do want to make up a word, the one that's suggested by the nonterminal names is already taken. I.e., based on the names, StarFragment and TickFragment etc sound like they're kinds of Fragment, but that name is already in use.

It would be as if Ecmascript said

Statement = { BlockStatement | VariableStatement | ... }

instead of

Statement = BlockStatement | VariableStatement | ...

from ecmarkdown.

bterlson avatar bterlson commented on May 25, 2024

I see what you're saying I think, but isn't it enough to say that "A Fragment is a sequence of one or more format fragments and Text fragments"?

from ecmarkdown.

jmdyck avatar jmdyck commented on May 25, 2024

If you say that, then you're using "fragment" to mean something distinct from Fragment, which can lead to confusion.

from ecmarkdown.

bterlson avatar bterlson commented on May 25, 2024

I see. Rename "format fragment"/TickFragment to simply "format"/TickFormat?

from ecmarkdown.

jmdyck avatar jmdyck commented on May 25, 2024

Yup, that'd work. (I suggested Span rather than Format, but it's about the same.)

from ecmarkdown.

bterlson avatar bterlson commented on May 25, 2024

Format already matches the RD parser function's name so I'm a little biased to avoid refactoring :)

from ecmarkdown.

bterlson avatar bterlson commented on May 25, 2024

29386d5

from ecmarkdown.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.