Coder Social home page Coder Social logo

syntax-tree / mdast-util-gfm-autolink-literal Goto Github PK

View Code? Open in Web Editor NEW
8.0 9.0 5.0 104 KB

mdast extension to parse and serialize GFM autolink literals

Home Page: https://unifiedjs.com

License: MIT License

JavaScript 100.00%
unist mdast mdast-util gfm autolink url

mdast-util-gfm-autolink-literal's Introduction

mdast-util-gfm-autolink-literal

Build Coverage Downloads Size Sponsors Backers Chat

mdast extensions to parse and serialize GFM autolink literals.

Contents

What is this?

This package contains two extensions that add support for GFM autolink literals syntax in markdown to mdast. These extensions plug into mdast-util-from-markdown (to support parsing GFM autolinks in markdown into a syntax tree) and mdast-util-to-markdown (to support serializing GFM autolinks in syntax trees to markdown).

GitHub employs different algorithms to autolink: one at parse time and one at transform time (similar to how @mentions are done at transform time). This difference can be observed because character references and escapes are handled differently. But also because issues/PRs/comments omit (perhaps by accident?) the second algorithm for www., http://, and https:// links (but not for email links).

As the corresponding micromark extension micromark-extension-gfm-autolink-literal is a syntax extension, it can only perform the first algorithm. The tree extension gfmAutolinkLiteralFromMarkdown from this package can perform the second algorithm, and as they are combined, both are done.

When to use this

You can use these extensions when you are working with mdast-util-from-markdown and mdast-util-to-markdown already.

When working with mdast-util-from-markdown, you must combine this package with micromark-extension-gfm-autolink-literal.

When you don’t need a syntax tree, you can use micromark directly with micromark-extension-gfm-autolink-literal.

When you are working with syntax trees and want all of GFM, use mdast-util-gfm instead.

All these packages are used remark-gfm, which focusses on making it easier to transform content by abstracting these internals away.

This utility does not handle how markdown is turned to HTML. That’s done by mdast-util-to-hast.

Install

This package is ESM only. In Node.js (version 16+), install with npm:

npm install mdast-util-gfm-autolink-literal

In Deno with esm.sh:

import {gfmAutolinkLiteralFromMarkdown, gfmAutolinkLiteralToMarkdown} from 'https://esm.sh/mdast-util-gfm-autolink-literal@2'

In browsers with esm.sh:

<script type="module">
  import {gfmAutolinkLiteralFromMarkdown, gfmAutolinkLiteralToMarkdown} from 'https://esm.sh/mdast-util-gfm-autolink-literal@2?bundle'
</script>

Use

Say our document example.md contains:

www.example.com, https://example.com, and [email protected].

…and our module example.js looks as follows:

import fs from 'node:fs/promises'
import {gfmAutolinkLiteral} from 'micromark-extension-gfm-autolink-literal'
import {fromMarkdown} from 'mdast-util-from-markdown'
import {
  gfmAutolinkLiteralFromMarkdown,
  gfmAutolinkLiteralToMarkdown
} from 'mdast-util-gfm-autolink-literal'
import {toMarkdown} from 'mdast-util-to-markdown'

const doc = await fs.readFile('example.md')

const tree = fromMarkdown(doc, {
  extensions: [gfmAutolinkLiteral()],
  mdastExtensions: [gfmAutolinkLiteralFromMarkdown()]
})

console.log(tree)

const out = toMarkdown(tree, {extensions: [gfmAutolinkLiteralToMarkdown()]})

console.log(out)

…now running node example.js yields (positional info removed for brevity):

{
  type: 'root',
  children: [
    {
      type: 'paragraph',
      children: [
        {
          type: 'link',
          title: null,
          url: 'http://www.example.com',
          children: [{type: 'text', value: 'www.example.com'}]
        },
        {type: 'text', value: ', '},
        {
          type: 'link',
          title: null,
          url: 'https://example.com',
          children: [{type: 'text', value: 'https://example.com'}]
        },
        {type: 'text', value: ', and '},
        {
          type: 'link',
          title: null,
          url: 'mailto:[email protected]',
          children: [{type: 'text', value: '[email protected]'}]
        },
        {type: 'text', value: '.'}
      ]
    }
  ]
}
[www.example.com](http://www.example.com), <https://example.com>, and <[email protected]>.

API

This package exports the identifiers gfmAutolinkLiteralFromMarkdown and gfmAutolinkLiteralToMarkdown. There is no default export.

gfmAutolinkLiteralFromMarkdown()

Create an extension for mdast-util-from-markdown to enable GFM autolink literals in markdown.

Returns

Extension for mdast-util-to-markdown to enable GFM autolink literals (FromMarkdownExtension).

gfmAutolinkLiteralToMarkdown()

Create an extension for mdast-util-to-markdown to enable GFM autolink literals in markdown.

Returns

Extension for mdast-util-to-markdown to enable GFM autolink literals (ToMarkdownExtension).

HTML

This utility does not handle how markdown is turned to HTML. That’s done by mdast-util-to-hast.

Syntax

See Syntax in micromark-extension-gfm-autolink-literal.

Syntax tree

There are no interfaces added to mdast by this utility, as it reuses the existing Link interface.

Types

This package is fully typed with TypeScript. It does not export additional types.

The Link type of the mdast nodes is exposed from @types/mdast.

Compatibility

Projects maintained by the unified collective are compatible with maintained versions of Node.js.

When we cut a new major release, we drop support for unmaintained versions of Node. This means we try to keep the current release line, mdast-util-gfm-autolink-literal@^2, compatible with Node.js 16.

This utility works with mdast-util-from-markdown version 2+ and mdast-util-to-markdown version 2+.

Related

Contribute

See contributing.md in syntax-tree/.github for ways to get started. See support.md for ways to get help.

This project has a code of conduct. By interacting with this repository, organization, or community you agree to abide by its terms.

License

MIT © Titus Wormer

mdast-util-gfm-autolink-literal's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mdast-util-gfm-autolink-literal's Issues

Links detected in raw broken markdown block compared to GitHub.com

Initial checklist

Affected packages and versions

[email protected]

Link to runnable example

No response

Steps to reproduce

Considering this broken markdown string (extracted from one of our user content):

\* \*\*Memory:\*\* \[Vengeance® Series 16GB \\(2x8GB\\) DDR4 SODIMM 2400MHz CL16 Memory Kit CMSX16GX4M2A2400C16](https&#x3A;//www.corsair.com/us/en/Categories/Products/Memory/Laptop-and-Notebook-Memory/Vengeance%C2%AE-Series-16GB-%282x8GB%29-DDR4-SODIMM-2400MHz-CL16-Memory-Kit/p/CMSX16GX4M2A2400C16#tab-overview)  
\* \*\*Disk:\*\* M.2 NVMe \[Samsung 970 EVO 250GB \\(MZ-V7S250B/AM\\)](https&#x3A;//www.samsung.com/us/computing/memory-storage/solid-state-drives/ssd-970-evo-plus-nvme-m-2-250gb-mz-v7s250b-am/) and M.2 SSD \[Transcend 64GB SATA III 6 Gb/s MTS800](https&#x3A;//www.transcend-info.com/Embedded/Products/No-803)

Using the following JS code:

unified()
    .use(remarkParse)
    .use(remarkGfm)
    .use(() => {
        return (tree) => {
            console.log(JSON.stringify(tree, null, 2));
            return tree;
        };
    })

it results in the following tree:

{
    "type": "root",
    "children": [
    {
        "type": "paragraph",
        "children": [
        {
            "type": "text",
            "value": "* **Memory:** [Vengeance® Series 16GB \\(2x8GB\\) DDR4 SODIMM 2400MHz CL16 Memory Kit CMSX16GX4M2A2400C16]("
        },
        {
            "type": "link",
            "title": null,
            "url": "https://www.corsair.com/us/en/Categories/Products/Memory/Laptop-and-Notebook-Memory/Vengeance%C2%AE-Series-16GB-%282x8GB%29-DDR4-SODIMM-2400MHz-CL16-Memory-Kit/p/CMSX16GX4M2A2400C16#tab-overview",
            "children": [
            {
                "type": "text",
                "value": "https://www.corsair.com/us/en/Categories/Products/Memory/Laptop-and-Notebook-Memory/Vengeance%C2%AE-Series-16GB-%282x8GB%29-DDR4-SODIMM-2400MHz-CL16-Memory-Kit/p/CMSX16GX4M2A2400C16#tab-overview"
            }
            ]
        },
        {
            "type": "text",
            "value": ")"
        },
        {
            "type": "break",
            "position": {
            "start": {
                "line": 1,
                "column": 314,
                "offset": 313
            },
            "end": {
                "line": 2,
                "column": 1,
                "offset": 316
            }
            }
        },
        {
            "type": "text",
            "value": "* **Disk:** M.2 NVMe [Samsung 970 EVO 250GB \\(MZ-V7S250B/AM\\)]("
        },
        {
            "type": "link",
            "title": null,
            "url": "https://www.samsung.com/us/computing/memory-storage/solid-state-drives/ssd-970-evo-plus-nvme-m-2-250gb-mz-v7s250b-am/",
            "children": [
            {
                "type": "text",
                "value": "https://www.samsung.com/us/computing/memory-storage/solid-state-drives/ssd-970-evo-plus-nvme-m-2-250gb-mz-v7s250b-am/"
            }
            ]
        },
        {
            "type": "text",
            "value": ")"
        },
        {
            "type": "text",
            "value": " and M.2 SSD [Transcend 64GB SATA III 6 Gb/s MTS800](https://www.transcend-info.com/Embedded/Products/No-803)"
        }
        ],
        "position": {
        "start": {
            "line": 1,
            "column": 1,
            "offset": 0
        },
        "end": {
            "line": 2,
            "column": 310,
            "offset": 625
        }
        }
    }
    ],
    "position": {
    "start": {
        "line": 1,
        "column": 1,
        "offset": 0
    },
    "end": {
        "line": 3,
        "column": 1,
        "offset": 626
    }
    }
}

Expected behavior

Instead it should be parsed the same way as GitHub does, as a pure text node.

Actual behavior

Links are detected in the content. It results in a tree containing links.
It also causes a bigger issue where the following is not true: input > parsed > markdown > parsed2 > markdown2 and markdown !== markdown2.

Runtime

Node v14

Package manager

yarn v2

OS

macOS

Build and bundle tools

esbuild

hostname inside link text with formatting is wrongly autolinked

Subject of the issue

Consider the following markdown:

[**www.richardianson.com**](https://richardianson.com/)

which renders correctly on github:
www.richardianson.com

but with mdast-util-gfm-autolink-literal, it parses to:

root[1] (1:1-1:56, 0-55)
└─0 paragraph[2] (1:1-1:56, 0-55)
    ├─0 text "[**" (1:1-1:4, 0-3)
    └─1 link[1] (1:4-1:56, 3-55)
        │ title: null
        │ url: "http://www.richardianson.com**](https://richardianson.com/)"
        └─0 text "www.richardianson.com**](https://richardianson.com/)" (1:4-1:56, 3-55)

instead of:

root[1] (1:1-1:56, 0-55)
└─0 paragraph[1] (1:1-1:56, 0-55)
    └─0 link[1] (1:1-1:56, 0-55)
        │ title: null
        │ url: "https://richardianson.com/"
        └─0 strong[1] (1:2-1:27, 1-26)
            └─0 text "www.richardianson.com" (1:4-1:25, 3-24)

workaround

escape the hostname:

[**www\.richardianson\.com**](https://richardianson.com/)

Parse `www.` won't return positions

Initial checklist

Affected packages and versions

[email protected]

Link to runnable example

No response

Steps to reproduce

Run the following script:

import { fromMarkdown } from 'mdast-util-from-markdown';
import { gfmAutolinkLiteral } from 'micromark-extension-gfm-autolink-literal';
import { gfmAutolinkLiteralFromMarkdown } from 'mdast-util-gfm-autolink-literal';

function log(text) {
  let tree = fromMarkdown(text, {
    extensions: [gfmAutolinkLiteral],
    mdastExtensions: [gfmAutolinkLiteralFromMarkdown],
  });
  let tokens = tree.children[0].children;
  console.log(`parsed result for "${text}":`)
  console.dir(tokens, { depth: null });
}

log('www.example.com.');
log('www.');

This script will output the following content:

parsed result for "www.example.com.":
[
  {
    type: 'link',
    title: null,
    url: 'http://www.example.com',
    children: [
      {
        type: 'text',
        value: 'www.example.com',
        position: {
          start: { line: 1, column: 1, offset: 0 },
          end: { line: 1, column: 16, offset: 15 }
        }
      }
    ],
    position: {
      start: { line: 1, column: 1, offset: 0 },
      end: { line: 1, column: 16, offset: 15 }
    }
  },
  {
    type: 'text',
    value: '.',
    position: {
      start: { line: 1, column: 16, offset: 15 },
      end: { line: 1, column: 17, offset: 16 }
    }
  }
]
parsed result for "www.":
[
  {
    type: 'link',
    title: null,
    url: 'http://www',
    children: [ { type: 'text', value: 'www' } ]
  },
  { type: 'text', value: '.' }
]

Expected behavior

Parsing links should return text and link tokens with position property.

Actual behavior

Parsing www. didn't return position property in the text token and link token.

Affected runtime and version

[email protected]

Affected package manager and version

No response

Affected OS and version

No response

Build and bundle tools

No response

Suggestion: Support For Mastodon Addresses

Initial checklist

Problem

I would like to ask if Mastodon addresses such as @[email protected] were supported like their email counterparts. I'm currently working on finding out what the standard protocol is for Mastodon based activity (e.g. the Mastodon equivalent of mailto). https://chat.alexisart.me/@alexis/110014274893325451

Given that the nature of the protocol not existing at the moment, this feature request will not be able to be implemented at the moment unless we are going to route of looking up the HTTP URL via webfinger.

Solution

Detect a Mastodon address and transform it into a usable link

Alternatives

Create the link manually

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.