Coder Social home page Coder Social logo

ts-xml-parser's Introduction

ts-xml-parser

A better xml parser written in pure typescript and works well with both node and deno.

Import to your project

For Node.js

Install it first:

// pay attention to the package name 'fsp-xml-parser'
npm install fsp-xml-parser
// or
yarn add fsp-xml-parser

Then import it:

// CommonJS
const { parse } = require('fsp-xml-parser')
// ES Module
// In nodejs, you need bundlers(such as webpack/parcel...) support for now, this line of code couldn't run in nodejs directly.
// But if typescript is your good friend, this is the right way.
import { parse } from 'fsp-xml-parser'

For Deno

// remote import in Deno
import parse from "https://denopkg.com/FullStackPlayer/ts-xml-parser/mod.ts"
// latest update: now you can import from deno.land
import parse from "https://deno.land/x/ts_xml_parser/mod.ts"
// local import in Deno
import parse from "path/to/parser.ts"

Usage

Simple:

let xml = `
<?xml version="1.0" encoding="utf-8" ?> 
<tagA></tagA>
`
let parsed = parse(xml)
// parsed:
// {
//    "declaration": {
//        "attributes": {
//            "version": "1.0",
//            "encoding": "utf-8"
//        }
//    },
//    "root": {
//        "name": "tagA"
//    }
//}

Namespace:

let xml = `
<?xml version="1.0" encoding="utf-8" ?> 
<propfind xmlns="DAV:" xmlns:R="RES:">
    <R:allprop/>
</propfind>
`
let parsed = parse(xml,true)    // true means prefixing namespace before tag name
// parsed:
// {
//     "declaration": {
//         "attributes": {
//             "version": "1.0",
//             "encoding": "utf-8"
//         }
//     },
//     "root": {
//         "name": "DAV:propfind",
//         "attributes": {
//             "xmlns": "DAV:",
//             "xmlns:R": "RES:"
//         },
//         "children": [
//             {
//                 "name": "RES:allprop"
//             }
//         ]
//     }
// }

Content:

let xml = `
<?xml version="1.0" encoding="utf-8" ?> 
<tagA>
    abc<![CDATA[123一二三]]>
</tagA>
`
let parsed = parse(xml)
// parsed:
// {
//     "declaration": {
//         "attributes": {
//             "version": "1.0",
//             "encoding": "utf-8"
//         }
//     },
//     "root": {
//         "name": "tagA",
//         "content": "abc<![CDATA[123一二三]]>"
//     }
// }

Mixed Content (a node owns text content and child nodes at the same time):

let xml = `
<?xml version="1.0" encoding="utf-8" ?>
<father>
    I have a son named John<fullname>Johnson</fullname>.
</father>
`
let parsed = parse(xml)
// parsed:
// {
//     "declaration": {
//         "attributes": {
//             "version": "1.0",
//             "encoding": "utf-8"
//         }
//     },
//     "root": {
//         "name": "father",
//         "children": [
//             {
//                 "name": "fullname",
//                 "content": "Johnson"
//             }
//         ],
//         "content": "I have a son named John."
//     }
// }

Deep Structure:

let xml = `
<?xml version="1.0" encoding="utf-8" ?>
<China>
    <Henan></Henan>
    <Shandong>
        <Jinan alias="Quancheng">
            <Lixia />
            <Tianqiao>
                There is a big train station<station type="train">Tianqiao Station</station>.
            </Tianqiao>
        </Jinan>
    </Shandong>
</China>
`
let parsed = parse(xml)
// parsed
// {
//     "declaration": {
//         "attributes": {
//             "version": "1.0",
//             "encoding": "utf-8"
//         }
//     },
//     "root": {
//         "name": "China",
//         "children": [
//             {
//                 "name": "Henan"
//             },
//             {
//                 "name": "Shandong",
//                 "children": [
//                     {
//                         "name": "Jinan",
//                         "attributes": {
//                             "alias": "Quancheng"
//                         },
//                         "children": [
//                             {
//                                 "name": "Lixia"
//                             },
//                             {
//                                 "name": "Tianqiao",
//                                 "children": [
//                                     {
//                                         "name": "station",
//                                         "attributes": {
//                                             "type": "train"
//                                         },
//                                         "content": "Tianqiao Station"
//                                     }
//                                 ],
//                                 "content": "There is a big train station."
//                             }
//                         ]
//                     }
//                 ]
//             }
//         ]
//     }
// }

ATTENTION

  • If you have single \ characters in <![CDATA[]>, it will be ignored as an escape character, if you are sure a single \ is necessary, type \\ instead.

  • <![CDATA[]]> can not be nested in a node content, if you really want to do that, encode your inner <![CDATA[]]> first, of course the receiver side should decode the content either.

Enjoy Yourself!

ts-xml-parser's People

Contributors

fullstackplayer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

tkafka

ts-xml-parser's Issues

XmlNode.content returns a wrong string if the node contains a child tag before or after any text

These test fail:

Deno.test("Mixed Content, tag at the beginning", () => {
  let xml = `
    <?xml version="1.0" encoding="utf-8" ?>
    <father>
        <span>I</span> have a son named John<fullname>Johnson</fullname>.
    </father>
    `;
  let parsed = parse(xml);
  assertEquals(parsed.root?.children, [
    {
      name: "span",
      content: "I",
    },
    {
      name: "fullname",
      content: "Johnson",
    },
  ]);
  assertEquals(parsed.root?.content, "I have a son named John."); // produces "have a son named John."
});
Deno.test("Mixed Content, tag at the end", () => {
  let xml = `
    <?xml version="1.0" encoding="utf-8" ?>
    <father>
        I have a son named John<fullname>Johnson</fullname><span>.</span>
    </father>
    `;
  let parsed = parse(xml);
  assertEquals(parsed.root?.children, [
    {
      name: "fullname",
      content: "Johnson",
    },
    {
      name: "span",
      content: ".",
    },
  ]);
  assertEquals(parsed.root?.content, "I have a son named John."); // produces "I have a son named John"
});

Bug: wrong comment tokenizing

The approach this library uses is totally wrong.

// 去掉两端空格和注释
xml = xml.trim()
xml = xml.replace(/<!--[\s\S]*?-->/g, "")

For a minimal test case like below,

<a attr="<!--not a comment-->"></a>

The expected result is

{ name: "a", attributes: { attr: "<!--not a comment-->" } }

However, the actual result is

{ name: "a", attributes: { attr: "" } }

Fails with rare characters

Hi!

thanks for making this library. This one and https://deno.land/x/[email protected] are faster than the most common Deno xml parser. I found a bug though. Parsing

console.log(parse('<row a="^"/>'))

outputs

{ declaration: undefined, root: undefined }

so values with a ^ in it (and who knows what else) are not parsed properly.

Another short note, the Deno import from your readme is wrong, it should be import { parse }, not import parse

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.