Coder Social home page Coder Social logo

yaidom's Introduction

Yaidom

Yaidom is a uniform XML query API, written in Scala. Equally important, yaidom provides several specific-purpose DOM-like tree implementations adhering to this XML query API. What's more, other implementations can easily be added.

Yaidom initially stood for yet another (Scala) immutable DOM-like API. Over time, the yaidom XML query API grew more important than any single DOM-like tree implementation, whether immutable or mutable. Currently, the name yaidom reflects the ease with which new DOM-like tree implementations can be added, while having them conform to the yaidom query API.

Why do we need yet another Scala (tree-oriented) XML library? After all, there are alternatives such as the standard Scala XML library. Indeed, yaidom has several nice properties:

  • the uniform XML query API, playing well with the Scala Collections API (and leveraging it internally)
  • multiple (existing or future) specific-purpose DOM-like tree implementations, conforming to this XML query API
  • among them, a nice immutable "default" DOM-like tree implementation
  • precise and first-class namespace support
  • precision, clarity and minimality in its genes, at the expense of some (but not much) verbosity and lack of XPath support
  • acceptance of some XML realities, such as the peculiarities of "XML equality" and whitespace in XML
  • and therefore no attempt to abstract JAXP away during parsing/serialization, but instead leveraging it (for parsing/serializing)
  • easy conversions between several element implementations
  • support for so-called yaidom dialects, using abstract query API traits for the backing elements and therefore supporting multiple XML backends
  • a scope mainly limited to basic namespace-aware XML processing, and therefore not offering any XSD and DTD support

Usage

Yaidom versions can be found in the Maven central repository. Assuming version 1.13.0, yaidom can be added as dependency as follows (in an SBT or Maven build):

SBT:

libraryDependencies += "eu.cdevreeze.yaidom" %%% "yaidom" % "1.13.0"

Maven2:

<dependency>
  <groupId>eu.cdevreeze.yaidom</groupId>
  <artifactId>yaidom_3</artifactId>
  <version>1.13.0</version>
</dependency>

Note that yaidom itself has a few dependencies, which will be transitive dependencies in projects that use yaidom. Yaidom has been cross-built for several Scala versions, leading to artifactIds referring to different Scala (binary) versions.

One transitive dependency is Saxon-HE (9.9). If Saxon-EE is used in combination with yaidom, the Saxon-HE dependency must be explicitly excluded!

Yaidom (1.8.X and later) requires Java version 1.8 or later!

yaidom's People

Contributors

dvreeze avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

yaidom's Issues

yaidom-0001 Elem.toTreeRepr inefficient

Method Elem.toTreeRepr (and therefore Elem.toString) takes far too much memory. In a REPL session, when parsing/creating a very large Document, the (implicit) call of method toString easily leads to an OutOfMemoryError.

Artifacts for Scala 2.11

Please publish artifacts built for Scala 2.11 final. We sorely lack an updated alternative to scala-xml :)

yaidom-0004 removeAllInterElementWhitespace doesn't work when some of the children are comments

When using the following XML in a scala worksheet

import java.io.ByteArrayInputStream

import eu.cdevreeze.yaidom.parse.DocumentParserUsingSax

val xml = """<?xml version="1.0" encoding="UTF-8"?>
<rootElem>
  <listing>
    <item />
    <item />
    <!-- some comment -->
  </listing>
</rootElem>
"""

val inputStream = new ByteArrayInputStream(xml.getBytes)
val parser = DocumentParserUsingSax.newInstance()
val requestDoc = parser.parse(inputStream)
val documentElementWithoutWhitespace = requestDoc.documentElement.removeAllInterElementWhitespace

The content of elemsWithoutWhitespace still contains text nodes with white-space only around the item elements. The white-space around listing is removed correctly.

As shown by this textual representation:

before whitespace removal:

documentElementWithWhitespace: eu.cdevreeze.yaidom.simple.Elem = elem(
  qname = QName("rootElem"),
  children = Vector(
    text("""
  """),
    elem(
      qname = QName("listing"),
      children = Vector(
        text("""
    """),
        elem(
          qname = QName("item")
        ),
        text("""
    """),
        elem(
          qname = QName("item")
        ),
        text("""
    """),
        comment(""" some comment """),
        text("""
  """)
      )
    ),
    text("""
""")
  )
)

after (incorrect) white-space removal

documentElementWithoutWhitespace: eu.cdevreeze.yaidom.simple.Elem = elem(
  qname = QName("rootElem"),
  children = Vector(
    elem(
      qname = QName("listing"),
      children = Vector(
        text("""
    """),
        elem(
          qname = QName("item")
        ),
        text("""
    """),
        elem(
          qname = QName("item")
        ),
        text("""
    """),
        comment(""" some comment """),
        text("""
  """)
      )
    )
  )
)

running the same code, without the comment results in the expected output

import java.io.ByteArrayInputStream

import eu.cdevreeze.yaidom.parse.DocumentParserUsingSax

val xml = """<?xml version="1.0" encoding="UTF-8"?>
<rootElem>
  <listing>
    <item />
    <item />
  </listing>
</rootElem>
"""

val inputStream = new ByteArrayInputStream(xml.getBytes)
val parser = DocumentParserUsingSax.newInstance()
val requestDoc = parser.parse(inputStream)
val documentElementWithoutWhitespace = requestDoc.documentElement.removeAllInterElementWhitespace

results in what would be expected in the first example

documentElementWithoutWhitespace: eu.cdevreeze.yaidom.simple.Elem = elem(
  qname = QName("rootElem"),
  children = Vector(
    elem(
      qname = QName("listing"),
      children = Vector(
        elem(
          qname = QName("item")
        ),
        elem(
          qname = QName("item")
        )
      )
    )
  )
)

yaidom-0005 Path.toCanonicalXPath broken for default namespace

See the following code:

import eu.cdevreeze.yaidom.core._
val scope = Scope.from("xs" -> "http://www.w3.org/2001/XMLSchema", "xlink" -> "http://www.w3.org/1999/xlink", "link" -> "http://www.xbrl.org/2003/linkbase")
require(scope.isInvertible)
val path = PathBuilder.from(QName("xs:annotation") -> 2, QName("xs:appinfo") -> 0, QName("link:linkbaseRef") -> 3).build(scope)
path.toCanonicalXPath(scope) // No exception thrown
val scope2 = Scope.from("" -> "http://www.w3.org/2001/XMLSchema", "xlink" -> "http://www.w3.org/1999/xlink", "link" -> "http://www.xbrl.org/2003/linkbase")
require(scope2.isInvertible)
path.toCanonicalXPath(scope2) // Boom! Expected at least one prefix for namespace URI ...

yaidom-0003 Bug in method Scope.includingNamespace

Method Scope.includingNamespace returns the Scope itself if the namespace is the "xml" namespace or if prefixesForNamespace(namespaceUri).contains(namespaceUri)). The latter is of course incorrect. It should be: !prefixesForNamespace(namespaceUri).isEmpty. It is likely that the bug is not encountered, but it is still a bug.

yaidom-0002 DocumentPrinterUsingSax potentially not emitting namespace declarations

This is a bug related to the use of DocumentPrinterUsingSax. Trait YaidomToSaxEventsConversions (used in DocumentPrinterUsingSax) does not treat namespace declarations as attributes. That is correct, and method startPrefixMapping should add namespace declarations. On the other hand, it used to be the case that the attributes passed to method startElement included namespace declarations as well. Though strictly incorrect, the old behaviour was more reliable in emitting namespace declarations.

This must be investigated further, and hopefully an elegant solution can be found. The old behaviour can be found in class DocumentPrinterUsingSax as it was before 2015-05-27.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.