Coder Social home page Coder Social logo

sigel's Introduction

Sigel

Clojars Project

( Changelog | API )

Sigel «ᛋ» is a Clojure interface to the XSLT and XPath bits of Saxon.

XSLT

Sigel lets you write XSLT, but with parentheses instead of angle brackets.

Examples

(require '[sigel.xslt.core :as xslt]
         '[sigel.xslt.elements :as xsl])

(def stylesheet-1
  "An XSLT stylesheet that converts <a/> to <b/>."
  (xsl/stylesheet {:version 3.0}
    (xsl/template {:match "a"} [:b])))

(def stylesheet-2
  "An XSLT stylesheet that converts <b/> to <c/>."
  (xsl/stylesheet {:version 3.0}
    (xsl/template {:match "b"} [:c])))

(def compiled-stylesheets
  [(xslt/compile-sexp stylesheet-1) (xslt/compile-sexp stylesheet-2)])

;; Transform the XML string "<a/>" with stylesheet-1 and then stylesheet-2.
(xslt/transform compiled-stylesheets "<a/>")
;;=> #object[net.sf.saxon.s9api.XdmNode 0x61acfa00 "<c/>"]

You can also write your transformation in EDN:

;; a.edn
[:xsl/stylesheet {:version 3.0}
 [:xsl/template {:match "a"} [:b]]]

;; in your Clojure code
(xslt/transform (xslt/compile-edn "/path/to/a.edn") "<a/>")
;;=> #object[net.sf.saxon.s9api.XdmNode 0xf2a49c4 "<b/>"]

You can also execute XSLT transformations written in plain old XML:

<!-- a-to-b.xsl -->
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="a">
    <b/>
  </xsl:template>
</xsl:stylesheet>
(xslt/transform (xslt/compile-xslt "a-to-b.xsl") "<a/>")
;;=> #object[net.sf.saxon.s9api.XdmNode 0x2bda7fdc "<b/>"]

XPath

Select things in an XML document with XPath.

Examples

(require '[sigel.xpath.core :as xpath])

;; Select nodes with XPath.
(seq (xpath/select "<a><b/><c/></a>" "a/b | a/c"))
;;=>
;;(#object[net.sf.saxon.s9api.XdmNode 0x3cadbb6f "<b/>"]
;; #object[net.sf.saxon.s9api.XdmNode 0x136b811a "<c/>"])

;; Get the result of evaluating an XPath expression against a node as a Java
;; object.
(xpath/value-of "<num>1</num>" "xs:int(num)")
;;=> 1

XML

Every function in this library that takes XML as input accepts any object that implements the XMLSource protocol.

Examples

(require '[clojure.java.io :as io])

;; java.lang.String
(xpath/select "<a><b/><c/></a>" "a/b")
;;=> #object[net.sf.saxon.s9api.XdmNode 0x772300a6 "<b/>"]

;; java.io.File
(xpath/select (io/as-file "/tmp/a.xml") "a/b")
;;=> #object[net.sf.saxon.s9api.XdmNode 0x5487f8c7 "<b/>"]

;; java.net.URL
(xpath/select (io/as-url "http://www.xmlfiles.com/examples/note.xml") "/note/to")
;;=> #object[net.sf.saxon.s9api.XdmNode 0x79f4a8cb "<to>Tove</to>"]

License

Copyright © 2019 Eero Helenius

Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.

Saxon is licensed under the Mozilla Public License.

sigel's People

Contributors

eerohele avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

sigel's Issues

Fix reflection warnings

Sigel reflects quite a lot. Some reflection warnings seem a bit tricky to get rid of, but could maybe remove at least most of it.

CDATA

I use (xsl/output {:method "text"}) to produce a text file and I can't seem to find a way to produce the > char. I always get &gt; instead. Am I missing anything?

hiccup-style input

I frequently find myself utilizing hiccup-style data structures rather than strings/streams. Something like:

[:list
     [:token "defproject"]
     [:whitespace " "]...
   ]

In the past, I would convert it into a string and then perform my XSLT transformations. However, more recently, I crafted a straightforward wrapper around TinyBuilder.

(defn xml-document [tree]
    (let [processor     (Processor. false)
          configuration (.getUnderlyingConfiguration processor)
          builder       (TinyBuilder. (.makePipelineConfiguration configuration))
          walk          (fn walk [x]
                            (if (coll? x)
                                (do
                                    (.startElement builder (FingerprintedQName. "" "" (name (first x))) (Untyped/getInstance) ExplicitLocation/UNKNOWN_LOCATION 0)
                                    (run! walk (rest x))
                                    (.endElement builder))
                                (.characters builder x ExplicitLocation/UNKNOWN_LOCATION 0)))]
        (.open builder)
        (.startDocument builder 0)
        (walk tree)
        (.close builder)
        (.getRootNode (.getTree builder))))

At present, it does not have the capability to handle attributes.
Would you be interested in incorporating this code into your library once it is completed?

XML External Entity Attack

Hi!

Thanks for writing sigel, we've been very happy with it.

Today we noticed that the default Saxon configuration might make us vulnerable to XXE attacks:

https://www.illucit.com/en/java/saxon-he-external-entity-processing-xxe

echo secret > /tmp/foo

/tmp/test.xml

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [  
  <!ELEMENT foo ANY >
  <!ENTITY xxe SYSTEM "file:///tmp/foo" >]>
 
  <foo>&xxe;</foo>
(-> "/tmp/test.xml" clojure.java.io/file slurp sigel.protocols/build)
=> #object[net.sf.saxon.s9api.XdmNode 0x66805298 "<foo>secret\n</foo>"]

We can fix this by using our own builder instead of sigel.protocols/build:

(let [configuration (net.sf.saxon.Configuration.)
      processor (doto (net.sf.saxon.s9api.Processor. configuration)
                  (.setConfigurationProperty
                    (str net.sf.saxon.lib.FeatureKeys/XML_PARSER_FEATURE
                         "http://xml.org/sax/features/external-general-entities")
                    false))
      builder (.newDocumentBuilder processor)
      xml-string (-> "/tmp/test.xml" clojure.java.io/file slurp)
      source (javax.xml.transform.stream.StreamSource. (java.io.StringReader. xml-string))]
  (.build builder source))

=> #object[net.sf.saxon.s9api.XdmNode 0x78a9c497 "<foo/>"]

Do you think it would be a good idea to disable external general entities by default in sigel.saxon/processor?

Something like:

(def ^Processor processor
  (doto (Processor. configuration)
    (.setConfigurationProperty
      (str FeatureKeys/XML_PARSER_FEATURE "http://xml.org/sax/features/external-general-entities") false)))

Execution error when transforming xml with an xml-model processing instruction

xslt/transform fails with

Execution error (IllegalArgumentException) at sigel.protocols/eval52799$fn$G (protocols.clj:12).
No implementation of method: :build of protocol: #'sigel.protocols/XMLSource found for class: net.sf.saxon.s9api.XdmValue

if I invoke it on the following xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="http://xmlp-test/schema/minimalSchema.rng"?>
<dtbook version="2005-3" xml:lang="de" xmlns="http://www.daisy.org/z3986/2005/dtbook/">
</dtbook>

The problem is the <?xml-model href="http://xmlp-test/schema/minimalSchema.rng"?> part which seems to be legal according to the standard.

I transform with the following identity xsl:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="#all" version="2.0">

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

The Clojure code to reproduce the problem is as follows:

(xslt/transform
 [(xslt/compile-xslt "identity.xsl")
  (xslt/compile-xslt "identity.xsl")]
 (io/file "bar.xml"))

Loading XSLT Stylesheets from Classpath

Hello again,

I'm wondering if there's any particular support to compile XSLT stylesheets from the classpath or a raw string?

(str (xslt/transform (xslt/compile-xslt (.getPath to-html-xslt)) content))

This works well when the file is a regular file in the filesystem but fails if the file is within a Jar.

It seems that it would be possible to turn a resource into an input stream and then into a StreamSource.

Am I missing something or is this just something you didn't yet need? Would be happy to provide a PR.

Default namespace gets prefixed

(xslt/transform
    (xslt/compile-sexp
      (xc/xslt3-identity {}
        (xsl/param {:name "target-uri"})
        (xsl/output {:omit-xml-declaration "yes"})
        (xsl/template {:match "/"}
          (xsl/result-document {:href "{$target-uri}/2a.xml" :method "xml"}
            [:container {:version "1.0"
                         :xmlns   "urn:oasis:names:tc:opendocument:xmlns:container"}]))))
    {:target-uri "file:/home/akond/workspace-clojure/my-project"}
    "<root/>")

shows

<container xmlns:l="urn:oasis:names:tc:opendocument:xmlns:container" version="1.0"/>
but it should be
<container xmlns="urn:oasis:names:tc:opendocument:xmlns:container" version="1.0"/>

Cyrillic letters in a call to match produces an error

as long as i use latin letters everything's fine, but when i put, for instance,
(xsl/template {:match "para[matches(., 'Пам')]"} ... i immediatelly get this error

Caused by: org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x1f) was found in the value of attribute "match" and element is "xsl:template".

i wander what could be the problem?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.