Coder Social home page Coder Social logo

cpi_xsdparser's Introduction

XsdParser

This is a fork from

https://github.com/xmlet/XsdParser

ported to the Remobjects Elements Compiler

https://www.elementscompiler.com/elements/default.aspx

public class XsdAnnotation extends XsdAbstractElement {

    private String id;
    private List<XsdAppInfo> appInfoList = new ArrayList<>();
    private List<XsdDocumentation> documentations = new ArrayList<>();
    
    // (...)
}
The set of rules followed by this library can be consulted in the following URL: XSD Schema

Usage example

A simple example:

namespace Simple;
uses
  remobjects.Elements.RTL,
  proholz.xsdparser;
 
method LoadAndParseXsd;
begin
  var xsd := new XsdParser('filename');
  var elements : List<XsdElement> := xsd.getResultXsdElements;
  var schemas : List<XsdSchema>:= xsd.getResultXsdSchemas;
  var unresolved: List<UnsolvedReferenceItem> := xsd.getUnsolvedReferences;
  
end;
end.
After parsing the file like shown above it's possible to start to navigate in the resulting parsed elements. In the image below it is presented the class diagram that could be useful before trying to start navigating in the result. There are multiple abstract classes that allow to implement shared features and reduce duplicated code.

Navigation

Below a simple example is presented. After parsing the XSD snippet the parsed elements can be accessed with the respective java code.

<?xml version='1.0' encoding='utf-8' ?>
<xsd:schema xmlns='http://schemas.microsoft.com/intellisense/html-5' xmlns:xsd='http://www.w3.org/2001/XMLSchema'>
	
  <xsd:group name="flowContent">
    <xsd:all>
      <xsd:element name="elem1"/>
    </xsd:all>
  </xsd:group>
	
  <xs:element name="html">
    <xs:complexType>
      <xsd:choice>
        <xsd:group ref="flowContent"/>
      </xsd:choice>
      <xs:attribute name="manifest" type="xsd:anyURI" />
    </xs:complexType>
  </xs:element>
</xsd:schema>
The result could be consulted in the following way:

public class ParserApp {
    public static void main(String [] args) {
        //(...)
        
        XsdElement htmlElement = elementsStream.findFirst().get();
        
        XsdComplexType htmlComplexType = htmlElement.getXsdComplexType();
        XsdAttribute manifestAttribute = htmlComplexType.getXsdAttributes().findFirst().get();
        
        XsdChoice choiceElement = htmlComplexType.getChildAsChoice();
        
        XsdGroup flowContentGroup = choiceElement.getChildrenGroups().findFirst().get();
        
        XsdAll flowContentAll = flowContentGroup.getChildAsAll();
        
        XsdElement elem1 = flowContentAll.getChildrenElements().findFirst().get();
    }
}

Parsing Strategy

In order to minimize the number of passages in the file, which take more time to perform, this library chose to parse all the elements and then resolve the references present. To parse the XSD file we use the DOM library, which converts all the XSD elements into Node objects, from where we extract all the XSD information into our XSD respective classes.

Our parse process is also based on a tree approach, which means that when we invoke the XsdSchema parse function the whole document will be parsed, because each XsdAbstractElement class extracts its respective information, i.e. a XsdSchema instance extracts information from the received xsd:schema Node object, and also invokes the respective parse function for each children elements present in its current Node object.

Type Validations

This library was born with an objective in mind, it should strictly follow the XSD language rules. To guarantee that we used the Visitor pattern. We used this pattern to add a layer of control regarding different XSD types interactions. In the presented code snippet we can observe how this works:

class XsdComplexContentVisitor extends XsdAnnotatedElementsVisitor {

  private final XsdComplexContent owner;
  
  @Override
  public void visit(XsdRestriction element) {
    owner.setRestriction(ReferenceBase.createFromXsd(element));
  }

  @Override
  public void visit(XsdExtension element) {
    owner.setExtension(ReferenceBase.createFromXsd(element));
  }
}
In this example we can see that XsdComplexContentVisitor class only implements two methods, visit(XsdRestriction element) and visit(XsdExtension element). This means that the XsdComplexContentVisitor type only allows these two types, i.e. XsdRestriction and XsdExtension, to interact with XsdComplexContent, since these two types are the only types allowed as XsdComplexContent children elements.
The XSD syntax also especifies some other restrictions, namely regarding attribute possible values or types. For example the finalDefault attribute of the xsd:schema elements have their value restricted to six distinct values:

  • DEFAULT ("")
  • EXTENSION ("extension")
  • RESTRICTION ("restriction")
  • LIST("list")
  • UNION("union")
  • ALL ("#all")
To guarantee that this type of restrictions we use Java Enum classes. With this we can verify if the received value is a possible value for that respective attribute.
There are other validations, such as veryfing if a given attribute is a positiveInteger, a nonNegativeInteger, etc. If any of these validations fail an exception will be thrown with a message detailing the failed validation.

Rules Validation

Apart from the type validations the XSD syntax specifies some other rules. These rules are associated with a given XSD type and therefore are verified when an instance of that respective object is parsed. A simple example of such rule is the following rule:

"A xsd:element cannot have a ref attribute if its parent is a xsd:schema element."

This means that after creating the XsdElement instance and populating its fields we invoke a method to verify this rule. If the rule is violated then an exception is thrown with a message detailing the issue.

Reference solving

This is a big feature of this library. In XSD files the usage of the ref attribute is frequent, in order to avoid repetition of XML code. This generates two problems when handling the parsing which are detailed below. Either the referred element is missing or the element is present and an exchange should be performed. To help in this process we create a new layer with four classes:

UnsolvedElement - Wrapper class to each element that has a ref attribute.
ConcreteElement - Wrapper class to each element that is present in the file.
NamedConcreteElement - Wrapper class to each element that is present in the file and has a name attribute present.
ReferenceBase - A common interface between UnsolvedReference and ConcreteElement.

These classes simplify the reference solving process by serving as a classifier to the element that they wrap. Now we will shown a short example to explain how this works:

<?xml version='1.0' encoding='utf-8' ?>
<xsd:schema xmlns='http://schemas.microsoft.com/intellisense/html-5' xmlns:xsd='http://www.w3.org/2001/XMLSchema'>
	
    <xsd:group id="replacement" name="flowContent">         <!-- NamedConcreteType wrapping a XsdGroup -->
        (...)
    </xsd:group>
	
    <xsd:choice>                                            <!-- ConcreteElement wrapping a XsdChoice -->
        <xsd:group id="toBeReplaced" ref="flowContent"/>    <!-- UnsolvedReference wrapping a XsdGroup -->
    </xsd:choice>
</xsd:schema>
In this short example we have a XsdChoice element with an element XsdGroup with a reference attribute as a child. In this case the XsdChoice will be wrapped in a ConcreteElement object and its children XsdGroup will be wrapped in a UnsolvedReference object. When replacing the UnsolvedReference objects the XsdGroup with the ref attribute is going to be replaced by a copy of the already parsed XsdGroup with the name attribute, which is wrapped in a NamedConcreteElement object. This is achieved by accessing the parent of the element, i.e. the parent of the XsdGroup with the ref attribute, and replacing the XsdGroup with the ref attribute by the XsdGroup with the name attribute.

Resuming the approach:
  1. Obtain all the NamedConcreteElements objects.
  2. Obtain all the UnsolvedReference objects. Iterate them to perform a lookup search in the previously obtained NamedConcreteElements objects by comparing the UnsolvedReference ref with the NamedConcreteElements name attributes.
  3. If a match is found, replace the UnsolvedReference wrapped object with the NamedConcreteElements wrapped object.

Missing ref elements

Any remaining UnsolvedReference objects can be consulted after the file is parsed by using the method getUnsolvedReferences(). Those are the references that were not in the file and the user of the XsdParser library should resolve it by either adding the missing elements to the file or just acknowledging that there are elements missing.

Hierarchy support

This parser supports xsd:base tags, which allow the use of hierarchies in the XSD files. The extended xsd:element is referenced in the element containing the xsd:base tag so it's possible to navigate to the extended elements.

Important remarks

name attribute - XsdParser uses the name attribute as a tool in order to resolve references, therefore it should be unique in the file. Using multiple times the same name will generate unexpected behaviour.

Changelog

1.0

  • First release. Passes all suplied Tests

cpi_xsdparser's People

Contributors

proholz avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.