Coder Social home page Coder Social logo

nori's Introduction

Nori

CI Gem Version Code Climate

Really simple XML parsing ripped from Crack, which ripped it from Merb.

Nori supports pluggable parsers and ships with both REXML and Nokogiri implementations.
It defaults to Nokogiri since v2.0.0, but you can change it to use REXML via:

Nori.new(:parser => :rexml)  # or :nokogiri

Make sure Nokogiri is in your LOAD_PATH when parsing XML, because Nori tries to load it when it's needed.

Examples

Nori.new.parse("<tag>This is the content</tag>")
# => {"tag"=>"This is the content"}

Nori.new.parse('<foo />')
#=> {"foo"=>nil}

Nori.new.parse('<foo bar />')
#=> {}

Nori.new.parse('<foo bar="baz"/>')
#=> {"foo"=>{"@bar"=>"baz"}}

Nori.new.parse('<foo bar="baz">Content</foo>')
#=> {"foo"=>"Content"}

Nori::StringWithAttributes

You can access a string node's attributes via attributes.

result = Nori.new.parse('<foo bar="baz">Content</foo>')
#=> {"foo"=>"Content"}

result["foo"].class
# => Nori::StringWithAttributes

result["foo"].attributes
# => {"bar"=>"baz"}

advanced_typecasting

Nori can automatically convert string values to TrueClass, FalseClass, Time, Date, and DateTime:

# "true" and "false" String values are converted to `TrueClass` and `FalseClass`.
Nori.new.parse("<value>true</value>")
# => {"value"=>true}

# String values matching xs:time, xs:date and xs:dateTime are converted to `Time`, `Date` and `DateTime` objects.
Nori.new.parse("<value>09:33:55.7Z</value>")
# => {"value"=>2022-09-29 09:33:55.7 UTC

# disable with advanced_typecasting: false
Nori.new(advanced_typecasting: false).parse("<value>true</value>")
# => {"value"=>"true"}

strip_namespaces

Nori can strip the namespaces from your XML tags. This feature is disabled by default.

Nori.new.parse('<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"></soap:Envelope>')
# => {"soap:Envelope"=>{"@xmlns:soap"=>"http://schemas.xmlsoap.org/soap/envelope/"}}

Nori.new(:strip_namespaces => true).parse('<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"></soap:Envelope>')
# => {"Envelope"=>{"@xmlns:soap"=>"http://schemas.xmlsoap.org/soap/envelope/"}}

convert_tags_to

Nori lets you specify a custom formula to convert XML tags to Hash keys using convert_tags_to.

Nori.new.parse('<userResponse><accountStatus>active</accountStatus></userResponse>')
# => {"userResponse"=>{"accountStatus"=>"active"}}

parser = Nori.new(:convert_tags_to => lambda { |tag| Nori::StringUtils.snakecase(tag).to_sym })
parser.parse('<userResponse><accountStatus>active</accountStatus></userResponse>')
# => {:user_response=>{:account_status=>"active"}}

convert_dashes_to_underscores

By default, Nori will automatically convert dashes in tag names to underscores.

Nori.new.parse('<any-tag>foo bar</any-tag>')
# => {"any_tag"=>"foo bar"}

# disable with convert_dashes_to_underscores
parser = Nori.new(:convert_dashes_to_underscores => false)
parser.parse('<any-tag>foo bar</any-tag>')
# => {"any-tag"=>"foo bar"}

nori's People

Contributors

alethea avatar barberj avatar cloocher avatar dduugg avatar deadprogram avatar der-flo avatar edmz avatar gaaady avatar janahrens avatar jim avatar jnunemaker avatar jvnill avatar kajetanowicz avatar knizhegorodov avatar kornysietsma avatar lackac avatar mchu avatar moom avatar olleolleolle avatar pcai avatar pengwynn avatar purp avatar robuye avatar rubiii avatar sandro avatar tamalw avatar technicalpickles avatar thomasjachmann avatar timriley avatar tjarratt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nori's Issues

REXML parsing error

Using REXML as parser, &lt; inside CDATA is converted to <. Nokogiri does not have this issue.

irb(main):030:0> Nori.new(parser: :rexml).parse("<lixi_payload><![CDATA[<RealEstate Zoning=\"Residential (Development/Existing) &lt;=6 units/dwellings\"></RealEstate>]]></lixi_payload>")
=> {"lixi_payload"=>"<RealEstate Zoning=\"Residential (Development/Existing) <=6 units/dwellings\"></RealEstate>"}

irb(main):031:0> Nori.new(parser: :nokogiri).parse("<lixi_payload><![CDATA[<RealEstate Zoning=\"Residential (Development/Existing) &lt;=6 units/dwellings\"></RealEstate>]]></lixi_payload>")
=> {"lixi_payload"=>"<RealEstate Zoning=\"Residential (Development/Existing) &lt;=6 units/dwellings\"></RealEstate>"}

Make advanced type casting actually advanced

Right now it only support converting true/false and date and times, is there anyways you can support more type for example the standard ones in soap: http://www.pocketsoap.com/4s4c/docs/datatypes.html

    def advanced_typecasting(value)
      split = value.split
      return value if split.size > 1

      case split.first
        when "true"       then true
        when "false"      then false
        when XS_DATE_TIME then try_to_convert(value) {|x| DateTime.parse(x)}
        when XS_DATE      then try_to_convert(value) {|x| Date.parse(x)}
        when XS_TIME      then try_to_convert(value) {|x| Time.parse(x)}
        else                   value
      end
    end

Strings as Nori::StringWithAttributes

Previously posted under #19.

Here's some code that runs a SOAP login:

require 'rubygems'
require 'bundler/setup'
require 'savon'
require 'yaml'

username = "something"
password = "somethingelse"

client = Savon.client(
  wsdl: "https://app2102.bws.birst.com/CommandWebService.asmx?WSDL",
  endpoint: "https://app2102.bws.birst.com/CommandWebService.asmx",
  convert_request_keys_to: :none,
  soap_version: 1,
  pretty_print_xml: true,
  filters: [:password],
  log_level: :error,
  log: true
)

response = client.call(:login,
message: {
  username: username,
  password: password
})

puts "*** Just the response ***"
puts response.hash

puts "*** Response as YAML ***"
puts "#{YAML.dump response.hash}"

and here's the results

D, [2014-06-05T06:21:56.129336 #59638] DEBUG -- : HTTPI GET request to app2102.bws.birst.com (httpclient)
D, [2014-06-05T06:21:57.173315 #59638] DEBUG -- : HTTPI POST request to app2102.bws.birst.com (httpclient)
*** Just the response ***
{:envelope=>{:body=>{:login_response=>{:login_result=>"17c079819a0eab25X9efff46bf1c3843", :@xmlns=>"http://www.birst.com/"}}, :"@xmlns:soap"=>"http://schemas.xmlsoap.org/soap/envelope/", :"@xmlns:xsi"=>"http://www.w3.org/2001/XMLSchema-instance", :"@xmlns:xsd"=>"http://www.w3.org/2001/XMLSchema"}}
*** Response as YAML ***

---
:envelope:
  :body:
    :login_response:
      :login_result: !ruby/string:Nori::StringWithAttributes
        str: !binary |-
          NzdiMFc5ODE5YRllYWYyN3M5ZWZkZjX2YmYwYjM4NDM=
        attributes: {}
      :@xmlns: http://www.birst.com/
  :@xmlns:soap: http://schemas.xmlsoap.org/soap/envelope/
  :@xmlns:xsi: http://www.w3.org/2001/XMLSchema-instance
  :@xmlns:xsd: http://www.w3.org/2001/XMLSchema

The YAML dump isn't working as expected since the strings are actually Nori::StringWithAttributes.

strip error

I have created a nokogiri document 'doc' each time I attempt to parse it with Nori I get an error.

'[63] pry(main)> parser = Nori.new
=> #<Nori:0x00000004f34b30
@options=
{:strip_namespaces=>false,
:delete_namespace_attributes=>false,
:convert_tags_to=>nil,
:convert_attributes_to=>nil,
:advanced_typecasting=>true,
:convert_dashes_to_underscores=>true,
:parser=>:nokogiri}>
[64] pry(main)> parser.parse(@doc)
NoMethodError: undefined method strip' for #<Nokogiri::XML::Document:0x00000002103ae0> from /var/lib/gems/1.9.1/gems/nori-2.4.0/lib/nori.rb:42:inparse'
[65] pry(main)> parser.parse(doc)
NoMethodError: undefined method strip' for #<Nokogiri::XML::Document:0x25decf4 name="document"> from /var/lib/gems/1.9.1/gems/nori-2.4.0/lib/nori.rb:42:inparse'

and even with a tiny called tiny.xml

<Races>
<Race RaceNumber="1"><NameRaceFull>Foo</NameRaceFull></Race>
<Race RaceNumber="2"><NameRaceFull>Goo</NameRaceFull></Race>
<Race RaceNumber="3"><NameRaceFull>Hoo</NameRaceFull></Race>
</Races>

[68] pry(main)> newdoc = Nokogiri::XML("tiny.xml")
=> #(Document:0x24ed034 { name = "document" })
[69] pry(main)> parser.parse(newdoc)
NoMethodError: undefined method strip' for #<Nokogiri::XML::Document:0x24ed034 name="document"> from /var/lib/gems/1.9.1/gems/nori-2.4.0/lib/nori.rb:42:inparse'
[70] pry(main)>

Am I doing something wrong?

Time and DateTime typecasting forcing UTC

See rubiii/savon#110 for history.

Seems nori is forcing Time objects and DateTime objects to utc when doing typecasting. This causes goes against ISO8601 which clearly states that date and time representations that do not specify a timezone should be considered as being the local time zone.

XML Parsing fails with unescaped ampersand in content (not tag)

If I have XML like this:

<?xml version="1.0" encoding="UTF-8" ?>
<outer>
  <inner>
    <before>data before</before>
    <data>Some & More</data>
    <after>here is after</after>
  </inner>
</outer>

and try to parse it like this:

xml = File.read("bad.xml")
result = Nori.new.parse(xml)

I get this:

{
    "data" => "Some  More\n        here is after\n  \n"
}

Which is clearly wrong. If I change the & into & it parses just fine:

<?xml version="1.0" encoding="UTF-8" ?>
<outer>
  <inner>
    <before>data before</before>
    <data>Some &amp; More</data>
    <after>here is after</after>
  </inner>
</outer>
{
    "outer" => {
        "inner" => {
            "before" => "data before",
              "data" => "Some & More",
             "after" => "here is after"
        }
    }
}

Why can't I use a raw & in the content? That seems to be a bug, right?

Empty items handing

This issue was first discussed here: savonrb/savon#137

Opening under Nori as it is more relevant.

I'd like to revisit this please. This causes an issue for me, as I can not pass an object returned by the server back to it:

Gyoku.xml(nori.parse(""))
=> "<xml xsi:nil="true"/>"

The server-side accepts empty item, but not a nil. Based on some discussions I believe the empty item should be parsed as an empty string:

Current behavior:
nori.parse("") => {"xml"=>nil}

Expected behavior:
nori.parse("") => {"xml"=>""}

Raise on error?

Hi,

is there a way to raise an exception on parsing error?

Nori.new.parse(xml_content)

No matter what is the content, it never raises anything, it just returns an empty hash.

Thank you!

Doc said REXML for the default parser

Hi,

It's a small issue but the README.md said REXML is parser by default but after inspecting the code Nokogiri is the parser by default. If you have time to update it.

Cheers

v2.3.0

List of changes for the upcoming v2.3.0 release.

  • Changed Nori#find to ignore namespace prefixes in Hash keys it is searching through.
    Original issue: savonrb/savon#474.
  • Limited Nokogiri to < 1.6, because v1.6 dropped support for Ruby 1.8.

Add option to not prefix attrs with '@' in hash

I understand why the attributes are prefixed, they're in a different namespace from the elements in the XML, but they go into the same namespace in the resulting hash, so the prefix prevents conflicts.

However, when there are no conflicts, it's a leaky abstraction. I shouldn't have to know or care whether something was originally an XML attribute and thus prefix it with an '@' when I reference it in the hash. Would you accept a patch to turn off this behavior (but still have it on by default) and raise an exception if there is a namespace conflict?

Use schema

Suppose we have the following XML:

 <root>
     <a>
         <b />
      </a>
      <a>
         <b />
         <b />
      </a>
 </root>

Nori will create the following hash:

{"root"=>{"a"=>[{"b"=>nil}, {"b"=>[nil, nil]}]}}

If we have the following document schema:

 <xsd:complexType name="a">
    <xsd:sequence>
        <xsd:element name="b" minOccurs="0" maxOccurs="unbounded" type="xsd:string"/>
    </xsd:sequence>
</xsd:complexType>

element ['root']['a'][0]['b'] should return a single-valued array, but there is no way to specify this to Nori.

special characters problem on ruby 1.9.3-p392

The following code causes exception on Ruby 1.9.3-p392 and probably 2.0 for both nokogiri and rexml parsers.

Nori.new.parse('<root>&amp;#039;</root>')

# => NameError: uninitialized constant REXML::Text::Document

Why add Hash#to_params

Is there a particularly reason by Nori defines Hash#to_params (and related methods)? This method doesn't seem to be used anywhere inside Nori or Savon, it's not in the README, and it doesn't seem to be related to XML parsing.

I'm wondering in particular because I have pull request into Rails with defines Hash#to_params (with a different meaning) since Rails already defines Hash#to_param with the same basic meaning as what to_params does in this project.

Selective disabling of advanced_typcasting for certain attributes

Hi there,

I was wondering if its possible to disable advanced_typecasting for certain attributes?

Right now, I'm consuming an API that gives coordinate data back in two potential formats; decimal, and D:M:S. The problem is, that when a field has a D:M:S geographic coordinate in it, it gets converted to a Time object.

It's a handy feature, and I'm using advanced_typecasting elsewhere, so I'm not 100% keen on switching it off entirely, and would just like to instruct Nori, by way of Savon, to not typecast a list of attributes I specify upfront.

Many thanks!

Nori 2 doesn't parse whole XML but only the first root tag

xml = <<-XML
      <request>
        <entities href="#id1">
        </entities>
      </request>
      <entity id="id1">
        <foo><bar>1</bar></foo>
        <sub href="#id2" />
      </entity>
      <ololo id="id2">
        <foo>1</foo>
      </ololo>
XML

Nori.new.parse xml # {"request"=>{"entities"=>{"@href"=>"#id1"}}}

advance_type_casting regex fails to parse time like fields

Currently not correctly parsing X-ContactStamp.

This appears to be an issue with the XS_TIME regex

> xml = "<InternetMessageHeader HeaderName=\"X-ContentStamp\">25:12:3380570386</InternetMessageHeader>"
> Nori.parse( xml)
RangeError: integer 3380570386 too big to convert to `int'
from .rbenv/versions/2.1.3/lib/ruby/2.1.0/time.rb:264:in `local'

backtrace
=> [".rbenv/versions/2.1.3/lib/ruby/2.1.0/time.rb:264:in `local'",
 ".rbenv/versions/2.1.3/lib/ruby/2.1.0/time.rb:264:in `make_time'",
 ".rbenv/versions/2.1.3/lib/ruby/2.1.0/time.rb:331:in `parse'",
 ".rbenv/versions/2.1.3/gemsets/backupify-clean/gems/nori-1.1.5/lib/nori/xml_utility_node.rb:223:in `block in advanced_typecasting'",
 ".rbenv/versions/2.1.3/gemsets/backupify-clean/gems/nori-1.1.5/lib/nori/xml_utility_node.rb:259:in `call'",
 ".rbenv/versions/2.1.3/gemsets/backupify-clean/gems/nori-1.1.5/lib/nori/xml_utility_node.rb:259:in `try_to_convert'",
 ".rbenv/versions/2.1.3/gemsets/backupify-clean/gems/nori-1.1.5/lib/nori/xml_utility_node.rb:223:in `advanced_typecasting'",
 ".rbenv/versions/2.1.3/gemsets/backupify-clean/gems/nori-1.1.5/lib/nori/xml_utility_node.rb:140:in `to_hash'",
 ".rbenv/versions/2.1.3/gemsets/backupify-clean/gems/nori-1.1.5/lib/nori/parser/nokogiri.rb:44:in `parse'",
 ".rbenv/versions/2.1.3/gemsets/backupify-clean/gems/nori-1.1.5/lib/nori/parser.rb:31:in `parse'",
 ".rbenv/versions/2.1.3/gemsets/backupify-clean/gems/nori-1.1.5/lib/nori.rb:12:in `parse'",

Should unescaped XML raise an error?

I assume this is expected behavior since xml should escape ampersands, but I was surprised by the output:

irb(main):007:0> xml = Nori.new
=> #<Nori:0x007fa2ab5caaf8 @options={:strip_namespaces=>false, :convert_tags_to=>nil, :advanced_typecasting=>true, :parser=>:nokogiri}>

irb(main):008:0> xml.parse "<outer><test>Hello&Goodbye</test></outer>"
=> {"test"=>"Hello"}

irb(main):009:0> xml = Nori.new :advanced_typecasting => false
=> #<Nori:0x007fa2ac85ef98 @options={:strip_namespaces=>false, :convert_tags_to=>nil, :advanced_typecasting=>false, :parser=>:nokogiri}>

irb(main):010:0> xml.parse "<outer><test>Hello&Goodbye</test></outer>"
=> {"test"=>"Hello"}

irb(main):011:0> xml = Nori.new :parser => :nokogiri
=> #<Nori:0x007fa2ab8f66a0 @options={:strip_namespaces=>false, :convert_tags_to=>nil, :advanced_typecasting=>true, :parser=>:nokogiri}>

irb(main):012:0> xml.parse "<outer><test>Hello&Goodbye</test></outer>"
=> {"test"=>"Hello"}

If the ampersand is escaped it works:

irb(main):013:0> xml = Nori.new
=> #<Nori:0x007fa2abe82150 @options={:strip_namespaces=>false, :convert_tags_to=>nil, :advanced_typecasting=>true, :parser=>:nokogiri}>

irb(main):014:0> xml.parse "<outer><test>Hello&amp;Goodbye</test></outer>"
=> {"outer"=>{"test"=>"Hello&Goodbye"}}

I would have expected this to raise an error, rather than return a bad result. Would be interested to hear any comments. Thanks!

XML "Arrays"

Since there is no proper Array type in XML, a common pattern is to do this:

<sports>
  <sport></sport>
  <sport></sport>
</sports>

The issue is that nori will translate it as follow

{sports: { sport: [{}, {}] }}

It's not wrong, but it's not convenient to use and easy to misunderstand when reading the code (the plural being a hash and the singular being an array). I think it would a nice improvement to have an option to reduce theses snippets to {sports: [{}, {}]}.

empty_tag_value not obeying xsi:nil ?

I'd like to get nil when xsi:nil="true", and a blank string when the tag is empty.

irb(main):025:0> parser = Nori.new(empty_tag_value: "")
=> #<Nori:0x00001c2965f690 @options={:strip_namespaces=>false, :delete_namespace_attributes=>false, :convert_tags_to=>nil, :convert_attributes_to=>nil, :empty_tag_value=>"", :advanced_typecasting=>true, :convert_dashes_to_underscores=>true, :parser=>:nokogiri}>
irb(main):026:0> parser.parse('<foo/>')
=> {"foo"=>""}
irb(main):027:0> parser.parse('<foo xsi:nil="true"/>')
=> {"foo"=>""}
irb(main):028:0>

Nori explodes when there is a trailing apostrophe

Consider the following SOAP answer from our payment backend:

<?xml version="1.0" encoding="ISO-8859-1"?><SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><SOAP-ENV:Body><ns1:authorizeResponse xmlns:ns1="http://schemas.xmlsoap.org/soap/envelope/"><ipaymentReturn><status xmlns="">SUCCESS</status><successDetails xmlns=""><retTransDate xmlns="">13.04.12</retTransDate><retTransTime xmlns="">18:21:59</retTransTime><retTrxNumber xmlns="">1-987654321</retTrxNumber><retAuthCode xmlns=""></retAuthCode><retStorageId xmlns="">12345678</retStorageId></successDetails><paymentMethod xmlns="">VisaCard</paymentMethod><trxRemoteIpCountry xmlns="">DE</trxRemoteIpCountry></ipaymentReturn></ns1:authorizeResponse></SOAP-ENV:Body></SOAP-ENV:Envelope>

Now if you add an apostrophe (') to this answer, Nori parses this as {:envelope=>2013-04-12 01:00:00 +0200}. Trailing apostrophe happens often, e.g. if you mock SOAP-Answers with '...lots of ugly xml', insert some variables, find out that its best to encapsulate strings with %{...lots of ugly xml...} and just forget to remove the ' at the end. And you spend hours on debugging.

nil.blank? test failure

The test that asserts 'nil' is blank? is failing (1.9.2-p180). I refactored the test a bit so I could figure out which one was failing:

https://gist.github.com/976457

Failures:

  1. Object#blank? should return true for 'nil' objects
    Failure/Error: blanky.blank?.should be_true
    expected false to be true

    ./spec/nori/core_ext/object_spec.rb:8:in `block (4 levels) in <top (required)>'

It ignores attributes when a child is a text node.

Original conversation with @tjarratt started in #50

I'm working with a soap api that has xml text nodes with an attribute. I'm not sure how to pass the hash to savon to render this node correctly. Using nori parse to try and reverse engineer the hash from the xml i saw that it didn't return a hash with the attribute visible.

This is apparently desired feature as i found a supporting spec #L305:

305       it "should ignore attributes when a child is a text node" do
306         xml = "<root attr1='1'>Stuff</root>"
307         expect(parse(xml)).to eq({ "root" => "Stuff" })
308       end

I feel this is incorrect and the attribute should not be ignored. How else can we pass attributes for nodes where the child is text?

Strange error when passing malformed xml

data = "<?xml version=\"1.0\" encoding=\"utf-8\">\n<request>\n <opcode>0</opcode>\n</request>\n" Nori.new.parse data NoMethodError: undefined method 'add_node' for nil:NilClass
Question sign is omitted at the end of header
Expected kind of "Malformed XML" or "Parsing error', but got a syntax error exception.

List of elements with repetition

I have a SOAP XML Array which has some tags repeated.

The resulting hash contains the repeated keys as Array and the elements without repetition as Hash.
How can i decipher the order of elements in the original XML Array?

<nodes>
  <foo>
    <name>a</name>
  </foo>
  <bar>
    <name>b</name>
  </bar>
  <baz>
    <name>c</name>
  </baz>
  <foo>
    <name>d</name>
  </foo>
  <bar>
    <name>e</name>
  </bar>
</nodes>

is parsed as

{nodes: {
  foo: [{name: "a"}, {name: "d"}],
  bar: [{name: "b"}, {name: "e"}],
  baz: {name: "c"}
}}

In my XML Response, the order of elements has meaning and i'm not sure how to retain this information.

Floating response attributes signature

While parsing document <foo attr="attr">...</foo> with to_hash you always expect to get '...' as a value of tag. However if empty value given (that should be considered as an empty string) Nori injects attributes list as a child hash. Since you can get the attributes of any node with .attributes there are totally no reasons to do such a thing.

strip_namespaces doesn't work for attributes

XML:

<wd:Position_Data wd:Effective_Date="2022-02-01">
  <wd:Position_ID>12345</wd:Position_ID>
</wd:Position_Data>

Ruby:

content = File.read('response.xml')
nori = Nori.new(strip_namespaces: true)
hash = nori.parse(content)
puts hash.inspect

Expected:

{"Position_Data"=>{"Position_ID"=>"12345", "@Effective_Date"=>"2022-02-01"}}

Actual:

{"Position_Data"=>{"Position_ID"=>"12345", "@wd:Effective_Date"=>"2022-02-01"}}

Nori dynamic require breaks in Jruby

Using MRI Ruby there's no issue to load a library or a module dynamically. However, in Jruby this is not thread safe per se and should be handled with care:

You should take care in the following situations: concurrent requires, or lazy requires that may happen at runtime in a parallel thread. If objects are in flight while classes are being modified, or constants/globals/class variables are set before their referrents are completely initialized, other threads could have problems.
from https://github.com/jruby/jruby/wiki/Concurrency-in-jruby#thread_safety

Here's a simple XML parsing code snippet that shows the problem in JRuby and runs fine in MRI.

xml = '<?xml version="1.0" encoding="UTF-8"?> <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <soapenv:Body> <ns1:response xmlns:ns1="urn:api.example.de/soap11/RpcSession" soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"> <loginReturn xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/" xsi:type="soapenc:string">foobar</loginReturn> </ns1:response> </soapenv:Body> </soapenv:Envelope>'

100.times.map do
  Thread.new do
    nori_options = {
      :strip_namespaces=>true,
      :advanced_typecasting=>true,
      :parser=>:nokogiri
    }
    nori = Nori.new(nori_options)
    p nori.parse(xml)
  end
end.each(&:join)

However, adding the require "nori/parser/nokogiri" statement as seen in https://github.com/savonrb/nori/blob/master/lib/nori.rb#L53 fixes the problem for Jruby.

Inconsistent types when parsing empty tags

When parsing an empty tag, the result varies depending on the attributes the element has:

Nori.new.parse('<foo />')
#=> {"foo"=>nil}

Nori.new.parse('<foo bar />')
#=> {}

Nori.new.parse('<foo bar="baz"/>')
#=> {"foo"=>{"@bar"=>"baz"}}

Nori.new.parse('<foo bar="baz">Content</foo>')
#=> {"foo"=>"Content"}

It would be handy if there was an option to configure that the parsed response stays consistent (like for string values), even if some information may be lost (e.g. always returning nil, even if there were attributes)

Maybe something like:

Nori.new(ignore_empty_attributes: true).parse('<foo bar />')
#=> {"foo"=>nil}

Fingerprint and time type

Nori determines fingerprint (e6:53:ec:01:00:ce:b4:06:c5:3d:49:7c:e8:a0:b0:e7:1b:6c:33:6c) as time type and tried to parse it with Time.parse method. Because this isn't time string, an error occur:

"argument out of range"

Nori 1.1.1 includes breaking change scheduled for 2.0

The nori 1.1.1 point release includes the #xml_attributes change that the CHANGELOG indicates is scheduled for the 2.0 version. May I propose that 1.1.1 be yanked and that 1.1.2 be uploaded? All users of savon performing gem update or new installs will get this version that will not work with existing code using xml attributes.

nokogiri gem < 1.6 dependency needed?

I'm running a project that uses gems that require nokogiri >= 1.6.0, yet this gem has a really tight requirement of < 1.6 on nokogiri.

Any reasons for this? Can it be loosened?

Thanks!

Nokogiri Parser should strip xml before parsing

The nokogiri parser checks if the stripped xml is empty before parser, but passes the unstripped XML to the parser.

This means the you have to strip it yourself before passing to the parser, which means the strip in the guard clause is unnecessary.

A better and more robust solution is for the Nori parse method to strip the xml before passing to Nokogiri.

I'll throw together a quick pull request.

Nori can't parse formatted XML

For example:

This works

xml = '<AvailStatusMessage><StatusApplicationControl Start="2009-11-01" End="2009-11-14" Mon="0" Tue="0" Weds="0" Thur="0" Fri="1" Sat="1" Sun="1"></StatusApplicationControl><LengthsOfStay><LengthOfStay MinMaxMessageType="SetMinLOS" Time="2" TimeUnit="Day"></LengthOfStay></LengthsOfStay><UniqueID Type="16" ID="1"></UniqueID><RestrictionStatus Status="Open"></RestrictionStatus></AvailStatusMessage>'

Nori.new.parse xml
=> {"AvailStatusMessage"=>{"StatusApplicationControl"=>{"@Start"=>"2009-11-01", "@End"=>"2009-11-14", "@Mon"=>"0", "@Tue"=>"0", "@Weds"=>"0", "@Thur"=>"0", "@Fri"=>"1", "@Sat"=>"1", "@Sun"=>"1"}, "LengthsOfStay"=>{"LengthOfStay"=>{"@MinMaxMessageType"=>"SetMinLOS", "@Time"=>"2", "@TimeUnit"=>"Day"}}, "UniqueID"=>{"@Type"=>"16", "@ID"=>"1"}, "RestrictionStatus"=>{"@Status"=>"Open"}}}

And this doesn't work

xml = '<AvailStatusMessages HotelCode="HXYORZZ">
                
        <AvailStatusMessage>
                      
          <StatusApplicationControl Start="2009-11-01" End="2009-11-14" Mon="0" Tue="0" Weds="0" Thur="0" Fri="1" Sat="1" Sun="1">
                        
          </StatusApplicationControl>
                      
          <LengthsOfStay>
                            
            <LengthOfStay MinMaxMessageType="SetMinLOS" Time="2" TimeUnit="Day">
                              
            </LengthOfStay>
                        
          </LengthsOfStay>
                      
          <UniqueID Type="16" ID="1"></UniqueID>
                      
          <RestrictionStatus Status="Open"></RestrictionStatus>
                  
        </AvailStatusMessage>'

Nori.new.parse xml
=> {"AvailStatusMessages"=>"        \n        <AvailStatusMessage>            \n          <StatusApplicationControlstart=\"2009-11-01\" end=\"2009-11-14\" mon=\"0\" tue=\"0\" weds=\"0\" thur=\"0\" fri=\"1\" sat=\"1\" sun=\"1\">            \n          </StatusApplicationControl>            \n          <LengthsOfStay>                \n            <LengthOfStaymin_max_message_type=\"SetMinLOS\" time=\"2\" time_unit=\"Day\">                \n            </LengthOfStay>            \n          </LengthsOfStay>            \n          <UniqueIDtype=\"16\" id=\"1\"></UniqueID>            \n          <RestrictionStatusstatus=\"Open\"></RestrictionStatus>        \n        </AvailStatusMessage>"} 

2.0.2 Breaking Savon

I get the following with nori-2.0.2

NoMethodError: undefined method `key?' for #<Nori::StringWithAttributes:0x007f98b430e400>
.../bundle/ruby/1.9.1/gems/savon-2.0.2/lib/savon/response.rb:37:in `body'
.../bundle/ruby/1.9.1/gems/savon-2.0.2/lib/savon/response.rb:43:in `to_array'

Downgrading to nori-2.0.0 allows Savon to work properly again.

xs:date regex matches invalid dates

As part of a response the string "DS2001-19-1312654773" (which is actually a filesystem path) matches the regex XS_DATE. This causes Date.parse to raise an exception. Anchoring the regex at the start (^) fixes this problem for this particular string but I think it should probably be anchored at the end too ($).

Nori converts String to Nori::StringWithAttributes

If i use savon for a soap call, i get a Hash with the response. In that hash i have Strings.
All Strings of the hash are from type Nori::StringWithAttributes instead String.
Only a result[:something].class shows that.

In my case i store the values in a Object wich i convert to yaml and this is the result:

test_id: !ruby/string:Nori::StringWithAttributes
str: '9482202'
attributes: {}

And a YAML.load(from_yaml) returns test_id = nil

@rubiii: if you need more input call me or weidenfreak ;)

cu
Jedbeard

Do not replace '-' with '_' in tag names.

Hello.

I'm trying to use savon library which use nori and I have one problem. Nori changes '-' to '_' in tag names. For example 'some-attribute' will be changed to 'some_attribute' (I want later to use response hash as argument for soap request). There is no option to change it. I would like to have option to not do it. I think it should be just removed becuase there is :convert_tags_to option for converting tags.

Breaking change between 2.4.0 and 2.6.0

I haven't had chance to investigate what caused this, but after an upgrade from 2.4.0 to 2.6.0, newlines are included in parsed fields.

In our case we are using Savon to talk to an API returning XML in XML. When parsing the embedded XML with Nokogiri, the newlines are mapped to Nokogiri::XML::Text which resulted in obscure errors and much debugging.

> Nori::VERSION
=> "2.4.0"
> Nori.new.parse("<outer>\n&lt;embedded&gt;\n&lt;one&gt;&lt;/one&gt;\n&lt;two&gt;&lt;/two&gt;\n&lt;embedded&gt;\n</outer>")
=> {"outer"=>"<embedded><one></one><two></two><embedded>"}
> Nori::VERSION
=> "2.6.0"
> Nori.new.parse("<outer>\n&lt;embedded&gt;\n&lt;one&gt;&lt;/one&gt;\n&lt;two&gt;&lt;/two&gt;\n&lt;embedded&gt;\n</outer>")
=> {"outer"=>"<embedded>\n<one></one>\n<two></two>\n<embedded>\n"}

I think this change is correct, but please add a note about it somewhere as a breaking change :)

xsi:nil="true" nodes not nil when additional attributes present

I service I am interacting with returns nil elements along with the xsi type information, like so:

<node xsi:type="datetime" xsi:nil="true" />

Unfortunately, Nori currently converts this to:

{:'@xsi:type' => 'datetime' }

Instead of just nil

IMHO, this isn't the expected behaviour, since we can infer the type information from the WSDL, and don't need this lingering quasi-nil hash value around just for the type information.

Thoughts? I'm not sure if all attributes should be zapped when xsi:nil is encountered, or just xsi/xmln prefixed ones.

does not work with plain tags and whitespace

I have a tag set like this

irb(main):001:0> tagged = <vb>Tell</vb> <prp>me</prp> <det>the</det> <jj>current</jj> <nn>temperature</nn>

irb(main):002:0> parser = Nori.new.parse(tagged)
=> {"vb"=>"Tell"}

It does not parse the other tags.

If I remove the whitespace it still returns the same result and if the parser is set to REXML

irb(main):014:0> parser = Nori.new(:parser => :rexml)
=> #<Nori:0x81c956b8 @options={:strip_namespaces=>false, :delete_namespace_attributes=>false, :convert_tags_to=>nil, :convert_attributes_to=>nil, :empty_tag_value=>nil, :advanced_typecasting=>true, :convert_dashes_to_underscores=>true, :parser=>:rexml}>
irb(main):015:0> parser.parse('<vb>Tell</vb><prp>me</prp><det>the</det><jj>current</jj><nn>temperature</nn>')
=> {"vb"=>"Tell<prp>me</prp><det>the</det><jj>current</jj><nn>temperature</nn>"}

It simply not returning the expected hash.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.