raku-community-modules / xml Goto Github PK
View Code? Open in Web Editor NEWAn Object-Oriented XML Library for Raku
License: Artistic License 2.0
An Object-Oriented XML Library for Raku
License: Artistic License 2.0
Would you oppose breaking the classes/roles into separate files so they can be exported from the module separately? I'm asking because I'd like to use only XML::Document
and the other ::Node
objects in the NativeCall
(Expat) version of this module that I'm currently working on
When I iteratively append new nodes to a node set, the return value of append() is the stringified concatenation of all of the nodes that have been appended up to that iteration. For example, given
for @array-of-nodes -> $node {
my $test = $nodeset.append($node);
say "$test";
}
The value output for $test on the first iteration is the first $node stringified, on the next iteration, it is the the string from the first iteration concatenated with the string from the second, etc.
zef won't automatically update it if you choose zef upgrade
and users have to do zef --force install XML
to get it to install. Please bump the version. Thank you!
I am only using the from-file method, but this module allows me to get my dns data via the Namecheap.com API with no major problems. The only minor problem I had was that it was a bit of trial-and-error to see exactly how to extract the tidbits I needed.
Thanks for a very useful module!
It failed to parse the name with a periods in it.
my $xml-part = "<name_contain.periods>some text</name_contain.periods>";
my $doc = XML::Grammar.parse($xml-part);
# return (Any) in $doc
I tried to change token in the grammar to:
token pident {
<!before \d> [ \d+ <.ident>* || <.ident>+ ]+ % <[-.]>
}
It worked when I defined the grammar in the script.
When I do something like $xml-text ~= $xml-document.root;
all text from XML::Text elements are processed in such a way that all whithespace is reduced to one space. For some elements like
Currently the library won't allow you to insert before/after if the reference node not found in the parent element's node list, including undefined ones. I want to insert a node after a node found by some criteria, like tag name, and if none found, insert before/after first/last child, and if there are no nodes at all, insert it as first child. First two can be easily done with one-liner like $parent.insertAfter($node,$parent.elements(:TAG<item>).tail // $parent.lastChild())
, but to fulfill the last you have to check for child nodes and call a whole different method, like insert
to actually insert a node, complicating the code quite a bit.
XML can't parse documents where an element contains only whitespace. In t/parser.t
, change $text
to the following; note the <monkey> </monkey>
element at the end:
my $text = '<test><title>The title</title><bullocks><item name="first"/><item name="second"/></bullocks><monkey> </monkey></test>';
Attempting to execute the tests will fail with could not parse XML
. I've tried tweaking the textnode
rule but I don't understand something about grammars well enough to make it work.
$ perl6 -v
This is rakudo version 2015.11-316-ga4ca12a built on MoarVM version 2015.11-22-g6e4b90f implementing Perl v6.b.
$ panda install XML
==> Fetching XML
==> Building XML
==> Testing XML
# Failed test 'set using Boolean.'
# at t/emitter.t line 30
# expected: 'standalone'
# got: 'True'
# Failed test 'element after set serialized properly'
# at t/emitter.t line 31
# expected: '<test><title alt="Alternate text">The title</title><bullocks standalone="standalone"><item name="first"/><item name="second"/></bullocks></test>'
# got: '<test><title alt="Alternate text">The title</title><bullocks standalone="True"><item name="first"/><item name="second"/></bullocks></test>'
# Looks like you failed 2 tests of 5
t/emitter.t ...........
Dubious, test returned 2 (wstat 512, 0x200)
Failed 2/5 subtests
t/example.t ........... ok
t/make.t .............. ok
t/namespaces.t ........ ok
t/parser.t ............ ok
t/preamble.t .......... ok
t/proxies.t ........... ok
t/query-methods.t ..... ok
t/query-positional.t .. ok
Test Summary Report
-------------------
t/emitter.t (Wstat: 512 Tests: 5 Failed: 2)
Failed tests: 4-5
Non-zero exit status: 2
Files=9, Tests=118, 7 wallclock secs ( 0.04 usr 0.01 sys + 6.58 cusr 0.34 csys = 6.97 CPU)
Result: FAIL
The spawned process exited unsuccessfully (exit code: 1)
in sub run-and-gather-output at /home/zoffix/.rakudobrew/moar-nom/install/share/perl6/site/sources/C27BE995DC18074CA8F64980F69FEB80BADF5619:86
in block at /home/zoffix/.rakudobrew/moar-nom/install/share/perl6/site/sources/A40B6CBA2E85D9DAA45064316EBEB9E42B0036E1:24
in sub indir at /home/zoffix/.rakudobrew/moar-nom/install/share/perl6/site/sources/C27BE995DC18074CA8F64980F69FEB80BADF5619:20
in method test at /home/zoffix/.rakudobrew/moar-nom/install/share/perl6/site/sources/A40B6CBA2E85D9DAA45064316EBEB9E42B0036E1:5
in method install at /home/zoffix/.rakudobrew/moar-nom/install/share/perl6/site/sources/1BC9777EC40C29C8331437E926CD4C13B983C026:141
in method resolve at /home/zoffix/.rakudobrew/moar-nom/install/share/perl6/site/sources/1BC9777EC40C29C8331437E926CD4C13B983C026:219
in sub MAIN at /home/zoffix/.rakudobrew/moar-nom/install/share/perl6/site/resources/9FF75FC978A3556E531F982825B3EDBBBA834D9E:18
in block <unit> at /home/zoffix/.rakudobrew/moar-nom/install/share/perl6/site/resources/9FF75FC978A3556E531F982825B3EDBBBA834D9E:145
Specifying offset
of value other than 1 is incompatible with its inner working, as the method is implemented in terms of before
with offset one greater than the position to insert at, so pretty much is pointless.
I want to use XPath in a Perl6 Web::Scraper - is there anything in the works for building the XML tree from an HTML file?
Hi, first thank you for the module.
I use it to dynamical update my Visual Studio project file and it works great. One think I think could be improved. After saving the project file the xml is all in one line. It does not be a problem, but in git and if you look at the file with a editor it is not very readable.
it is possible to implement a pretty printing parameter which adds new lines to the xml string when saving it?
Thanks
I have an XML file with &
in it, and this is not decoded when I stringify the text node. The documentation doesn't mention XML entities at all, so I'm not sure whether to expect this or not.
<NAME>MARKS&SPENCER</NAME>
say $doc.lookfor(:TAG('NAME') :SINGLE)[0]; # MARKS&SPENCER
Although, one would at least expect a facility to decode it, which is also not provided.
Ideally, the user should never see an XML entity, and they would be encoded/decoded transparently along the way.
Raku uses IO::Path to represent filenames so it would make sense that functions accept an IO::Path wherever a filename is expected
[0] > my $svg = dir('.', test => *.ends-with('.svg'))[0];
"thumbnail.svg".IO
[1] > from-xml-file($svg)
Type check failed in binding to parameter '$file'; expected Str but got IO::Path (IO::Path.new("thumbn...)
Hi!
I'm trying to parse a 52 MByte XML file and the performance is really bad.
I'm trying to follow the instructions and just doing:
my $XML = 'ec_inventory_en.xml';
sub MAIN(){
my $xml = from-xml-file($XML);
}
This code will use more than 5Gbytes of memory [1], only one core is used [2] and it takes more than 3m30s (in comparison a perl version takes around 15 seconds to parse the file)
[1] - Reported by cat /proc/$PID/smaps | grep -i pss | awk '{Total+=$2} END {print Total/1024" MB"}'
[2] -
Building/Installing XML.pm6 or other modules that run XML tests gives this kind of error:
Not enough positional parameters passed; got 0 but expected 1
in sub from-xml at lib/XML.pm6:1067
in block at t/example.t:10
lib/XML.pm6:1067 is this:
proto from-xml ($) is export {*}
Would be nice to have a remove namespace method. I can get by with
$el.attribs{"xmlns:xyz"}:delete;
Regards,
Marcel
If a XML input (file or string) has a XML prolog (<?xml.....) the version and encoding will have the quotes in them and on output will not parse correctly.
In addition the code looked like it would not handle a prolog with single quotes.
and
the standalone parameter is not supported (at least it does not look like it is.
I fixed the first two items BUT while I have 26 years of Perl coding I am just starting on Perl 6.
Attached is (At lease I am going to try and attach) a git patch file with two commits that:
OK can't attach a patch file I will try and find a email address and send it to you.
I'm not sure at the moment where the parse failure is happening, but other validators confirm the file (from Unicode) is valid XML.
(rename file from ee.txt
to ee.xml
because GH doesn't like XML uploads for some reason).
I'll try to investigate further, but my guess is whatever is causing the problem is also causing problems with kn.xml
, ks.xml
, lo.xml
, ml.xml
, mr.xml
, and yav.xml
inside of the main/
directory for the CLDR repository.
It seems like the problem is in the module
The code:
#!/usr/bin/env perl6
use v6;
use XML;
my Str $log-file;
if %*ENV<MQ_LOG>:exists {
$log-file = %*ENV<MQ_LOG>;
} else {
$log-file = "$*HOME/.mq/log.xml";
}
unless $log-file.IO.e {
spurt $log-file, make-xml('log', \('meta', :version<1>)).Str;
}
my XML::Document $log = from-xml-file($log-file);
sub MAIN {
$log[3].append(make-xml('group',
:actual-length<1>,
:original-length<2>,
:max<3>,
:timestamp<4>,
:score<5>,
:level<6>
)
);
say "Üks!";
spurt $log-file, $log.Str;
say "Kaks!";
}
The error:
Use of uninitialized value of type Any in string context.
Methods .^name, .perl, .gist, or .say can be used to stringify it to something meaningful.
in block at /home/ron/rakudo/install/share/perl6/site/sources/23A69E0485BA94AAA7B51C8E2892B44F68D5C5DF (XML::Element) line 774
Use of uninitialized value of type Any in string context.
Methods .^name, .perl, .gist, or .say can be used to stringify it to something meaningful.
in block at /home/ron/rakudo/install/share/perl6/site/sources/23A69E0485BA94AAA7B51C8E2892B44F68D5C5DF (XML::Element) line 774
For example:
my @elements = $mydocument.root.nodes;
for @elements -> $element {
my $new-element = @new-elements.pop;
$mydocument.root.replace($element, $new-element);
}
seems to result in only the first and last $new-element ending up in the document. Sorry, I have not golfed this down further. I see a similar result with the replaceChild() syntax.
While trying to reproduce some examples (from the test folder), I stumbled onto an indexing oddity.
Below, $xml.root[1]
, $xml.root[3]
, $xml.root[5]
, and $xml.root[7]
return text.
Conversely, $xml.root[0]
, $xml.root[2]
, $xml.root[4]
, $xml.root[6]
, and $xml.root[8]
return blank lines.
Is this canonical XML
behavior? Thx.
~$ raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.say;' ~/exemel_text.xml
<?xml version="1.0"?><root>
<file>text1</file>
<file>text2</file>
<file>text3</file>
<file>text4</file>
</root>
~$ raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.root.say;' ~/exemel_text.xml
<root>
<file>text1</file>
<file>text2</file>
<file>text3</file>
<file>text4</file>
</root>
~$ raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.root[0].say;' ~/exemel_text.xml
~$ raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.root.[0].say;' ~/exemel_text.xml
~$ raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.root.[1].say;' ~/exemel_text.xml
<file>text1</file>
~$ raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.root.[2].say;' ~/exemel_text.xml
~$ raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.root.[3].say;' ~/exemel_text.xml
<file>text2</file>
~$ raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.root.[4].say;' ~/exemel_text.xml
~$ raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.root.[5].say;' ~/exemel_text.xml
<file>text3</file>
~$ raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.root.[6].say;' ~/exemel_text.xml
~$ raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.root.[7].say;' ~/exemel_text.xml
<file>text4</file>
~$ raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.root.[8].say;' ~/exemel_text.xml
~$ raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.root.[9].say;' ~/exemel_text.xml
(Any)
~$
Rakudo 2023.05 / MacOS;
XML:ver<0.3.3>:auth<zef:raku-community-modules>
This might be considered a rakudo bug rather than an exemel bug, but the current built-in .perl
method does not appear handle circular data well. Running .per
l on an XML::Node never returns. In the meantime, it might be easy to create a version that outputs something like:
from-xml("...")
where the ...
is replaced with the stringified representation of the XML::Node. It's not perfect, but at least it would work.
related to #63?
$ git clone https://github.com/tony-o/perl6-web-scraper
$ cd perl6-web-scraper
$ raku -I. -MXML -e "from-xml('t/data/s05.html'.IO.slurp)"
seems to hang, taking up a single CPU but never (in my limited patience) returning.
Mac M2:
$ sw_vers
ProductName: macOS
ProductVersion: 14.1.2
BuildVersion: 23B92
This program:
use XML;
my $xml = from-xml-file('test.xml');
outputs an error when test.xml is like this (which passes the validation test here: http://www.w3schools.com/xml/xml_validator.asp):
<?xml version="1.0" encoding="UTF-8"?>
<test>
<greeting en="hello">world</greeting>
<for>
<item-1>Yes</item-1>
<item-1>No</item-1>
<item-1>Maybe</item-1>
<item-1>Who cares?</item-1>
</for>
</test>
but works fine if test.xml is:
<?xml version="1.0" encoding="UTF-8"?>
<test>
<greeting en="hello">world</greeting>
<for>
<item-dash>Yes</item-dash>
<item-dash>No</item-dash>
<item-dash>Maybe</item-dash>
<item-dash>Who cares?</item-dash>
</for>
</test>
Hi,
The translation of a single quote in an attribute value to '
is not helping in some situations. An example is: <img onclick="alert('clicked')" />
. In these cases the JavaScript is rendered unusable if it is translated into <img onclick="alert('clicked')" />
.
Regards,
Marcel
Say, we have a table row where some columns has style
attribute and some doesn't. We need these where the attribute doesn't contain a substring:
$tr.lookfor(:TAG<td>, :style{ ! (.defined && .contains("display:none")) });
The problem here is that the code sees :style
and skips all nodes where it is missing.
I think, the right apporach for a code match must be passing in Nil
for every missing style
attribute.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.