akjoshi / nxparser Goto Github PK

Automatically exported from code.google.com/p/nxparser

License: Other

Java 99.16% XSLT 0.84%

nxparser's Introduction

This README covers some of the licencing questions relating to the NxParser project.

All source code under the NxParser project is released with the 3-clause BSD licence.
When building the "lite" version of the NxParser, no dependencies are required and this
licence applies ("licence-lite.txt").

The "full" version of the NxParser build includes dependencies (found in the lib/ folder)
available under the licence found in "licence-full.txt". The third party licence is found
under "MPL-1.1.txt".

nxparser's People

Watchers

nxparser's Issues

Clarify license

The project source code contains three licenses: license-full, license-lite and 
MLP-1.1. Why is that so? What is the "real" license?

Original issue reported on code.google.com by [email protected] on 22 Dec 2011 at 9:34

Invalid language tags cause NPE

I have a .nt file where some of the language tags are upper-case, like this:

<http://www.sorbaioli.org/2009/02/23/wordpress-plugin-for-identica/> 
<http://creativecommons.org/ns#attributionName> "Graziano Sorbaioli"@EN .

I'm not exactly sure if this according to spec or not, but in any case NxParser 
issues a warning when parsing this line:

WARNING: Something wrong with the literal-backing string. The parsing regex 
pattern didn't match. Check the string for correct N3 syntax. The malicious 
string is: "Graziano Sorbaioli"@EN

However, parseNodes() still returns Node objects, but when I try to call 
toString() I get an NPE:

java.lang.NullPointerException
    at org.semanticweb.yars.nx.util.NxUtil.unescape(NxUtil.java:178)
    at org.semanticweb.yars.nx.util.NxUtil.unescape(NxUtil.java:164)
    at org.semanticweb.yars.nx.Literal.toString(Literal.java:235)
    at edu.kit.aifb.ats.triples.test.NxParserTest.test(NxParserTest.java:15)

I'd expect NxParser to fail at parsing time with an exception (or to just 
ignore the line) and not sometime later with an NPE. getData() returns null, 
which is also not great, but would at least be manageable.

Here is the code to reproduce:

    @Test
    public void test() throws ParseException {
        Node[] nx = NxParser.parseNodes("<http://www.sorbaioli.org/2009/02/23/wordpress-plugin-for-identica/> <http://creativecommons.org/ns#attributionName> \"Graziano Sorbaioli\"@EN .");
        System.out.println(nx[2].toString());
    }

Cheers,
Günter

Original issue reported on code.google.com by [email protected] on 9 Aug 2012 at 12:14

NxParser.parseNodesInternal() enters infinite loop with urls with multiple adjacent spaces.

What steps will reproduce the problem?
1. NxParser.parseNodes("<  http://subject/> \"predicate\" \"object\" .");

What is the expected output? What do you see instead?
Expected a ParseException..


What version of the product are you using? On what operating system?
1.2.2

Please provide any additional information below.
I've attached a patch that fixes the issue + simple unit test.  I hope I've 
fixed more problems that I've introduced.

Original issue reported on code.google.com by [email protected] on 17 Apr 2012 at 3:43

Attachments:

NxParserMultiSpaceFix.patch

Read large file with >2GB

What steps will reproduce the problem?
1. Download file 
http://downloads.dbpedia.org/3.8/en/interlanguage_links_en.ttl.bz2
2. Try to parse it
3. NxParser hangs and does not read anything

What is the expected output? What do you see instead?
Normal n-triple based parsing 

What version of the product are you using? On what operating system?
1.2.2

Original issue reported on code.google.com by [email protected] on 10 Jul 2013 at 12:44

Comment lines "# commment goes here" cause NxParser to go into an infinite loop

What steps will reproduce the problem?
1. Feed NxParser with a .nt file containing a "# comment goes here" line
2. Put the kettle on, feed the cats, come back to realise its not processed 
anything

What is the expected output? What do you see instead?

I'd expect it to handle comment lines gracefully, and ignore them.

What version of the product are you using? On what operating system?

1.2.2

Please provide any additional information below.

While processing dumps from the dbpedia-extraction-framework I noticed that my 
import job (using NxParser) was stalled. I attached a debugged and spotted it 
was stuck in while loop in the function loadNext().

I pulled the latest source from SVN (r93) and added a simple workaround.

While I'm not sure that .nt files should have "# comment goes here" comment 
lines; it makes sense to handle them gracefully anyway.

Attached is my simple one-liner fix.

Original issue reported on code.google.com by [email protected] on 10 Jul 2012 at 8:24

Attachments:

nxparser-comment-line-fix.patch

Generated NodeIDs invalid

The nodeID of generated Bnodes (here in subject position) can be invalid 
according to the spec because of the star at its end.

E.g.:
_:httpx3Ax2Fx2Fbnbx2Edatax2Eblx2Eukx2Fdocx2Fresourcex2F006893251x2Erdfx3Fx5Fmeta
datax3Dallx2Cviewsx2Cformatsx2Cex78ecutionx2Cbindingsxxtermx5F* 
<http://purl.org/linked-data/api/vocab#label> "*" 
<http://bnb.data.bl.uk/doc/resource/006893251.rdf?_metadata=all,views,formats,ex
ecution,bindings> .

The production rule in the N-Triples spec is:
[A-Za-z][A-Za-z0-9]*

see: http://www.w3.org/TR/rdf-testcases/#name

Original issue reported on code.google.com by [email protected] on 4 Dec 2011 at 11:53

hangs parsing

What steps will reproduce the problem?
1. try parse this url http://www.ingegneriarchitettura.unibo.it/it/bacheca/RSS

What is the expected output? What do you see instead?
it hangs :(

logcat: 
a lot of: 09-10 15:31:47.347: W/System.err(1108):   at 
org.semanticweb.yars.nx.parser.NxParser.loadNext(NxParser.java:113)

What version of the product are you using? On what operating system?
nxparser-1.2.3.jar   (not lite)
on android API 16

thank you anyway!
matteo

Original issue reported on code.google.com by [email protected] on 10 Sep 2013 at 3:33

RDF 1.1 unescaping code cannot unescape all that was escaped using old code

See NxUtil test cases.

Original issue reported on code.google.com by [email protected] on 23 Nov 2014 at 4:58

Incorporate changes to N-Triples spec introduced by RDF 1.1

In RDF 1.1, there are changes to the N-Triples spec. Consider them in the code.

Original issue reported on code.google.com by [email protected] on 15 Sep 2014 at 3:34

Do we need the htmlparser.jar in there?

Just wondering if the htmlparser.jar library is needed? Where did it come out 
of? (I know commons-cli is used for parsing command line in the CLI.)

Original issue reported on code.google.com by [email protected] on 25 Jun 2012 at 6:37

Parser hangs with URIs that contain spaces

What steps will reproduce the problem?

    public void testParse() throws Exception {
        Node[] nx = new Node[] { new Resource("http://example.org/ b") };

        ByteArrayInputStream bais = new ByteArrayInputStream(Nodes.toN3(nx).getBytes());

        NxParser nxp = new NxParser(bais);
        while (nxp.hasNext()) {
            System.out.println(nxp.next());
        }
    }

What is the expected output? What do you see instead?

The parser should not hang.

Original issue reported on code.google.com by [email protected] on 2 Nov 2011 at 1:57

jar compiled with Java7!

Hi guys,

the jar seems to be compiled with Java7?
is that really needed?
would it be possible to provide a Java6 or Java5?

I encounter a 
"java.lang.UnsupportedClassVersionError: org/semanticweb/yars/nx/Resource : 
Unsupported major.minor version 51.0"

exception.

Thanks!
Daniel

Original issue reported on code.google.com by [email protected] on 14 Sep 2011 at 4:03

TldManager should throw an exception rather than print log messages

What steps will reproduce the problem?

URIHandler.getPLD("http://localhost/")

What is the expected output? What do you see instead?

I would expect an exception (TldManagerException, say).

Instead I see log output, which I do not want to see.

Original issue reported on code.google.com by [email protected] on 12 Apr 2015 at 11:00

Maven artefact

Hi guys,

Since I already posted from my wishlist, here is another wish:
How about providing a maven artefact/repo?

Thanks,
Daniel

Original issue reported on code.google.com by [email protected] on 14 Sep 2011 at 4:08

Warning in Japanese DBpedia File

What steps will reproduce the problem?
1. Parse "http://ja.dbpedia.org/resource/マンチェスター・シティFC" 
with RdfXmlParser
2. Gives warning.

What is the expected output? What do you see instead?

rapper says the file is ok:

$ rapper "http://ja.dbpedia.org/resource/マンチェスター・シティFC" -c
rapper: Parsing URI 
http://ja.dbpedia.org/resource/マンチェスター・シティFC with parser 
rdfxml
rapper: Parsing returned 996 triples
$

Instead I get (with 2.1-SNAPSHOT):

WARNING: class org.semanticweb.yars.nx.parser.ParseException: 
org.xml.sax.SAXParseException; lineNumber: 882; columnNumber: 19; Element type 
"prop-ja:使用チーム" must be followed by either attribute specifications, 
">" or "/>"., http://ja.dbpedia.org/data/マンチェスター・シティFC.xml
org.semanticweb.yars.nx.parser.ParseException: org.xml.sax.SAXParseException; 
lineNumber: 882; columnNumber: 19; Element type "prop-ja:使用チーム" must 
be followed by either attribute specifications, ">" or "/>".
    at org.semanticweb.yars.rdfxml.RdfXmlParser.parse(RdfXmlParser.java:138)

Original issue reported on code.google.com by [email protected] on 5 Dec 2014 at 12:58

akjoshi / nxparser Goto Github PK

nxparser's Introduction

nxparser's People

Watchers

nxparser's Issues

Recommend Projects

Recommend Topics

Recommend Org