Comments (5)
One usability issue: the built-in N-Triples parser would skip invalid lines. Many RDF dumps are invalid, and SERD just stops when this happens (without clear error message at the moment; would be useful to know which line is invalid). This means you'd have to correct the error and restart generation all over again.
That's not a real reason to keep the built-in parser though, because its definition of "invalid" was too narrow: it would happily parse some invalid IRIs, resulting in an invalid HDT file. At least with SERD, we know that the generated HDT file will be valid.
from hdt-cpp.
One minor bump in the road of removing the built-in N-Triples parser/serializer is that there are actually call sites in the code base that are hardcoded to use them, instead of cleanly going through a factory function that would encapsulate the multiple parser/serializer implementations.
For example, header parsing in src/header/PlainHeader.cpp
is hardcoded to use the built-in parser. I'm working through these now.
from hdt-cpp.
This is overall a little bit of a larger task than we might have thought, since the varying reader/writer implementations aren't cleanly abstracted nor dispatched through a registry.
I think what I'll do is lift and adapt existing known-good code (that already abstracts over Serd and Raptor, and more) from our datagraph/librdf++ library (used in Dydra), and replace the I/O layer with that. That library also works on FILE
handles, meaning that the transparent gzip (etc) compression code could be made to work with it more readily than with C++ iostreams.
from hdt-cpp.
Seems like a good idea, especially given the FILE
handles thing.
from hdt-cpp.
Removed the built-in parser with bc7a258 and 09a1e6b because people kept hitting this issue (LinkedDataFragments/Client.js#26, LinkedDataFragments/Server.js#37, …).
from hdt-cpp.
Related Issues (20)
- Unused TABLESUM and coversizes in suffixtree
- Removed unneeded exception in BasicHDT
- Consolidate rdf2hdt Windows-specific implementation and base implementation
- Replace use of deprecated ftime() HOT 2
- Resolve "delete called on non-final" warnings.
- Test dumpDictionary not being called with an input HDT file
- Test case "properties" fails HOT 1
- Code formatting / beautifier needed. HOT 1
- Evaluate Parallel Hashmap for potential performance benefits HOT 2
- Add option to ignore error instead of throwing error HOT 5
- `make install` does not install triples/ directory -- hdt-it still active? HOT 1
- clang-format of libdcs [sic]
- hdt::QueryProcessor.searchJoin() gives incorrect results HOT 6
- Compile error on macOS with "make -j2" command HOT 2
- rdf2hdt stops without error message HOT 3
- Add encryption-at-rest to libraries HOT 1
- rdf2hdt produces invalid UTF8 values? HOT 1
- undefined reference to `hdt::HDTManager::mapHDT(char const*, hdt::ProgressListener*)'
- support for quads/named graphs HOT 3
- Memcpy to nullptr in CSD_HTFC::CSD_HTFC()
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hdt-cpp.