cower's People
cower's Issues
conversions fails without valueUrl or csvw:value or titles
If the json only provides datatype and name, and not valueUrl, csvw:value, or titles for a column, conversion fails. Happens because nothing is placed in valueUrl_eval. Fix by replacing titles by names, or by supplementing titles by names where titles is missing.
handle csvw:parseOnEmpty as in COW
If csvw:parseOnEmpty = true
, empty cells should be converted; if false
they should not. Current behavious is probably NA
in data means dropped at writing due to use of complete.cases
.
no evaluation in propertyUrl
propertyUrls are only expanded, never evaluated: ..., "propertyUrl": "vocab:{variable}",...
does not work. would need to get evaluated before predicates are inserted, but only if {}
present.
handle conditionals
The conditional in COW, {% if something %}x{% else %}y{% endif %}
is currently not supported. R's ifelse()
function probably does work as an alternative.
literal should create correct datatype uris directly
"string"^^<http://example.org/datatype>
rather than "string"^^http://example.org/datatype
, so no base = uriref()
anymore. Check if the datatypes uris in <> form anywhere though.
row numbering subject wrong in combination with NULL specification
Because .I
is evaluated after subsetting, batch[!Rank %in% c(1, 2), uriref(.I, base = 'http://example.com/')]
does not work because it starts counting. batch[, list(Rank, uriref(.I, base = "http://example.com/"))][!Rank %in% c(1, 2), ]
but doesn't update the data.table. Best alternative might be batch[, vrb := uriref(.I, base = 'http://example.com/')]
, followed by batch[Rank %in% c(1, 2), vrb := NA]
, that is, use the nullstrings for separate commands.
make RFC3987 urirefs
uriref()
should create IRIs rather than URIs, so no percent encoding. See https://github.com/CLARIAH/iribaker, https://github.com/dgerber/rfc3987. R-pattern to replace may look like this "(?!(?:[a-zA-Z0-9._~-]|[\\xA0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF])|%[0-9a-fA-F][0-9a-fA-F]|[!$&'()*+,;=]|:|@|/)(.)"
(for stringi::stri_replace_all_regex()
), but check this. Consider using the useful bits of url-encode as well.
error if total rows are an exact multiple of batch_size
When going through the csv file in chunks and the total rows are an exact multiple of the batch size, cower ends up having to read a csv file of zero remaining rows and an error is thrown. Because the conversion in that case is complete, the error message should not appear.
Possible fixes:
- tryCatch or another error handler
- count the rows on the csv file in advance (very inefficient in R because entire file would nonetheless have to be read), dependency on
wc -l
somewhat undesirable, though there are already external depedencies (head
, though only whenmax_size
is specified; and maybe gzfile, though only when usingcompress = T
).
complete metadata graph
https://github.com/CLARIAH/COW/blob/9f9e8cd3b4cfce7c702a97d1cd0534eaa7187fb3/cow/converter/csvw.py#L235 and https://github.com/CLARIAH/COW/blob/9f9e8cd3b4cfce7c702a97d1cd0534eaa7187fb3/cow/converter/csvw.py#L184 are currently not included. Only https://github.com/CLARIAH/COW/blob/66cd539abebbe108d0b716d119e6e2e9c26b2fe8/cow/converter/util/__init__.py#L279 part was done, so only NanoPublication()
is implemented (https://github.com/CLARIAH/COW/blob/83eaa0518c5885dd893e24a1139c2be2611545dd/cow/converter/csvw.py#L178) but not the rest.
csvw:value strings should not need double quotes
If you want "csvw:value": "my string"
to result in triples with sub pred "my string" it needs to be provided as
"csvw:value": "'my string;"` because otherwise cower thinks it's a column name. Should be fixed in add_schema_evals
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.