Coder Social home page Coder Social logo

cow's Introduction

CSV on the Web (CoW)

CoW is a tool to convert a .csv file into Linked Data. Specifically, CoW is an integrated CSV to RDF converter using the W3C standard CSVW for rich semantic table specificatons, producing nanopublications as an output RDF model. CoW converts any CSV file into an RDF dataset.

Features

Documentation and support

For user documentation see the basic introduction video and the GitHub wiki. Technical details are provided below. If you encounter an issue then please report it. Also feel free to create pull requests.

Quick Start Guide

There are two ways to run CoW. The quickest is via Docker, the more flexible via PIP.

Docker Image

Several data science tools, including CoW, are available via a Docker image.

Install

First, install the Docker virtualisation engine on your computer. Instructions on how to accomplish this can be found on the official Docker website. Use the following command in the Docker terminal:

# docker pull wxwilcke/datalegend

Here, the #-symbol refers to the terminal of a user with administrative privileges on your machine and is not part of the command.

After the image has successfully been downloaded (or 'pulled'), the container can be run as follows:

# docker run --rm -p 3000:3000 -it wxwilcke/datalegend

The virtual system can now be accessed by opening http://localhost:3000/wetty in your preferred browser, and by logging in using username datalegend and password datalegend.

For detailed instructions on this Docker image, see DataLegend Playground. For instructions on how to use the tool, see usage below.

Command Line Interface (CLI)

The Command Line Interface (CLI) is the recommended way of installing CoW for most users.

Install

Check whether the latest version of Python is installed on your device. For Windows/MacOS we recommend to install Python via the official distribution page.

The recommended method of installing CoW on your system is pip3:

pip3 install cow-csvw

You can upgrade your currently installed version with:

pip3 install cow-csvw --upgrade

Possible installation issues:

  • Permission issues. You can get around them by installing CoW in user space: pip3 install cow-csvw --user.
  • Cannot find command: make sure your binary user directory (typically something like /Users/user/Library/Python/3.7/bin in MacOS or /home/user/.local/bin in Linux) is in your PATH (in MacOS: /etc/paths).
  • Please report your unlisted issue.

Usage

Start the graphical interface by entering the following command:

cow_tool

Select a CSV file and click build to generate a file named myfile.csv-metadata.json (JSON schema file) with your mappings. Edit this file (optional) and then click convert to convert the CSV file to RDF. The output should be a myfile.csv.nq RDF file (nquads by default).

Command Line Interface

The straightforward CSV to RDF conversion is done by entering the following commands:

cow_tool_cli build myfile.csv

This will create a file named myfile.csv-metadata.json (JSON schema file). Next:

cow_tool_cli convert myfile.csv

This command will output a myfile.csv.nq RDF file (nquads by default).

You don't need to worry about the JSON file, unless you want to change the metadata schema. To control the base URI namespace, URIs used in predicates, virtual columns, etcetera, edit the myfile.csv-metadata.json file and/or use CoW commands. For instance, you can control the output RDF serialization (with e.g. --format turtle). Have a look at the options below, the examples in the GitHub wiki, and the technical documentation.

Options

Check the --help for a complete list of options:

usage: cow_tool_cli [-h] [--dataset DATASET] [--delimiter DELIMITER]
                    [--quotechar QUOTECHAR] [--encoding ENCODING] [--processes PROCESSES]
                    [--chunksize CHUNKSIZE] [--base BASE]
                    [--format [{xml,n3,turtle,nt,pretty-xml,trix,trig,nquads}]]
                    [--gzip] [--version]
                    {convert,build} file [file ...]

Not nearly CSVW compliant schema builder and RDF converter

positional arguments:
  {convert,build}       Use the schema of the `file` specified to convert it
                        to RDF, or build a schema from scratch.
  file                  Path(s) of the file(s) that should be used for
                        building or converting. Must be a CSV file.

optional arguments:
  -h, --help            show this help message and exit
  --dataset DATASET     A short name (slug) for the name of the dataset (will
                        use input file name if not specified)
  --delimiter DELIMITER
                        The delimiter used in the CSV file(s)
  --quotechar QUOTECHAR
                        The character used as quotation character in the CSV
                        file(s)
  --encoding ENCODING   The character encoding used in the CSV file(s)

  --processes PROCESSES
                        The number of processes the converter should use
  --chunksize CHUNKSIZE
                        The number of rows processed at each time
  --base BASE           The base for URIs generated with the schema (only
                        relevant when `build`ing a schema)
  --gzip 				Compress the output file using gzip
  --format [{xml,n3,turtle,nt,pretty-xml,trix,trig,nquads}], -f [{xml,n3,turtle,nt,pretty-xml,trix,trig,nquads}]
                        RDF serialization format
  --version             show program's version number and exit

Library

Once installed, CoW can be used as a library as follows:

from cow_csvw.csvw_tool import COW
import os

COW(mode='build', files=[os.path.join(path, filename)], dataset='My dataset', delimiter=';', quotechar='\"')

COW(mode='convert', files=[os.path.join(path, filename)], dataset='My dataset', delimiter=';', quotechar='\"', processes=4, chunksize=100, base='http://example.org/my-dataset', format='turtle', gzipped=False)

Further Information

Examples

The GitHub wiki provides more hands-on examples of transposing CSVs into Linked Data.

Technical documentation

Technical documentation for CoW are maintained in this GitHub repository (under ), and published through Read the Docs at http://csvw-converter.readthedocs.io/en/latest/.

To build the documentation from source, change into the docs directory, and run make html. This should produce an HTML version of the documentation in the _build/html directory.

License

MIT License (see license.txt)

Acknowledgements

Authors: Albert Meroño-Peñuela, Roderick van der Weerdt, Rinke Hoekstra, Kathrin Dentler, Auke Rijpma, Richard Zijdeman, Melvin Roest, Xander Wilcke

Copyright: Vrije Universiteit Amsterdam, Utrecht University, International Institute of Social History

CoW is developed and maintained by the CLARIAH project and funded by NWO.

cow's People

Contributors

harshpundhir avatar wxwilcke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cow's Issues

specific path for .csv file

allow for specification of a path for the .csv file to be converted as well as a URI, specifically one from Dataverse (e.g. datasets.socialhistory.org). Possibly as well for a handle or DOI. (relates to #29)

CSVW converter does not properly link data to author

It's a tricky thing to associate the nanopublication produced by the converter with the original author information of the CSVW schema file.

In principle, any author information in the schema is author information of the CSVW schema, but as we cannot know beforehand the URI that is generated for the nanopublication/assertion pair by the converters, it is practical to just take that author informaiton, and use it for the nanopublication as well (since the csvw conversion is a purely syntactic effort, we cannot really say there's any additional authorship involved: the authors of the resulting RDF are the same as the ones who devised the schema).

Still, currently the two are not connected. This is mainly because the CSVW schema files do not usually have a root resource with a URI (no @id tag in the root of the JSON).

Possible solutions:

  • replace the BNode with the generated URI of the nanopublication/assertion
  • owl:sameAs the BNode with the generated URI of the nanopublication/assertion

wrong warning?

Hi, I just 'build' a file with the converter, but don't think the warning I received is appropriate?

Building schema for /.../git/sdh-public-datasets/clio-infra/geocoder_cshapes.csv
Detected encoding: ascii (1.0 confidence)
Detected dialect: csv.dialect (delimiter: ',')
Delimiter is: ,
Found headers: [u'Webmapper code', u'Webmapper numeric code', u'ccode', u'code parent', u'country', u'country name', u'ctr', u'start year', u'end year']
WARNING: You have two or more column headers that are syntactically the same. Conversion might 
produce incorrect results because of conflated URIs or worse
Done

Converting with invalid AboutURL

During the workshop, a user asked what happened if you changed to AboutURL into a non-existing heading. The presentation decided to move on, but I wanted to try it out.
The provided example was what if user typed
"aboutUrl": "country/{Countries}"
When it should be
"aboutUrl": "country/{Country}"

the resulting NQ results in:
<https://iisg.amsterdam/resource/country/_Countries_> <https://iisg.amsterdam/resource/Rank> "5"^^<http://www.w3.org/2001/XMLSchema#string> <http://iisg.amsterdam/imf_gdppc/assertion/de48f611/2018-02-23T11:04> .

Note the underscore before and after Countries. Is this a problem? And if so, can COW check if the Column exists in the CSV file?

COW Wiki Home Page Suggestion

Some terms and/or processes that we see as obvious, might need an extra step of explanation for beginners.
Visual
More space between (or even a dividing line) between the two columns, to create a more recognisable separation of the two. If it looks like a table, it might be more understandable.

More verbose
In the end, the wiki doesn't really use the above mentioned data other than the top row (buurt-a). However, it's easier to understand/distinguish adjustments if the used variables are more unique. Maybe use the real names of Amsterdam neighbourhoods? Again, it's not much of an issue now because the other rows aren't used, but one less dash in a triple wouldn't hurt.

COW Wiki Page 2 Suggestion

Clarify that the user can come up with a prefix themselves.

In addition to the provided prefixes, you can add your own. E.g. below we add the prefix "typos" referring to URI:

could be rewritten as

In addition to the provided prefixes, you can create and add your own. E.g. in the example below we want to refer to the URI: "https://prefixes.causelesstypos.com/". Let's call the prefix "typos" and add it to the list of prefixes.

inconsistent base URI application?

Having just tried the latest version of COW, with build and convert on the test file (imf_gdppc.csv) without any edits to the metadata json-file, we (@morkor and me) get what seems to to us inconsistent URIs across sub/pred/graph

<https://iisg.amsterdam/resource/6> <http://data.socialhistory.org/resource/Int> "72,524"^^<http://www.w3.org/2001/XMLSchema#string> <http://iisg.amsterdam/imf_gdppc/assertion/0f8040d8/2017-11-01T14:45> .

Note the difference between subject (iisg.amsterdam/resource/) and predicate (data.socialhistory.org). I think this happens here:

propertyUrl = "http://data.socialhistory.org/resource/{}".format(

But don't feel confident to start changing the converter's internals. I'm also having trouble finding the base in BurstConverter, which would the best thing to replace data.socialhistory.org with IMO.

The other thing is that the graph predicate does not have the /resource/ bit. I can see how this would be on purpose, but just wanted to make sure this is what we want. Either way, the fact that /resource/ is missing means that it is not based on the base-URI in the metadatafile, but hardcoded somewhere else in the converter which limits its usability for people who are not iisg.amsterdam.

RFC 3987 non compliance error

Hi,

it appears the iri-baker stopped baking iri's. E.g. when convert on the guerry.csv data, the error below is reported. I've tested this on multiple already converted files and all of them complain about the iri's.

administrators-MacBook-Pro-3:testcow RichardZ$ cow_tool convert guerry.csv
Converting guerry.csv to RDF
Initializing converter for guerry.csv
Processes: 4
Chunksize: 5000
Quotechar: '"'
Delimiter: ','
Encoding : 'ascii'
Taking encoding, quotechar and delimiter specifications into account...
Starting conversion
Opening CSV file for reading
Running in 4 processes
Process PoolWorker-1, nr 0, 5000 rows
Row: 0
The argument you provided does not comply with RFC 3987 and is not parseable as a IRI(there is no scheme or no net location part)
dept/1
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/cow_csvw/converter/csvw.py", line 364, in _burstConvert
    result = c.process(count, rows, chunksize)
  File "/usr/local/lib/python2.7/site-packages/cow_csvw/converter/csvw.py", line 429, in process
    s = self.expandURL(self.aboutURLSchema, row)
  File "/usr/local/lib/python2.7/site-packages/cow_csvw/converter/csvw.py", line 594, in expandURL
    raise Exception(u"Cannot convert `{}` to valid IRI".format(url))
Exception: Cannot convert `dept/1` to valid IRI
TypeError in multiprocessing... falling back to serial conversion
Opening CSV file for reading
Starting in a single process
Row: 0
The argument you provided does not comply with RFC 3987 and is not parseable as a IRI(there is no scheme or no net location part)
dept/1
Something went wrong, skipping guerry.csv.

Null coding doesn't work for date variables

Assigning null values to negative date values (gDay; gMonth; gYear) does not seem to work. Some have values of -2, -4 etc. but the converter does not change them into null values (for example in the BEVSTATP table of HSN). With or without "" doesn't make a difference.

is "lang": "en" properly providing language tags?

Hi, in the file occupation_link.csv-metadata.json in line 186 "lang" is set by a variable and in line 189 by a value. It doesn't appear that the language tags are actually showing up in the .nq files, though.

KeyError: u'None' with declaration of skos:definition

Hi, I'm trying to add a skos:definition to URI like so:

117 {
118 "virtual": true,
119 "aboutUrl": "http://data.socialhistory.org/resource/hisco/category",
120 "propertyUrl": "skos:definition",
121 "value": "The sub-divisions of occupational unit groups, coded to 5 digits, which provide the finest level of detail in the HISCO classification scheme.",
122 "language": "en"
123 },

However upon conversion I get the following error:
Traceback (most recent call last):
File "/Users/RichardZ/git/wp4-converters/src/converter/csvw.py", line 399, in process
value = row[unicode(c.csvw_name)]
KeyError: u'None'

Could you please look into this?

COW Wiki Page 4 Suggestion

Good joke about the weird dienstboden number, but to avoid confusion turn

hopefully per 100 inhabitants or so, otherwhise the 1.5 is pretty messy).

into

“hopefully per 100 inhabitants or so, otherwise 1.5 of a maid would have been really messy/disturbing).”

to clarify that it would be messy to have 1 and a half maid. Now it can be interpreted as that having decimals is really messy.

addition to wiki

suggestion by @albertmeronyo:
in the end of the home page, there could be a paragraph saying how to generate the default JSON schema file for that example CSV (with the corresponding command), and perhaps encourage users to also execute the conversion after every change they make through the 1-x pages (so they can see the effect of what they just modified in the JSON schema file)

problem converting strings with special characters

In the conversion process a number of files throw errors, such as: Could not apply python string formatting to 'http://data.socialhistory.org/resource/hisco/occupationalTitle/bokhandelsdr{ng'
The special character also appears in the original .csv file, so it doesn't appear to be related to #15 . Other files have similar issues, specifically with the /, which is used to separate multiple occupations.

line options

Hi, to assess whether you're converting a file properly, usually you just take a sample of the file. It would be great if one could specify a line range to be read in from the .csv file. This would have having to create sample files and in case of errors allows more quickly to see whether it is related to the .csv file itself. I'm thinking something like:

python csvw-tool.py convert -l 5:100 myFile.csv line 5-100
python csvw-tool.py convert -l +100 myFile.csv first 100 lines
python csvw-tool.py convert -l -100 myFile.csv last 100 lines

specify valueUrl without propertyUrl

Currently if you want to specify a valueUrl, you have to provide a propertyUrl as well. If not, you get the following error:

Exception: Cannot convert `None` to valid IRI
The argument you provided does not comply with RFC 3987 and is not parseable as a IRI(there is no scheme or no net location part)

Example json for the test.csv file

   {
    "titles": [
     "Country"
    ], 
    "@id": "https://iisg.amsterdam/resource/test.csv/column/Country", 
    "name": "Country", 
    "dc:description": "Country",
    "valueUrl": "clioctr:{Country}"
   }, 

COW should either convert this or specify in the error message that it cannot make valueUrls unless propertyUrl is specified.

Cattle: 502 Bad Gateway on conversion.

Tried converting a csv file with Cattle, received a 502 Bad Gateway, reduced the file-size by taking out rows and converted without issue.

File that timed-out was 200kb with 3222 rows and 9 columns.
File that worked was 44kb with 749 rows and 9 columns.

CSVW converter to RDF should support conditional mappings

The CSVW-based converter should be able to deal with propertyUrl (and valueUrl) mappings that are conditional on the value of another cell.

For instance, the occhisco to hisco mapping has a column 'checked' that registers whether the mapping is exact or just close. Depending on its value, the mapping between occhisco and hisco should be skos:exactMatch or skos:closeMatch respectively.

This is bound to happen in other CSV files as well.

Unicode encode error

While building database in COW, received following error re: codec. Understood that CSV was set to UTF-8, and needed to be in ASCII. However, might need some extra explanation in error message for non-tech users.

  File "C:\Datalegend\Clone\COW\cow\converter\csvw.py", line 109, in build_schema
    "@id": iribaker.to_iri("{}/{}/column/{}".format(base, url, head)),
UnicodeEncodeError: 'ascii' codec can't encode character u'\ufeff' in position 0: ordinal not in range(128)

CSVW converter to RDF should support preprocessing

Some CSV files are not correct, e.g. because numeric string values are stored as integers (leading to a dropped '0' in some cell values).

This should not be part of a generic converter, but could be taken care of by a CSV-file specific preprocessing step. The converter can then either use a declarative description of the preprocessing (e.g. a JSON instruction) or simply call some external script to perform the fix.

Easier to read/understand error code(s)

While converting a .csv with a .nq already present, the following error is shown:
image

Already established in Issue #37 that this is caused by the .nq being present. However, this error message is not clear (enough): maybe reword it (and possibly others) for user convenience?

encoding issue in hsn_hisco converter

Hi, there's an encoding issue with the hsn_hisco.py converter. The script runs fine, but virtuoso doesn't recognize the output as proper RDF.

To reproduce the error, run the script and run 'rapper -c -i turtle '. (rapper can be installed via brewer).

you'll find some trial code trying to fix the issue commented out in hsn_hisco.py

Parallel converter tries to format empty rows (None) in URL patterns.

The multiprocessing mapper will provide chunks of fixed size to each process. When the number of rows and the number of chunks do not 'fit', empty rows will be sent to the converter.

The converter still tries to make sense of them, i.e. instantiate them in patterns, but this will fail.

jinja if-else issue

Hi, does anyone understand why:

"valueUrl": "{% if STATUS == -9 %} http://data.socialhistory.org/resource/hsn2013a/status/{STATUS} {% else %} http://data.socialhistory.org/resource/hisco/status/{STATUS} {% endif %}",

would result in:

<http://data.socialhistory.org/resource/hsn2013a/_http:/data.socialhistory.org/resource/hisco/status/-9_>

Parallel converter sometimes fails "NoneType is not callable"

For some files (e.g. the CEDAR_HISCO_se.csv file) the parallel converter throws an exception "NoneType is not callable", this can be tested using the tests.TestConversion.test_parallel_csvw_conversion_CEDAR unittest e.g. :

(clariah-converters) hoekstra@MB-033068:~/projects/clariah-converters/src$ python -m unittest tests.TestConversion.test_parallel_csvw_conversion_CEDAR
No handlers could be found for logger "rdflib.term"
Quotechar: '"'
Delimiter: ';'
Encoding : 'iso-8859-1'
Only taking encoding, quotechar and delimiter specifications into account...
Starting conversion
Opening CSV file for reading
Starting parsing process
Running with 2 processes
Traceback (most recent call last):
  File "converter/csvw.py", line 235, in _parallel
    for out in pool.imap(burstConvert_partial, enumerate(grouper(self._chunksize, reader))):
  File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 668, in next
    raise value
TypeError: 'NoneType' object is not callable

Looked into various potential reasons, but to no avail. Might be a bug in the Python 2.7 multiprocessing module.

COW Wiki Page 5 Suggestion

The dc:description of the buurten already is descriptive (and in English), so for clarity change the dc:description of the dienstboden column to
dc:description: "(Number of) Maids (per 100 inhabitants)",

and

Explain that ?s ?p ?o stand for subject, predicate, object.

permission issue during install

Hi, I run into a permission issue of some sort. Please find the output below:

`pip install cow_csvw
Collecting cow_csvw
Downloading cow_csvw-0.11.tar.gz (6.7MB)
100% |████████████████████████████████| 6.7MB 216kB/s
Requirement already satisfied: alabaster==0.7.9 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: appdirs==1.4.3 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: argh==0.26.2 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: Babel==2.3.4 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: backports-abc==0.5 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: beautifulsoup4==4.4.1 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Collecting certifi==2016.9.26 (from cow_csvw)
Using cached certifi-2016.9.26-py2.py3-none-any.whl
Requirement already satisfied: chardet==2.3.0 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: docutils==0.12 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: html5lib==0.999999999 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: imagesize==0.7.1 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: iribaker==0.1.2 in /Users/RichardZ/git/wp4-converters/src/iribaker (from cow_csvw)
Requirement already satisfied: isodate==0.5.4 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: Jinja2==2.8 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: Js2Py==0.31 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: keepalive==0.5 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: language-tags==0.4.1 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: livereload==2.5.0 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: MarkupSafe==0.23 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: nose==1.3.7 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: nosetests-json-extended==0.1.0 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Collecting numpy==1.11.2 (from cow_csvw)
Using cached numpy-1.11.2-cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Requirement already satisfied: packaging==16.8 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Collecting pandas==0.19.1 (from cow_csvw)
Using cached pandas-0.19.1-cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Requirement already satisfied: pathtools==0.1.2 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: port-for==0.3.1 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Collecting Pygments==2.1.3 (from cow_csvw)
Using cached Pygments-2.1.3-py2.py3-none-any.whl
Collecting pyparsing==2.2.0 (from cow_csvw)
Using cached pyparsing-2.2.0-py2.py3-none-any.whl
Requirement already satisfied: python-dateutil==2.6.0 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: pytz==2015.7 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: PyYAML==3.11 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Collecting rdflib==4.2.2 (from cow_csvw)
Requirement already satisfied: rdflib-jsonld==0.4.0 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: regex==2015.6.24 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: requests==2.8.1 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: rfc3987==1.3.4 in /usr/local/lib/python2.7/site-packages/rfc3987-1.3.4-py2.7.egg (from cow_csvw)
Requirement already satisfied: simplejson==3.8.1 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: singledispatch==3.4.0.3 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: six==1.10.0 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: snowballstemmer==1.2.1 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: SPARQLWrapper==1.7.6 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: Sphinx==1.4.8 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: sphinx-autobuild==0.6.0 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Collecting tornado==4.4.2 (from cow_csvw)
Collecting travis==0.0.2 (from cow_csvw)
Using cached travis-0.0.2-py2.py3-none-any.whl
Requirement already satisfied: tzlocal==1.2 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: unicodecsv==0.14.1 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: uritemplate==0.6 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: watchdog==0.8.3 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: webencodings==0.5 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Requirement already satisfied: Werkzeug==0.11.2 in /usr/local/lib/python2.7/site-packages (from cow_csvw)
Collecting xlrd==1.0.0 (from cow_csvw)
Requirement already satisfied: setuptools>=18.5 in /usr/local/lib/python2.7/site-packages (from html5lib==0.999999999->cow_csvw)
Building wheels for collected packages: cow-csvw
Running setup.py bdist_wheel for cow-csvw ... error
Complete output from command /usr/local/opt/python/bin/python2.7 -u -c "import setuptools, tokenize;file='/private/var/folders/p0/jvvv0sd91kx77mgq3sn0_4xr0000gp/T/pip-build-VbiU1G/cow-csvw/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" bdist_wheel -d /var/folders/p0/jvvv0sd91kx77mgq3sn0_4xr0000gp/T/tmpJWwRY9pip-wheel- --python-tag cp27:
running bdist_wheel
running build
running build_py
creating build
creating build/lib
creating build/lib/cow_csvw
copying src/init.py -> build/lib/cow_csvw
copying src/config.py -> build/lib/cow_csvw
copying src/cow_tool.py -> build/lib/cow_csvw
copying src/csv2qb.py -> build/lib/cow_csvw
copying src/csv2qber-schema.py -> build/lib/cow_csvw
copying src/csvw_tool.py -> build/lib/cow_csvw
copying src/../../../../../../../../../installer.failurerequests -> build/lib/cow_csvw/../../../../../../../../..
error: [Errno 13] Permission denied: 'build/lib/cow_csvw/../../../../../../../../../installer.failurerequests'


Failed building wheel for cow-csvw
Running setup.py clean for cow-csvw
Failed to build cow-csvw
Installing collected packages: certifi, numpy, pandas, Pygments, pyparsing, rdflib, tornado, travis, xlrd, cow-csvw
Found existing installation: certifi 2017.4.17
Uninstalling certifi-2017.4.17:
Successfully uninstalled certifi-2017.4.17
Found existing installation: numpy 1.9.2
Uninstalling numpy-1.9.2:
Successfully uninstalled numpy-1.9.2
Found existing installation: pandas 0.16.2
Uninstalling pandas-0.16.2:
Successfully uninstalled pandas-0.16.2
Found existing installation: Pygments 2.2.0
Uninstalling Pygments-2.2.0:
Successfully uninstalled Pygments-2.2.0
Found existing installation: pyparsing 2.1.8
Uninstalling pyparsing-2.1.8:
Successfully uninstalled pyparsing-2.1.8
Found existing installation: rdflib 4.2.1
Uninstalling rdflib-4.2.1:
Successfully uninstalled rdflib-4.2.1
Found existing installation: tornado 4.5.1
Uninstalling tornado-4.5.1:
Successfully uninstalled tornado-4.5.1
Found existing installation: xlrd 0.9.4
Uninstalling xlrd-0.9.4:
Successfully uninstalled xlrd-0.9.4
Running setup.py install for cow-csvw ... error
Complete output from command /usr/local/opt/python/bin/python2.7 -u -c "import setuptools, tokenize;file='/private/var/folders/p0/jvvv0sd91kx77mgq3sn0_4xr0000gp/T/pip-build-VbiU1G/cow-csvw/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /var/folders/p0/jvvv0sd91kx77mgq3sn0_4xr0000gp/T/pip-3WlA72-record/install-record.txt --single-version-externally-managed --compile:
running install
running build
running build_py
creating build
creating build/lib
creating build/lib/cow_csvw
copying src/init.py -> build/lib/cow_csvw
copying src/config.py -> build/lib/cow_csvw
copying src/cow_tool.py -> build/lib/cow_csvw
copying src/csv2qb.py -> build/lib/cow_csvw
copying src/csv2qber-schema.py -> build/lib/cow_csvw
copying src/csvw_tool.py -> build/lib/cow_csvw
copying src/../../../../../../../../../installer.failurerequests -> build/lib/cow_csvw/../../../../../../../../..
error: [Errno 13] Permission denied: 'build/lib/cow_csvw/../../../../../../../../../installer.failurerequests'

----------------------------------------

Command "/usr/local/opt/python/bin/python2.7 -u -c "import setuptools, tokenize;file='/private/var/folders/p0/jvvv0sd91kx77mgq3sn0_4xr0000gp/T/pip-build-VbiU1G/cow-csvw/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /var/folders/p0/jvvv0sd91kx77mgq3sn0_4xr0000gp/T/pip-3WlA72-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /private/var/folders/p0/jvvv0sd91kx77mgq3sn0_4xr0000gp/T/pip-build-VbiU1G/cow-csvw/`

error message when offline

When offline and trying to 'convert' something, the following error is reported:

"URLError: <urlopen error [Errno 8] nodename nor servname provided, or not known>"

Is it possible to add: "Do you have a working internet connection" to that error? I believe the error message comes from outside COW, so I couldn't directly add it myself.

argument to provide path to *.json and *.nq file

Comment 1:

In the build stage, the *.json file containing the skeleton schema is created in the same directory as the .csv file. Is that logical or should the *.json file be created in the current working directory?

Comment 2:

In the convert stage, the *.json file is expected to be in the same directory as the .csv file, not in the current working directory. This also applies to the *.nq file. Again, is that logical (especially if there is no write permission for the directory in which the .csv file is placed)?

Question:

Irrespective of above, would it be possible to add an argument to specify the directory the *.json should be created in in the build fase, as well as where the *.json file should be read from in the convert stage and the directory in which the .nq file should be written too.

convert stage stopped working

Hi, it appears I can no longer convert files after the build stage. Is there somewhere an error log that I can report on? The output won't do much good:

administrators-MacBook-Pro-3:cow_example RichardZ$ cow_tool build imf_gdppc.csv
Building schema for imf_gdppc.csv
Backed up prior version of schema to imf_gdppc.csv-metadata.json_2017-10-18T12:03:36
Detected encoding: ascii (1.0 confidence)
Detected dialect: csv.dialect (delimiter: ';')
Delimiter is: ;
Found headers: [u'Rank', u'Country', u'Int']
Done
administrators-MacBook-Pro-3:cow_example RichardZ$ cow_tool convert imf_gdppc.csv-metadata.json
Converting imf_gdppc.csv-metadata.json to RDF
Initializing converter for imf_gdppc.csv-metadata.json
Something went wrong, skipping imf_gdppc.csv-metadata.json.
administrators-MacBook-Pro-3:cow_example RichardZ$

@base doesn't seem to be parsed anymore

So to sum the issue based on the discussion on datalegend's Slack:

the issue at hand appears to lie in the fact that the base url defined in @base is not parsed as part of the aboutUrl when using COW to convert;

  • manually overriding the baseUrl with "" in the prefix specification does not alter this;
  • using a combination of a prefix with {_row} does not work either (e.g. hsn:{_row});
  • a quick fix is to explicate the baseUrl like so: "aboutUrl": "http://example.org/resource/{_row}",

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.