Coder Social home page Coder Social logo

dcterms:subject (free-text) about gcis HOT 15 CLOSED

usgcrp avatar usgcrp commented on May 20, 2024
dcterms:subject (free-text)

from gcis.

Comments (15)

bduggan avatar bduggan commented on May 20, 2024
  1. Yes, I believe t is possible to construct such a SPARQL query.

from gcis.

justgo129 avatar justgo129 commented on May 20, 2024

I'd like to reopen this but don't have the permissions. I'l provide the rationale upon the reopening (it would make it easier to follow on my part).

from gcis.

bduggan avatar bduggan commented on May 20, 2024

What is the rationale?

from gcis.

justgo129 avatar justgo129 commented on May 20, 2024

@zednis is it a problem for dcterms:subject to contain multiple objects?
Basically, it's an extension of the discussion in #150 where a similar discussion occurred vis a vis output file names (i.e. splitting them up in the turtle or not).

Once I gain some clarification on it, I am all right with closing.

from gcis.

zednis avatar zednis commented on May 20, 2024

Two things:

  1. We should avoid collapsing multiple values into single literals. The property is not intended to reference an embedded collection but a single value. The correct way to do this in RDF is to make multiple RDF statements reusing the property with different values.

  2. dcterms:subject is intended to be used as an object property and not with literal values.

from http://dublincore.org/documents/dcmi-terms/#terms-subject

This term is intended to be used with non-literal values as defined in the DCMI Abstract Model (http://dublincore.org/documents/abstract-model/). As of December 2007, the DCMI Usage Board is seeking a way to express this intention with a formal range declaration.

Most terms in dcterms are intended to be used with non-literal values. This is what separates dublin core terms properties from regular dublin core properties - which are intended to be used with literal values.

Guidance on usage at available at http://wiki.dublincore.org/index.php/User_Guide/Publishing_Metadata#dcterms:subject

The correct representation in RDF for dcterms:subject would be like this:

<http://data.globalchange.gov/image/ff6a7a8e-d886-4b30-acd7-a3538a787baf>
  dcterms:subject <http://data.globalchange.gov/term/Precipitation> ,
                            <http://data.globalchange.gov/term/observed>,
                            <http://data.globalchange.gov/term/global> .

This is why dcterms:subject is usually used to reference an instance of a skos:Concept.

The correct representation in RDF for dc:subject would be like this:

<http://data.globalchange.gov/image/ff6a7a8e-d886-4b30-acd7-a3538a787baf>
  dc:subject "Precipitation"^^xsd:string, 
                   "observed"^^xsd:string, 
                   "global"^^xsd:string .

from gcis.

zednis avatar zednis commented on May 20, 2024

@bduggan this issues should be reopened and we should modify your usage of dcterms:subject because it is incorrect.

from gcis.

bduggan avatar bduggan commented on May 20, 2024

I would propose removing dcterms:subject from the rdf. This falls into the larger category of controlled vocabularies which is work in progress. Again, this data was provided as a free form text field, and we do not want to make any assertions about splitting the contents and making URIs for various phrases there.

from gcis.

zednis avatar zednis commented on May 20, 2024

I think this information is too important to just drop. What if we have a student work on a post-process to split free-text provided contents into separate text keywords. We can then use either dc:subject or dcterms:subject as mentioned above.

from gcis.

bduggan avatar bduggan commented on May 20, 2024

The lists in the attributes are not constrained, and there are no definitions. An example: "Precipitation, projections, seasonal, CMIP5, RCP2.6" -- collections of keywords and a model and a scenario. I would argue that treating these as dcterms:subject's is not a good idea -- better would be for the image to be associated with /model/cmip5 and /scenario/rcp2.6. Once we start properly managing controlled vocabularies we will want definitions for each of the terms as well as how they relate to other controlled vocabularies.

I think these lists are important to keep in mind and should help inform our discussions about representing (and curating) controlled vocabularies.

from gcis.

zednis avatar zednis commented on May 20, 2024

I don't want to just throw this data out.

I think it makes sense to use dc:subject in the meantime before we have controlled vocabularies fully in use - and it will allow us to support keywords that don't map to a term in a controlled vocabulary.

We should do some data science to post-process the free-text data we have into delimited keywords, but thats why we have students ;-)

from gcis.

bduggan avatar bduggan commented on May 20, 2024

I'm not so worried about the effort of splitting these up, I'm more worried about making a category mistake with this data. Besides models and scenarios, there are also regions and even platforms/instruments: things for which we already have URIs.

Anyway, eventually, yes, we do want URIs for term, perhaps under "/term" (in your example)., To support this, we need to decide what is returned from the /term endpoint. i.e. what are the attributes and relationships of a term? (not in the lexicon sense but in the controlled vocabulary sense.)

Also, I wasn't suggesting throwing it out, just leaving it out of the RDF for now; it'll still be in the database and we can add it once we iron out our representation of controlled vocabularies.

from gcis.

aulenbac avatar aulenbac commented on May 20, 2024

@bduggan is right. Think engineered. Think lean and automated. And think of an operational GCIS ontology that is automatically generated on a weekly schedule using that lean automated approach. Manual work is not an option; we do not have students for new work like this nor should we assume we shall in the future.

Actually, I am okay with throwing this "data" out. Based on two years of GCIS work, Attributes is a spotty, usually blank, catch-all that few bother to fill out. When they do, the values are amazingly varied and not terribly useful. Perhaps we should remove the Attributes field completely.

from gcis.

justgo129 avatar justgo129 commented on May 20, 2024

I'd be all right with removing the "attributes" field completely given the lack of criteria for the process of assigning them. Let me confirm first that this is so and will update.

from gcis.

zednis avatar zednis commented on May 20, 2024

ok, I have been persuaded to the leave this content out of the RDF for now. I support Brian's suggesting of revisiting this and eventually bringing it back with controlled vocabularies.

from gcis.

justgo129 avatar justgo129 commented on May 20, 2024

Works for me.

On Wed, Jun 24, 2015 at 2:26 PM, Stephan Zednik [email protected]
wrote:

ok, I have been persuaded to the leave this content out of the RDF for
now. I support Brian's suggesting of revisiting this and eventually
bringing it back with controlled vocabularies.


Reply to this email directly or view it on GitHub
#175 (comment).


Justin Goldstein, Ph.D.
Advance Science Climate Data and Observing Systems Coordinator
US Global Change Research Program
1717 Pennsylvania Ave NW, Suite #250
Washington, DC 20006

O: (202) 419-3496
M: (202) 285-3005

e-mail: jgoldstein AT usgcrp Dot gov
http://www.globalchange.gov

from gcis.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.