Coder Social home page Coder Social logo

linkedart's People

Contributors

saminorling avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

linkedart's Issues

Acquisition date clean-up and modeling

Need to review Accession Dates (TitAccessionDate) in EMu for consistency in date formatting.

This field is the best option for assigning a date to the acquisition event (E8_Acquisition), though it is not always populated. Regardless of how the date is entered into EMu, need to transform for appropriate formatting for dates in CRM.

Quotations in strings

Some fields contain quotations as catalogued in EMu - need to replace these to avoid invalid JSON output. Not addressed in current transformation, so needs rework.

collection.imamuseum.org IDs/URLs

Dagwood IDs are different than EMu IRNs, and are not currently cataloged in EMu. This makes it difficult to automate creation of Homepage pattern.

Need to work with Registration to determine an appropriate place in EMu to document either Dagwood IDs or full collection.imamuseum.org URLs.

Thoughts:

  • Link Catalogue records (BibReference_tab) to Bibliography records (BibRecordType = Web Site) representing each object URL (WebIdentifier)
  • For objects with images, does it make sense to also represent image URLs in this same way? Another option for capturing these in EMu could be the Image Reference table in the References tab (RefImageType_tab + RefImageReference_tab) - would need to add an Image Reference Type to the thesaurus for this use case.

Creation Locations

Creation Locations are catalogued in a table in EMu with the following headers:

  • Country
  • Province/State/Territory
  • District/County/Shire
  • City/Town

Data is entered into these fields very inconsistently, particularly where the information doesn't fit cleanly into these categories. Significant clean-up needs to be undertaken, in collaboration with registrars. All values in these fields are controlled by a look-up list, so proliferation of bad values continues. Clean-up will need to include clearing look-up list values.

Not modeling this information for now, but hope to undertake in the future.

After/Following creators

How should After/Followers be handled? Currently ignoring creators with such qualifiers.
EDIT: See list of all qualifiers present in IMA data two comments below.

Related to this, we don't catalog Workshops of (or related terms) as individual Parties records, but use the qualifier "Workshop of" with actual record for artist (e.g. Titian). Don't have an easy way to assign a unique identifier to these groups in absence of individual record. These are also being ignored by current mapping, due to presence of the qualifier.

Overall Dimensions vs. Dimensions of Parts

EMu contains a table for dimensions with headers:

Type, Height, Width, Depth, Diameter, Unit (Length), Weight, Unit (Weight, Dimension Notes

The Type value can be used as an indicator for whether the dimensions on that line of the table can be attributed to a part of an object (one of few areas where our data lends itself to partitioning), or if the dimensions captured on that line refer to the object as a whole. Type is a relatively controlled list, with key terms. These values need to be evaluated for whether they indicate part or whole dimensions.

Portfolio and Series Titles - partition?

Portfolio Title and Series Title are captured in single text fields on object records. Current mapping to Linked Art reflects this, transforming values to P1_is_identified_by statements with E55_Type "series titles" and "titles". Side note: not a fitting AAT vocab word for "portfolio titles".

Would it make more senses extrapolate parent record information from these title fields? i.e., create separate MMO records of type "portfolios (groups of works)" - http://vocab.getty.edu/aat/300179434 - and "series (object groupings)" - http://vocab.getty.edu/aat/300027349, with title information (and little else).

Issue, this may cause conflict in instances where portfolios/series are catalogued as blanket records with individual works as parts. To avoid creating duplicate MMOs with conflicting URIs (Note: since there aren't IRNs to associate with these inferred series/portfolios, the URIs would need to be something like data.discovernewfields.org/series/[lowercase-series-name]), I could set a conditional transformation that only if an object with a series or portfolio titles is NOT flagged as a part record, then create the inferred MMO pattern in the JSON.

Need to think through this more and also ask Editorial Board about how series/portfolio titles relates to partitioning.

Parse Production Events for multiple Creators

Activities cannot be carried_out_by multiple Actors. Instead, the overarching Production event is broken into multiple sub-Production events (parts_of), each with their own Actors. Need to rework in the transformation file.

Mark Description - AAT Mapping

In EMu, we have a field titled "Mark Description" (backend name = CrePrimaryInscriptions). It seems to be used to capture notes about marks, inscriptions, and signatures on works.

There are individual AAT terms for each of those three items, but it would be difficult to determine in which way the field is being used at a given time. How should we type this Linguistic Object?

Thoughts: there may be some consistency in terminology used in the field that could be used as a flag while generating the JSON-LD. For example, the field sometimes started with "Signed: " - could this indicate the that contents are ALWAYS of type signature?

IMA Locations Clean-up and Transformation

Review Locations Levels 1-3 and clean as needed. Order should be:

Level 1 = IMA
Level 2 = Name of gallery (if applicable - contains "galler" or "suite"); other location info.
Level 3 = Gallery code (if gallery), otherwise more specific info.

Level 2 (code) values to represent in transformation:

  • On Loan
  • see related parts
  • Art Study Room (S90)
  • Westerley (name of room)
  • Efroysom Family Entrance Pavilion (F02)
  • The Virgina B. Fairbanks Art & Nature Park

Blank nodes

Blank nodes are allowable where for attributes that are not themselves entities. This includes:

  • Timespans
  • Dimensions
  • Values/Monetary Amounts
  • etc.

Essentially, if an element would not need to be referenced by multiple sources, it does not need a dereference-able URI.

Need to update the transformation logic, tracking spreadsheet to no longer create URIs for:

  • Dimensions
  • Production Timespans
  • Check for other elements that don't need URIs.

Partitioning Medium and Support - data consistency

In EMu, Medium and Support are catalogued in two tables, and outputs in XML as:

<table name="Medium">
  <tuple>
    <atom name="PhyMedium">fabric</atom>
  </tuple>
  <tuple>
    <atom name="PhyMedium">plastic</atom>
  </tuple>
</table>
<table name="Support">
  <tuple>
    <atom name="PhySupport">structural foam</atom>
  </tuple>
</table>

Unfortunately, data is inconsistent. For example:

<table name="Medium">
  <tuple>
    <atom name="PhyMedium">paper</atom>
  </tuple>
</table>
  • Paper should be listed as a Support, not Medium

or

<table name="Medium">
  <tuple>
    <atom name="PhyMedium">plastic</atom>
  </tuple>
  <tuple>
    <atom name="PhyMedium">paint</atom>
  </tuple>
</table>
<table name="Support">
  <tuple>
    <atom name="PhySupport">steel</atom>
  </tuple>
  <tuple>
    <atom name="PhySupport">plastic</atom>
  </tuple>
  <tuple>
    <atom name="PhySupport">paint</atom>
  </tuple>
</table>
  • plastic and paint are repeated under Support, also listed in Medium

How will incorrect data affect partitioning of Support and Medium?

If too complicated, can simplify represent Medium(s) and Support(s) with made-of syntax. If we go this route though, will need to avoid duplication of made-of statements when values are repeated either within a single table or across both tables (e.g., XML example directly above).

Identifier Types

Since we publish non-accessioned items (e.g., long-term loans to the PC) online, we will need LA JSON-LD representation. Current transformation applies the "Accession Number" type to all values coming from TitObjectID - need to build in the logic for nuances when the item is NOT accessioned.

Related to this, whatever identifier type we go with for this, should probably also be applied to Previous Accession Number (the field name is a bit of a misnomer).

Adjust classification mapping logic

Per the standard list of vocabulary terms to me used in LA, there are established categories of artworks and other Human-Made_Objects that are recommended for use.

Current IMA mapping assigns classifications of "artwork" for all records + IMA thesaurus terms based on PhyMediaCategory_tab. This is incorrect. Only artworks should be considered artworks (e.g., design collection and similar artifacts should not receive this type.

Solution: use TitObjectType for major categories (e.g., "Visual Works: Paintings"). Where there is not a clear mapping to our Record Types, a broad AAT category won't be available, just the IMA thesaurus term.

Long Term Loans

Long term loans to the permanent collection ARE published online, but are NOT owned by IMA. Add in logic to the owner pattern to not make the statement when the Legal Status is not "Accessioned."

Creator Types

Currently all creators (whether individuals, organizations, collaborations, or cultures) are represented as Type: Actor in the "carried_out_by" statements of object production events.

Should this be modified? For example, should cultures be typed as "Group"?

For EMu party records, I can pull the Party record type to identify Person vs. Organization. Is Collaboration a type?

Previous Accession Number(s)

Now that I have answers from Registration re: allowable values in TitPreviousAccessionNo, need to rework the transformation to address all possible values.

Notes:

  • Clean-up complete and ingested into EMu
  • Field contains non-IMA identifiers, but we will only represent IMA values in Linked Art data
  • Single text field, but when there are multiple previous accession numbers, they are delimited by " | "
  • Ignore "No TR Number"
  • Possible IMA identifier starters: TR (temp), U (Unknown), NON-ART (non-art), S (Study), E (Eiteljorg)

Research:

  • Is there a more fitting AAT classification for retired identifiers than general "identification numbers"?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.