imamuseum / linkedart Goto Github PK
View Code? Open in Web Editor NEWTransforming IMA objects, creators, and exhibitions data to the Linked Art data model.
Home Page: https://linked.art
Transforming IMA objects, creators, and exhibitions data to the Linked Art data model.
Home Page: https://linked.art
Need to review Accession Dates (TitAccessionDate) in EMu for consistency in date formatting.
This field is the best option for assigning a date to the acquisition event (E8_Acquisition), though it is not always populated. Regardless of how the date is entered into EMu, need to transform for appropriate formatting for dates in CRM.
Some fields contain quotations as catalogued in EMu - need to replace these to avoid invalid JSON output. Not addressed in current transformation, so needs rework.
Dagwood IDs are different than EMu IRNs, and are not currently cataloged in EMu. This makes it difficult to automate creation of Homepage pattern.
Need to work with Registration to determine an appropriate place in EMu to document either Dagwood IDs or full collection.imamuseum.org URLs.
Thoughts:
Per Linked Art Issue #296, "brief text" should be modeled as a metatype of the linguistic object's main type.
Creation Locations are catalogued in a table in EMu with the following headers:
Data is entered into these fields very inconsistently, particularly where the information doesn't fit cleanly into these categories. Significant clean-up needs to be undertaken, in collaboration with registrars. All values in these fields are controlled by a look-up list, so proliferation of bad values continues. Clean-up will need to include clearing look-up list values.
Not modeling this information for now, but hope to undertake in the future.
How should After/Followers be handled? Currently ignoring creators with such qualifiers.
EDIT: See list of all qualifiers present in IMA data two comments below.
Related to this, we don't catalog Workshops of (or related terms) as individual Parties records, but use the qualifier "Workshop of" with actual record for artist (e.g. Titian). Don't have an easy way to assign a unique identifier to these groups in absence of individual record. These are also being ignored by current mapping, due to presence of the qualifier.
EMu contains a table for dimensions with headers:
Type, Height, Width, Depth, Diameter, Unit (Length), Weight, Unit (Weight, Dimension Notes
The Type value can be used as an indicator for whether the dimensions on that line of the table can be attributed to a part of an object (one of few areas where our data lends itself to partitioning), or if the dimensions captured on that line refer to the object as a whole. Type is a relatively controlled list, with key terms. These values need to be evaluated for whether they indicate part or whole dimensions.
Portfolio Title and Series Title are captured in single text fields on object records. Current mapping to Linked Art reflects this, transforming values to P1_is_identified_by statements with E55_Type "series titles" and "titles". Side note: not a fitting AAT vocab word for "portfolio titles".
Would it make more senses extrapolate parent record information from these title fields? i.e., create separate MMO records of type "portfolios (groups of works)" - http://vocab.getty.edu/aat/300179434 - and "series (object groupings)" - http://vocab.getty.edu/aat/300027349, with title information (and little else).
Issue, this may cause conflict in instances where portfolios/series are catalogued as blanket records with individual works as parts. To avoid creating duplicate MMOs with conflicting URIs (Note: since there aren't IRNs to associate with these inferred series/portfolios, the URIs would need to be something like data.discovernewfields.org/series/[lowercase-series-name]), I could set a conditional transformation that only if an object with a series or portfolio titles is NOT flagged as a part record, then create the inferred MMO pattern in the JSON.
Need to think through this more and also ask Editorial Board about how series/portfolio titles relates to partitioning.
Activities cannot be carried_out_by multiple Actors. Instead, the overarching Production event is broken into multiple sub-Production events (parts_of), each with their own Actors. Need to rework in the transformation file.
In EMu, we have a field titled "Mark Description" (backend name = CrePrimaryInscriptions). It seems to be used to capture notes about marks, inscriptions, and signatures on works.
There are individual AAT terms for each of those three items, but it would be difficult to determine in which way the field is being used at a given time. How should we type this Linguistic Object?
Thoughts: there may be some consistency in terminology used in the field that could be used as a flag while generating the JSON-LD. For example, the field sometimes started with "Signed: " - could this indicate the that contents are ALWAYS of type signature?
Review Locations Levels 1-3 and clean as needed. Order should be:
Level 1 = IMA
Level 2 = Name of gallery (if applicable - contains "galler" or "suite"); other location info.
Level 3 = Gallery code (if gallery), otherwise more specific info.
Level 2 (code) values to represent in transformation:
Blank nodes are allowable where for attributes that are not themselves entities. This includes:
Essentially, if an element would not need to be referenced by multiple sources, it does not need a dereference-able URI.
Need to update the transformation logic, tracking spreadsheet to no longer create URIs for:
In EMu, Medium and Support are catalogued in two tables, and outputs in XML as:
<table name="Medium">
<tuple>
<atom name="PhyMedium">fabric</atom>
</tuple>
<tuple>
<atom name="PhyMedium">plastic</atom>
</tuple>
</table>
<table name="Support">
<tuple>
<atom name="PhySupport">structural foam</atom>
</tuple>
</table>
Unfortunately, data is inconsistent. For example:
<table name="Medium">
<tuple>
<atom name="PhyMedium">paper</atom>
</tuple>
</table>
or
<table name="Medium">
<tuple>
<atom name="PhyMedium">plastic</atom>
</tuple>
<tuple>
<atom name="PhyMedium">paint</atom>
</tuple>
</table>
<table name="Support">
<tuple>
<atom name="PhySupport">steel</atom>
</tuple>
<tuple>
<atom name="PhySupport">plastic</atom>
</tuple>
<tuple>
<atom name="PhySupport">paint</atom>
</tuple>
</table>
How will incorrect data affect partitioning of Support and Medium?
If too complicated, can simplify represent Medium(s) and Support(s) with made-of syntax. If we go this route though, will need to avoid duplication of made-of statements when values are repeated either within a single table or across both tables (e.g., XML example directly above).
Since we publish non-accessioned items (e.g., long-term loans to the PC) online, we will need LA JSON-LD representation. Current transformation applies the "Accession Number" type to all values coming from TitObjectID - need to build in the logic for nuances when the item is NOT accessioned.
Related to this, whatever identifier type we go with for this, should probably also be applied to Previous Accession Number (the field name is a bit of a misnomer).
Per the standard list of vocabulary terms to me used in LA, there are established categories of artworks and other Human-Made_Objects that are recommended for use.
Current IMA mapping assigns classifications of "artwork" for all records + IMA thesaurus terms based on PhyMediaCategory_tab. This is incorrect. Only artworks should be considered artworks (e.g., design collection and similar artifacts should not receive this type.
Solution: use TitObjectType for major categories (e.g., "Visual Works: Paintings"). Where there is not a clear mapping to our Record Types, a broad AAT category won't be available, just the IMA thesaurus term.
Long term loans to the permanent collection ARE published online, but are NOT owned by IMA. Add in logic to the owner pattern to not make the statement when the Legal Status is not "Accessioned."
Example for object classifications:
object/1 -> classified_as -> Type -> classified_as -> "object type" URI
Currently all creators (whether individuals, organizations, collaborations, or cultures) are represented as Type: Actor in the "carried_out_by" statements of object production events.
Should this be modified? For example, should cultures be typed as "Group"?
For EMu party records, I can pull the Party record type to identify Person vs. Organization. Is Collaboration a type?
Now that I have answers from Registration re: allowable values in TitPreviousAccessionNo, need to rework the transformation to address all possible values.
Notes:
Research:
Per the LA model, objects associated with a specific department in a museum are represented as aggregations within a set. How would the "member_of" status be represented from each object's JSON-LD? Base on the context file, "member_of" isn't available in CIDOC-CRM?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.