Background
The initially targeted users of MyData (https://github.com/wettenhj/mydata) have requested that their users shouldn't have to interact with MyData at all if they don't want to, i.e. MyData will be primarily used by facility managers for adding new instrument PCs to MyTardis and for diagnosing failed uploads. So general microscope users should be able to simply save a folder (e.g. "Dataset 1") in their user folder (e.g. "jsmith") and leave it up to MyData to put the dataset in a sensible default experiment in MyTardis (which the user can later modify if they wish). The proposed method for defining a default experiment is to group datasets by:
(i) instrument,
(ii) user who collected the data (the researcher)
(iii) the date on which the data was collected.
So for example, if MyData found a "Dataset 1" folder with a creation date of "2014-10-11" within a "jsmith" folder on instrument "Test Microscope 1", then it would query MyTardis to see if a default experiment record already exists which is suitable for this dataset, i.e. an experiment record tagged with "Test Micrsocope 1", "jsmith" and "2014-10-11". If it didn't already exist, MyData would create this default experiment record. It would initially create the record using a facility role account, e.g. MyTardis username="myfacility", and then user "jsmith" would be given full ownership access to the experiment record by creating an appropriate ObjectACL record.
The question is how to implement these experiment "tags" (instrument, data-collector and date-of-collection) nicely in MyTardis.
Option 1. (already implemented in MyData's current MyTardis test instance)
Option 2.
- Make use of MyTardis's new Instrument model (accessible as an optional field in the dataset model), but try to avoid introducing any new schemas, parameters or changes to the experiment model.
- This doesn't look feasible, because for "default experiments", we really want the instrument to be a property of the experiment, not the dataset. And we still need to find a way to record the date of data collection (NOT the same as the date of creation of a database record), There is already functionality in MyTardis's ObjectACLs which could be used to tag an experiment with the researcher who collected the data, but it may not be easy to filter experiments in the TastyPie API using ObjectACLs when determining whether a default experiment already exists for a given instrument, data owner, and date of collection.
Option 3.
- Add new fields to MyTardis's Experiment model to allow "default experiments" of this form to be defined and queried easily.
- Having an instrument field in both the Experiment and Dataset models might go against database normalization principles, but it could certainly be useful here, and there would be no problem with just setting it to NULL for Experiments containing Datasets from multiple instruments.
- Adding a data-collection-date field to the Experiment model would be easy, but it would be good to bounce the idea of other MyTardis users and see if it would cause confusion with the creation date of the database record, and whether some users would argue that date of collection should go in the Dataset model instead of the Experiment model (which certainly wouldn't help with the objective here of defining "default experiments").
- Adding a field to the Experiment model for the user who collected the data would be easy, but there could be confusion with the ObjectACL records which indicate who currently has access to the data. For now, I would prefer having a new field in the Experiment model for this (and documenting the new fields together as a way of grouping datasets collected by the same user on the same instrument on the same date). But we could use ObjectACLs if we can work out an appropriate to filter by ObjectACL when querying experiment records in the TastyPie API.