We've talked about how to deal with inherited attributes in a few places, so I thought I'd start an issue here to bring it together.
There's a tension between our goals of (a) providing a specification for humans to read that is non-repetitive; and (b) providing a specification for interchange that's as simple to implement as possible. For (a) we would definitely like to have inherited attributes, because we don't want to write down the same parameters over and again. For (b) we would kinda prefer not to have inherited attributes, because it does make the specification a bit more complex, and we would have to be quite precise about what demes parsers are expected to do.
So, here's a proposal. Our object tree currently looks like this:
root
+ --- description
+ --- demes
+ --- id
+ --- description
+ --- epochs
+ --- initial_size
+ --- final_size
+ --- pulses
+ --- source
+ --- dest
+ --- time
+ --- migrations
+ --- source
+ --- dest
+ --- rate
I don't like the idea of having things like cloning_rate
defined as root attributes of the graph, because this implies that everything inherits this as a default (even pulses and migrations, say).
What if we added an explicit defaults
section to the various levels of the tree, and the defaults are then given as fully qualified references? So, something like
description: An example demography to play around with cloning attributes.
generation_time: 1
time_units: generations
defaults:
Epoch.cloning_rate: 0.05
Epoch.initial_size: 1000
demes:
- id: pop1
description: Population with epochs and changing cloning rates
ancestors: root
defaults:
Epoch.cloning_rate: 0.1 # Overrides the default from higher in the tree
Epoch.initial_size: 1e4
epochs:
- end_time: 500
- initial_size: 1e2
end_time: 100
- end_time: 0
cloning_rate: 0.5
- id: pop2
description: Population with epochs and changing cloning rates
ancestors: root
epochs:
- end_time: 500
- initial_size: 1e2
end_time: 0
cloning_rate: 1.0
So, the semantics can be defined precisely (if a value isn't given, go up the tree looking at the defaults, and if you don't find any, error out), and we're not cluttering up the namespace with things that aren't actually related to the object in question. The problem of outputting a compact/simplified (I'm on the fence about what the right terminology is here) is one of figuring out what can be put into the defaults to minimise repetition. This is a parsimony problem, which I'm sure we can solve.
The downside is that parsers have to be a bit more complicated, but I think this is something we can live with. So, to be clear, we get rid of the distinction between the low-level fully qualified graph that's used for interchange and settle on this single form which requires that implementations understand how default values work.
What do we think? I know I'm vacillating on what should and shouldn't go in the JSON schema definition (apologies!), but I feel like we're very close to something we can define precisely and start building on!