This happens when the 'raw' subset of the dataset has a label that the 'dev' subset does not.
SupervisableDataset assumes that labels in the 'raw' subset are irrelevant but those in 'train' are, which is all well and good.
However, it should be made clear how and when annotated data points in 'raw' get committed to 'train'. It should also be clear that there are times when one want to deduplicate 'raw' against 'train' (i.e. to 'lock' those annotated points) and times when one doesn't (i.e. to keep those points open to modification).
these commit and deduplicate' actions shall be accessible as app-level(cross-explorer) widgets.
also need a button to push dataframe updates to explorer sources. This isn't automatic due to performance considerations.. perhaps add a scheduled pull/push from dataframes to sources?
update_population and retrain_model should read the 'train' set rather than the 'raw' set.