Spectrum Discover is an industry-leading unstructured data catalog platform that can be plugged into by various applications and tools for data discovery services. When integrated with Watson Knowledge Catalog from Cloud Pak for Data, we create a even more powerful solution for end-end cataloging of all types of data (structured, semi-structured and un-structured). This is a use case we term Unified Data Catalog (UDC)
This repository includes instruction, dataset, manifest and recording to recreate a demo based on Spectrum Discover to ingest, index, tag and extract insight from a military aircraft dataset. The output is a well-curated and easily-accessible dataset ready for HPC-based preprocessing and deep learning/AI training.
The dataset is a collection of 11 military aircraft images from Kaggle. Here is an example aircraft image from the dataset:
- width: 1280
- height: 850
- type: B1
- xmin: 322
- ymin: 112
- xmax: 893
- ymax: 618