01 - Summit Data Management

Design Diagram

The Summit Data Management Composite Application supports the 01 - Summit Data Receipt and Ingest SA Process. The general flow of information is left to right across the diagram.

Files are transferred via Globus to the "inbox" bucket in the 01 - Object Store using the 08 - Transfer Manager and 03 - Identity Manager. At the conclusion of a transfer task (which can contain 1 or more files), a transfer success message is produced to the 06 - Interservice Bus for each successfully transferred file. At this point, the message driven work is atomized down to 1 file to leverage the competing consumer design pattern. Successive operations can scale by spinning up multiple instances of the consuming/producing services. The transfer success message is consumed by the Data Categorizer, which routes the work to the appropriate ingest service via another targeted message. At this point, the messages and services involved branch off in accordance with the routing determined by the Data Categorizer before merging again at the end for clean up.

Science Data Branch:

  • Files routed for science data ingestion are handled by the Science Data Ingester, which ingests data into the "raw" bucket in the 01 - Object Store and catalogs the ingest in the 07 - Metadata Store (02 - Object Inventory). 
  • In addition to clean up of the "inbox" bucket by the Ingested Data Remover, successful science data ingests also trigger the Actual Count Writer to record an instance of an Observing Program Execution ID as having been received in the 07 - Metadata Store (01 - Processing Support).
  • Unsuccessful ingests trigger the failure handling services: 
    • Summit Data Failure Mover for quarantining items from the "inbox" bucket to the "ingest-fail" bucket in the 01 - Object Store
    • Summit Ingest Failure Notifier for notifying Ops personnel of an issue
    • Summit Ingest Worker for resolution of the failure by Ops Support personnel

Transfer Manifest Branch:

  • Files routed for transfer manifest ingest are handled by the Transfer Manifest Ingester, which ingests planned counts by the Observing Program As Execution ID in the 07 - Metadata Store (01 - Processing Support)
  • Unsuccessful ingests are handled the same as for science data

Categorization Failure Branch

  • Unsuccessful categorizations are handled similarly to failed ingests, with the exception of the quarantine area being the category-fail bucket in the 01 - Object Store

All of the services produce logs and telemetry, which are ingested into 02 - System Monitoring. Some produce additional Data Center-specific events that can be used for the development of Key Performance Indicators (KPIs), such as ingest success/failure events.