04 - Automated Processing

Design Diagram



Key Concept: Automated Processing Scratch

The Automated Processing Scratch (supported by DC Cloud File Storage) service is a Gluster filesystem for the exchange of data across steps in a calibration processing DAG. The data of one DAG is separated from another by using the unique (to a DAG) Recipe Run ID that initiated its execution.

Logical Data Model 

The Recipe Run ID is used to segregate data between DAGs. 

Type: File System

Configured PathService PathDescription
/scratch/(recipe run)/*Shared folder space for use by any processing node part of the same Recipe Run

Key Concept: Automated Processing Scratch Inventory

Automated Processing Scratch Inventory is a tool to index files in scratch for a particular Recipe Run. It is implemented as a key value store within Redis that associates custom Tags with sets of Paths that they point to, enabling the retrieval of sets of files based upon the union or intersection of 1 to many tags.



Key Concept: Airflow Architecture

The Airflow system allows for the scaling out of workers but centralization of resource scheduling.

Key Concept: DKIST Processing Libraries

Responsibility

dkist-processing-core

dkist-processing-common

dkist-processing-<instrument>

Virtual Env Setup

x



Task Dependencies



x

DAG Definition



x

DAG Format

x



Management Tasks


x


Unit Tests

x

x

x

Instrument Cal. Tasks



x


Key Concept: Deployment 



Key Concept: Code/DAG Versions