04 - Automated Processing
Design Diagram
Key Concept: Automated Processing Scratch
The Automated Processing Scratch (supported by DC Cloud File Storage) service is a Gluster filesystem for the exchange of data across steps in a calibration processing DAG. The data of one DAG is separated from another by using the unique (to a DAG) Recipe Run ID that initiated its execution.
Logical Data Model
The Recipe Run ID is used to segregate data between DAGs.
Type: File System
Configured Path | Service Path | Description |
---|---|---|
/scratch | /(recipe run)/* | Shared folder space for use by any processing node part of the same Recipe Run |
Key Concept: Automated Processing Scratch Inventory
Automated Processing Scratch Inventory is a tool to index files in scratch for a particular Recipe Run. It is implemented as a key value store within Redis that associates custom Tags with sets of Paths that they point to, enabling the retrieval of sets of files based upon the union or intersection of 1 to many tags.
Key Concept: Airflow Architecture
The Airflow system allows for the scaling out of workers but centralization of resource scheduling.
Key Concept: DKIST Processing Libraries
Responsibility | dkist-processing-core | dkist-processing-common | dkist-processing-<instrument> |
---|---|---|---|
Virtual Env Setup | x | ||
Task Dependencies | x | ||
DAG Definition | x | ||
DAG Format | x | ||
Management Tasks | x | ||
Unit Tests | x | x | x |
Instrument Cal. Tasks | x |