How do I request raw DKIST data?

Summary

The DKIST Data Center Archive (DC) provides calibrated (a.k.a. Level 1, “L1”) data to the community via the User Portal for download. It does not generally provide general access to raw (a.k.a. Level 0, “L0”) data. However, special requests can be made using a request form to the DC for raw data files for certain instruments. If the user has a justified reason to get these raw data, they can make a request to the DC for access. Requests will be reviewed on a case by case basis. It is assumed that a user will already have downloaded the L1 data before requesting the raw data. You are required to specify the dataset IDs from the L1 data that corresponds to the raw data you wish to download. Raw data is not searchable and cannot be filtered by instrument or other similar parameters. If the request is approved, the user will receive ALL the raw data for at least one observing program as executed at the telescope.

Note that if you are a PI who has been contacted by the data center regarding issues calibrating data from your proposal, do not use this form. Please respond to the DC using the email you were contacted by.

Warning: Raw data requests are for experts only. The dataset sizes are massive and working with these data requires a large amount of storage and processing power.

Nearly all VBI data is speckle reconstructed and this data is then transferred to the data center. In such cases unreconstructed VBI data will not be available. If you requested unreconstructed data in your proposal, then this data will still have gain and dark corrections applied.

If you are going to need unreconstructed VBI data, you must request it in your proposal.

Submitting a raw data request does NOT guarantee that the request will be granted. Requests will be vetted on a case by case basis. Note that proprietary data can only be released to the PI and proposal team members.

Requests will be reviewed by a DKIST committee that will vet the request based on:

How does the request (if granted) benefit the solar community?
How does the request (if granted) benefit DKIST (e.g. donated improvement to calibration pipeline)?

Prerequisites to obtaining DKIST raw (L0) data

The user must have an active Globus account (see: How to create a globus account (for DKIST data))
The user must have an active endpoint for Globus data transfers (see: Globus Connect & Endpoints)
The user should already have downloaded and looked at the Level 1 datasets

Instructions

Visit the DKIST Data Portal Raw Data request form (link available soon)
Log into the Data Portal
Fill in the following mandatory fields on the request form; please be as descriptive as possible:
1. Datasets Requested: Dataset ID(s) where the FITS header keyword = “DSETID” in the L1 data
2. Justification: Statement for the reason for this raw data request

4. If the request is approved by the DKIST management, then

The user will receive a notification from the DC that the request was approved
Data Center staff will contact the user for additional information on requested datasets
Data Center staff will provide the raw data endpoint details for this Globus download as well as the allotted time the data will be made available to them
The user must initiate the Globus data transfer within the allotted time - the requested data will then be transferred to the user endpoint
If the user has requested notifications from Globus upon transfer completion, the user will receive a notification when the transfer completed (status: Success/Error)

5. If a raw data request is denied, the requester will be notified by email with the reason.

I got my raw data, but how are the files named and organized?

When the RAW data files are delivered to the data center, the only useful bit of information that the raw data filename contains is the instrument name. The data center subsequently renames the files (e.g. 0056d294667a4bb5a552bfbda08a4dd3.fits) to ensure that if we happened to get the same filename twice from the summit there are no issues handling multiple copies of the same file. As the data center will not generally release raw data, our overriding goal is just to create unique filenames. When the raw data is ingested at the data center, meta-information about the file goes into frame inventory, so the data center subsequently knows what the file is and what it contains. However, the data center doesn’t have mechanism to expose frame inventory publicly. The only way for a user to extract the information you require, is to read the file headers and parse them. The L1 files are more usefully named, e.g.

VISP_2022_06_03T19_33_53_514_00630205_I_AKNPB_L1.fits.

But again this is only for processed data, as is the creation of the additional files like the ASDF file. The figure below illustrates what you would see in the Globus web client, once you have been given access to the raw data files for your proposal.

Each one of the raw directories is one observing program execution id (OP). e.g.

eid_1_118_op4Asyox_R001.82591.14038621.

You should find there are N observe OPs that would correspond to N * wavelength L1 science datasets, but there are also OPs for required calibration files such as darks, PolCals, and gains in the raw data. Within the observe frames, you will find FITS files that correspond to a singular modulator state at a single slit step, but again the only way to access this information is to open the files one-by-one and read the appropriate keywords.

Note that the user will also need to download the DKIST pipeline software in order to process the data themselves, augmenting the parameters and processing procedures as they see fit. (See: <Pipeline Install/Run Instructions>)

How do I request raw DKIST data?

Prerequisites to obtaining DKIST raw (L0) data

Instructions

I got my raw data, but how are the files named and organized?

Related articles