1. Data analysis overview¶
With the start of experiments at European XFEL for the very first time in 2017, the provision of data analysis tools is:
- display of key experimental data during experiment (see Data display during experiment)
- near real-time, e.g. calibration (see Integrating data analysis pipelines)
- download of data files (through the
rawdirectory on the offline cluster initially and the meta data catalog later)
- offline and online tools (see Data analysis software)
1.1. Integrated detector calibration pipeline¶
A direct consequence of the high frame rates of the large megapixel detectors at European XFEL is that they generate raw data volumes of the order of 10 to 15 GB per second and detector, data rates previously unprecedented in photon science.
Additionally, the imaging detectors’ on-sensor memory-cell and multi-gain-stage architectures pose unique challenges in detector-specific data corrections and subsequent calibration of scientific data. As a result, the European XFEL implements a data processing and calibration concept which moves these data preparatory steps away from facility users and instead provides them with a fully corrected and calibrated dataset as the primary data product [KusterCal2014].
Such a concept, which has already been successfully deployed in other scientific communities such as astronomy, space science, and high-energy physics for more than a decade, is deemed highly beneficial to the user community.
Users neither have to provide large amounts of computing resources nor have to have in-depth expertise on detector physics to obtain state-of-the-art corrected and calibrated datasets for their experiments and can thus focus on their scientific analysis. Additionally, comparisons between and data aggregation of different experiments and instruments are simplified as calibration becomes user-independent.
A successful implementation of such a concept requires a clear separation of data streams, standardization and definition of data products and formats, as well as optimized infrastructure. The Figure Fig. 1.1 shows an overview of the data products, data flows, generalized infrastructure, and user roles foreseen for the European XFEL. Data acquisition starts at the detector (top left corner of the figure), and data flows via the detector front-end electronics (not shown in the figure) through the train builder, which provides standardized pixel-ordered output to the PC-layer. The PC-layer nodes reformat this data to the archival and stream formats which are used in subsequent data flows. Separate flows for raw data, calibrated data, and data taken for detector calibration and characterization are shown.
Available resources on the PC-layer may be used for experiment-specific online-monitoring tasks, which are in addition to standard monitoring tasks of the facility. A maximum continuous, but non-guaranteed, refresh rate of a few Hz is foreseen, as anything faster will not be comprehensible for operators or users anyway. Nevertheless, buffered burst modes are possible. Data integrity will always take precedence over rate, and archiving has highest priority.
1.1.1. Policies and prerequisites¶
In the following, a selection of the key policies and prerequisites for the aforementioned concept are described. Here, I/O, processing, database access, and archival are performed within the European XFEL’s software framework, Karabo, wherever appropriate.
Raw data represents digitized detector signal, not altered by detector-specific corrections or calibrations; e.g., it is in the form of detector units such as analogue digital units (ADU). Vetoing, either by hard- or software triggers and zero-value suppression (e.g., by transferral to event lists), may have been performed and is irreversible. Raw data is the main archival data product at the European XFEL. It is not foreseen to be exported outside the facility [KusterCal2014].
Calibrated data is generated from raw data by applying detector-specific corrections and transformation to physical units (calibration) - e.g., photons per pixel. Calibrated data is the standard data product with which users will be provided. It is not archived; instead, if a calibrated dataset is requested but not accessible through the online-cache or user-space anymore, it will be reprocessed on the fly from the raw data repository using the appropriate calibration parameters provided by the calibration database.
Alignment data is generated from dedicated alignment measurements, providing the position of each detector pixel and detector module in three-dimensional space. It is stored in the detector coordinate system (i.e., as pixel coordinates) and no additional interpolation or coordinate transformation will be applied. Alignment data is part of the standard data products with which users will be provided.
1.1.2. Services provided by the European XFEL facility¶
Calibration data production and storage into a calibration database are services provided by European XFEL’s detector group. Calibration data is prepared by detector experts using data from dedicated experimental campaigns in the European XFEL detector laboratory or at light and particle sources; in situ at the scientific instruments using, e.g., the XFEL beam; or as a part of a regular experimental procedure, e.g. dark image acquisition between runs. The detector group together with the detector manufacturers provides the necessary analysis tools and procedures to derive correction factors and calibration parameters. Important from a user perspective, the group also provides a database and references to the state-of-the art calibration dataset for a given detector, as well as tools for its application as part of the online workflow.
This concept ensures that, in addition to a full history of calibration data, its validity for different time periods will be maintained and will be available to the user at any time. A calibration report is available for each run of the calibration or correction pipelines.
The detector group and the instrument groups will jointly measure and provide alignment data. The facility will provide the user with means of using this data to convert scientific data from detector coordinate space to other coordinate systems relevant for scientific analysis. These tools can be used either within a Karabo-based workflow, or in conjunction with other analysis tools. Alignment datasets will be shipped as part of calibrated data products.
These concepts have been published in [KusterCal2014].
|[KusterCal2014]||(1, 2, 3) Kuster, Markus, et al. “Detectors and calibration concept for the European XFEL.” Synchrotron radiation news 27.4 (2014): 35-38. Available online: https://www.tandfonline.com/doi/abs/10.1080/08940886.2014.930809.|