6. Data analysis software¶
6.1. Software to access and inspect data¶
6.1.1. Karabo Data¶
karabo_data is a Python library for accessing and working with data produced
at European XFEL. It can:
- Conveniently access data from an experimental run, which is often spread across dozens of ‘sequence’ files.
- Read data into pandas and xarray, two popular Python libraries which support powerful, efficient data analysis.
- Assemble image data for multi module detectors like LPD and AGIPD, using geometry files in different formats.
- Stream data from files over a ZeroMQ socket. This stream of data can then be accessed using Karabo Bridge Clients clients, to test live-processing tools with data from real experiments.
6.1.2. Karabo Bridge Clients¶
We provide client libraries in Python and C++ to receive data from the karabo bridge, allowing users to integrate their tools with the karabo framework and receive live data during an experiment run.
6.1.3. HDF5 command line tools¶
We hope most users will be able to access data through existing tools such as Karabo Data, or by converting it to standard formats such as CXI. But if none of these options work for you, you may need to look directly at the HDF5 files.
Basic HDF5 command line tools are available by default:
h52gif h5copy h5fc-64 h5perf_serial h5unjam h5c++ h5debug h5import h5redeploy h5c++-64 h5diff h5jam h5repack h5cc h5dump h5ls h5repart h5cc-64 h5fc h5mkgrp h5stat
You can inspect HDF5 files in the terminal with the
available in the
exfel_anaconda3 module (see Access on online and Maxwell cluster).
hdfview interactive viewer is available in the
(see https://confluence.desy.de/display/IS/hdfview). To use it, first run
module load xray.
6.2. Software with scientific purposes¶
karaboFAI is a tool that provides on-line (real-time, as fast as the calibration pipeline) and off-line data analysis and visualization for experiments at European XFEL that require azimuthal integration of diffraction data acquired with 2D detectors. It works with AGIPD, LPD, JungFrau and FastCCD detectors.
6.2.2. Karabo Data Interactive¶
Adds a user interface to view detector images inside a Jupyter Notebook, these plots are interactive and can be zoomed/panned, additionally users can pick which trains and pulses to plot, as well as which type of image to plot from the data file (e.g. data, mask, or gain), as seen here:
It is recommended to use this application from pre-installed path on the on- and offline cluster (See Access on online and Maxwell cluster)
For more information about
karabo_data_interactive check the source code.
This tool provides a tool to calibrate AGIPD detector geometry.The tool can be seen as an alternative to the calibration mode of CrysFEL’s hdfsee. The calibration can either be based on a starting geometry that needs to be refined or a completely new geometry. In the latter case the initial conditions for the geometry are defined so that all modules are 29px apart from each other and 4px gap between asics within a module.
The geometry calibration is supported by two modes of graphical user interfaces. A Qt-based and a jupyter notebook based interface.
Using the Qt-Gui
It is recommended to use this Gui application through the pre-installed path on the on- and offline cluster (See Access on online and Maxwell cluster). The command is:
The following optional arguments can be set via the command line:
Show help about these options
Do not start gui, create a notebook
Set default directory to save notebooks
Set file name of the notebook
The path to a run folder
Path to a CrystFEL format geometry file
Detector distance [m]
Photon energy [eV]
Display range for plotting
If no run directory has been preselected using the
-r/--run option, a
directory can to be set by clicking the Run-dir button. Train IDs can be
selected after a run has been selected. The user can either choose to display
images by pulses or if the signal is to week/noisy by applying a Maximum or
Mean across the entire train to all images.
To do so the user can just select the Max or Mean button
instead of the default Sel #. After an image number / function has been
selected the image can be assembled using the Assemble button.
Optionally a pre-defined geometry file can be loaded using the Load button.
After the image is displayed quadrants can be selected by clicking on them. They can be moved by using the Ctrl+arrow-up/down/left/right key combination. Circles that can help to align quadrants are added by the Draw Helper Objects button. The radii of the circles an be adjusted using the radius spin box in the top left.
Once the quadrants have been positioned a geometry file can be saved by using the Save button.
Calibration Using Jupyter
--notebook flag creates a Jupyter notebook in the home directory. This
notebook is self explanatory.
Dependencies If the user doesn’t want or cannot use the xfel module and wants to install the tool the following python packages should be available:
6.2.4. XAS Visualization¶
Application for real-time visualization of XAS (Xray absorption spectroscopy) experiments on the online cluster. The app is the end-point of the real-time data analysis pipeline. The dataflow is:
XasTimProcessor (Python bound device) -> PipeToZeroMQ (Python bound device) -> xasVisualization
The above two devices can be found in the project ZMQ_BRIDGE in the SCS TOPIC.
- Python 3.6
It is recommended to use this Dash application through the pre-installed path on the on- and offline cluster (See Access on online and Maxwell cluster). If you wish to install it on your local machine follow the steps below
On a local PC
git clone https://git.xfel.eu/gitlab/dataAnalysis/xasVisualization.git cd xasVisualization pip install -r requirements.txt python app.py --TEST
Open a browser and enter the address localhost:8050.
- In the Karabo GUI:
output/ZMQ publisher portof the
- Instantiate the
- Start the
- On the online cluster
A web server will be started there. For now the user can log onto the online cluster with their own account and start the server. In the future, we might want to start the server in a controlled account.
- Open a browser on a PC in the control room and enter the IP address of the
online cluster and the port:
6.3. Access on online and Maxwell cluster¶
The tools described above are readily available on both the online cluster and the Maxwell cluster. We recommend using the already setup applications available in the XFEL specific Anaconda3 distribution:
module load exfel exfel_anaconda3
This will provide access to all these libraries and tools for data analysis:
There are many other applications and libraries available on the Maxwell cluster, maintained by DESY. There is a listing at: https://confluence.desy.de/display/IS/Alphabetical+List+of+Packages
Not all of this software is on the online cluster (maintained by EuXFEL).
6.4. Adding extra software¶
Both the Offline cluster and the Online cluster environment feature a set of data analysis tools. If an experiment requires access to additional analysis packages or applications, this user requirement shall be discussed and agreed in advance. CAS (control and analysis software) keep active connection to the authors of the different scientific data analysis packages, like Mosflm, XDS, etc. and maintains the availability of these tools.
In addition, users can bring their own tools and install them in their user space. This will be available for immediate use in both offline and online environment.
Users can separate space within an experiment group for the further development of these tools even during the experiment.
In general, users are encouraged to share the progress on such data analysis development with European XFEL. Wider collaborations are also welcome and happily hosted and coordinated by CAS. In this case, dedicated space is going to be created and managed for such collaborative projects.