5. Compute environment

5.1. User account

User accounts are created during the initial registration step in the UPEX portal. At this point the account can only be used for the UPEX itself. If the user account is associated to the accepted and scheduled proposal, then the account is upgraded 4 weeks before the first scheduled beamtime of the given user. For the first early user period, the time between the upgrade of the accounts and the start of the experiment can be shorter. The upgraded account allows the user to access additional services such as the online safety training, the metadata catalog , and the computing infrastructure of the European XFEL.

By default upgraded user accounts are kept in this state for 1 year after the user’s last beamtime. An extension can be requested by the PI.

On-site guest WLAN (WiFi) is provided for all users. For those users with eduroam accounts provided by their home institute, access is straightforward. For those without eduroam account, a special registration procedure must be conducted to obtain guest access for the limited time period. After connecting to the XFEL-Guest network (also when using a network patch cable) and opening a web browser, the user will be able to register for the usage if guest network. The registration is valid for 10 days and 5 devices.

5.1.1. Tools

At different stages of the proposal, users are granted access to different services:

Stage Access provided Comments
Proposal submission Access to User portal (UPEX)  
Approval of proposal
and scheduling
Lightweight account ca. 2 months before beam-time start
Preparation phase
Access to Metadata catalog and beamtime store
filesystem. LDAP account upgraded for members of
all accepted proposals.
First-time users: once A-form is submitted
and accepted. Deadline for A-form submission
is normally 4 weeks before beam-time start.
Beam time
Access to catalogs and dedicated online and
offline services
 
Data analysis
Access to catalogs and shared offline computing
resources, initially limited to 1 year time
period after beamtime.
 

Importantly, first-time users should aim for a timely A-form submission. This ensures that they will have a time window of several weeks prior to the start of their beam-time when access to the Maxwell computing resources and the associated storage system (GPFS) is granted. An additional benefit of such access is that working with example data becomes possible, in order to get accustomed to the peculiarities of EuXFEL data and workflows.

5.2. Online cluster

During beam time, exclusive access to a dedicated online cluster (ONC) is available only to the experiment team members and instrument support staff.

European XFEL aims to keep the software provided on the ONC identical to that available on the offline cluster (which is the Maxwell cluster).

5.2.1. Online cluster nodes in SASE 1

Beamtime in SASE 1 is shared between the FXE and the SPB/SFX instruments, with alternating shifts: when the FXE shift stops, the SPB/SFX shift starts, and vice versa.

Within SASE1, there is one node reserved for the SPB/SFX experiments (sa1-onc-spb), and one node is reserved for the FXE experiments (sa1-onc-fxe). These can be used by the groups at any time during experiment period (i.e. during shifts and between shifts).

Both the SPB/SFX and the FXE users have shared access to another 7 nodes. The default expectation is that those nodes are using during the shift of the users, and usage stops at the end of the shift (so that the other experiment can start using the machines during their shift). These are sa1-onc-01, sa1-onc-02, sa1-onc-03, sa1-onc-04, sa1-onc-05, sa1-onc-06, sa1-ong-01.

Overview of available nodes and usage policy:

name purpose
sa1-onc-spb reserved for SPB/SFX
sa1-onc-fxe reserved for FXE
sa1-onc-01 to sa1-onc-06 shared between FXE, SPB use only during shifts
sa1-ong-01
shared between FXE, SPB
GPU: Tesla V100 (16GB)

These nodes do not have access to the Internet.

The name sa1-onc- of the nodes stands for SAse1-ONlineCluster.

5.2.2. Online cluster nodes in SASE 2

Beamtime in SASE 2 is shared between the MID and the HED instruments, with alternating shifts: when the MID shift stops, the HED shift starts, and vice versa.

Within SASE2, there is one node reserved for the MID experiments (sa2-onc-mid), and one node is reserved for the HED experiments (sa2-onc-hed). These can be used by the groups at any time during experiment period (i.e. during shifts and between shifts).

Both the MID and the HED users have shared access to another 7 nodes. The default expectation is that those nodes are using during the shift of the users, and usage stops at the end of the shift (so that the other experiment can start using the machines during their shift). These are sa2-onc-01, sa2-onc-02, sa2-onc-03, sa2-onc-04, sa2-onc-05, sa2-onc-06, sa2-ong-01.

Overview of available nodes and usage policy:

name purpose
sa2-onc-mid reserved for MID
sa2-onc-hed reserved for HED
sa2-onc-01 to sa2-onc-06 shared between MID, HED use only during shifts
sa2-ong-01
shared between HED, MID
GPU: Tesla V100 (16GB)

These nodes do not have access to the Internet.

The name sa2-onc- of the nodes stands for SAse2-ONlineCluster.

5.2.3. Online cluster nodes in SASE 3

Beamtime in SASE 3 is shared between the SQS and the SCS instruments, with alternating shifts, when the SQS shift stops, the SCS shift starts, and vice versa.

Within SASE3, there is one node reserved for the SCS experiments (sa3-onc-scs), and one node is reserved for the SQS experiments (sa3-onc-sqs). These can be used by the groups at any time during experiment period (i.e. during and between shifts).

Both SASE3 instrument users have shared access to another 7 nodes. The default expectation is that those nodes are used during users shift, and usage stops at the end of the shift (so that the other experiment can start using the machines during their shift). These are sa3-onc-01, sa3-onc-02, sa3-onc-03, sa3-onc-04, sa3-onc-05, sa3-onc-06, sa3-ong-01.

Overview of available nodes and usage policy:

name purpose
sa3-onc-scs reserved for SCS
sa3-onc-sqs reserved for SQS
sa3-onc-01 to sa3-onc-06 shared between SCS, SQS use only during shifts
sa3-ong-01
shared between SCS, SQS
GPU: Tesla V100 (16GB)

These nodes do not have access to the Internet.

The name sa3-onc- of the nodes stands for SAse3-ONlineCluster.

Note that the usage policy on shared nodes is not strictly enforced. Scientists across instruments should liaise for agreement on usage other than specified here.

5.2.4. Access to online cluster

The ONC can only be accessed from workstation (Linux Ubuntu 16.04) in the control hutch or from dedicated access workstations located at the XFEL headquarter building on levels 1 and 2 (marked with an X in the map below).

Location of the ONC workstations

Workstations at Level 1

Workstations at Level 1

Workstations at Level 1

Workstation at Level 2

From these access computers, one can ssh directly into the online cluster nodes and also to the Maxwell cluster (see Offline cluster). The X display is forwarded automatically in both cases.

There is no direct Internet access from the online cluster possible.

5.2.5. Storage

The following storage resources are available on the Online user cluster:

  • raw: data stored by DAQ (data cache) - not accessible (access planned via reader service in the long run)
  • usr: beamtime store. Into this folder users can upload some files, data or scripts to be used during the beamtime. This folder is mounted and thus immediately synchronised with a corresponding folder in the offline cluster. There is not a lot of space here (5TB).
  • proc: can contain data processed by dedicated pipelines (e.g. calibrated data). Not used at the moment (May 2019).
  • scratch: folder where users can write temporary data, i.e. the output of customized calibration pipelines etc. This folder is intended for large amounts of processed data. If the processed data is small in volume, it is recommended to use usr.

Access to data storage is possible via the same path as on Maxwell cluster:

/gpfs/exfel/exp/<instrument>/<instrument_cycle>/p<proposal_id>/(raw|usr|proc|scratch)
Folder Permission Quota Retention
raw None No Data migration to offline storage and removed
usr Read/write 5TB immediately synced with Maxwell cluster
proc Read NO Data removed after migration
scratch Read/write NO Data removed when needed

To simplify access to files, symbolic links are in place that create a file structure as is visible on the online cluster.

5.2.6. Access to data on the online cluster

Currently, no access to data files is possible from the online cluster: the raw directory is not readable and the proc directory not populated with files.

Online analysis tools running on the online cluster thus have to be fed the currently recorded data through the Karabo Bridge.

File-based post-processing thus has to take place from the offline (=Maxwell) cluster after the files have been transferred at the end of a run. There is a delay of several minutes for this (depending on run length and overall business of the data transfer system).

5.2.7. Home directory warning

The home directory /home/<username> for each user (with username <username> on the online cluster is not shared with the home directory /home/<username> on the offline(=Maxwell) cluster. The home directory /home/<username> within the online cluster is shared across all nodes of the online cluster. The home directory /home/<username> within the offline cluster is shared across all nodes of the offline cluster.

To share files between the online and the offline cluster, the /gpfs/exfel/exp/<instrument>/<instrument_cycle>/p<proposal_id>/usr directory should be used: the files stored here show up in both the online and offline cluster, and are accessible to the whole group of users of this proposal.

5.3. Offline cluster

The Maxwell cluster at DESY is available for data processing and analysis during and after the experiment. Users are welcome and encouraged to make themselves familiar with the Maxwell cluster and its environment well in advance of the beam time.

In the context of European XFEL experiments, the Maxwell cluster is also referred to as the “offline” cluster. Despite this name, you can connect to the internet from Maxwell. It is offline in that it can’t stream data directly from the experiments, unlike the “online cluster”.

5.3.1. Getting access

When a proposal is accepted, the main proposer will be asked to fill out the “A-form” which, among information on the final selection of samples to be brought to the experiment, also contains a list of all experiment’s participants. At time of submission of the A-form, all the participants have to have an active account in UPEX. This is the prerequesite for getting access to the facility’s computing and data resources. After submission of the A-form, additional participants can be granted access to the experiment’s data by PI request.

Users have access to:

  • HPC cluster
  • beamtime store, data repository and scratch space
  • web based tools

5.3.2. Graphical login

To use Maxwell with a remote desktop, you can either:

5.3.3. Jupyter

Jupyter notebooks can be used through https://max-jhub.desy.de

5.3.4. SSH access

ssh username@max-display.desy.de

Replace username with your EuXFEL username. Unlike most of the cluster, max-display is directly accessible from outside the DESY/EuXFEL network.

5.3.5. Running jobs

When you log in, you are on a ‘login node’, shared with lots of other people. You can try things out and run small computations here, but it’s bad practice to run anything for a long time or use many CPUs on a login node.

To run a bigger job, you should submit it to SLURM, our queueing system. If you can define your job in a script, you can submit it like this:

sbatch -p upex -t 8:00:00 myscript.sh
  • -p specifies the ‘partition’ to use. External users should use upex, while EuXFEL staff use exfel.

  • -t specifies a time limit: 8:00:00 means 8 hours. If your job doesn’t finish in this time, it will be killed. The default is 1 hour, and the maximum is 2 weeks.

  • Your script should start with a ‘shebang’, a line like #!/usr/bin/bash pointing to the interpreter it should run in, e.g.:

    #!/usr/bin/bash
    
    echo "Job started at $(date) on $(hostname)"
    
    # To use the 'module' command, source this script first:
    source /usr/share/Modules/init/bash
    module load exfel exfel_anaconda3
    
    python -c "print(9 * 6)"
    

To see your running and pending jobs, run:

squeue -u $USER

Once a job starts, a file like slurm-4192693.out will be created - the number is the job ID. This contains the text output of the script, which you would see if you ran it in a terminal. The programs you run will probably also write data files.

SLURM is a powerful tool, and this is a deliberately brief introduction. If you are submitting a lot of jobs, it’s worth spending some time exploring what it can do.

5.3.5.1. During beamtime

During your beamtime, a few nodes are reserved so that your group can run some jobs promptly even if there’s a backlog. To use your reservation, add an extra option when submitting your jobs:

sbatch --reservation=upex_002416 ...

Replace the number with your proposal number, padded to 6 digits.

You can check the details of your reservation like this:

scontrol show res upex_002416

The output of this command tells you the period when the reservation is valid, the reserved nodes, and which usernames are allowed to submit jobs for it:

[@max-exfl001]~/reservation% scontrol show res upex_002416
ReservationName=upex_002416 StartTime=2019-03-07T23:05:00 EndTime=2019-03-11T14:00:00 Duration=3-14:55:00
Nodes=max-exfl[034-035,057,166] NodeCnt=4 CoreCnt=156 Features=(null) PartitionName=upex Flags=IGNORE_JOBS
TRES=cpu=312
Users=bob,fred,sally Accounts=(null) Licenses=(null) State=ACTIVE BurstBuffer=(null) Watts=n/a

5.3.6. Software available

The EuXFEL data analysis group provides a number of relevant tools, described in Data analysis software. In particular, a Python environment with relevant modules can be loaded by running:

module load exfel exfel_anaconda3

5.3.7. Storage

Users will be given a single experiment folder per beam time (not per user) through which all data will be accessible, e.g:

/gpfs/exfel/exp/<instrument>/<instrument_cycle>/p<proposal_id>/(raw|usr|proc|scratch)
Storage Quota Permission Lifetime comments
raw None Read 2 months Fast accessible raw data
usr 5TB Read/Write 24 months user data, results
proc None Read 6 months processed data e.g. calibrated
scratch None Read/Write 6 months Temporary data (lifetime not guaranteed)

5.3.8. Synchronisation

The data in the raw directories are moved from the online cluster (at the experiment) to the offline (Maxwell) cluster as follows:

  • when the run stops (user presses button), the data is flagged that it can be copied to the Maxwell cluster, and is queued to a copy service (provided by DESY). The data will be copied without the user noticing.

  • Once the data is copied, the data is ‘switched’ and becomes available on the offline cluster.

    The precise time at which this switch happens after the user presses the button cannot be predicted: if the data is copied already (in the background), it could be instantaneous, otherwise the copy process needs to finish first.

  • The actual copying process (before the switch) could take anything between minutes to hours, and will depend on (i) the size of the data and (ii) how busy the (DESY) copying queue is.
  • The usr folder is mounted from the Maxwell cluster, and thus always identical between the online and offline system. However, it is not optimised for dealing with large files and thus potentially slow for lager files. There is a quota of 5TB.

5.4. Running containers

Singularity is available on both the online and offline cluster. It can be used to run containers built with Singularity or Docker.

Running containers with Docker is experimental, and there are some complications with filesystem permissions. We recommend using Singularity to run your containers, but if you need Docker, it is available.

  • On the online cluster, Docker needs to be enabled for your account. Please email it-support@xfel.eu to request it.
  • On the offline cluster, Docker only works on nodes allocated for SLURM jobs (see Running jobs), not on login nodes.

5.5. Compute environment FAQ

Frequently asked questions

tbd