High Performance Computing

Number

hpc

ECTS

4.0

Specification

Understand the various acceleration and execution options in the HPC environment.

Level

Advanced

Content

The scaling of applications should be guaranteed at various levels as generically as possible. Scaling goes from the utilization of a single computer system to the utilization of a high performance computing center. To maximize the utilization of all available resources, computational accelerators such as GPUs are also used. Abstractions of hardware, parallelization of computation, and communication between systems allows execution on different infrastructures. Specialization of computation on specific hardware, such as GPU programming, on the other hand, allows for maximum performance with decreasing portability.

Learning outcomes

Containers The basics of container systems are known. Applications can be executed and maintained in various container formats optimized for HPC. By means of interactive containers, the computing power of an HPC cluster is effectively utilized through the use of data science tools.

Shared-memory systems An overview of shared-memory systems and frameworks (for example OpenMP) is known. The most common parallelization paradigms can be implemented by using these frameworks as examples. Available resources of a shared-memory system are thus utilized to the maximum.

Distributed-memory systems Various message-passing communication patterns are known. These can be applied and executed in exemplary applications. Message passing is used to interconnect several independent computing resources in order to solve a problem in parallel. The correct use of the different patterns guarantees an efficient message exchange of the systems.

Hybrid programming on GPU / CPU The basic functionality of different computational accelerators in the HPC environment are known. The basic use of suitable frameworks guarantees efficient utilization of all resources of hybrid CPU/GPU systems. The most commonly used acceleration patterns are known and can be executed on examples.

GPU programming The multilayer parallel architecture of modern GPUs is understood. Algorithms can be adapted to the parallel execution model, if possible. Different possibilities to minimize or hide the data transfer overhead can be applied to simple examples.

Evaluation

Mark

Built on the following competences

Foundation in Programming

Modultype

Portfolio Module