Cyfronet - Intel oneAPI workshop

Europe/Warsaw
room 303 (Nawojki 11, Krakow)

room 303

Nawojki 11, Krakow

Description

Each day of this in-person workshop is dedicated to a different topic: 

Day 1 (Oct 24th): Programming with oneAPI
Day 2 (Oct 25th): Performance libraries and profiling
Day 3 (Oct 26th): OpenMP Offloading & MPI

The event is free of charge and open for registration for anyone employed or affiliated with an institution (university, research institute, enterprise, public administration unit) from Poland or any other EuroCC2 country.

Register for any (or all) of the days using individual registration forms below. If registrations exceed the number of planned participants, we will maintain a waiting list.

Please note that this is an in-person only event in Krakow, Poland.

Speakers:

Stephen Blair-Chappell is an independent software consultant and is an Intel-certified oneAPI instructor. He was formerly the Technical Director at Bayncore where he led a team of consultants providing HPC and AI training on Intel Architecture. For 18 years he was a Technical Consulting Engineer at Intel helping their strategic customers in software optimization and code modernization. He is the author of the book "Parallel Programming with Intel Parallel Studio XE".

Biagio Cosenza is a tenure-track Assistant Professor in the Department of Computer Science at the University of Salerno, Italy, and a member of the Khronos SYCL Working Group. He joined the University of Salerno in August 2019 through a national brain gain program (Attraction and International Mobility), and received the Abilitazione for Italian Associate Professorship. From 2015 to 2019, he was Senior Research at the TU Berlin, Germany, where he was Principal Investigator for the DFG project Celerity and received the Habilitation from the Faculty IV. From 2011 to 2015, he was Postdoctoral researcher at the University of Innsbruck, Austria, where he contributed to the Insieme Compiler project and the DK- Plus multidisciplinary platform for Scientific Computing. His research is currently funded by the European HPC Joint Undertaking (LIGATE project), the Italian Ministry of Research (LibreRT project, PRIN 2022), and several industrial projects. Cosenza’s main research interests are in the field of high performance computing, in particular with respect to programming models, compiler technology, optimization and tuning.

    • 1
      Welcome
    • 2
      Introduction to oneAPI and the new Intel Developer Cloud (IDC) infrastructure

      Hardware Evolution: From CPUs to heterogenous HW (GPUs, FPGAs) programming
      Concept and purpose for the oneAPI Standardization initiative
      Intel’s oneAPI Solutions – Toolkits with Compilers, libs, analysis, and migration tools
      Transition from Intel Parallel Studio XE to Intel oneAPI toolkits
      IDC: service platform for developing with the latest Intel HW & SW

    • 10:30
      Coffee break
    • 3
      Direct programming with oneAPI DPC++/SYCL

      Intro to heterogenous programming model with SYCL 2020
      SYCL features and examples:
      • “Hello World” Example: buffer, accessor, queue, basic and nd-range kernels
      • Device Selection to offload kernel workloads
      • Execution Model
      • Compilation and Execution Flow
      • Memory Model; Buffers, Unified Shared Memory (USM)
      • Performance optimizations with SYCL features

    • 12:30
      Lunch break (lunch not provided)
    • 4
      Hands-on labs on CPU/GPU programming with SYCL

      Start Working with IDC, explore SYCL
      Understand the SYCL* language and programming model
      Use device selection to offload kernel workloads
      Decide when to use basic parallel kernels and ND Range Kernels
      Create a host accessor
      Build your first SYCL application

    • 15:00
      Coffee break
    • 5
      Hands-on labs on CPU/GPU programming with SYCL (cont.)

      Continue your learning journey and use CPU/GPU

    • 6
      Welcome
    • 7
      Intel oneAPI Librairies oneDPL, oneTBB, oneMKL

      Intel oneAPI libraries (oneMKL) for HPC - with demos
      Performance optimized libraries for numerical simulations and other purposes

    • 10:30
      Coffee break
    • 8
      Compatibility Tool (CUDA / SYCL conversion)

      Open-Source Compatibility tool for porting purposes (SYCLomatic) with demo
      Migration Cuda based GPU Applications to SYCL

    • 9
      Hands-on labs on CUDA to SYCL Compatibility Tool

      Get your hands dirty with examples:
      • vecAdd
      • Rodinia NW

    • 12:30
      Lunch break (lunch not provided)
    • 10
      Profiling and analysing code performance with VTune

      VTune main functionality (Hot spot analysis…) starting with CPU
      Profiling Tools Interfaces for GPU
      Profile heterogenous SYCL/OpenMP Workloads with Intel VTune

    • 11
      Application profiling for CPU and mixed hardware with Intel Advisor

      Advisor main functionality (Vectorization and Roofline) starting with CPU
      Estimate performance potential gains with Offload Advisor (CPU -> HW Accelerator)
      Analyse heterogenous SYCL/OpenMP Workloads with Intel Advisor and Roofline analysis

    • 15:30
      Coffee break
    • 12
      Hands-on labs on VTune / Advisor

      Understand the basics of command line options in VTune Profiler to collect data and generate reports
      Profile a SYCL* application using Intel® VTune™ Profiler on Intel® DevCloud
      Run Offload Advisor using command line syntax
      Use performance models and analyze generated reports
      See how Offload Advisor¹ identifies and ranks parallelization opportunities for offload

    • 13
      Welcome
    • 14
      Offloading C++ code to GPU with OpenMP
    • 10:30
      Coffee break
    • 15
      Transitioning from Ifort to IFX

      Intel Fortran Compiler provides full Fortran language standards support up through Fortran 2018 and expands OpenMP GPU offload support, speeding development of standards-compliant applications

    • 16
      Offloading with FORTRAN Code – with Demos

      Automatic offloading using DO CONCURRENT
      Offloading using OpenMP 5.2
      Offloading using oneMKL

    • 17
      Intel® MPI in Heterogenious Environment

      Intel MPI functionalities for GPU

    • 12:30
      Lunch break (lunch not provided)
    • 18
      Hands-on labs on code optimisation

      Use OpenMP Offload directives to execute code on GPU
      Use OpenMP constructs to effectively manage data transfers to and from the device

    • 15:00
      Coffee break
    • 19
      Hands-on labs on code optimisation (cont.)

      To be defined – could be user-provided codes for an initial run