Cyfronet - Intel oneAPI workshop

Europe/Warsaw
room 303 (Nawojki 11, Krakow)

room 303

Nawojki 11, Krakow

Description

Each day of this in-person workshop is dedicated to a different topic: 

Day 1 (Oct 24th): Programming with oneAPI
Day 2 (Oct 25th): Performance libraries and profiling
Day 3 (Oct 26th): OpenMP Offloading & MPI

The event is free of charge and open for registration for anyone employed or affiliated with an institution (university, research institute, enterprise, public administration unit) from Poland or any other EuroCC2 country.

Register for any (or all) of the days using individual registration forms below. If registrations exceed the number of planned participants, we will maintain a waiting list.

Please note that this is an in-person only event in Krakow, Poland.

Speakers:

Stephen Blair-Chappell is an independent software consultant and is an Intel-certified oneAPI instructor. He was formerly the Technical Director at Bayncore where he led a team of consultants providing HPC and AI training on Intel Architecture. For 18 years he was a Technical Consulting Engineer at Intel helping their strategic customers in software optimization and code modernization. He is the author of the book "Parallel Programming with Intel Parallel Studio XE".

Biagio Cosenza is a tenure-track Assistant Professor in the Department of Computer Science at the University of Salerno, Italy, and a member of the Khronos SYCL Working Group. He joined the University of Salerno in August 2019 through a national brain gain program (Attraction and International Mobility), and received the Abilitazione for Italian Associate Professorship. From 2015 to 2019, he was Senior Research at the TU Berlin, Germany, where he was Principal Investigator for the DFG project Celerity and received the Habilitation from the Faculty IV. From 2011 to 2015, he was Postdoctoral researcher at the University of Innsbruck, Austria, where he contributed to the Insieme Compiler project and the DK- Plus multidisciplinary platform for Scientific Computing. His research is currently funded by the European HPC Joint Undertaking (LIGATE project), the Italian Ministry of Research (LibreRT project, PRIN 2022), and several industrial projects. Cosenza’s main research interests are in the field of high performance computing, in particular with respect to programming models, compiler technology, optimization and tuning.

    • 09:30 09:40
      Welcome 10m
    • 09:40 10:30
      Introduction to oneAPI and the new Intel Developer Cloud (IDC) infrastructure 50m

      Hardware Evolution: From CPUs to heterogenous HW (GPUs, FPGAs) programming
      Concept and purpose for the oneAPI Standardization initiative
      Intel’s oneAPI Solutions – Toolkits with Compilers, libs, analysis, and migration tools
      Transition from Intel Parallel Studio XE to Intel oneAPI toolkits
      IDC: service platform for developing with the latest Intel HW & SW

    • 10:30 10:50
      Coffee break 20m
    • 10:50 12:30
      Direct programming with oneAPI DPC++/SYCL 1h 40m

      Intro to heterogenous programming model with SYCL 2020
      SYCL features and examples:
      • “Hello World” Example: buffer, accessor, queue, basic and nd-range kernels
      • Device Selection to offload kernel workloads
      • Execution Model
      • Compilation and Execution Flow
      • Memory Model; Buffers, Unified Shared Memory (USM)
      • Performance optimizations with SYCL features

    • 12:30 13:30
      Lunch break (lunch not provided) 1h
    • 13:30 15:00
      Hands-on labs on CPU/GPU programming with SYCL 1h 30m

      Start Working with IDC, explore SYCL
      Understand the SYCL* language and programming model
      Use device selection to offload kernel workloads
      Decide when to use basic parallel kernels and ND Range Kernels
      Create a host accessor
      Build your first SYCL application

    • 15:00 15:15
      Coffee break 15m
    • 15:15 17:00
      Hands-on labs on CPU/GPU programming with SYCL (cont.) 1h 45m

      Continue your learning journey and use CPU/GPU

    • 09:30 09:40
      Welcome 10m
    • 09:40 10:30
      Intel oneAPI Librairies oneDPL, oneTBB, oneMKL 50m

      Intel oneAPI libraries (oneMKL) for HPC - with demos
      Performance optimized libraries for numerical simulations and other purposes

    • 10:30 10:50
      Coffee break 20m
    • 10:50 11:20
      Compatibility Tool (CUDA / SYCL conversion) 30m

      Open-Source Compatibility tool for porting purposes (SYCLomatic) with demo
      Migration Cuda based GPU Applications to SYCL

    • 11:20 12:30
      Hands-on labs on CUDA to SYCL Compatibility Tool 1h 10m

      Get your hands dirty with examples:
      • vecAdd
      • Rodinia NW

    • 12:30 13:30
      Lunch break (lunch not provided) 1h
    • 13:30 14:30
      Profiling and analysing code performance with VTune 1h

      VTune main functionality (Hot spot analysis…) starting with CPU
      Profiling Tools Interfaces for GPU
      Profile heterogenous SYCL/OpenMP Workloads with Intel VTune

    • 14:30 15:30
      Application profiling for CPU and mixed hardware with Intel Advisor 1h

      Advisor main functionality (Vectorization and Roofline) starting with CPU
      Estimate performance potential gains with Offload Advisor (CPU -> HW Accelerator)
      Analyse heterogenous SYCL/OpenMP Workloads with Intel Advisor and Roofline analysis

    • 15:30 15:45
      Coffee break 15m
    • 15:45 17:00
      Hands-on labs on VTune / Advisor 1h 15m

      Understand the basics of command line options in VTune Profiler to collect data and generate reports
      Profile a SYCL* application using Intel® VTune™ Profiler on Intel® DevCloud
      Run Offload Advisor using command line syntax
      Use performance models and analyze generated reports
      See how Offload Advisor¹ identifies and ranks parallelization opportunities for offload

    • 09:30 09:40
      Welcome 10m
    • 09:40 10:30
      Offloading C++ code to GPU with OpenMP 50m
    • 10:30 10:45
      Coffee break 15m
    • 10:45 11:10
      Transitioning from Ifort to IFX 25m

      Intel Fortran Compiler provides full Fortran language standards support up through Fortran 2018 and expands OpenMP GPU offload support, speeding development of standards-compliant applications

    • 11:10 12:00
      Offloading with FORTRAN Code – with Demos 50m

      Automatic offloading using DO CONCURRENT
      Offloading using OpenMP 5.2
      Offloading using oneMKL

    • 12:00 12:30
      Intel® MPI in Heterogenious Environment 30m

      Intel MPI functionalities for GPU

    • 12:30 13:30
      Lunch break (lunch not provided) 1h
    • 13:30 15:00
      Hands-on labs on code optimisation 1h 30m

      Use OpenMP Offload directives to execute code on GPU
      Use OpenMP constructs to effectively manage data transfers to and from the device

    • 15:00 15:15
      Coffee break 15m
    • 15:15 17:00
      Hands-on labs on code optimisation (cont.) 1h 45m

      To be defined – could be user-provided codes for an initial run