The Polish Open Science Conference 2024

Name: The Polish Open Science Conference 2024
Start: 2024-04-10T09:00:00+02:00
End: 2024-04-12T20:00:00+02:00
Location: Władysława Reymonta 7

10–12 Apr 2024

Władysława Reymonta 7

Europe/Warsaw timezone

Contact

os-conf-pl@cyfronet.pl

Data Lineage in High-Performance Computing Environments

10 Apr 2024, 12:55

10m

Audytorium (Centrum Dydaktyki AGH, U-2) (Władysława Reymonta 7)

Audytorium (Centrum Dydaktyki AGH, U-2)

Władysława Reymonta 7

Audytorium (Centrum Dydaktyki AGH, U-2) Kraków, Poland

Session I

Dr Mateusz Tykierko (Wroclaw Centre for Networking and Supercomputing)

As the complexity of high-performance computing (HPC) continues to grow, data management becomes a critical challenge. In HPC environments, where data processing occurs on a massive scale, tracing data lineage—from its source to its utilization in analyses and computations—and understanding data provenance is a key element in ensuring data integrity, regulatory compliance, and performance optimization. Furthermore, ensuring the reproducibility of scientific results is paramount in such environments. In this presentation, we will present an analysis of data lineage, provenance, and reproducibility in the context of HPC environments, discussing techniques, tools, and best practices for data management in such complex settings. We will focus on issues related to identifying data sources, tracking the flow of data through various processing stages, ensuring data consistency and quality, establishing data provenance, and facilitating the reproducibility of scientific results.

PLOSC24_S1_P3_Tykierko.pdf

The Polish Open Science Conference 2024

Contact

Data Lineage in High-Performance Computing Environments

Audytorium (Centrum Dydaktyki AGH, U-2)

Władysława Reymonta 7

Speaker

Description

Presentation materials

Choose timezone

The Polish Open Science Conference 2024

Contact

Speaker

Description

Presentation materials