Date of Award
Master of Science
OpenMP target offload has been in the inception phase for some time but has been gaining traction in the recent years with more compilers supporting the constructs and optimising it at the general level. Its ease of programming compared to other models makes it quite desirable in the industry. This work investigates how different compilers are interacting with the different constructs at runtime and how the callbacks affect the performance of each compiler. We also dive into the programs in the Polybench benchmark test suite with the main focus on linear algebra to generate an OpenMP GPU target offload implementation with parallelization techniques obtained from DiscoPoP for the C Language. The main focus is on the DOALL and reduction loops which come under loop level parallelism. The problem encountered are issues with the device data mapping. We also analyze the compiler used and see its behaviour in code generation for OpenMP target offload. While converting these benchmarks, we faced a myriad of issues related to the data mapping of multi-dimensional data structures onto the target from the host while using the GCC compiler. The main work done to counter this issue was to suggest a code transformation algorithmic approach which efficiently resolves the issues without loosing the correctness of the said programs.
Bhattacharjee, Arijit, "Evaluation of GPU-specific device directives and multi-dimensional data structures in OpenMP" (2020). Graduate Theses and Dissertations. 18278.