***Timeline can be adjusted based on algorithm complexity.
I’m a HPC developer with 3+ years of experience in C++/ CUDA/ DPC++(intel)/OpenMP/OpenACC development.
Hands-on with Intel ICC/IPP/TBB/MKL, oneAPI and multithreading concepts.
I have worked on parallelising and optimising many large scale CPU based application to CPU + GPU environment.
I have experience with CUDA accelerated libraries.
I use a combination of custom CUDA kernels, accelerated CUDA libraries,OpenMP and OpenACC to improvement application performance
Workloads I handled: Image Processing, spatial linear algebra, Monte Carlo methods, autonomous cars, Deep Learning.
Hands on with tools like nvprof, nvvp, Nsight compute, Nsight system, Intel VTune, Intel Advisor.
If needed, will provide documentation to understand the implementation.
I assure you, best of my efforts.