You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
CUDA C implementation of Principal Component Analysis (PCA) through Singular Value Decomposition (SVD) using a highly parallelisable version of the Jacobi eigenvalue algorithm.
The general Idea of this project is to generate a Fibonacci Sequence and sort it by Manual Sorting Algorithms Such as (Bubble Sort, Quick Sort, Merge Sort, and Heap Sort) And Also Sorting Algorithms by Thrust Library such as (‘thrust::sort’ and ‘thrust::transform’) at the same time.