In this post, I used example codes from Numba tutorial for GTC 2017 conference.
- I have created an environment deep-learning in Anaconda Navigator (Anaconda3). In this environment, I have installed required Python modules, for example, cudatoolkit and numba.
- The code in Jupyter Notebook begin by exporting path to NVIDIA driver DLL and library folder.
- The first example is on use of jit decorator. This example shows performance improvement while using the compiled code in comparison to pure python code.
- The second example uses vectorize decorator. This example shows code for GPU is slow whereas code for CPU is faster. Reason: small number of computations for GPU and data transfer overhead.
- The third example code shows a faster for GPU.
- The fourth example code shows baseline performance with host arrays.
- The fifth example code shows how transfer data to and from GPU toward faster execution.
No comments:
Post a Comment