ClojureCuda¶
Installation¶
Insure that the paths are updated at the end of your .bashrc
file
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
The version minor version of the cuda library also matter. You also need
to simlink the exact libnvrt.so.MAJOR.MINOR
binary into
/usr/lib/x86_64-linux-gnu/
. For example
ln -s /usr/local/cuda/lib64/libnvrtc.so.11.0 /usr/lib/x86_64-linux-gnu/
Logic¶
The logic follows the cuda model, but the compilation happens with the
clojurecuda.core/compile!
function. Most example shows that you write
a *.cu
file and concatenate the sources, while headers are a map of
their name (e.g stdint.h
) to their code source. The best example if
taken from
uncomplicate.neanderthal.internal.device.cublas
namespace.
(let [src (str (slurp (io/resource "uncomplicate/clojurecuda/kernels/reduction.cu"))
(slurp (io/resource "uncomplicate/neanderthal/internal/device/cuda/number.cu"))
(slurp (io/resource "uncomplicate/neanderthal/internal/device/cuda/blas-plus.cu"))
(slurp (io/resource "uncomplicate/neanderthal/internal/device/cuda/vect-math.cu"))
(slurp (io/resource "uncomplicate/neanderthal/internal/device/cuda/random.cu")))
integer-src (slurp (io/resource "uncomplicate/neanderthal/internal/device/cuda/number.cu"))
standard-headers {"stdint.h" (slurp (io/resource "uncomplicate/clojurecuda/include/jitify/stdint.h"))
"float.h" (slurp (io/resource "uncomplicate/clojurecuda/include/jitify/float.h"))}
philox-headers
(merge standard-headers
{"Random123/philox.h"
(slurp (io/resource "uncomplicate/neanderthal/internal/device/include/Random123/philox.h"))
"features/compilerfeatures.h"
(slurp (io/resource "uncomplicate/neanderthal/internal/device/include/Random123/features/compilerfeatures.h"))
"nvccfeatures.h"
(slurp (io/resource "uncomplicate/neanderthal/internal/device/include/Random123/features/nvccfeatures.h"))
"array.h" (slurp (io/resource "uncomplicate/neanderthal/internal/device/include/Random123/array.h"))})]
(JCublas2/setExceptionsEnabled false)
(defn cublas-double [ctx hstream]
(in-context
ctx
(with-release
[prog (compile!
(program src philox-headers)
["-DNUMBER=double" "-DREAL=double" "-DACCUMULATOR=double"
"-DCAST(fun)=fun" #_"-use_fast_math" "-default-device"
(format "-DCUDART_VERSION=%s" (driver-version))])]
(let-release [modl (module prog)
handle (cublas-handle hstream)
hstream (get-cublas-stream handle)]
(->CUFactory
modl hstream (cu-double-accessor (current-context) hstream) native-double
(->DoubleVectorEngine handle modl hstream) (->DoubleGEEngine handle modl hstream)
(->DoubleTREngine handle modl hstream) (->DoubleSYEngine handle modl hstream))))))
(defn cublas-int [ctx hstream]
(in-context
ctx
(with-release
[prog (compile! (program integer-src)
["-DNUMBER=int" "-use_fast_math" "-default-device"])]
(let-release [modl (module prog)
handle (cublas-handle hstream)
hstream (get-cublas-stream handle)]
(->CUFactory modl hstream (cu-int-accessor (current-context) hstream) native-int
(->IntegerVectorEngine handle modl hstream) nil nil nil))))))
At the compile!
function call, we can see that the program
function
requires the concatenate sources, and the headers as map. The second
argument to compile!
are the arguments to nvcc
(or more precisely
the nvidia compiler).
Java, C++ and JNA¶
When we want to leverage other's code, we need to create bridge between
C++ and Java. One typical is to pass custom struct
. To solve this we
could use jna.