Yuanjun Xiong, MMLAB CUHK
This should be the first question, actually. The answer is to provide more flexibility in coding.
Mex-files are c program that can be directly invoked by MATLAB, one of the most powerful math softwork on this planet. Usually, we can write a standard c file and use the utility provided by MATLAB to compile & link it. Have a taste of c code in a mex sorce.
#include "mex.h"
#include <stdio.h>
void mexFunction(int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[]){
printf("Hello, Matlab\n");
return;
}
Assuming you save this snippet as hello.c
. Then in interactive console of Matlab, type and enter
mex hello.c
You will get the compiled mex file called hello.mexw64
. You can invoke it in Matlab as
hello()
Then you get the output
Hello, Matlab
However, if we want to debug this little program, it becomes a little involved. Matlab doesn’t provide by itself a debugging tool, since mex is just an extension. Mathworks suggests using Visual Studio’s “attach to process” feature for debugging.
Things become more tangled when we add some CUDA code to the mex file. Matlab also has its tool to compile the CUDA enabled code and once again, no debug tools. Not very difficult to imagine the mass when host code is mixed with device code and we don’t have idea what’s happening in the code.
This is why I write this.
We try to compile this small program, say VS_CUDA_MEX.cu
.
#include "mex.h"
#include <cuda.h>
#include <cuda_runtime.h>
__global__ void test_cuda_vs(int a, int b){
return;
}
void mexFunction(int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[]){
D_OUT = mxCreateDoubleMatrix(1,1,mxREAL);
double* out = mxGetPr(D_OUT);
test_cuda_vs<<<1,1>>>(1,2);
return;
}
In matlab, we can use the command mex VS_CUDA_MEX.cu
to compile it. Before doing this, remember to copy this file
%matlabroot%\toolbox\distcomp\gpu\extern\src\mex\win64\mexopts.bat
to the folder the source file resides, as suggested by Mathworks. You may refer to this for help setting up CUDA and other stuffs.
As in debugging we will be using VS, why not we just compile the code with it? Now let’s try this.
We should start by creating the project. Any config which can generate a DLL
file will be fine. I use Visual Studio 2010, so we are creating a MFC DLL. In the IDE, click in the menu New Project->Visual C++->MFC DLL
, name your project, and finish the new project wizard. For example, the project is named as MexTemp
.
We need to export a function to communicate with the MATLAB. Similar things exists in every DLL file. Let’s do this step by step.
mex_interface.cpp
to the source folder.mexFunction()
. We should write this function in the cpp
file. For example, we write the code below. Since we gonna use CUDA, we add the CUDA headers.#include "stdafx.h"
#include "mex.h"
#include <cuda.h>
#include <cuda_runtime.h>
extern "C" void tester(int);
void mexFunction(int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[]){
#define D_OUT plhs[0]
#define OPT_DIST prhs[3]
D_OUT = mxCreateDoubleMatrix(1,1,mxREAL);
double* out = mxGetPr(D_OUT);
tester(1);
out[0] = 0;
return;
}
Please refer to Matlab documentation on how to write mex functions.
You may wonder we got an extern "C"
in the code. This is due to that we need use a cpp
to export the mex function. However, the cuda device code must be written in files with .cu
extensions. So we have write a “wrapper” to expose the device code to the mex function.
To do this, add a new file called my_kernel.cu
to the source folder. In this file, we write the __global
cuda function and the wrapper, say tester()
.
#include <cuda.h>
#include <cuda_runtime.h>
#include <stdio.h>
__global__ void test_cuda_vs(int a, int b){
return;
}
extern "C" void tester(int a)
{
int test = a;
test_cuda_vs<<< 1, 1 >>> (1, test);
}
This example is very simple. It runs one thread and do nothing.
But it serves as a complete template for any new contents. The function __global__ test_cuda_vs(int a, int b)
is the cuda kernel. Please refer to the CUDA Programming Guide for more details.
The host function tester()
here acts as the wrapper for the device function. Thus we can call this function in the mex fucntion or anywhere else to implictly invoke the device function.
The export the entry point to MATLAB, add a new file called mexTemp.def
and write
LIBRARY "MexTemp"
EXPORTS mexFunction
For now, we have get all the code we need. Let’s configure the project to compile it.
It’s straightforward to run the build command in the Visual Studio to compile the code. But before that, we have to set some properties. Let’s go through this.
Setup a configuration
a. In the solution explorer (usually on the left side of the IDE), right click on the project name and click “Properties”. On the top of the pop up dialog you should see Configuration Manager
. Open it. In the dropdown list active solution platform
select <new>
,then OK
. Now we have the x64 platform.
b. Select the “x64” configure as the active platform.
Setup CUDA complier
a. Right click on the project name agian, click Build Customization
and check CUDA 5.5(.targets, .props)
. Name may vary w.r.t. your CUDA version. This teaches the VS to use NVCC to compile cu files.
b. Once again Properties-> CUDA C/C++ -> Device -> Code Generation
, choose the computing model of your GPU.
c. Properties-> CUDA C/C++ -> Host -> Runtime Library
, choose Multi-threaded Debug DLL (/MDd)
.
d. In Properties-> Linker -> Input -> Additional Dependencies
add cudart.lib
. You may also specify the lib and include path of CUDA in Properties -> VC++ Directories -> Include Directories
and Properties -> VC++ Directories -> Library Directories
.
Setup C++ compiler & linker
This part is even more involved, some problem might happen occassionally. So good luck!
a. In Properties-> Linker -> Input
add these contents, note you should search these file and write actual position of them in your system.
C:\Program Files\MATLAB\R2013a\extern\lib\win64\microsoft\libmx.lib
C:\Program Files\MATLAB\R2013a\extern\lib\win64\microsoft\libmex.lib
C:\Program Files\MATLAB\R2013a\extern\lib\win64\microsoft\libmat.lib
b. In Properties-> Linker -> General -> Output File
, change the name of the output file to $(OutDir)$(ProjectName).mexw64
.
c. In Properties-> C/C++ -> General -> Additional Include Directories
add
C:\Program Files\MATLAB\R2013a\extern\include
.
d. In Properties-> C/C++ -> Code Generation -> Runtime Library
, choose Multi-threaded Debug DLL (/MDd)
. This make the linking consistent between NVCC and the C++ linker.
e. In Properties-> Linker -> Command Line
add addition flags /export:mexFunction /dll
.
f. In Properties-> General -> Target Extension
change the extension name to mexw64
to eliminate a possible warning.
g. Set the module file to .\mexTemp.def
in Properties -> Linker -> Input -> Module Definition File
For now, you should be able to successfully build the project.
Many thanks to NVidia for providing us Nsight, we can use it to debug the device code just like CPU code. I am not willing to even recall the stupid experience when I tried debugging my GLSL code.
First please download and install Nsight. Then add NSIGHT_CUDA_DEBUGGER=1
to the environment variables. Now let’s try it.
First open Matlab, locate to the mex file we have just generated. Try run it with mexTemp()
. It should work fine. Then we try Nsight’s debugging functionalities.
In the IDE, add some breakpoints in the device function. Then open the menu Tool -> Attach To Process
. In the pop-up switch the proxy to Nsight GPU Debugger
. It should look like this.
In the Available Process
list, attach the MATLAB Process. You may first set the Qualifier
to be the name of your PC.
The IDE will now switch to debug interface. Try run the mex in MATLAB. It should correctly pause at the breakpoint you specified.
Now you can debug the code just like common cpp.
To also break on bp in the host function, attach the default
proxy to MATLAB.
Thanks for reading so many words. Here is a working solution template as a reward for your patience.
matlabroot
pointing to the install location of Matlab (say, C:\Program Files\MATLAB\R2013a
). You may need to log off and log in to enable the variable.zip
file.mexFunction.sln
. There should be two projects in the solution: mexFuntion
and mexFunction_CUDA
. The CUDA enabled project may not be loaded if you don’t have CUDA installed or running other versions. But the pure mex project should will be OK.I have also provide a property sheet in \properties
, called mex.props
. You can import it in the property manager of VS. Then add a `cpp
file for matlab mexFunction. It will teach the IDE to build mex files.
Look at these links for more reference
Trick and Tips: Debugging CUDA Matlab Mex file codes
StackOverflow: Debug cuda kernel on mexfunction by attaching Matlab with Nsight GPU Debuger
Contact me:
yjxiong@ie.cuhk.edu.hk
MMLAB, CUHK
Mar 5, 2014