Creating a New Project in CUDA

AID: 2591
  • Status: Published

2710 points

  • Byammasajan
  • TypeTutorial
  • Posted on2010-03-03 at 01:43:24

This tutorial demonstrates how to create a new Project for developing CUDA enabled Apps in NVIDIA GPU platform.

Prerequisites:

  • GPU(s) - Geforce, Tesla, etc.

  • CUDA SDK - Installed

  • CUDA Driver - Installed

  • CUDA Toolkit - Installed

  • CUDA Samples -Installs with Toolkit


Tutorial

1. Login to GPU Machine (ssh access also fine)

2. Set PATH variable - Add to ~/.bash_profile

  • export PATH=$PATH:/usr/local/cuda/bin

  • export PATH=/usr/local/cuda/bin:$PATH

  • export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib


3. Need to add /usr/local/cuda/lib (for 64 bit machines use /usr/local/cuda/lib64)  to /etc/ld.so.conf

  • Create a File called gpu.conf under /etc/ld.so.conf.d Directory

  • Add /usr/local/cuda/lib64 to gpu.conf


4. Run ldconfig as Root user

5. you can Enable Profiler for GPU (optional)

  • export CUDA_PROFILE=1


(if you enable cuda profiler and run your main App, you can see a file in the current directory named cuda_profile.log )
Eg:.
 
# CUDA_PROFILE_LOG_VERSION 1.5
# CUDA_DEVICE 0 Tesla C1060
# TIMESTAMPFACTOR fd4920a156863f8
method,gputime,cputime,occupancy
method=[ memcpyHtoD ] gputime=[ 3.744 ] cputime=[ 2.000 ] 
method=[ memcpyHtoD ] gputime=[ 3.968 ] cputime=[ 1.000 ] 
method=[ _Z6vecAddPiS_S_ ] gputime=[ 6.656 ] cputime=[ 8.000 ] occupancy=[ 0.031 ] 
method=[ memcpyDtoH ] gputime=[ 4.416 ] cputime=[ 17.000 ]

                                    
1:
2:
3:
4:
5:
6:
7:
8:

Select allOpen in new window




Basic Development Environment Setup done!


Hint: On 64-Bit machines cudart ld load error will occur - to fix it try the two steps

  • ln -s /usr/local/cuda/lib64/libcudart.so /usr/lib/libcudart.so

  • ln -s /usr/lib64/libXi.so.6 /usr/lib64/libXi.so


Check Installation

  • Edit /opt/sample/C/common/common.mk and set the cuda install path /usr/local/cuda

  • Go to /opt/sample/C

  • run make will compile the samples, if any error persists, check the previous steps

  • execute a sample  ./opt/sample/C/bin/linux/release/bandwidthTest(optional)


create a New Project (Assumption all the above steps are done successful)

  • cd /opt/sample/C/src

  • cp template/ yourprojectName -R

  • cd yourprojectName

  • change the Makefile

 
# Add source files here
EXECUTABLE	:= yourprojectName
# Cuda source files (compiled with cudacc)
CUFILES		:= yourprojectName.cu

                                    
1:
2:
3:
4:

Select allOpen in new window


(make changes to the yourprojectName.cu file and yourprojectName_kernel.cu file)
  • make


Execute the GPU program

  • bash ../../bin/linux/release/yourprojectName


Sample Code:

MakeFile:
 
################################################################################

# Add source files here
EXECUTABLE	:= saj
# Cuda source files (compiled with cudacc)
CUFILES		:= saj.cu
# C/C++ source files (compiled with gcc / c++)
CCFILES		:=


################################################################################
# Rules and targets

include ../../common/common.mk

                                    
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:

Select allOpen in new window



Source Code

 
/*
Hello world Program to compute the sum of two arrays of size N using GPU
(not used blocksDim and blockIdx and grid concepts, so that any body can familier with CUDA)

@author Sajan Kumar.S
@email: nospam+ammasajan[A.T]gmail[.]com
*/


#include <stdio.h>
#include <stdlib.h>
#define N 20 // 20 elements

__global__ void vecAdd(int *A, int  *B, int *C){
         int i=threadIdx.x;

         __shared__ int s_A[N],s_B[N],s_C[N]; // N Value depends on size of shared memory

        // copy the values to shared mem and attack! :D

        s_A[i]=A[i];
        s_B[i]=B[i];

        __syncthreads();
//       C[i]=A[i]+B[i];

//      s_C[i]=s_A[i]+s_B[i]; // to calucate the sume of elements
        s_C[i]=s_A[i]*s_B[i]; // to caluclate the sume of elements
        __syncthreads();

        C[i]=s_C[i];
}

int main(){

        int *h_a=0,*h_b=0,*h_c=0;
        int *d_a=0,*d_b=0,*d_c=0;
        int memSize=N*sizeof(int);

        // allocate host memory size of N
        h_a=(int *)malloc(memSize);
        h_b=(int *)malloc(memSize);
        h_c=(int *)malloc(memSize);

        // allocate GPU memory size of N
        cudaMalloc((void **)&d_a,memSize);
        cudaMalloc((void **)&d_b,memSize);
        cudaMalloc((void **)&d_c,memSize);

        // Init values to A and B arrays(clearing C array)
        for(int i=0;i<N;i++){
                h_a[i]=i+2;
                h_b[i]=i+3;
                h_c[i]=0;
        }

        // Copied the values to GPU arrays A and B
        cudaMemcpy(d_a,h_a,memSize,cudaMemcpyHostToDevice);
        cudaMemcpy(d_b,h_b,memSize,cudaMemcpyHostToDevice);

        // printing the A array and B array on CPU
        printf("\n Array A : \n");
        for(int i=0;i<N;i++)
                printf("%d\t",h_a[i]);
        printf("\n Array B : \n");
        for(int i=0;i<N;i++)
                printf("%d\t",h_b[i]);
        printf("\ncalucalting Sum : ");
        vecAdd<<<1, N>>>(d_a,d_b,d_c);

        // copying the output C from GPU to mem
        cudaMemcpy(h_c,d_c,memSize,cudaMemcpyDeviceToHost);

        printf("\nSum of Arrays: \n");
        for(int i=0;i<N;i++)
                printf("%d\t",h_c[i]);

        cudaFree(d_a);
        cudaFree(d_b);
        cudaFree(d_c);

        free(h_a);
        free(h_b);
        free(h_c);

        return 1;
}

                                    
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:
40:
41:
42:
43:
44:
45:
46:
47:
48:
49:
50:
51:
52:
53:
54:
55:
56:
57:
58:
59:
60:
61:
62:
63:
64:
65:
66:
67:
68:
69:
70:
71:
72:
73:
74:
75:
76:
77:
78:
79:
80:
81:
82:
83:
84:
85:
86:
87:

Select allOpen in new window

Asked On
2010-03-03 at 01:43:24ID2591
Tags

CUDA

,

GPU

,

Programming

,

C

Topic

Miscellaneous Hardware

Views
2183

Add your Comment

Please Sign up or Log in to comment on this article.

Join Experts Exchange Today

Gain Access to all our Tech Resources

Get personalized answers

Ask unlimited questions

Access Proven Solutions

Search 3.2 million solutions

Read In-Depth How-To Guides

1000+ articles, demos, & tips

Watch Step by Step Tutorials

Learn direct from top tech pros

And Much More!

Your complete tech resource

See Plans and Pricing

30-day free trial. Register in 60 seconds.

Loading Advertisement...

Top Misc Hardware Experts

  1. Callandor

    263,064

    Guru

    2,175 points yesterday

    Profile
    Rank: Genius
  2. nobus

    91,482

    Master

    400 points yesterday

    Profile
    Rank: Savant
  3. dbrunton

    58,499

    Master

    1,000 points yesterday

    Profile
    Rank: Genius
  4. MASQUERAID

    52,029

    Master

    1,000 points yesterday

    Profile
    Rank: Genius
  5. andyalder

    41,288

    0 points yesterday

    Profile
    Rank: Genius
  6. jamietoner

    37,225

    0 points yesterday

    Profile
    Rank: Genius
  7. garycase

    35,852

    0 points yesterday

    Profile
    Rank: Genius
  8. dlethe

    28,851

    0 points yesterday

    Profile
    Rank: Genius
  9. hanccocka

    25,976

    375 points yesterday

    Profile
    Rank: Genius
  10. rindi

    23,421

    0 points yesterday

    Profile
    Rank: Savant
  11. thinkpads_user

    20,668

    0 points yesterday

    Profile
    Rank: Genius
  12. DaveBaldwin

    20,464

    0 points yesterday

    Profile
    Rank: Genius
  13. joewinograd

    15,100

    0 points yesterday

    Profile
    Rank: Wizard
  14. Merete

    14,700

    0 points yesterday

    Profile
    Rank: Genius
  15. DavisMcCarn

    13,375

    0 points yesterday

    Profile
    Rank: Genius
  16. JohnnyCanuck

    12,922

    0 points yesterday

    Profile
    Rank: Wizard
  17. Michael-Best

    12,201

    0 points yesterday

    Profile
    Rank: Sage
  18. arnold

    11,650

    0 points yesterday

    Profile
    Rank: Genius
  19. leew

    11,620

    0 points yesterday

    Profile
    Rank: Savant
  20. Darr247

    11,448

    1,200 points yesterday

    Profile
    Rank: Genius
  21. IanTh

    11,330

    0 points yesterday

    Profile
    Rank: Genius
  22. kode99

    11,200

    0 points yesterday

    Profile
    Rank: Genius
  23. ZShaver

    9,896

    0 points yesterday

    Profile
    Rank: Master
  24. DrKlahn

    9,752

    0 points yesterday

    Profile
    Rank: Sage
  25. viki2000

    9,594

    0 points yesterday

    Profile
    Rank: Guru

Hall Of Fame