Jump to content

  • Log In with Google      Sign In   
  • Create Account


TMarques

Member Since 12 Jun 2008
Offline Last Active May 19 2013 05:53 AM

Topics I've Started

CUDA function calling, what's the best approach?

10 May 2013 - 11:53 AM

Hello,

 

I'm working on a CUDA kernel and something interesting, I guess, crossed my mind. Maybe you could help me.

 

Say I have those two kernels:

 

__device__ float SimpleKernel1(float value1, float value2, float value3 ...float valueN)
{
    return value1 + value2 + value3 ... + valueN;
}

 

__device__ float SimpleKernel2(float *values)
{
    return value[0] + value[1] + value[2] ... + value[N];
}

 

Would SimpleKernel2 run faster? I know there are lots of factors in play (i.e memory interface, clock, number of threads) but thinking as generic as possible, kernel1 function call sends sizeof(float)*N bytes while kernel2 function call only sends sizeof(float *) bytes, so maybe this would results in great speedups. Is this right or wrong, does it really matter?

 

Thanks!


OpenCL/OpenGL interoperability error

23 March 2013 - 11:45 AM

Hello,

 

I'm trying to establish interop between CL and GL, the tutorials are very straightforward and everything looks to be set correctly, however, when I try to get a cl_mem from a VBO index, the error CL_INVALID_GL_OBJECT appears and I don't know what I'm doing wrong.

 

#include <SDL/SDL.h>
#include <GL/glew.h>
#include <GL/glx.h>
#include <stdio.h>

#ifdef __APPLE__
	#include <OpenCL/opencl.h>
#else
	#include <CL/cl.h>
	#include <CL/cl_gl.h>
#endif

//#define FULLSCREEN

int main(int argc, char *argv[])
{

	//******************
	//Setting up window.
	//******************

	int videoFlags;
	SDL_VideoInfo *videoInfo;

	SDL_Init(SDL_INIT_VIDEO);
	videoInfo = (SDL_VideoInfo *)SDL_GetVideoInfo();
	if(!videoInfo)
	{
		printf("Error: SDL failed to determine video info.");
		return false;
	}

	videoFlags = SDL_OPENGL | SDL_GL_DOUBLEBUFFER | SDL_HWSURFACE;
	#ifdef FULLSCREEN
		videoFlags += SDL_FULLSCREEN;
	#endif

	SDL_SetVideoMode(videoInfo->current_w,
			 videoInfo->current_h,
			 videoInfo->vfmt->BitsPerPixel,
			 videoFlags);

	//************************
	//Setting up openGL state.
	//************************

	glewInit();

	//******************************************************
	//Setting up openCL state and share context with openGL.
	//******************************************************

	cl_int state;
	cl_platform_id platform[100];
	cl_context context;
	cl_device_id devices[100];

	//Get platform.
	cl_uint numberOfPlatforms = 0;
	if(clGetPlatformIDs(100, platform, &numberOfPlatforms) != CL_SUCCESS)
	{
		printf("OpenCL Error: Platform couldn't be found.\n");
		return 0;
	}
	else
	{
		printf("%i platforms found.\n", numberOfPlatforms);
	}

	//Parameters needed to bind OpenGL's context to OpenCL's.
	cl_context_properties properties[] = {	CL_GL_CONTEXT_KHR, (cl_context_properties) glXGetCurrentContext(),
						CL_GLX_DISPLAY_KHR, (cl_context_properties) glXGetCurrentDisplay(),
						CL_CONTEXT_PLATFORM, (cl_context_properties) platform[0],
						0};

	//Find openGL devices.
	typedef CL_API_ENTRY cl_int (CL_API_CALL *CLpointer)(const cl_context_properties *properties, cl_gl_context_info param_name, size_t param_value_size, void *param_value, size_t *param_value_size_ret);
	CL_API_ENTRY cl_int (CL_API_CALL *myCLGetGLContextInfoKHR)(const cl_context_properties *properties, cl_gl_context_info param_name, size_t param_value_size, void *param_value, size_t *param_value_size_ret) = (CLpointer)clGetExtensionFunctionAddressForPlatform(platform[0], "clGetGLContextInfoKHR");

	size_t size;
	state = myCLGetGLContextInfoKHR(properties, CL_DEVICES_FOR_GL_CONTEXT_KHR, 100*sizeof(cl_device_id), devices, &size);

	if(state != CL_SUCCESS)
	{
		printf("OpenCL Error: Devices couldn't be resolved.\n");
		return 0;
	}
	else
	{
		printf("%i devices support OpenGL/OpenCL interoperability.\n", (int)(size/sizeof(cl_device_id)));
	}

	//Create context.
	context = clCreateContext(properties, 1, &devices[0], NULL, NULL, &state);
	if(state != CL_SUCCESS)
	{
		printf("OpenCL Error: Context couldn't be created.\n");
		return 0;
	}
	else
	{
		printf("Context succesfully created.\n");
	}

	//Create VBO and grant read and write access to openCL.
	GLuint vbo;
	glGenBuffers(1, &vbo);
	glBindBuffer(GL_ARRAY_BUFFER, vbo);
	glBufferData(GL_ARRAY_BUFFER, 400, NULL, GL_STATIC_DRAW);
	cl_mem mem = clCreateFromGLBuffer(context, CL_MEM_WRITE_ONLY, vbo, &state);
	if(state != CL_SUCCESS)
	{
		printf("Error creating memory object from VBO! (CL_INVALID_GL_OBJECT = -60) and (state = %i)\n", state);
	}
	else
	{
		printf("Memory object created succesfully from VBO!\n");
	}

	printf("GLCL finished. Bye!");
	return 0;
}

Has anyone ever encoutered this issue and dealt with it successfully? I'm trying AMD forums to get a solution but so far I have found nothing useful.

 

I'm running this program in Ubuntu and the programming language is CPP, I've managed to get an OpenCL program working, the problem is getting OpenCL to acess OpenGL objects. Also, I'm using SDL to construct the OpenGL window.

 

Thanks and best regards!


Parallel programming question.

18 February 2013 - 11:34 AM

Hello,

 

My question is related to OpenCL but I guess it could be applied to any parallel API (Cuda, MPI, pThread...)

 

I have a kernel that's executed thousands of times and each parallel kernel holds a reference to a single memory space called "__global char *comp". Prior to executing the kernels, the Host initializes this "comp" memory space to "true" and, during execution, kernels only acess "comp" if it needs to assign it's value to "false".

 

Do I have to worry about memory access synchronization in this particular case? I'm thinking it wouldn't matter if two kernels acessed the memory space concurrently as the only possible outcome would be for it to have a "false" value assigned.


SDL_Surface question.

19 January 2013 - 05:24 PM

Hello,

 

I'm trying to understand how SDL_surfaces work. When you set one with the SDL_HWSURFACE flag, are you storing the buffer to the GPU's memory? (Assuming there's a GPU to output video content)

 

If so, does the pixels parameter of the SDL_surface structure points to a position in the GPU's memory instead of the host's memory?

 

I tried reading the documentation http://sdl.beuc.net/sdl.wiki/SDL_Surface, but couldn't find the answer to that.

 

I apologise for any misunderstandings, I'm a newbie on this matter.

 

Thanks in advance!


Debugging issue.

19 July 2012 - 02:47 PM

So, will try to keep it simple since I don't expect anybody to solve my problem, I'm just looking for some tips to help me getting started solving the issue.

I have this program which runs fine under "Run", however, when running under "Debug", it crashes and the only information I can get is: No source available for "pthread_join() at 0x3682609080"

What's it meaning, is it too generic for a straight answer?

Language: C++
OS: Linux | Distribution
Environment: Eclipse
Debugger: GDB (Eclipse default I believe)

Thanks in advance!

PARTNERS