Sign in to follow this  
Followers 0
assainator

OpenGL
OpenCL OpenGL interop

2 posts in this topic

Hello all,

I've been trying to work together an OpenCL particle demo but I get an error of which I have no idea what causes it and thus no idea how to fix it.

 

Once it get to the call of clEnqueueNDRangeKernel(...) I get an error "Access violation writing location 0xFFFFFFFD", yet none of the arguments I pass have that value.

 

I am kind of at a loss here so I have no idea what the problem is.

 

My particles update code:

void UpdateFunc(int)
{
	glutTimerFunc(30, UpdateFunc, 30);

	// Get workgroup size
	size_t global = NUM_PARTICLES;
	size_t local;
	cl_int err = clGetKernelWorkGroupInfo(kernel, deviceID, CL_KERNEL_WORK_GROUP_SIZE, sizeof(local), &local, nullptr);
	if(err != CL_SUCCESS)
	{
		std::cout << "Couldn't get work group info!" << std::endl;
		getchar();
		exit(-1);
	}

	// Aquire position vbo.
	err = clEnqueueAcquireGLObjects(commandQueue, 1, &PositionArray, 0, nullptr, nullptr);
	if(err != CL_SUCCESS)
	{
		std::cout << "Error when aquiring OGL object." << std::endl;
		getchar();
		exit(-1);
	}

	// run kernel.
	cl_event wait;
	err = clEnqueueNDRangeKernel(commandQueue, kernel, 1, nullptr, &global, &local, 0, nullptr, &wait);
	if(err != CL_SUCCESS)
	{
		std::cout << "Error when running kernel." << std::endl;
		getchar();
		exit(-1);
	}
	clWaitForEvents(1, &wait);
	clFinish(commandQueue);

	// release position vbo.
	err = clEnqueueReleaseGLObjects(commandQueue, 1, &PositionArray, 0, nullptr, nullptr);

	if(err != CL_SUCCESS)
	{
		std::cout << "Error when releasing OGL object." << std::endl;
		getchar();
		exit(-1);
	}

	glutPostRedisplay();
}

 

And my whole source:

#include <iostream>
#include <GL/glew.h>
#include <GL/glut.h>
#include <CL/cl.h>
#include <CL/cl_gl.h>

#define NUM_PARTICLES 1024

cl_platform_id platformID;
cl_device_id deviceID;
cl_program program;
cl_context context;
cl_command_queue commandQueue;
cl_kernel kernel;
cl_mem VelocityArray;
cl_mem PositionArray;
cl_event event;

GLuint vbo;

void RenderFunc();
void UpdateFunc(int);

char* getDeviceName(cl_device_id device);

int main(int argc, char **argv)
{
	std::cout << "Starting..." << std::endl;
	std::cout << "Using GPU config for acceleration." << std::endl;

	std::cout << "Simulating with " << NUM_PARTICLES << " particles." << std::endl;
	

	// Initialize glut.
	glutInit(&argc, argv);
	glutInitDisplayMode(GLUT_RGB | GLUT_DOUBLE | GLUT_DEPTH);
	glutInitWindowSize(1024, 768);
	glutInitWindowPosition(0,0);

	glutCreateWindow("OpenCL + OpenGL particles");

	// intialize glew
	glewInit();


	// Get first OpenCL platform.
	cl_int err = clGetPlatformIDs(1, &platformID, nullptr);
	if(err != CL_SUCCESS)
	{
		std::cout << "Couldn't find an OpenCL platform!" << std::endl;
		getchar();
		return -1;
	}

	// Get first OpenCL gpu device.
	err = clGetDeviceIDs(platformID, CL_DEVICE_TYPE_GPU, 1, &deviceID, nullptr);
	if(err != CL_SUCCESS)
	{
		std::cout << "Couldn't find a OpenCL enabled GPU!" << std::endl;
		getchar();
		return -1;
	}
	std::cout << "Using OpenCL device: " << getDeviceName(deviceID) << std::endl;

	// Create OpenCL create properties
	cl_context_properties props[] =
	{
		CL_GL_CONTEXT_KHR, (cl_context_properties)wglGetCurrentContext(),
		CL_WGL_HDC_KHR, (cl_context_properties)wglGetCurrentDC(),
		CL_CONTEXT_PLATFORM, (cl_context_properties)platformID,
		0
	};

	// Create an OpenCL context.
	context = clCreateContext(props, 1, &deviceID, nullptr, nullptr, &err);
	if(!context || err != CL_SUCCESS)
	{
		std::cout << "Couldn't create an OpenCL context." << std::endl;
		getchar();
		return -1;
	}

	// Create a command queue.
	commandQueue = clCreateCommandQueue(context, deviceID, 0, &err);
	if(!commandQueue || err != CL_SUCCESS)
	{
		std::cout << "Couldn't create a command queue for OpenCL." << std::endl;
		getchar();
		return -1;
	}

	// Load OpenCL code.
	FILE* stream = fopen("particles.kernel.cl", "rb");
	fseek(stream, 0, SEEK_END);
	unsigned int len = ftell(stream);
	fseek(stream, 0, SEEK_SET);
	char * KernelSource = (char*)malloc(len + 1);
	fread(KernelSource, sizeof(char), len, stream);
	KernelSource[len] = '\0';
	fclose(stream);

	// Create a program.
	program = clCreateProgramWithSource(context, 1, (const char**)&KernelSource, nullptr, &err);
	if(!program || err != CL_SUCCESS)
	{
		std::cout << "Couldn't create a program." << std::endl;
		getchar();
		return -1;
	}

	// Build the program.
	err = clBuildProgram(program, 0, nullptr, nullptr, nullptr, nullptr);
	if(err != CL_SUCCESS)
	{
		std::cout << "Couldn't build the program:" << std::endl;
		std::cout << "Build log:" << std::endl;

		char buffer[2048];
		clGetProgramBuildInfo(program, deviceID, CL_PROGRAM_BUILD_LOG, sizeof(buffer), buffer, nullptr);
		std::cout << buffer << std::endl << "-- End of build log -- " << std::endl;

		getchar();
		return -1;
	}

	// Create a kernel.
	kernel = clCreateKernel(program, "particle_cycle", &err);
	if(!kernel || err != CL_SUCCESS)
	{
		std::cout << "Couldn't create a kernel from the program.";
		getchar();
		return -1;
	}

	
	glutDisplayFunc(RenderFunc);
	glutTimerFunc(30, UpdateFunc, 30);

	// initialize opengl
	glClearColor(0,0,0,0);
	glViewport(0,0,1024, 768);

	glMatrixMode(GL_PROJECTION);
	glLoadIdentity();
	gluPerspective(90.0, (float)1024/768, 0.1, 1000.0);

	glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
	glMatrixMode(GL_MODELVIEW);
	glLoadIdentity();
	glTranslatef(0,0, -1.0f);

	// Create position VBO
	glGenBuffers(1, &vbo);
	glBindBuffer(GL_ARRAY_BUFFER, vbo);

	float* initialPos = new float[4 * sizeof(float) * NUM_PARTICLES];
	for(unsigned int i = 0; i < NUM_PARTICLES; i++)
	{
		initialPos[i] = rand() / (float)RAND_MAX;
	}

	// Upload data to OGL
	glBufferData(GL_ARRAY_BUFFER, 4 * sizeof(float) * NUM_PARTICLES, initialPos, GL_DYNAMIC_DRAW);

	// Bind vbo to cl buffer
	PositionArray = clCreateFromGLBuffer(context, CL_MEM_READ_WRITE, vbo, nullptr);

	// Create velocity array.
	float *velocity = new float[4 * sizeof(float) * NUM_PARTICLES];
	for(unsigned int i = 0; i < NUM_PARTICLES; i++)
	{
		initialPos[i] = rand() / (float)RAND_MAX;
	}

	// Create OpenCL buffer object.
	VelocityArray = clCreateBuffer(context, CL_MEM_READ_ONLY, 4 * sizeof(float) * NUM_PARTICLES, nullptr, nullptr);
	if(!VelocityArray)
	{
		std::cout << "Couldn't create a buffer for the particle velocities." << std::endl;
		getchar();
		return -1;
	}

	// Upload data and associate buffer with the context.
	err = clEnqueueWriteBuffer(commandQueue, VelocityArray, CL_TRUE, 0, 4 * sizeof(float) * NUM_PARTICLES, velocity, 0, nullptr, nullptr);
	if(err != CL_SUCCESS)
	{
		std::cout << "Couldn't upload particle velocities to the context." << std::endl;
		getchar();
		return -1;
	}

	// Set the kernel arguments.
	float dt = 0.03;
	clSetKernelArg(kernel, 0, sizeof(vbo), &vbo); 
	clSetKernelArg(kernel, 1, sizeof(cl_mem), &VelocityArray);
	clSetKernelArg(kernel, 2, sizeof(float), &dt);

	// start application
	glutMainLoop();
}

void RenderFunc()
{
	// Clear
	glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

	//Enable blending, set point size
	glEnable(GL_BLEND);
	glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
	glEnable(GL_POINT_SMOOTH);
	glPointSize(5.0f);

	// Bind buffer
	glBindBuffer(GL_ARRAY_BUFFER, vbo);

	// Enable drawing
	glVertexPointer(4, GL_FLOAT, 0, nullptr);
	glEnableClientState(GL_VERTEX_ARRAY);

	// Draw the array.
	glDrawArrays(GL_POINTS, 0, NUM_PARTICLES);

	// Disable drawing.
	glDisableClientState(GL_VERTEX_ARRAY);
	glBindBuffer(GL_ARRAY_BUFFER, 0);

	// Swap buffers
	glutSwapBuffers();
}

void UpdateFunc(int)
{
	glutTimerFunc(30, UpdateFunc, 30);

	// Get workgroup size
	size_t global = NUM_PARTICLES;
	size_t local;
	cl_int err = clGetKernelWorkGroupInfo(kernel, deviceID, CL_KERNEL_WORK_GROUP_SIZE, sizeof(local), &local, nullptr);
	if(err != CL_SUCCESS)
	{
		std::cout << "Couldn't get work group info!" << std::endl;
		getchar();
		exit(-1);
	}

	// Aquire position vbo.
	err = clEnqueueAcquireGLObjects(commandQueue, 1, &PositionArray, 0, nullptr, nullptr);
	if(err != CL_SUCCESS)
	{
		std::cout << "Error when aquiring OGL object." << std::endl;
		getchar();
		exit(-1);
	}

	// run kernel.
	cl_event wait;
	err = clEnqueueNDRangeKernel(commandQueue, kernel, 1, nullptr, &global, &local, 0, nullptr, nullptr);
	if(err != CL_SUCCESS)
	{
		std::cout << "Error when running kernel." << std::endl;
		getchar();
		exit(-1);
	}
	clWaitForEvents(1, &wait);
	clFinish(commandQueue);

	// release position vbo.
	err = clEnqueueReleaseGLObjects(commandQueue, 1, &PositionArray, 0, nullptr, nullptr);

	if(err != CL_SUCCESS)
	{
		std::cout << "Error when releasing OGL object." << std::endl;
		getchar();
		exit(-1);
	}

	glutPostRedisplay();
}

char* getDeviceName(cl_device_id device)
{
	char deviceName[512];
	size_t deviceNameSize;
	cl_int err = clGetDeviceInfo(device, CL_DEVICE_NAME, sizeof(deviceName), deviceName, &deviceNameSize);

	if(err != CL_SUCCESS)
	{
		return nullptr;
	}
	else
	{
		return strdup(deviceName);
	}
}

The kernel:

__kernel void particle_cycle(__global float4* pos, __read_only __global float4* vel, __read_only float dt)
{
	const size_t i = get_global_id(0);

	float4 v = vel[i];
	float4 p = pos[i];

	p += (v * dt);

	pos[i] = p;
}

 

EDIT:

I forgot my setup:

- Windows 8 Pro x64

- Visual Studio 2012 Express

- AMD APP SDK v2.8

- AMD Radeon HD 7850

 

 

Thanks a lot in advance for any help.

Edited by assainator
0

Share this post


Link to post
Share on other sites
1) clSetKernelArg(kernel, 0, sizeof(vbo), &vbo) should be: clSetKernelArg(kernel, 0, sizeof(cl_mem), &PositionArray) 2) remove clWaitForEvents(1, &wait) (it is not necessary) 3) clEnqueueNDRangeKernel can take NULL as a local size argument
2

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • By Jon Alma
      Some time ago I implemented a particle system using billboarding techniques to ensure that the particles are always facing the viewer.  These billboards are always centered on one 3d coordinate.
      I would like to build on this and use billboarding as the basis for things like laser bolts and gunshots.  Here the difference is that instead of a single point particle I now have to draw a billboard between two points - the start and end of the laser bolt for example.  I appreciate that having two end points places limits on how much the billboard can be rotated to face the viewer, but I'm looking to code a best effort solution.  For the moment I am struggling to work out how to do this or find any tutorials / code examples that explain how to draw a billboard between two points ... can anyone help?
      Thanks.
    • By Sagaceil
      It's always better to fight with a bro.
    • By recp
      Hi,
      I'm working on new asset importer (https://github.com/recp/assetkit) based on COLLADA specs, the question is not about COLLADA directly
      also I'm working on a new renderer to render (https://github.com/recp/libgk) imported document.
      In the future I'll spend more time on this renderer of course, currently rendering imported (implemented parts) is enough for me
      assetkit imports COLLADA document (it will support glTF too),
      importing scene, geometries, effects/materials, 2d textures and rendering them seems working
      My actual confusion is about shaders. COLLADA has COMMON profile and GLSL... profiles,
      GLSL profile provides shaders for effects so I don't need to wory about them just compile, link, group them before render

      The problem occours in COMMON profile because I need to write shaders,
      Actually I wrote them for basic matrials and another version for 2d texture
      I would like to create multiple program but I am not sure how to split this this shader into smaller ones,

      Basic material version (only colors):
      https://github.com/recp/libgk/blob/master/src/default/shader/gk_default.frag
      Texture version:
      https://gist.github.com/recp/b0368c74c35d9d6912f524624bfbf5a3
      I used subroutines to bind materials, actually I liked it,
      In scene graph every node can have different program, and it switches between them if parentNode->program != node->program
      (I'll do scene graph optimizations e.g.  view frustum culling, grouping shaders... later)

      I'm going to implement transparency but I'm considering to create separate shaders,
      because default shader is going to be branching hell
      I can't generate shader for every node because I don't know how many node can be exist, there is no limit.
      I don't know how to write a good uber-shader for different cases:

      Here material struct:
      struct Material { ColorOrTexture emission; ColorOrTexture ambient; ColorOrTexture specular; ColorOrTexture reflective; ColorOrTexture transparent; ColorOrTexture diffuse; float shininess; float reflectivEyety; float transparency; float indexOfRefraction; }; ColorOrTexture could be color or 2d texture, if there would be single colorOrTex then I could split into two programs,
      Also I'm going to implement transparency, I am not sure how many program that I needed

      I'm considering to maintain a few default shaders for COMMON profile,
      1-no-texture, 2-one of colorOrTexture contains texture, 3-........

      Any advices in general or about how to optimize/split (if I need) these shaders which I provied as link?
      What do you think the shaders I wrote, I would like to write them without branching if posible,
      I hope I don't need to write 50+ or 100+ shaders, and 100+ default programs

      PS: These default shaders should render any document, they are not specific, they are general purpose...
             I'm compiling and linking default shaders when app launched

      Thanks
    • By CircleOfLight97
      Hi guys,
      I would like to contribute to a game project as a developer (open source possibly). I have some experiences in C/C++ in game development (perso projects). I don't know either unreal or unity but I have some knowledges in opengl, glsl and shading theory as I had some courses at university regarding to that. I have some knowledges in maths and basic in physics. I know a little how to use blender to do modelling, texturing and simple game assets (no characters, no animation no skinning/rigging). I have no game preferences but I like aventure game, dungeon crawler, platformers, randomly generated things. I know these kind of projects involve a lot of time and I'd be really to work on it but if there are no cleary defined specific design goals/stories/gameplay mechanics I would like to not be part of it x) and I would rather prefer a smaller but well defined project to work on that a huge and not 'finishable' one.
      CircleOfLight97
    • By gamesthatcouldbeworse
      Hi, I finally released KILL COMMANDO on gamejolt for free. It is a retro-funsplatter-shooter with C64 style. Give it a try.
  • Popular Now