Jump to content

  • Log In with Google      Sign In   
  • Create Account

jmakitalo

Member Since 05 Apr 2006
Offline Last Active Oct 15 2014 07:19 AM

Topics I've Started

Threads with SDL within C++ classes

24 September 2014 - 07:26 AM

I'm planning on trying out SDL 1.2 threads for my cross-platform game to run some specific tasks. At first a terrain data streamer (even though there may be better ways to do this directly). I'm trying first to figure out how to make a C++ class based design so that the required mutexes would be contained within the class and it would be safe to use the class interface.

 

Below I tried to envision a class for processing some data (here just an array of integers) by a separate thread. The idea is that the main thread creates an instance of the class and then requests work to be done by method requestWorkToBeDone(). The main thread can then ask for the status of the work by getStatus() and acquire the processed data by getData(). The class should be responsible for appropriate mutexes.

enum EStatus {
 statusIdle,
 statusWorking,
 statusDone
};

class MyClass
{
public:
 ~MyClass();
 bool init(); // Create thread and mutex, allocate data.

 static int threadFuncMediator(void *p) // called by SDL_CreateThread to run threadFunc() in separate thread.
 {
  return (static_cast<MyClass*>p)->threadFunc();
 }

 void getStatus(EStatus &_status) const
 {
  SDL_mutexP(mutex);
  _status = status;
  SDL_mutexV(mutex);
 }

 bool requestWorkToBeDone(int someParameter)
 {
  SDL_mutexP(mutex);
  if(status!=statusIdle){
   SDL_mutexV(mutex);
   return false;
  }

  status = statusWorking;

  // Use someParameter to specify work ...

  SDL_mutexV(mutex);

  return true;
 }

 bool getData(int *_data)
 {
  SDL_mutexP(mutex);
  if(status!=statusDone){
   SDL_mutexV(mutex);
   return false;
  }
  memcpy(_data, data, sizeof(int)*numData);
  status = statusIdle;
  SDL_mutexV(mutex);

  retrurn true;
 }
private:
 SDL_Thread *thread;
 SDL_mutex *mutex;

 EStatus status;

 int numData;
 int *data;

 // This runs in a thread created by init() via threadFuncMediator().
 // Calls getStatus() to see if work should be done.
 // When work is done, calls setStatus(statusDone);
 int threadFunc();
};

The thing I'm worried is that threadFunc() also has to call getStatus() to see if it should do work or not. But the instance of the class is owned by the main thread, so is this function call itself safe, eventhough data handling within the function is handled with mutexes?


Advanced heightfields in Bullet physics

16 September 2014 - 01:19 AM

At the moment I'm using Bullet for physics and collision detection with a basic heightfield terrain. The heightmap is 2048*2048 and so cannot yield very high resolution for my roughly 8 km x 8 km terrain.

 

I'm using a vertex shader for some portions of the terrain, which locally masks another heightmap, which is repeated 100 times, over the terrain. The heightmaps and mask are read in vertex shader to displace the terrain grid. The grid itself can be quite dense, so that the additional information from the masked heightmap is visibly utilized. This way I can achieve nice bumpy appearance for, e.g., forests and fields.

 

The problem is communicating this masked heightfield to the physics part of the game. My quick hack was to make a higher resolution map, say 4096x4096, where the two heightmaps are combined, and pass it to the Bullet physics. Of course this has still insufficient resolution in some cases and it wastes memory.

 

I started wondering if it would be easy to modify Bullet's btHeightfieldTerrainShape.cpp module to allow for such additional masked heightmap. A quick look gave me the idea that Bullet just samples the given heightmap on demand and doesn't do any costly pre-processing or caching. This suggests that I could add my masked heightmap easily by modifying getRawHeightFieldValue function and storing the other heightmap and mask in btHeightfieldTerrainShape. Can anyone confirm this?

 

Another related topic is that in the future, I may want to divide my terrain into, say 1 km x 1 km blocks that have their own heightmaps. The heightmaps would then be stored at varying resolutions and streamed from the disk. I would need to be able to update the Bullet heightfield quickly after a stream is complete and this requires that Bullet does not do any heavy processing on heightfield updates. Has anyone experience with anything similar?

 

Thank you in advance.


Efficient instancing in OpenGL

30 June 2014 - 06:32 AM

The game I'm working on should be able to render dense forests with many trees and detailed foliage. I have been using instancing for drawing pretty much everything, but even so, I have lately hit some performance issues.

 

My implementation is based on storing instance data in uniforms. I restrict the object transformations so that only translation, uniform scale and rotation along one axis are allowed. For the rotation part, I pass sin(angle) and cos(angle) as uniforms. Thus 6 floats are passed per instance. This way, I can easily draw 256 instances at once by invoking glUniform4fv, glUniform2fv and glDrawElementsInstancedBaseVertex per batch. The particular draw command is used, because I use large VBO:s that store multiple meshes.

 

Lately I have noticed, that the performance is too low for my purposes. I used gDebugger in an attempt to finding the bottleneck. The FPS count was initially roughly 40. Lowering texture resolution had no effect. Disabling raster commands had negligible effect. Disabling draw commands boosted FPS to over 100. Thus I guess the conclusion is that the excecution is not CPU nor raster operation bound, but has to do with vertex processing.

 

I'm also using impostors for the trees, and level of detail for the meshes, but I have the feeling that I should be able to draw more instances of the meshed trees than what I'm currently able to. I actually had quite ok FPS of 80 with just the trees in place, but adding the foliage (a lot of instances of small low poly meshes) dropped the FPS to 40. Disabling either the trees or the foliage increases the FPS significantly. Disabling the terrain, which uses a lot of polygons, has no effect, so I think the issue is not being just bound by polygon count.

 

Could it be that uploading the uniform data is the limiting factor?

 

For some of the instanced object types, such as the trees, the transformation data is static and is stored in the leaf nodes of a bounding volume hierarchy (BVH) in proper arrays, so that glUniform* can be called without further assembly of data. It would then make sense to actually store these arrays in video memory. What is the best way to do this these days? I think that VBO:s are used in conjuction with glVertexAttribDivisor. To me this does not seem very neat approach, as "vertex attributes" are used for something that are clearly of "uniform" nature. But anyway, I could then make a huge VBO for the entire BVH and store a base instance value and number of instances for each leaf node. To render a leaf node, I would then use glDrawElementsInstancedBaseVertexBaseInstance. This is GL 4.2 core spec. which might be a bit too high. Are there better options? I also have objects (the foliage), for which the transformation data is dynamic (updated occasionally), as they are only placed around the camera. What would be the best way to store/transfer the transformation data in this case?

 

Thank you in advance.


Gravity: Old school arcade game

08 March 2014 - 08:44 AM

I do a lot of 3D graphics and game programming and have been working on a rather large project for a few years now. Around a week back, however, I started pondering if I could make a simple yet entertaining game in a few evenings. I keenly remember a game called Lunar lander from the past, which I liked a lot. I wanted to make something similar, but with a twist or two.

 

The result is a game called Gravity, where you fly a little space ship over the surface of a planet, trying to collect valuable diamonds, emeralds and the kind. But it is not made easy: you have to fight gravity and various enemy ships, whose quantity increases as you proceed. Collect all the stars in a level to earn extra score. How high a score can you get?

 

Below is a picture. However, the graphics are simple on purpose, test the game itself to see if it's any good!

 

gravity.png

 

 

Download the game from here . At the moment binaries are built for Windows and linux (64-bit only).

 

Gamepads are also supported, but as of yet cannot be configured.

 

I highly appreciate your comments and suggestions.


Compressed texture array

25 August 2013 - 06:16 AM

I'm trying to create a texture array of DXT1/DXT3/DXT5 compressed images loaded from separate dds files (compressed, with mipmaps). I load the dds file with a function

unsigned char *load_dds(const char *filename, int &w, int &h, int &nmipmaps, GLenum &format, int &size);

where format is GL_COMPRESSED_RGBA_S3TC_DXT*_EXT and size the the total size of the data (including mipmaps). I have tested this for basic 2D textures and it seems to work.

 

I then create the texture and allocate data for all layers by calling glCompressedTexImage3D with no input data:

GLenum target = GL_TEXTURE_2D_ARRAY;
glCompressedTexImage3D(target, 0, format, w, h, nlayers, 0, size, 0);

This gives "invalid value" error. I'm not sure what size should reflect here. Should it include mipmaps and all layers? Is it allowed here to give NULL data pointer?

 

After allocating the data, I call for each image i

int mipWidth = w;
int mipHeight = h;
int blockSize = (format == GL_COMPRESSED_RGBA_S3TC_DXT1_EXT) ? 8 : 16;
int offset = 0;

for(int mip = 0; mip < numMipmaps; ++mip){
   int mipSize = ((mipWidth + 3) / 4) * ((mipHeight + 3) / 4) * blockSize;

  glCompressedTexSubImage3D(target, mip, 0, 0, i, mipWidth, mipHeight, 1, format,
                                    mipSize, data + offset);

  mipWidth = max (mipWidth >> 1, 1);
  mipHeight = max (mipHeight >> 1, 1);

  offset += mipSize;
}

where data is pointer returned by load_dds. This generates error "invalid operation". Is it necessary to call glCompressedTexImage3D prior to glCompressedTexSubImage3D to allocate the memory? What else could be wrong here? I would appreciate any hints.


PARTNERS