Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 15 May 2010
Offline Last Active Oct 22 2016 09:57 PM

Posts I've Made

In Topic: Stupid Things I've Done

26 September 2016 - 05:00 PM

This one in particular I didn't do, but it was in the existing codebase of an image processing library that I'm still working on right now. My task is to take this library and move it to execute on CUDA. I tried to keep all the code changes to a bare minimum. This is understandable because some of those algorithms are supremely complicated. Haralick features are on their own a project that could be taken up for a two semester senior project. 


I've managed to get one of the algorithms to run on CUDA and process the images. It's running through fine but it's failing to provide the correct values. It's not crashing, it's just that every value that comes out of it is 0. I can already see this is going to be a pain in the ass because I haven't modified this algorithm in any way. Stepping through the debugger with two instances of the code in parallel (the CPU version, which worked fine, and the GPU version, which gave me 0s for all the values), I finally encountered the culprit:

size2 = size2++;

Looking at it, I actually wondered how this would be executed. I assumed that first the value of size2 would be assigned to itself, and then incremented, which is I guess what happened in the CPU. Alternately, even if the ++ was evaluated first, still size2 + 1 would have been used to overwrite the old value in size2. Those two ways would have made sense to me; however, the GPU had an even better way to do it. I haven't disassembled the code because, quite frankly, I didn't really care to visualize what really has happened, but I guess behind the scenes, the GPU must have cached the value of size2, incremented the original value of size2, and then evaluate the expression, ergo, overwriting the incremented value with the original value (size2 of course being initialized to 0 means size2 will always be 0).


Either way, this is an entirely redundant piece of code and serves only to confuse.


Okay, while I'm here, maybe I'll rant a little bit (>.>). Of course this isn't the only silliness that happened in these algorithms. Sometimes I wonder if people just got bored when they were coding and just brute forced a solution that made the code work. For instance, take this function:

double imgmoments(pix_data *pixels, int width, int height, int x, int y)
   double *xcoords,sum;
   xcoords=new double[width*height];
   int row,col;
   /* Generate a matrix with the x coordinates of each pixel. */
   for (row=0;row<height;row++)
     for (col=0;col<width;col++)

   /* Generate a matrix with the y coordinates of each pixel. */
   for (col=0;col<width;col++)
     for (row=0;row<height;row++)
        if (y!=0)
        {  if (x==0) xcoords[row*width+col]=pow((double)(row+1),(double)y);
        sum+=xcoords[row*width+col]*get_pixel(pixels, width, height, col, row, 0).intensity;

   delete xcoords;

At first glance, this isn't necessarily a bad solution. If the second loop refers to multiple values calculated in the first loop on a single iteration, you maybe don't want to recalculate the pow function too many times so you might look at it, assume the programmer knew what he/she was just caching the values to avoid redundant calculations, and leave it alone. My task, however, is to move the code to CUDA because the algorithms are slow as balls, so MY ultimate goal is actually optimization. Allocating memory in code that's running on GPU is kind of detrimental in that regard, so of course I get skeptical when I see code that looks like it's allocating dynamic memory the size of the whole image in doubles in a function that appears to be executed for every pixel in the image. Looking through the second loop, I tracked down the references to xcoord. There are 4. The array might have been worth keeping around if the algorithm addressed multiple different values of xcoords on a single iteration (even then it probably wouldn't be), but every access of xcoords happens at [row*width + col], so.... literally the current coordinate, every time. To make sure I wasn't losing my mind, I've copied the codebase over, modified the function to replace the whole dynamic array shtick with a single local variable in the inner for loop, and ran both the old and new functions on an image to make sure the moments are the same, which they were.


The whole codebase was kind of silly like that. I've encountered gotos, functions that spanned over 300 lines, sometimes both gotos and big-ass functions. 


I've fought with the code structure since I started this project. The ImageMatrix class that represented the images loaded, contained over 20 methods. Whoever wrote it has never heard of the term "encapsulation" because the class's member variables were all public. I've also ended up arguing with my professor multiple times about this because his argument is "code structure is not important to it running fast or on cuda" What's actually a waste of my time, and anyone else's who worked on this trash, is when I need to rearrange the code you've written (or let other students write, I mean it had multiple authors), and I have to effectively take every noodle of your spaghetti and separate it from the rest before being able to see how it's going to run on the GPU. 


Okay, rant over XD Nevertheless, once I've actually got everything up and running, the project isn't too bad. I'm still doing a bunch of copy-pasta, but it's not too bad, and the code is fairly important in the fields that use them. These algorithms provide a lot of useful information about images, with the volume of data for processing we get (like some of those really big telescopes that take monstrous amount of images of the cosmos every night), parallelizing them is a nice step forward.

In Topic: Variable range rope physics.

12 November 2015 - 07:33 PM

You know, I can't actually remember if we did bother with that. I think I still have the code on a branch somewhere so I might just go back and test it to see, but I think we'll roll with this setup from here forward.

In Topic: Variable range rope physics.

10 November 2015 - 07:16 PM

So we were never able to fix the bug with the rope using hinge joints, which is unfortunate because we like the rope's flimsiness when made out of discrete components. What we ended up doing is getting a static image for the rope , stretching it as the player changes the tongue's length. It actually looks moderately decent but still requires some work. The most important thing is that they are stable. What I have also tried just drawing a curved rope based on player's velocity , but it just ended up making the tongue Iook like a PVC pipe.

In Topic: Variable range rope physics.

04 November 2015 - 07:48 PM

That's the issue I'm facing. Under static conditions, the rope works ok. If a player just hangs off of it and nothing changes about the situation, Unity will be able to resolve the simulation correctly. The issue arises when the player's rope/tongue either attaches to a moving platform or gets pulled by some other object like the floating platform in the game prototype above. Unfortunately, that's the main method of locomotion in my game and I'm having a really hard time finding a solution to it.

In Topic: Singletons and Game Dev

01 November 2015 - 01:42 AM

Many successful products have shipped on terrible source code. A finished product built on a house of cards is always better than an unfinished product built on an ivory tower.


Heh, that reminds me of the lugaru source: http://hg.icculus.org/icculus/lugaru/file/97b303e79826/Source/GameTick.cpp#l7276


I guess it really comes down to a works or not result.



Global variables create hidden dependencies, which destorys your capacity to analyse access patterns and control their scheduling.
Hidden dependencies are also evil for countless other reasons too sad.png


I've been feeling this pain ever since I started working at my current job. It's a web application for which the client is written in Javascript. The way it behaves is really unpredictable. A lot of functions have really nasty side effects, and of course no dependencies are made obvious through the function interface because all data is stored in a global variable called "locals" (an oxymoron of the highest order). At least I can grep the source code to find instances of its use, but it's still not a very good "technique", merely kind of a means for brute-force debugging.