# Stupid Things I've Done

This topic is 810 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

I was writing a collision detection thingy and I was checking tiles against my cursor which only moves 50px at a time (perfect collisions based on x and y comparison). Anyway, I made an extra condition that I didn't need:

if (collision)
{
can_not_place_here = 1
}
else if (not collision)
{
can_not_place_here = 0
}

This works so long as you're only hovering your cursor over the first (or was it last) object to be created. If one tile is not under your cursor, and it's the last one you check then you're getting the wrong feedback. I removed the extra condition when I realised that the only condition to check for is does it collide? "Does it not collide" is not required. After checking if the value is 1, I then reset the nocando value to 0 once the relevant stuff has been checked for and everything processed. Then it loops again the next program cycle.

Another thing: using a class member function, I cannot refer to 'x' or 'y' for some reason... I have to use SDL_RECT.x and SDL_RECT.y I'll figure that out eventually! By far the funniest problem I've ever had was using Anjuta in Ubuntu. The IDE was so bad, they actually killed the window that lets you link libraries so I couldn't figured out how to link SDL when I got the broken Anjuta. I got into using Terminal after that but that wasn't my cup of tea! It didn't take me long to find that Anjuta is garbage. I use codeblocks for everything now and I can solve my own problems. It took me a few hours the other day, to work out that my x y coordinates were wrong. I was getting numbers in the 70,000 range and I knew that I hadn't placed anything that far off my screen!

##### Share on other sites

This one in particular I didn't do, but it was in the existing codebase of an image processing library that I'm still working on right now. My task is to take this library and move it to execute on CUDA. I tried to keep all the code changes to a bare minimum. This is understandable because some of those algorithms are supremely complicated. Haralick features are on their own a project that could be taken up for a two semester senior project.

I've managed to get one of the algorithms to run on CUDA and process the images. It's running through fine but it's failing to provide the correct values. It's not crashing, it's just that every value that comes out of it is 0. I can already see this is going to be a pain in the ass because I haven't modified this algorithm in any way. Stepping through the debugger with two instances of the code in parallel (the CPU version, which worked fine, and the GPU version, which gave me 0s for all the values), I finally encountered the culprit:

size2 = size2++;


Looking at it, I actually wondered how this would be executed. I assumed that first the value of size2 would be assigned to itself, and then incremented, which is I guess what happened in the CPU. Alternately, even if the ++ was evaluated first, still size2 + 1 would have been used to overwrite the old value in size2. Those two ways would have made sense to me; however, the GPU had an even better way to do it. I haven't disassembled the code because, quite frankly, I didn't really care to visualize what really has happened, but I guess behind the scenes, the GPU must have cached the value of size2, incremented the original value of size2, and then evaluate the expression, ergo, overwriting the incremented value with the original value (size2 of course being initialized to 0 means size2 will always be 0).

Either way, this is an entirely redundant piece of code and serves only to confuse.

Okay, while I'm here, maybe I'll rant a little bit (>.>). Of course this isn't the only silliness that happened in these algorithms. Sometimes I wonder if people just got bored when they were coding and just brute forced a solution that made the code work. For instance, take this function:

double imgmoments(pix_data *pixels, int width, int height, int x, int y)
{
double *xcoords,sum;
xcoords=new double[width*height];
int row,col;
/* Generate a matrix with the x coordinates of each pixel. */
for (row=0;row<height;row++)
for (col=0;col<width;col++)
xcoords[row*width+col]=pow((double)(col+1),(double)x);

sum=0;
/* Generate a matrix with the y coordinates of each pixel. */
for (col=0;col<width;col++)
for (row=0;row<height;row++)
{
if (y!=0)
{  if (x==0) xcoords[row*width+col]=pow((double)(row+1),(double)y);
else
xcoords[row*width+col]=pow((double)(col+1),(double)y)*xcoords[row*width+col];
}
sum+=xcoords[row*width+col]*get_pixel(pixels, width, height, col, row, 0).intensity;
}

delete xcoords;
return(sum);
}


At first glance, this isn't necessarily a bad solution. If the second loop refers to multiple values calculated in the first loop on a single iteration, you maybe don't want to recalculate the pow function too many times so you might look at it, assume the programmer knew what he/she was just caching the values to avoid redundant calculations, and leave it alone. My task, however, is to move the code to CUDA because the algorithms are slow as balls, so MY ultimate goal is actually optimization. Allocating memory in code that's running on GPU is kind of detrimental in that regard, so of course I get skeptical when I see code that looks like it's allocating dynamic memory the size of the whole image in doubles in a function that appears to be executed for every pixel in the image. Looking through the second loop, I tracked down the references to xcoord. There are 4. The array might have been worth keeping around if the algorithm addressed multiple different values of xcoords on a single iteration (even then it probably wouldn't be), but every access of xcoords happens at [row*width + col], so.... literally the current coordinate, every time. To make sure I wasn't losing my mind, I've copied the codebase over, modified the function to replace the whole dynamic array shtick with a single local variable in the inner for loop, and ran both the old and new functions on an image to make sure the moments are the same, which they were.

The whole codebase was kind of silly like that. I've encountered gotos, functions that spanned over 300 lines, sometimes both gotos and big-ass functions.

I've fought with the code structure since I started this project. The ImageMatrix class that represented the images loaded, contained over 20 methods. Whoever wrote it has never heard of the term "encapsulation" because the class's member variables were all public. I've also ended up arguing with my professor multiple times about this because his argument is "code structure is not important to it running fast or on cuda" What's actually a waste of my time, and anyone else's who worked on this trash, is when I need to rearrange the code you've written (or let other students write, I mean it had multiple authors), and I have to effectively take every noodle of your spaghetti and separate it from the rest before being able to see how it's going to run on the GPU.

Okay, rant over XD Nevertheless, once I've actually got everything up and running, the project isn't too bad. I'm still doing a bunch of copy-pasta, but it's not too bad, and the code is fairly important in the fields that use them. These algorithms provide a lot of useful information about images, with the volume of data for processing we get (like some of those really big telescopes that take monstrous amount of images of the cosmos every night), parallelizing them is a nice step forward.

1. 1
2. 2
Rutin
19
3. 3
4. 4
5. 5

• 13
• 26
• 10
• 11
• 9
• ### Forum Statistics

• Total Topics
633736
• Total Posts
3013601
×