Jump to content

  • Log In with Google      Sign In   
  • Create Account


AI Algorithms on GPU


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
21 replies to this topic

#1 WeirdoFu   Members   -  Reputation: 205

Like
0Likes
Like

Posted 04 August 2005 - 12:55 PM

It seems that with the increasing power of GPUs on consumer graphics cards, more and more people are trying to figure out what to do with GPUs other than graphics. The scientific computing society has been extremely interested in its uses and have found that GPUs can do sort of large number of elements up to 50 times faster than a CPU can due to the built in parallelism. More on General Purpose Computation on GPU can be found here: http://www.gpgpu.org/ So, my question and idea is whether we can offload AI load from CPUs onto GPUs as well. Of course, to a certain extent it may not be practical, but it would just be one of those cool things that you can say, "hey look, my graphics card can do this too." type of deal. So, for AI in games, I guess the first question people will ask is, "Can we get a GPU to do the A* algorithm for us?" If we can, then that may be fairly interesting application wise too. I know they have pure GPU oriented particle system engines, so why not GPU based game AI engines? Steal a few cycles from the graphics and use it for better AI. So, what do people think?

Sponsor:

#2 Foxostro   Members   -  Reputation: 182

Like
0Likes
Like

Posted 04 August 2005 - 01:55 PM

The problem is that its expensive to get data off the graphics card, no matter how easy it is to put it on the card or do the operations. I don't think it will be practical, but I have no facts to back this statement up.

#3 BrianL   Members   -  Reputation: 530

Like
0Likes
Like

Posted 04 August 2005 - 02:48 PM

Your primary problem will by the types of operations the GPU can do. If you are really interested in pursuing this, I would suggest working out the simplest algorithm you can, and heading over to the graphics groups and see if anyone has any ideas.

Off hand, my guess is that your first difficulty would be handling random access needed when exploring neighbors in a search algorithm. Graphics cards are more designed for algorithms which take small, predetermined data sets and work in parallel.

#4 Anonymous Poster_Anonymous Poster_*   Guests   -  Reputation:

0Likes

Posted 04 August 2005 - 02:58 PM

A friend of mine showed me this:

http://graphics.cs.uiuc.edu/~jch/papers/gpucluster.pdf

A paper where they run the k means clustering algorithm on a gpu and get 1.5x - 3x speed up. It's pretty cool but then I showed him this:

http://www.cs.cmu.edu/~dpelleg/download/kmeans.pdf

A paper where through clever data structures they speed up the same algorithm 170x

An easier way to take advantage of hardware acceleration in AI type algorithms is to use matrix math instead of loops where ever possible, then use a hardware accelerated matrix library (like a machine specific BLAS library). Matlab programmers are really good at this and call it "vectorizing" an algorithm. It also often times produces really elegant code. I'm not sure about A*, but a lot of neural network operations, for example, can be written in terms of matrix multiplications.

#5 Anonymous Poster_Anonymous Poster_*   Guests   -  Reputation:

0Likes

Posted 04 August 2005 - 03:03 PM

The A* algorithm is actually pretty weel suited for gpu programming, by the way it looks at its neighboors. If you could fit map information til fit a float texture, this could and by my knowledge be done. But then again the trick by offloading the cpu by calculating stuff on the gpu, is great because then you have more power for the AI..

#6 Extrarius   Members   -  Reputation: 1412

Like
0Likes
Like

Posted 04 August 2005 - 03:29 PM

I'd be pretty impressed if you could get a full A* algorithm working on the graphics card, but considering a real implementation should be able to handle an arbitrary graph (and not just a square grid), I don't imagine it'd run much faster.

#7 pinacolada   Members   -  Reputation: 834

Like
0Likes
Like

Posted 04 August 2005 - 04:03 PM

Quote:
Original post by Foxostro
The problem is that its expensive to get data off the graphics card, no matter how easy it is to put it on the card or do the operations. I don't think it will be practical, but I have no facts to back this statement up.


I don't know if that will be a huge problem. An A* algorithm only needs to return the completed path to be useful, and you can probably store the path in something like 500 bytes. An enormously complicated neural network only needs to return the values of the final layer. In general, the data that represents "what action should the AI take" is going to be pretty small. So as long as it's cheap for the GPU to send us back 1k or so of data, we'll be alright.

#8 Foxostro   Members   -  Reputation: 182

Like
0Likes
Like

Posted 04 August 2005 - 04:14 PM

Quote:
Original post by pinacolada
Quote:
Original post by Foxostro
The problem is that its expensive to get data off the graphics card, no matter how easy it is to put it on the card or do the operations. I don't think it will be practical, but I have no facts to back this statement up.


I don't know if that will be a huge problem. An A* algorithm only needs to return the completed path to be useful, and you can probably store the path in something like 500 bytes. An enormously complicated neural network only needs to return the values of the final layer. In general, the data that represents "what action should the AI take" is going to be pretty small. So as long as it's cheap for the GPU to send us back 1k or so of data, we'll be alright.


Alright, nevermind then. I hope this idea pans out! :)

#9 Sneftel   Senior Moderators   -  Reputation: 1776

Like
0Likes
Like

Posted 04 August 2005 - 04:39 PM

A* is not an algorithm that lends itself to parallelization, so it's unlikely that a GPU-based solution would be at all performant.

IMHO, the "Stupid GPU Tricks" fad is one that is fading, fast. The bottom line is that a graphics card is designed for graphics. It's more programmable than it used to be, but the transistor layout is still one that's optimized for the rasterization pipeline and not for other things. The need for generalized parallel processing hardware on consumer-level computers is clear, but the GPU is not the chip that will fill this niche. Time spent on bludgeoning a GPU into executing algorithm XYZ is time wasted.

#10 WeirdoFu   Members   -  Reputation: 205

Like
0Likes
Like

Posted 04 August 2005 - 05:10 PM

Well, in many aspects, if memory doesn't fail me, some modern GPUs have higher transitor counts than many CPUs currently on the market, but I guess the main issue in trying to "bludgeon" a GPU into doing what you want has to do with the fact that you need to first "bludgeon" programmers into thinking in a different way.

Architecturally, a GPU operates on the opposite extreme of most general purpose CPU, that is if memory surves me correctly. Where standard CPUs are constantly swapping operators, the GPU is constantly swapping data (streaming). So, programming for the two require the programmers to think differently.

Currently, it seems that mainly algorithms that can be massively parallelized will get benefits of GPU speed up, since your data is actually loaded in the form of a texture and them modified by a fragment program, which runs parallel for all pixels in the texture. So, I guess A* may not be the best candidate out there, but who knows, with some twisting of things, someone might be able to think differently and solve it in some strange convoluted way.

I guess the main reason I started this thread was to see how people thought about the possibility, whether its there or not, and just gauge reaction. As for anything concrete, I really don't have anything solid myself, which is something I'll probably work on in the future.

#11 Sneftel   Senior Moderators   -  Reputation: 1776

Like
0Likes
Like

Posted 04 August 2005 - 06:08 PM

Quote:
Original post by WeirdoFu
Well, in many aspects, if memory doesn't fail me, some modern GPUs have higher transitor counts than many CPUs currently on the market, but I guess the main issue in trying to "bludgeon" a GPU into doing what you want has to do with the fact that you need to first "bludgeon" programmers into thinking in a different way.

Not really. Perhaps game programmers, or PC application developers in general, but they're hardly the totality of computer science. People have been making algorithms specifically for SIMD architectures for decades. Simply parallelizing an algorithm, however, is not the end of the battle. You still need to deal with the fact that the GPU expects to work on 4-element floating point vectors; that it wants to perform triangle rasterization; that it expects to only perform random access in the form of "textures"; that it's unable to perform any cross-core communication. None of these limitations are inherent to SIMD architectures, and none of them needs to be present in a consumer-level SIMD chip.

If you're interested in path planning on parallel architectures in general, I think that's a very noble goal. A quick Google Scholar search turns up a few links; it seems that most researchers have approached the problem using the potential field approach, which intuitively is more parallelizable than A*.

Quote:
I guess the main reason I started this thread was to see how people thought about the possibility, whether its there or not, and just gauge reaction. As for anything concrete, I really don't have anything solid myself, which is something I'll probably work on in the future.

My reaction: I just got back from the SIGGRAPH conference. The last couple of years, there was an imperial buttload of papers and sketches and posters and courses on GPGPU. This year, there were noticeably fewer. Personally, I think researchers are thinking about the cell processor, and about how much they hate having to think about pixel shaders in order to write database software. Mark my words: within five years, we'll see multi-core processing on Intel and AMD processors. Get ready for that.

#12 Programmer16   Crossbones+   -  Reputation: 1911

Like
0Likes
Like

Posted 04 August 2005 - 06:13 PM

idea patent pending :D

#13 xEricx   Members   -  Reputation: 564

Like
0Likes
Like

Posted 05 August 2005 - 03:12 AM

Just my 2 cents...

In the game we're currently working on, its often the GPU who's falling back... so having it doing more computations would just be silly.

With next gen consoles, I really don't see why we would want to compute things on the GPU, and as for PC, there's too many different hardware to really rely on it.

Eric

#14 WeirdoFu   Members   -  Reputation: 205

Like
0Likes
Like

Posted 05 August 2005 - 06:36 AM

Maybe we're approaching this from the wrong angle. Up to now, we've thought about how to adopt current AI algorithms for use on a GPU, but what about a graphical effect that can be enhanced with AI that ran on a GPU?

If we can build a complete particle system engine that is GPU based, then why not start from there and say let's improve on the particle and make "smart" particles? So, let's say we use the particle system in a slightly different way. Let's say we have a game where nanotech is used and the players can use a system called "smart lighting dust." So, basically, you can then create a particle system of light sources that swarmed around the level or around the player highlighting certain things or increasing visibility in general.

The original starting point of GPGPU was not just trying to get the GPU to do things other than graphics, but rather, can we piggy-back things onto our graphics that benefit the overall cause. So, instead of thinking that we want to translate AI stuff onto a GPU, why not think whether we can piggy-back AI stuff into graphics itself. So, maybe "smart graphics" or "smart textures" even.

#15 Rattrap   Members   -  Reputation: 1477

Like
0Likes
Like

Posted 05 August 2005 - 07:09 AM

If I remember correctly, the slowdown from reading from a graphics card is less of a problem on the pci-express cards. The pci-express port was developed with 2-way communication in mind.

#16 AndyTX   Members   -  Reputation: 802

Like
0Likes
Like

Posted 05 August 2005 - 07:38 AM

Quote:
Original post by Sneftel
You still need to deal with the fact that the GPU expects to work on 4-element floating point vectors; that it wants to perform triangle rasterization; that it expects to only perform random access in the form of "textures"; that it's unable to perform any cross-core communication. None of these limitations are inherent to SIMD architectures, and none of them needs to be present in a consumer-level SIMD chip.

This is merely an argument to use Sh or Brook to abstract that layer though. Both of these can target GPUs and provide a generalized stream processing interface. I also wouldn't be at all surprised if both of these can target Cell, multi-core, etc. in the future as well.

Quote:
Original post by Sneftel
Mark my words: within five years, we'll see multi-core processing on Intel and AMD processors. Get ready for that.

... both companies already have multi-core processors on the market. I don't think there has been any doubt of this evolution for the past few years at least. Still, even multi-core processors are not designed as efficient stream processing engines (like the Cell's SPEs). We do need generalized stream processing/DSP-like things in modern computers, that's for sure. Whether that will end up being provided by some evolution of GPUs, or built into the hardware is really immaterial. Either GPUs will become more general and not be just "GPUs" any more, or we'll convert to something like the Cell architecture and probably won't need GPUs any more.

In either case, GPUs are here now, they're cheap and fast stream processors, and thus I see no reason not to continue research using them... If a large chunk of the research was about finding ways around GPU limitations that would not exist in generalized DSPs, then maybe there'd be something to complain about (research probably unless in the next five years). However I don't think that this is the case at all: the problems that we're hitting now with GPUs are general stream processing and parallel algorithm problems for the most part. Sh and Brook already solve 99% of the interface problems.



#17 Sneftel   Senior Moderators   -  Reputation: 1776

Like
0Likes
Like

Posted 05 August 2005 - 06:51 PM

Quote:
Original post by AndyTX
Quote:
Original post by Sneftel
You still need to deal with the fact that the GPU expects to work on 4-element floating point vectors; that it wants to perform triangle rasterization; that it expects to only perform random access in the form of "textures"; that it's unable to perform any cross-core communication. None of these limitations are inherent to SIMD architectures, and none of them needs to be present in a consumer-level SIMD chip.

This is merely an argument to use Sh or Brook to abstract that layer though. Both of these can target GPUs and provide a generalized stream processing interface. I also wouldn't be at all surprised if both of these can target Cell, multi-core, etc. in the future as well.

Brook and Sh handle the annoying programming interface issues, but not the underlying hardware design. They can't add in features the card doesn't have, such as inter-fragment communication. If you want to do something parallelizable and future-proof, you'd be better off with MPI (though of course these are less for stream processing).

Quote:
Original post by Sneftel
Mark my words: within five years, we'll see multi-core processing on Intel and AMD processors. Get ready for that.

... both companies already have multi-core processors on the market. I don't think there has been any doubt of this evolution for the past few years at least.[/quote]
I think I was unclear there... when I say "multicore", I'm not talking about hyperthreading or the AMD X2. I'm talking about the sort of stream processing that GPUs and the Cell processor can do. Still, I think we're mostly on the same page here.



#18 Anonymous Poster_Anonymous Poster_*   Guests   -  Reputation:

0Likes

Posted 06 August 2005 - 12:40 AM

Quote:
Original post by Sneftel
A* is not an algorithm that lends itself to parallelization, so it's unlikely that a GPU-based solution would be at all performant.

IMHO, the "Stupid GPU Tricks" fad is one that is fading, fast. The bottom line is that a graphics card is designed for graphics. It's more programmable than it used to be, but the transistor layout is still one that's optimized for the rasterization pipeline and not for other things. The need for generalized parallel processing hardware on consumer-level computers is clear, but the GPU is not the chip that will fill this niche. Time spent on bludgeoning a GPU into executing algorithm XYZ is time wasted.



Maybe doing multiple A* in parallel ??? Some of the cards have 60000 instruction shader scripting (Im not sure if the instruction set is adaquate tho... and the PCIX speeds up constant updates of 'map' texture data in graphics memory (so much that there was some talk of going back to 'system' memory ....).


Influence maps a good possibility ??


#19 Borundin   Members   -  Reputation: 136

Like
0Likes
Like

Posted 06 August 2005 - 09:29 AM

Well there seems to be some pathfinding written already as pixel shaders. See Pathfinding on the GPU on the shadertech.com site (source is available).

Also found another one on their first page but havent tried that one yet.



#20 AndyTX   Members   -  Reputation: 802

Like
0Likes
Like

Posted 08 August 2005 - 04:19 AM

Quote:
Original post by Sneftel
Brook and Sh handle the annoying programming interface issues, but not the underlying hardware design. They can't add in features the card doesn't have, such as inter-fragment communication.

Well excepting perhaps "scatter" (which can still be done on the GPU with various trickery, although not altogether efficiently), all of the standard stream processing operations CAN be done on the GPU. I'm not sure what you're referencing with inter-fragment communication since to my knowledge NOT allowing that is a fundimental property that differentiates standard CPUs and DSPs (perhaps THE most important property). If you're speaking of things like "reductions", those CAN be implemented efficiently on a GPU (and are in Brook IIRC and soon Sh).

Quote:
Original post by Sneftel
I think I was unclear there... when I say "multicore", I'm not talking about hyperthreading or the AMD X2. I'm talking about the sort of stream processing that GPUs and the Cell processor can do. Still, I think we're mostly on the same page here.

I won't disagree with you here. Like I said, either the GPU will morph into a general stream processor (and then optionally be integrated into the motherboard - that detail is really insignificant to our development efforts), or something more general will come along and be able to take over the role that the GPU is playing right now.

In any case I think we all agree that stream/parallel processing is becoming more and more important. Right now, I'd still argue though that GPUs are relatively cheap, powerful and readily available stream processors that - with the aid of Sh and/or Brook - are easily used as such.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS