Selecting units in a 3d evironment

Started by
8 comments, last by DiegoFloor 11 years, 9 months ago
I'm making an strategy game using openGL on Android.

It is time to upgrade my unit selection code, which now consists of simple ray to plane intersection.

What I did the last time is use my own math functions and transformations to figure out which unit the user is clicking.

So the question arises: Once I've rendered a scene, is there an easy way to figure out which the unit the user is tapping?

I'm not using any engine here, just openGL on android with my home-brew math library.

My Oculus Rift Game: RaiderV

My Android VR games: Time-Rider& Dozer Driver

My browser game: Vitrage - A game of stained glass

My android games : Enemies of the Crown & Killer Bees

Advertisement
There's two ways, as far as I know.

The first, revert the clicked screen coordinate to world coordinates by using your projection in reverse. This is done by inverting the modelviewprojection matrix, which if you aren't familiar with it is going to be tricky. Then use the world coordinate to look up the nearest unit.

The second and simpler approach is to have an ID buffer. Objects (or their selection area) should render a uint unique ID to a buffer, which you then sample at the point you click with glReadPixels. Then just have a lookup of the ID to get the object.
The first, revert the clicked screen coordinate to world coordinates by using your projection in reverse. This is done by inverting the modelviewprojection matrix, which if you aren't familiar with it is going to be tricky. Then use the world coordinate to look up the nearest unit.
Why invert? blink.png

What I do sounds very similar to what he is already doing: computing the camera matrix (inverse view matrix, without inverting,) this produces the world space position as well as the basis vectors for "camera." When I have window coordinates to pick, I convert them to normalized device coordinates then use the world-space information + projection information (i.e. near plane distance) to construct a ray, then perform a search to find the hit. If scaling is involved in the camera matrix calculation, I use the properties of matrices to avoid normalizing the basis / direction vectors.

I like this approach because I need the world-space camera information for other functions such as culling, billboarding, etc.

If you need pixel perfect picking then the selection buffer approach described by Zoomulator is a simple solution. Though if the user does not tap exactly on a pixel covered by the unit it will not register a hit.
Stop twiddling your bits and use them already!
The inverse matrix exactly reverses the transformations you did to get the screen coordinates. I forgot to mention that you also need the depth buffer value as the z coordinate in this case and normalizing the xy coordinate which you mentioned..

I also realised that it's probably just the view-projection you'd want to get the coordinate. The model matrices can be left out..*

A single matrix multiplication and you're done, as long as you got the inverting right.
VPmatrix * WorldPos = NormScreenCoord
InverseVPmatrix * NormScreenCoord = WorldPos

No need for a raycast search or compensation for scaling, but still requires a spatial search of some kind to get the object.

Regarding the ID buffer, maybe the tapping could be resolved by sampling an area rather than a single pixel. Check the pixels for an ID, and if there's more than one that isn't just background, count which one there's more of. Or something like that.. I don't really know how touch screens calculate those things.

In my current project I'm using both. The ID buffer to look up objects, since it's a pretty fast mapping, but for moving things I use the screen-to-world transformations. I don't know how optimal it is, but it lets me get away with mapping things without keeping more advanced query structures that can do spatial and ray cast searches.

* I'll have to look this up though
Does anyone have a link to an article, or some sample code?
Doesn't matter which language.

My Oculus Rift Game: RaiderV

My Android VR games: Time-Rider& Dozer Driver

My browser game: Vitrage - A game of stained glass

My android games : Enemies of the Crown & Killer Bees

Sorry, I don't have any good example code for it.. I worked it out myself. I used khan academy to get my matrix inversion right.. the ID buffer should be easy enough if you know how to manage framebuffers in OpenGL.

I'm not using legacy GL though. v3.3 and shaders, with the fragment shader outputting both to a color buffer and an ID buffer. I guess it's OpenGL ES for android? It's kindof like 3.3 scaled down?

You could also make two rendering passes, one for color and one for ID and binding different buffers.

Once you've got the ID buffer, it's very easy to use the glReadPixels to get at the information.

Sorry that it's a bit thin on code. See it as an exercise ;)
Anything specific that you can't figure out? I'll try to help

Does anyone have a link to an article, or some sample code?
Doesn't matter which language.
While my personal code is kind of tailored to my application, MathGeoLib's frustum(.h|.cpp) code is very clean (though I personally prefer the "radar" approach to intersection testing.)

An example of what I described:
If your view transform is something like this:
Translate1*Scale*RotateZ*RotateX*RotateY*Translate2

Your camera transform is this:
Translate2' * RotateY' * RotateX' * RotateZ' * Scale' * Translate1'

Now you can extract the position of the camera from the resulting matrix, like in the image on this page.
Note: You'll likely want the left / up / forward vectors to be normalized, what you can do as an optimization is the following:
Translate2' * RotateY' * RotateX' * RotateZ' * Translate1' * Scale' (only apply inverse scaling to the transformation part of the matrix).
That way the columns in the rotation part of the matrix are all unit.

This gives you the camera information.

Take the coordinates of the tapping and convert them from window coordinates to normalized device coordinates, then your ray's direction is simply:
Forward * nearplane distance + Left * x (in NDC) + Up * y (in NDC)
Where forward/left/up are extracted from the camera matrix as described above, the ray starts from your camera's position (translation part extracted from the matrix.)
Note, you might have to change some of the signs for the direction equation depending on the convention you use for the window coordinates origin.

Disclaimer: I don't remember why there's a conversion to NDC nor do I have the time to verify it right now.
Also, I had written in my notes, that this assumes a symmetric frustum, but I can't remember why either.
Stop twiddling your bits and use them already!
There's another alternative, less flexible but should work nonetheless. Make a 3d cursor like you would make any other unit, as a 3d model hovering the terrain, and its coordinates on the terrain are controlled by the mouse. It won't work with the UI, though...

There's another alternative, less flexible but should work nonetheless. Make a 3d cursor like you would make any other unit, as a 3d model hovering the terrain, and its coordinates on the terrain are controlled by the mouse. It won't work with the UI, though...


This runs on a touch screen. Hence, there is no mousemove event before the screen is tapped.

My Oculus Rift Game: RaiderV

My Android VR games: Time-Rider& Dozer Driver

My browser game: Vitrage - A game of stained glass

My android games : Enemies of the Crown & Killer Bees


[quote name='DiegoFloor' timestamp='1341410166' post='4955622']
There's another alternative, less flexible but should work nonetheless. Make a 3d cursor like you would make any other unit, as a 3d model hovering the terrain, and its coordinates on the terrain are controlled by the mouse. It won't work with the UI, though...


This runs on a touch screen. Hence, there is no mousemove event before the screen is tapped.
[/quote]

Ah, sorry. Didn't read it was for mobile.

This topic is closed to new replies.

Advertisement