Eye Movement Simulation

Started by
6 comments, last by Mythics 14 years, 10 months ago
I'm not sure that this forum really fits this topic or not, but here goes. I'm wanting to create an Eye Movement Simulation. It doesn't need to be 100% exact or anywhere near as fast as the actual human eye of course, it's just a concept I'm trying to demonstrate to some friends of mine. My ultimate goal would be to read from the entire computer screen, but as that would most likely be a terribly monumental task.. I'd be happy with nearly anything (such as 'watching' a movie or similar). My knowledge here isn't exactly of professional levels, so excuse any blunders and please correct me. To continue, I'm under the impression that the eye is typically going to center in on movement/luminescent differences/etc. So, a white dot on a full black screen will grab the person's attention for example. A moving white dot should in theory do even better. What I'd like to do is simply track changes from one moment to another, and display a 'center of focus' on the screen where the human eye is most likely being drawn to. I probably can't actually keep up with live changes, which is why I thought if I simply read an avi file or the like, I could pre-program the 'center of focus' and then run the overlay the same moment I start the video. This making sense? I'm not terribly great at explaining myself sometimes. My real question is how the hell would I do something like this, has anyone else attempted something like this, is there a professional version of this kinda thing out there I'm not locating, etc. Most of what I've found so far doesn't really do what I'm looking to accomplish. Not to mention, I'd like to have the source for such a thing :P. Thanks for any help, or just for simply reading all this.. lol. -Mythics Edit: Just to mention, I was just guessing regarding the "movement/luminescent differences" bit. I don't know without a doubt what the eye gets drawn to. [Edited by - Mythics on June 23, 2009 9:12:10 AM]
Advertisement
I'm not exactly meaning to bump my own thread, but I wasn't sure if only editing the previous post will show up as 'unread' to others or not.


Can anyone help me out to at least confirm if I stuck this in the proper forum or not? The topic in itself is more of an Artificial Sense rather than Artificial Intelligence.

The idea was founded on a discussion between me and a couple friends that to create a form of AI via Artificial Neural Network to 'read' onscreen data, I'd need a sort of 'front end' for the ANN that would project data to the ANN based on the 'center of focus'. That way, if the ANN was meant to confirm 'seeing' the letter A onscreen, it would need to be able to 'look around for it' rather than just learning by some static grid of on/off switches (like a digital clock for example).

Again, my apologies if I'm not explaining myself easily.

Thanks,
Mythics
I'm not quite sure on what your simulation is supposed to do, is it like this?
Inputs: Bitmap/screen    Last Eye Pos     |_________________|             |          ProcessOutputs:     |        New Eye Pos

Another question is, what is the purpose of this sim?
I think the reason that no one jumped in was that your original post was a little scatter-shot and difficult to understand where you were headed.

Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC
Professional consultant on game AI, mathematical modeling, simulation modeling
Co-founder and 10 year advisor of the GDC AI Summit
Author of the book, Behavioral Mathematics for Game AI
Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play

"Reducing the world to mathematical equations!"

Do you want to search an image for the spot that sticks out the most?
It's more of an image processing algorithm.. you could for example loop over all the pixels, and find the area of some radius where the image is the lightest, or where there are the clearest edges, etc. If you want to do it for movement, it's probably a lot harder. You would have to analyze several frames and find patterns that occur in all of them, and then see if the pattern has moved sufficiently, to pass some threshold needed to draw attention.
Try searching for computer vision, starting with wikipedia for example: http://en.wikipedia.org/wiki/Computer_vision.
Quote:Original post by Hodgman
I'm not quite sure on what your simulation is supposed to do, is it like this?
Inputs: Bitmap/screen    Last Eye Pos     |_________________|             |          ProcessOutputs:     |        New Eye Pos

Another question is, what is the purpose of this sim?


It is supposed to be like the ascii diagram you gave, where the process looks for what 'stands out' the most.

The purpose of the simulation is to have a front end for an Artificial Neural Network for image recognition. This way, instead of just feeding in a set 10x10 grid of pixels for example, it could scan an image for 'note worthy objects' to feed into the ANN. (I don't intend to actually try anything so complicated as all that, I just wanted to show off an example of what the front end aspect would look like)



Quote:Original post by Erik Rufelt
Do you want to search an image for the spot that sticks out the most?

For the most part, yes.

Quote:Original post by Erik Rufelt
It's more of an image processing algorithm.. you could for example loop over all the pixels, and find the area of some radius where the image is the lightest, or where there are the clearest edges, etc.

After putting some thought/research into how the human eye works, I believe the nearest thing I could really do is check for differences. White sticks out on black and black sticks out on white due to the difference in color. Take a checkerboard pattern as an example of a situation where the human eye can't figure out where it should center in on, and you might get a dizzying effect.


Quote:Original post by Erik Rufelt
If you want to do it for movement, it's probably a lot harder. You would have to analyze several frames and find patterns that occur in all of them, and then see if the pattern has moved sufficiently, to pass some threshold needed to draw attention.

The eye is in constant motion, so it is never truly stationary. If it ever was, it wouldn't take very long for your sight to go black due to no 'changes' being read from the eye.

So, my thought is to compare all pixels to their neighbors, calculate some amount of change between them, and 'guide' the center of focus towards the strongest and most dense area of change.

If I did something like that, I could account for movement by comparing the previous frame to the current frame as well (like a third dimension, if that helps to picture what I mean). That way I'd be comparing a pixel to it's 8 neighbors within the current frame, and it's 9 neighbors from the previous frame.

Doing as I just described is probably far from what I actually want the program to do, but I think it would satisfy enough scenarios for the demonstration I wanted to present without getting too terribly far into the complexity of trying to simulate actual vision processing.

Quote:Original post by Erik Rufelt
Try searching for computer vision, starting with wikipedia for example: http://en.wikipedia.org/wiki/Computer_vision.

Thanks for the terminology 'computer vision' and the link. It does look like a starting point if nothing else. I was searching for all sorts of stuff but never thought of that, lol.



Thanks for the responses. I still don't know for sure if I'm explaining myself any better, but at least I'm getting some new ideas from ya :).

[Edited by - Mythics on June 24, 2009 12:16:39 PM]
You need to look up "visual salience" (or saliency, I've seen both spellings). This is probably a good introduction: http://www.scholarpedia.org/article/Visual_salience. Itti and some other researchers might have code for their models online.

In general, though, it's an open problem, and how the eye views the scene depends on many factors -- e.g. the task people are performing. It is also different for different people. And even the best existing models cannot predict it with 100% accuracy.
Quote:Original post by Gil Grissom
You need to look up "visual salience" (or saliency, I've seen both spellings). This is probably a good introduction: http://www.scholarpedia.org/article/Visual_salience. Itti and some other researchers might have code for their models online.


Thank you as well for this, it is most certainly helpful.


Quote:Original post by Gil Grissom
In general, though, it's an open problem, and how the eye views the scene depends on many factors -- e.g. the task people are performing. It is also different for different people.


I can't help but think that most infants would at least start out being drawn to very similar things. I would guess that the information the eye gathers over time will lead to the individual performing differently as well. Myself for example, I like to watch movies a second time typically to watch for things in the background that are most certainly not drawing my attention.

There's a book called On Intelligence, by Jeff Hawkins that shows an illustration of a human face. The concept is that the saccade performed by the eye will typically lead the eye to points of interest. Just a neat piece of info imo.


Thanks for the info, I'll give it a proper look over at work today.

This topic is closed to new replies.

Advertisement