threading on the objects inside the 'mind' course

Published April 23, 2022
Advertisement

JoeJ said somewhere that computers can only operate/evaluate things if they have been transformed in a digital form. I want to stop by his statement.

Realistic AI consists of two main stages. The first stage is making a reconstitution of the outer world in a virtual environment. The second stage is interpretation of the scene resulted from reconstitution. Why am I using the term reconstitution? Because the information gathered through sensors by a robot is a trucated. The information gathered by a robot is usually meshes describing one side of the object, a robot needs to go round the object to create a complete mesh of the object, which is something a robot can`t do (because of obstacles in the way) or because he doesn`t have time to do it. Reconstitution means that the partial meshes obtained through scanning are identified and replaced in the scene with complete corresponding objects meshes found in the memory. You can`t do a proper physics simulation unless you have the full shape of the bodies. Also sensors can`t tell you the weight of objects (which is needed for a physics simulation) so all the information required in a physics simulation is obtained through matching the partial description of objects with objects previously saved in memory. The type of object information that is loaded from the memory is not limited just to shape and weight, it can also refer to the laws by which those objects operate. If we talk about a self driving car, the information gathered through sensors can be a description of a person or a description of a car. The information gathered alone is not enough to tell the patterns by which the objects move. Pedestrians move along the sidewalks, cars only move on the road and never cross on the sidewalks, trains move along the railways. This type of information can be saved as a complete description of objects into the memory of the robot and later loaded into a digital representation of a concrete real life scene.

After reconstitution you still only have a bunch of objects some of which are moving. So the next stage is figuring out what that means to your goals. I like to call the second stage the overmind.

So my goal is this post was to state that the information gathered through sensors is always incomplete and , secondly, what needs to be done to fill in the missing data.

Previous Entry rumors
Next Entry Challenges
0 likes 2 comments

Comments

JoeJ

Realistic AI consists of two main stages. The first stage is making a reconstitution of the outer world in a virtual environment. The second stage is interpretation of the scene resulted from reconstitution.

Traditionally, none of those are AI related fields. For the former you could look at how photogrammetry sensors and tools do it, for the latter there's rich work on 3D shape matching (which does not come close yet to the robustness of image classification, afaik).
In recent years of course both fields saw contribution from ML guys. For the former, there's a lot of hype about neural distance fields (NERF). But it's not clear to me which new problem this solves.

In games we don't have those problems. Our worlds are already composed from robust models, which have properties we can easily get, including mass and material properties, and function or purpose.
So if our AI agent sees a hammer, it can get all this information easily. I can pick it up, i can use it for this and that. The hammer may even link to animation data so the agent can use it to visually make a sword.
That's how we can avoid some hard problems real world robots would have.
That's also why we implement our AI stuff in non general but very specific ways, dictated by game design. For example, the hammer has the knowledge ‘i can craft a sword from molten metal’. Or, the map has knowledge of ‘use this path to get from here to there’. So for us, intelligence can be implemented on the world, not the agents. This gives control over game design and also allows to share work to help with efficiency.

We could say, game AI is similar to Borg in Star Trek, maybe. We can share a expensive tactics calculation over a whole group of agents easily.
We also have much easier sensing and association of the environment.

Though, it may happen the real world robotics guys get such goodies too, if ‘internet of things’ becomes widespread.

Reconstitution means that the partial meshes obtained through scanning are identified and replaced in the scene with complete corresponding objects meshes found in the memory.

It's maybe better to work with incomplete data, than trying to fix this data using models from a data base.
Notice we humans do the same: We work with what we see. Assumptions about the occluded stuff are good enough, in case we have no memory about it.

Also sensors can`t tell you the weight of objects (which is needed for a physics simulation) so all the information required in a physics simulation is obtained through matching the partial description of objects with objects previously saved in memory.

Same argument here: We look at some object and we make assumptions: ‘The object is small and separated from the ground, so i can lift it up. It’s a metal material, so it's probably heavy. It has a long shape, so i can guess inertia of the object as well.'
Then, while we lift up the object, we see how heavy it really is. And by rotating it, we could measure inertia, and store all that data to memory linked to that object. Some ML could even make smart predictions and assumptions by comparing a new object with the objects we already have in our data base.
But the first step is to make your stuff work without having such data, which is an open problem for AI research. Building the database would be just a learning process on top of that, and we have pretty good AI learning capabilities now. But it feels some other building blocks needed for true AI are still completely missing, and i doubt learning could model those missing things. I'm really not sure, because naming those missing things is already hard.

So my goal is this post was to state that the information gathered through sensors is always incomplete and , secondly, what needs to be done to fill in the missing data.

There also is a lot of research on related geometry problems: Detecting and filling holes, up to completing missing data in the scan from a data base.
But as said, for games we do not really have this problem, so not much need to think about it.

April 23, 2022 02:05 PM
Calin

For the former you could look at how photogrammetry sensors and tools do it, for the latter there's rich work on 3D shape matching

When I said “interpreting” I wasn`t referring to matching 3d shapes/objects, that`s something that is done in the first stage. By interpreting I mean figuring out Why the stuff that`s going on in any particular real life scenario is happening.
So the fist stage is figuring out what is happening, like capturing the state of things (which could be objects of certain type moving) the second stage is figuring out why do they move, like the motive (or why the objects that are not moving, were there the moment we entered that scene, this is important because knowing why they are there might tell us what could happen next to them).

[edit] I edited this reply several times

April 23, 2022 03:16 PM
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!
Profile
Author
Advertisement
Advertisement