Yeah that's how it works guys. I can use an image or drawing or 3d object like a soda can as a target. When you open the app and point the camera at a target it recognizes, it can spawn whatever you want within the game engine that can have whatever game logic you want.
At work I made cards as targets with 3d animated fbx characters for each card. If you move the cards close enough to each other, they play animations and do damage until one dies. I also made intro and exits so if the camera loses the target a particle effect starts and the character disappears. When the camera recognizes the card an intro particle effect plays and the character appears, playing an idle animation.
You can rotate the cards, pick them up and as long as part of the target is still showing it's pretty good at following it.
Similarly I was thinking of making kids puzzles where once you put it together, you can view it as a 3d scene with animated animals and stuff.
For tower defense you could position your pieces on a game board or on a table top, and point your cam at it to watch it play out. Tablets are better with big screens than phones but it runs smoothly on both.
I made a mob spawner than drops models at an interval and they can walk toward a waypoint and you could put your character cards in it's path to make them fight. And if any of the enemies make it to their waypoint it counts against you.
But you can apply any game logic you'd use in a game engine. The camera feed is simply the input. Once a target is recognized you can spawn 1 or many models with or without animations, with or without AI...
This was last weekend when I first got it working, but each target spawns different set of models. In this vid only 1 target at a time but now it can track dozens at the same time:
@mipmap yeah at work they were showing me the ikea catalog and we're doing something similar with a different catalog to show as a demo to prospective buyers. You can spin the cards around to view the models from 360 degrees. And you can even add virtual buttons where when map off a region of the target and when it's occluded from the camera stream you can trigger a function in the game engine...
Edit:: One idea I was thinking was to go take pictures of landmarks in the city from common positions and use them as targets which you could spawn stuff around. I also made a screenshot script that will snap a shot of the virtual stuff over the real life scene. So I could make targets of statues and when you load the app and point the cam at the statue you can add ears, mustaches, noses, hats, etc and people can pose in front of it and it will all be in the image in their phone's gallery.