So I will probably implements this approach but I also thought of another approach that I'd like to try.
The solution that the neural network is learning is not what I would choose. So I wonder if it would be possible to solve the pursuit problem myself a few times and record the responses that I use. Then train the neural network to learn how I solved the problem. After learning my approach, I would let the neural network try to find a better solution to the original problem by itself. I hope that this kind of "assisted learning" would be a useful way of getting the learning on the right track from the start.
I am sure this is not the first time someone has used this approach. I've called it "assisted learning" but I'm sure it is called something else in the literature. Does anyone know any references to this kind of work?