• Advertisement
Sign in to follow this  

Using machine learning to predict the collapse & stabilization of complex systems?

This topic is 1237 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

(Original post: http://stackoverflow.com/questions/26149416/using-machine-learning-to-predict-the-collapse-stabilization-of-complex-system)

 

 

I should add that I've also come up since this stackoverflow post with various methods ranging from Windowing, to convolution, FFTs, using a variation of expansion meshing, and even a method that revolved around trying to perform a Monte Carlo tree search over a set of systematic samples (really overly complicated, didn't work).

I'm at a serious loss of ideas right now. This is a project that I'm doing with a 5 University students, and all of us are clueless on how to correctly approach this.

If anyone has an idea on how to approach this, please let me know.

Share this post


Link to post
Share on other sites
Advertisement

My guess is that the typical SO reader has no idea what you're trying to do. For example the below is completely meaningless to me:

 


I've been working on a fuzzy logic SDK for the past 3 months now, and its come to the point where I need to start heavily optimizing the engine.

As with most "utility" or "needs" based AI systems, my code works by placing various advertisements around the world, comparing said advertisements against the attributes of various agents, and "scoring" the advertisement [on a per agent basis] accordingly.

It's not clear to me which terms you're using as AI terms and which terms are problem domain specific. What does placing advertisements around the world have to do with a fuzzy logic SDK?

Perhaps if we understood the problem domain better, the solution could avoid machine-learning to detect expected crash conditions.

Share this post


Link to post
Share on other sites

OK... so it sounds like you are trying to figure out which butterfly is going to cause the hurricane -- i.e. "solve chaos theory". Yeah... that's gonna be a bitch. Even if you just run the simulation looking for values that start to arc in one direction, you don't truly know how far it will get until it starts to correct.

Share this post


Link to post
Share on other sites

I think I don't need to find out which one is going to be the butterfly, I think I only need to find out when a collapse will happen.

 

I've been asking various professors and math grad students. I've gotten various suggest from using Neural networks, to multivariate Markov chains. Any opinions?

 

Markov chain description VIA reddit:

For the non-periodic repetition you might be able to capture it using a multivariate Markov chain. It won't reproduce the exact behavior but it should be able to approximate it. You might want to do some sensitivity analysis to see how robust your system is to approximations of the attributes.

Share this post


Link to post
Share on other sites

My guess is that the typical SO reader has no idea what you're trying to do. For example the below is completely meaningless to me:

 

 


I've been working on a fuzzy logic SDK for the past 3 months now, and its come to the point where I need to start heavily optimizing the engine.

As with most "utility" or "needs" based AI systems, my code works by placing various advertisements around the world, comparing said advertisements against the attributes of various agents, and "scoring" the advertisement [on a per agent basis] accordingly.

It's not clear to me which terms you're using as AI terms and which terms are problem domain specific. What does placing advertisements around the world have to do with a fuzzy logic SDK?

Perhaps if we understood the problem domain better, the solution could avoid machine-learning to detect expected crash conditions.

http://www.zubek.net/robert/publications/Needs-based-AI-draft.pdf

 

This should explain the context for what I mean by advertisements. My scoring system is a lot more complicated than the one described (Various Eigenvectors and what not), but it should give you a general idea.

 

In terms of speed I am currently getting, at the moment its not that bad. I can calculate 20,000 iterations a second for a system with 4.3k agents and 500 advertisements. However, for the game I'm currently working on, that number would most likely need to increase by a factor of 3 - 5. (I figured out that if I could just copy segments of the graph rather than recalculating, the factor increase would be nearly 10. That number is very attractive to me haha)

Share this post


Link to post
Share on other sites
I should also add by an itteration, that does not directly imply that ab advertisement is being calculated (it's usually locking for X turns)
In actuality, it's close to 2k ads per second per entity

Share this post


Link to post
Share on other sites
If you have access to professors and math grad students, find one with a decent grasp on chaos mechanics and have them explain to you why you can't do what you're attempting to do.

Share this post


Link to post
Share on other sites

If you have access to professors and math grad students, find one with a decent grasp on chaos mechanics and have them explain to you why you can't do what you're attempting to do.

 

There's probably an "n" in their answer.

 

For a shorter solution, look up Poincare's "3 body problem".

Share this post


Link to post
Share on other sites
Huh... alright well I looked into chaos theory a bit more heavily. There should still be a way to tell if I'm at an equilibrium and not about to collapse.

If I at least know that things are stable right now, I can still use an lod system

Share this post


Link to post
Share on other sites
The whole point of chaos theory is that it is impossible to know how long an equilibrium might remain stable, and what might destabilize it. Such attractors (areas where solutions are likely to be found) are not perfectly stable, and it takes only minuscule variations on conditions to radically alter which attractor(s) you land in.

The more complex your system, the more likely you are to run afoul of these principles. Fundamentally you just can not do what you want to do - it isn't even mathematically feasible. The only way to know the state of a complex system at Time T in the future is to wait until Time T actually occurs.

Share this post


Link to post
Share on other sites
Well, I don't need to predict into the future with this I imagine. How could I tell that I am stable right now, rather than in 5 minutes from now.
Is there a way to do that?

Share this post


Link to post
Share on other sites
All I need to know is if there is a method that would allow me to greatly reduce calculations during stability. If the graph just keeps on repeating there has to be a way I can detect that and just keep on copying until an interrupt happens or something significant occurs

Share this post


Link to post
Share on other sites

I am confused... why do you want to copy the graph? I thought it represented what your agents were doing in the simulation?

Share this post


Link to post
Share on other sites

I didn't actually mean the graph. I meant the next "Values" for the next iteration.

 

But I solved it. (Well, I didn't. I sent the code to a friend who is a specialist in Econometrics). I'm storing Markov Chains of the advertisements the agent picks. So if it decides to eat, from a track record, there is a 15% chance that it'll want to sleep next. Role a random number generation, and execute the probability it lands on. Large number theorem allows these probabilities to approach their true value as the sample size tends to infinity, so I imagine this would produce quite accurate results. I could even just throw an if statement or two in there for good measure.  

Share this post


Link to post
Share on other sites
Think of it this way: how do you define "stable"?

Stable means you're not about to "collapse" within some finite window of future time. But you can't predict the future, so how can you measure if you are stable?


In other words: stability is a property of a system over time, not instantaneously.

Share this post


Link to post
Share on other sites

Huh... thats a really good point. People though believed they were stable until right before the recession, so I imagine that you can know if you're stable from a history of data as well. Obviously that would be an approximation, and a false positive at that, but at least it could be correct over a constant time interval.

Edited by WireZapp

Share this post


Link to post
Share on other sites
Excellent - now you understand the problem you are trying to solve that much better.

What you want is basically exactly what you just described: an approximate prediction that has an arbitrarily high probability of being correct based on how much input data is fed in and what timeframes are considered.

Now that you have a little more insight into what your requirements really are, you can select a tool for this job. Not surprisingly, there are many fields in which modeling chaotic systems and trying to predict them is really important - weather prediction being the classic example. The tools you wind up using to try and model your simulation are going to depend heavily on the type of data you have available, the quantity of data available, and how much time you can afford to spend recalculating things as the data changes (i.e. as new "Facts" are discovered about your system).

You mentioned Markov chains as a potential solution. Here's why I think that will ultimately wind up just frustrating you: think about how many inputs your agents have for things they can think about in order to make a decision. You dropped the number 2000 per agent earlier, for instance. In order to accurately model a decision process with N inputs, you typically need something like a Markov chain of length N, give or take. If each of your agents needs a 2000-long chain, and each chain has to be constantly recalibrated based on decisions propagating through time... you've just made a ton of work for yourself.

Can you get decent results with, say, an order-20 chain instead? Probably. But there will always be those mysterious cases where Markov fails you, and you get incredibly inaccurate predictions. This is partly because of chaos, and partly because your modeling tool is inherently limited.

It really all depends on how badly you want to try to predict stability (or the lack thereof) in your system, and how badly you will be hurt if you cannot accurately and reliably do so. Ironically, as I mentioned earlier, oftentimes the best way to know what state a system is in at a given time is to just simulate it up to that time, and look.

If your simulation is fast enough to handle the agent and input volume you need, then I'm honestly not sure why all this difficult stuff has to be thrown around in the first place. If it's too slow, on the other hand, have you exhausted the possibilities of other kinds of optimization?

Share this post


Link to post
Share on other sites

Excellent - now you understand the problem you are trying to solve that much better.

What you want is basically exactly what you just described: an approximate prediction that has an arbitrarily high probability of being correct based on how much input data is fed in and what timeframes are considered.

Now that you have a little more insight into what your requirements really are, you can select a tool for this job. Not surprisingly, there are many fields in which modeling chaotic systems and trying to predict them is really important - weather prediction being the classic example. The tools you wind up using to try and model your simulation are going to depend heavily on the type of data you have available, the quantity of data available, and how much time you can afford to spend recalculating things as the data changes (i.e. as new "Facts" are discovered about your system).

You mentioned Markov chains as a potential solution. Here's why I think that will ultimately wind up just frustrating you: think about how many inputs your agents have for things they can think about in order to make a decision. You dropped the number 2000 per agent earlier, for instance. In order to accurately model a decision process with N inputs, you typically need something like a Markov chain of length N, give or take. If each of your agents needs a 2000-long chain, and each chain has to be constantly recalibrated based on decisions propagating through time... you've just made a ton of work for yourself.

Can you get decent results with, say, an order-20 chain instead? Probably. But there will always be those mysterious cases where Markov fails you, and you get incredibly inaccurate predictions. This is partly because of chaos, and partly because your modeling tool is inherently limited.

It really all depends on how badly you want to try to predict stability (or the lack thereof) in your system, and how badly you will be hurt if you cannot accurately and reliably do so. Ironically, as I mentioned earlier, oftentimes the best way to know what state a system is in at a given time is to just simulate it up to that time, and look.

If your simulation is fast enough to handle the agent and input volume you need, then I'm honestly not sure why all this difficult stuff has to be thrown around in the first place. If it's too slow, on the other hand, have you exhausted the possibilities of other kinds of optimization?

 

I hadn't fully considered that for the advertisements. Jeez.

 

And yeah, I've considered a lot more optimizations that can be made. We're trying to redesign the scoring system to be a lot more parallel.

That won't give the full performance increase we want.

 

We're using the engine to create a political simulator (Just as a tech demo, not a final game). The player tries to design a new law (through a flow graph system), raise taxes, start new cities, etc to keep his citizens happy.We also figured out for this to be enjoyable we would need a few hundred thousand entities, and perhaps anywhere from 50k - 60k advertisements. We're aiming for a 10 second delay between turns. Where our AI engine stands right now, our in house testing rig can run through all those entities in about 98 seconds. We're hoping adding GPGPU will bring that time down to a more reasonable 50 seconds but, that's still too long.

 

We also noticed that close to 80% of the time (when the player is playing well) the system was just creating repetitive patterns, and was highly stable. When the player didn't do well, that number was closer to 50%.

 

We've tried various prebaking methods, and while they greatly reduce computation time, they don't give as enjoyable of a result.

 

On a side note:

We have various other optimizations that were made. For instance, originally each entity held their own copy of the AI events relating to their nearest neighbor. Like neighbor A would remember if neighbor B committed some crime, and would view the event in different light based on their opinion on the player and that particular neighbor. When we first implemented this, it was brutally slow, increasing computation time by a huge factor. After various optimizations and what not, it no longer effects performance (welll... it adds a second or two).

Edited by WireZapp

Share this post


Link to post
Share on other sites

I did some research into stability theory. I didn't realize how huge this particular field of research is; however, I still don't know how to approach this. There are countless different theorems and algorithms, all of which I imagine have varying complexities for a C++ implementation

Share this post


Link to post
Share on other sites

Your best bet is to design the simulation with rubber-banding constraints so that the farther certain things get out of whack the less likely they are to proceed further in that direction. That's how many large scale decision systems are anyway.

Share this post


Link to post
Share on other sites

That could work. I'll try that tomorrow.

 

On another note though with stabilization, one of my colleagues took a look at a paper on Papov's theory for stability on fuzzy logic control systems, and followed it quite closely. We were hopping that the system would stabilize after 10,000 - 20,000 iterations; however, it actuality it was stable by time it had computed the second advertisement.

 

Screenshots in case you're curios: http://i.imgur.com/sjLwsnK.png

 

After 3k iterations, the system no longer had any noise. The only collapse that happened was at iteration 52k-ish.

 

Edit: I didn't realize I posted this from my personal account lol

Edited by LouisCastricato

Share this post


Link to post
Share on other sites

I made various other optimizations to the point where calculating a score takes so little time that it isn't really worth it for me to implement this anymore

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement