Sign in to follow this  

Probabilistic algorithms to approximate function result?

This topic is 3595 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Months ago I read a Game Programming Gem about the Bloom filter, and how it could be used to approximate a function result. I'm writing a Virtual Machine for my new computer language, and I have been thinking since then how I could integrate that kind of probabilistic algorithm to the machine to speed up processing. What algorithms of this kind exists, and what efforts have been done to implement them in computers and computer languages?

Share this post


Link to post
Share on other sites
Quote:
Original post by cronodragon

What algorithms of this kind exists, and what efforts have been done to implement them in computers and computer languages?


None. Programming languages need to be deterministic.

If you want to approximate a square root function, then any attempt at generic approximation will be slower than what you have. Only custom derived function will yield acceptable results under certain circumstances.

For example, how will you approximate "SELECT * FROM foo WHERE bar;"? Or a function that counts returns number of nodes in a tree? Or better yet, how will you simulate the side-effects of a quicksort partitioning algorithm?

Share this post


Link to post
Share on other sites
Quote:
Original post by Antheus
None. Programming languages need to be deterministic.


But how could approximating a function result make a programming language non-deterministic? I think a programming language could bring a mechanism in which a "guessing" function can override a calculation function. If the guessing function returns "undefined result", then the actual calculation is performed. That's perfectly deterministic. And there are also degrees of approximation an algorithm can work with.

Another question, what about using training algorithms to guess function results?

Share this post


Link to post
Share on other sites
Quote:
Original post by cronodragon
Another question, what about using training algorithms to guess function results?
Assuming that all a function does is return the next number from from algorithmic sequence then one can write a genetic algorithm that guesses the next number by evolving virtual programs that are capable of producing the exact sequence being fed into it (hence being able to produce the next number in the sequence). I have done this myself before and it was able to learn any simple sequence that you could program yourself using a limted set of operations, and a rather limted number of total operations.
There's always the possibility that it will derive the wrong sequence and give a wrong answer though.
None of this is of any use in the area of optimisation, however.

Share this post


Link to post
Share on other sites
Quote:
Original post by cronodragon

Another question, what about using training algorithms to guess function results?


There's a time-space constraint on such optimizations.

You can use trivial cache and apply dynamic programming to the problem. It would yield deterministic and accurate results, but at expense of memory.

Then there's information constraint. This is where a conceptual optimization were possible. For example, C functions return int error codes, but in reality emit only 4 distinct values. As such, the state could be uniquely represented with 2 bits. But there's no optimization to be gained here, especially not when talking about run-time performance.

But then we have unbounded functions with side-effects (see the SQL query above). An INSERT query returns number of rows modified. It's trivial to determine the result at compile-time simply from query string. But optimizing the actual query function call would mean that the database never gets updated.

So it's not a problem of optimizing functions (fixtures, dynamic programming or constants can be trivially used) that have no side-effects.

But there's more. Side-effects may not be obvious. Cache trashing is big problem with table lookup. Any table over 64-512 bytes will trash your cache. This means that only small amount of information can be held by these estimators.


There is however a separate domain of expert languages. These rely on neural networks, fuzzy logic, or other heuristic approaches. But these are used not for performance (they can take hours, days, even weeks to run), but due to lack of deterministic function - so they cannot be programmed in imperative manner.

Notable applications of such applications include smart car systems, such as accelerator and break systems. These involve response function to f(time) = force. And those take months to develop, calibrate and test.

So no, run-time performance optimizations are still domain of graph theory and formal proofs. Heuristic approach is used, but for different purposes.



Share this post


Link to post
Share on other sites
Ok, maybe under the plain processing flow used by common programming languages, approximated-solution functions don't have a bright future with optimization. But what about mixing them with parallel processing?

For example, let's say we want to optimize the algorithm of finding the highest number inside an array. We could make it in a way the function is non-blocking. When the function finds the first highest number, it could signal the calling thread to use that approximation. If the function finds another highest number, it could signal again, and the calling thread would start again with a new value. If no more higher numbers are found, the calling thread would have advanced on it's calculation, saving the time spent by the function while trying to find the next solution.

Of course that would require a reversible processing frame (like an undo cache), and executing only reversible (undo-able) instructions. Non-reversible instructions would block the thread, so errors are not propagated outside of the frame.

I don't think every calculation can be optimized this way, but this could become another tool-feature for a programming language.

Share this post


Link to post
Share on other sites
Quote:
Original post by cronodragon

For example, let's say we want to optimize the algorithm of finding the highest number inside an array.


If this array is unsorted, then the complexity is constant O(n). Without any additional information, one needs to test every single value. This is the fastest possible way. Highest value could be the first, or last, or any in between.

There's a way to split this between multiple processes and achieve linear speed-up. SIMD might also be applied.


Quote:
We could make it in a way the function is non-blocking. When the function finds the first highest number, it could signal the calling thread to use that approximation. If the function finds another highest number, it could signal again, and the calling thread would start again with a new value. If no more higher numbers are found, the calling thread would have advanced on it's calculation, saving the time spent by the function while trying to find the next solution.


Task: Find richest person on earth. Send them an invitation to your party.

You start searching through all bank accounts for highest value. 3% through the list, your algorithm returns "richest person", and sends them an invitation. But, your algorithm then restarts with another richest person. And again. And again.

So now you have a party to which 20,000 people show up.

If you have n cores, the fastest way to find largest number in unsorted array is to split the array into n pieces, have each core find local max, then find maximum of all the results.

If n=1, searching through entire array will be fastest. Anything else, let alone threads, will drastically slow you down.

Share this post


Link to post
Share on other sites

This topic is 3595 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this