# Unity Algorithm for transforming equations

This topic is 4411 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

I want to be able to transform equations so that they are as computationally cheap as possible. So given a set of rules (a*n + a = a*(n+1), a*n + a*m = a*(n+m), etc.), a set of operators and their computational cost, and an equation. The algorithm should be able to transform the equation using the rules to the cheapest possible equation with the same semantics (assuming all rules are correct). For a simple example imagine the following
R = { a*b=b*a,  a*b + a*c = a*(b+c) }
C = {a+b = 3,  a*b = 7}
E = a*3 + 2*b

Of course R is the set of rules, C is the set of costs and E is the equation. Here the algorithm would somehow find out that the best representation would be:
E = a * (3+2)

Because the cost would be: 7+3 = 10, where the cost of the original equation would be: 7+3+7 = 17. So does anyone know of an algorithm capable of doing this? Brute force is not an option, and I would like the algorithm to stay at O(n²) or lower, but if there exist no such thing I can do with algorithms with worse complexity. If I can reduce the complexity by using an algorithm which didn’t always produced perfect results that would also be good. I don’t need the data to be in any kind of data structure, so I don’t care if the algorithm wants the rules in a black-red tree with the costs being the keys, or if the algorithm wanted a circular linked list. EDIT: I can see this might seem like school work, but it's not. In a recent thread I saw some quite surprising assembly from a modern compiler, and as Jan Wassenberg points out, the assembly is actually quite bad. Therefore I chose to do some research in this kind of stuff, just to see where the problems really are.

##### Share on other sites
Dynamic programming will probably help keeping the time complexity acceptable. Basically, the idea is to cache (or memoize) intermediate results, so that overlapping subproblems will only get calculated once.

##### Share on other sites
Quote:
 Original post by SharlinDynamic programming will probably help keeping the time complexity acceptable. Basically, the idea is to cache (or memoize) intermediate results, so that overlapping subproblems will only get calculated once.

Thanks, I believe that can cut down the processing time quite a lot. I doubt it will change the actual complexity though.

##### Share on other sites
This is not a simple problem. Depending on your rewrite rules and initial expression, in this case:

R = { a*b=b*a, a*b + a*c = a*(b+c) }
E = a*3 + 2*b

it is entirely possible that the set of expressions that can be generated using the rewrite rules and are equivalent to your initial expression is infinite. That is, an algorithm enumerating them would not halt, so no optimum algorithm exists to solve the general case.

This means you'd have to use heuristics to search some finite subset for the best solutions.

Symbolic algebra systems like Macsyma and Maple do the same stuff, only they have heuristics to find the most convenient representation (for example, 3*x is more convenient than x + x + x).

Are you familiar with the concept of syntactic unification?

##### Share on other sites
Quote:
 Original post by Anonymous PosterDepending on your rewrite rules and initial expression, in this case:R = { a*b=b*a, a*b + a*c = a*(b+c) }E = a*3 + 2*bit is entirely possible that the set of expressions that can be generated using the rewrite rules and are equivalent to your initial expression is infinite. That is, an algorithm enumerating them would not halt, so no optimum algorithm exists to solve the general case.

I thought the set of costs could help to construct a finite number of combinations of rules, where these combinations would be the only way to decrease the cost. Well honestly I don't have enough understanding of the problem, so I'll just choose to believe that the general case is impossible to solve for now.

Quote:
 This means you'd have to use heuristics to search some finite subset for the best solutions.Symbolic algebra systems like Macsyma and Maple do the same stuff, only they have heuristics to find the most convenient representation (for example, 3*x is more convenient than x + x + x).

I though it might be necessary to do something like this.

Quote:
 Are you familiar with the concept of syntactic unification?

No, but I'm willing to read about it. To be honest I wouldn't mind reading 10 books to get a better understanding of the problem, it would just mean I had learned more in the end. So, I'll go read about syntactic unification now. I would really appreciate links to some articles or books which might give me some additional information about the problem (even if they assume knowledge of syntactic unification).

##### Share on other sites
Quote:
 Original post by Anonymous Posterit is entirely possible that the set of expressions that can be generated using the rewrite rules and are equivalent to your initial expression is infinite. That is, an algorithm enumerating them would not halt, so no optimum algorithm exists to solve the general case.

Indeed. This is an interesting problem. You can't straightforwardly prune branches that yield expressions more costly than the original, because they might *eventually* lead to cheaper solutions. But you *must* somehow cap the cost, otherwise the search will happily generate expressions such as original + 0 + 0 + 0 + 0 + 0 + 0 + 0, ad infinitum.

It also occurred to me that because the search space is an arbitrary graph, you *have* to keep track of opened nodes, lest you end up with an infinite loop. So memoization is pretty much a requirement, not just an optimization.

##### Share on other sites
Quote:
 I thought the set of costs could help to construct a finite number of combinations of rules, where these combinations would be the only way to decrease the cost.
Heh, that's a standard optimization called hill-climbing, only it requires that you can always go in the direction of decreasing cost. In other words, your cost function must be monotonic, without local minima.

Unfortunately, this problem is too complicated: you can't devise an algorithm that can guarantee to find the best expression in this way.

Quote:
 No, but I'm willing to read about it. To be honest I wouldn't mind reading 10 books to get a better understanding of the problem, it would just mean I had learned more in the end.

An excellent (and hands-on!) introduction is found in SICP:

http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-29.html#%_sec_4.4

Unification is a generalisation of pattern matching. It's hard to explain simply without getting into details, but it's what you intuitively do when noticing that, given these two equations:

x + y = z
x + 2 = z

you can deduce (by looking at the syntax only!) that y = 2 is consistent with both equations.

If you don't know scheme, it might be worth it to learn it for tackling this problem, because scheme is:

1) Laughably easy syntax; you can pick it up in a day.

2) Very well suited for representing and manipulating symbolic expressions.

You are likely to be scared off if you just go link-hunting: unification can get exceedingly technical in a rigorous mathematical formalization.

##### Share on other sites
This would be an interesting problem to apply Simulated Annealing or Genetic Algorithms to. Both have the ability to jump out of local minima while looking for a global minimum.

Unfortunately it wont give you a guaranteed compelxity, or even any guarantees at all, but it would be interesting.

- Phil

##### Share on other sites
Here's the way you might start attacking the problem:

1) Write a unification algorithm for your expression syntax. This is pretty simple if you've got a convenient data representation for your expressions; I'd use scheme or Common Lisp, because I've written unifiers in them and know it is simple from experience. Symbolic manipulation is one of lisps' strengths.

2) Your rewrite rules are transitions from one expression to another. That is, a rule looks like:

E0 -> E1

for example

x*y + 2 -> y*x + 1 + 1

or

0*x -> 0

where E0 is an expression containing variables and constants and E1 is another expression containing the same variables (or not, but certainly no NEW variables) that is equivalent to E0. Let's call E0 the head of the rule, and E1 the result.

So given an expression E, if E can be unified with a rule's head, it is equivalent to that rule's tail with the variables adequately substituted. (There may be more than one way to unify two expressions).

So you get a set of expressions that are reachable in one step from your original E expression and your set R of rewrite rules.

Of course, you can apply your rewrite rules to this new set of expressions too, yielding more equivalent expressions.

This yields a (potentially infinite) directed graph of the expressions that are syntactically derivable using your rewrite rules.

Now comes the hard part: how to search that potentially infinite expression graph for good expressions? You can apply genetic algorithms or simulated annealing as the poster above mentioned, A* with a heuristic of your own, simple breadth first search, symmetry pruning...

##### Share on other sites
Thanks a lot, if I could I would rate you up.

1. 1
Rutin
26
2. 2
3. 3
JoeJ
20
4. 4
5. 5

• 10
• 10
• 9
• 9
• 10
• ### Forum Statistics

• Total Topics
631751
• Total Posts
3002087
×