Algorithm for transforming equations

Started by
37 comments, last by GameDev.net 17 years, 9 months ago
Quote:Thanks a lot, if I could I would rate you up.
I'm glad I could help.

Be advised, these concepts are not simple, and understanding them takes sweat and hard work.

I'd be happy to walk you through writing the expression representation and the unification algorithm at least: it's fun and doing this sort of thing keeps me on my mental toes. So if you want to see how this sort of thing is done and don't mind using lisp or scheme, I'll post a step-by-step description of how to do this in lisp or scheme.

Just say the word.
Advertisement
Quote:Original post by Anonymous Poster
I'd be happy to walk you through writing the expression representation and the unification algorithm at least: it's fun and doing this sort of thing keeps me on my mental toes. So if you want to see how this sort of thing is done and don't mind using lisp or scheme, I'll post a step-by-step description of how to do this in lisp or scheme.

Just say the word.


If you want to help with a step-by-step description, I would really appreciate it. I don't mind you using lisp or scheme, or any other language for that matter, learning the syntax of a language doesn't take long and the Lisp variants are often easy to learn.
Okay. :)

There are a few things we should be clear about before diving in:

1) That two mathematical expressions are theoretically equivalent doesn't mean they'll give the same answer when a computer calculates them, unfortunately. The whole field of numerical analysis is dedicated to finding ways to write an expression so that numerical error propagation is not catastrophic.

So this is a nice exercise, but to make this practically useful, error propagation must be taken into account, and good programmers usually write mathematical expressions with error propagation in mind, so rearranging them might not always be what the programmer wants.

2) If I tried to explain every concept we'd be here forever, so I'll provide links to concepts I think are explained better elsewhere.

3) I'll probably make mistakes, so try to find them. Give me as much feedback as possible for every post, and ask about what you don't understand; writing a monologue is not fun.

4) If real life intervenes, I might disappear for a few days at a time, depending on how busy I am. I'll try to get the first post with actual content done today before I head for the bars.

---------------------------------------------------------

The first part of the problem is: given an expression and some rewrite rules, figure out how to generate equivalent expressions using the rewrite rules, and be sure you're actually doing this correctly and not screwing up.

1) First we have to choose a representation for our mathematical expressions. Choosing a convenient representation is not as easy as one may think, and it must be done carefully and rigorously.

2) Next we have to choose a representation for our rewrite rules.

3) We have to think about what applying a rewrite rule to an expression *means*. We have to find an algorithm that applies a given rewrite rule to an expression in every possible way that makes sense. I've been thinking about this, and fortunately I think we can get away with a simple pattern matching algorithm: the full generality of unification is not needed, because expressions cannot contain pattern variables. But I'll leave this for much later.
Choosing a representation for our expressions part 1:

First we have to decide WHAT we're actually trying to represent. We are talking about the expressions to the left of an assignment, like*:

3 + 6*y

in

x := 3 + 6*y

What do we need to bear in mind? Not the colon-equal sign, because it doesn't really mean equality here but assignment, and is not part of the actual expression which we want to rewrite.

We DO have to bear in mind the algebraic operators. Let's limit ourselves to *, +, /, and -. We can always add more later.

We have to have parentheses, like (4 + 5)*z.

We also have to bear in mind what those operators act upon: constants like 6 and variable identifiers like y.

So we know intuitively that the expressions we want to represent are like 4 + 5 or (x + y) / 5 . We have to formalize this intuition so that we can write code that manipulates these expressions as data structures.

---------------------------------------------------------------------------------------

So we have to first clarify and formalize our syntactic representation: only then can we start to code it.

We already know of one possible syntactic representation: the typical math notation we learn in school**. Indeed,

(x + y) / 5

or

4 - 9

are written in that form. We know that these expressions:

+3++/

or

)(-7**(6()

are nonsense, even though they use the symbols that are used in math notation. So we need a formal way to specify the rules in a given syntax, rules that tell us what the valid expressions are and what expressions are nonsense.

How do we formally specify and reason about syntax rules? Thank god that Backus and Naur already solved that problem for us, because it is thorny. We turn to Backus and Naur and ask them for help:

http://www.cs.man.ac.uk/~pjj/bnf/bnf.html

http://en.wikipedia.org/wiki/Backus-Naur_form

It is important that you understand this before we move on. Don't be ashamed to ask questions: again, these are difficult concepts. Invent a few BNF grammars and see how they work using pen and paper to get a feel for them.




* I'm using := to represent assignment instead of = to distinguish assignment from actual equality.

**We will use another representation, because traditional infix algebra notation is (relatively) complicated to formalize as BNF rules. We'll use a BNF representation that is so simple you won't believe it at first.
Quote:Original post by Anonymous Poster
How do we formally specify and reason about syntax rules?

Ok, this doesn't seem so hard. You mention we won't be using the infix notation, but I tried to create a set of rules for it anyway
digit ::= "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" | "0"char ::= "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"number ::= <digit> | <number> <digit>string ::= <char> | <string> <char>operator ::= "+" | "-" | "/" | "*"constant ::= <number>variable ::= <string>operand ::= <constant> | <variable>expression ::= <operand> |     "(" <expression> ")" |     <expression> <operator> <expression>assignment ::= <variable> ":=" <expression>


I also tried to create a simple grammar for a Lisp like system (prefix), most is still the same only assignment and expression is changed and operand_expression_list is added.
digit ::= "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" | "0"char ::= "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"number ::= <digit> | <number> <digit>string ::= <char> | <string> <char>operator ::= "+" | "-" | "/" | "*"constant ::= <number>variable ::= <string>operand ::= <constant> | <variable>operand_expression_list ::=     <operand> |     <expression> |     <operand_expression_list> " " <operand> |     <operand_expression_list> " " <expression>expression ::=  operand | "(" <operator> " " <operand_expression_list> ")"assignment ::= "(" "define" <variable> <expression> ")"


I believe both representations are about correct, I haven't verified it in any way though.

I have ignored whitespace since I can't see it doing anything but complicating the algorithm.

[Edited by - CTar on June 16, 2006 3:21:12 PM]
Excellent. I see you are a very quick learner. Quicker than me at least. I'll pick up the pace.

The disadvantage I see with infix notation is threefold:

1) Operator precedence rules must be codified into the BNF grammar if one wants a unique parse tree for every expression (a unique meaning). This is a pain because one is forced to mix syntax with semantics, and it makes the BNF more complicated.

Of course, if one is forced to use parentheses to mark operation precedence in infix notation, the readability advantage of prefix notation is negated.

2) The left-recursiveness of math notation must be worked around if one wants to implement a recursive descent parser. This is also a pain.

3) If we're using lisp and represent expressions in prefix notation, we get an evaluator for our expression representation for free!

Quote:
I believe both representations are about correct, I haven't verified it in any way though.
I'll take your word for it. :)

I see you are already familiar with lisp or scheme. That's great.

I realize that you wrote the BNF lisp forms there for a quick examples' sake. But for anyone else reading this, you should be aware that define in scheme is a binding construct, not an assignment operator. For the language lawyers, this means that define binds a name to a value cell while set! (the assignment operator) mutates the value in a bound name's value cell. That's why you can get unboundname errors using set!.

Also, this means that define's binding only extends within its enclosing lexical scope while set! actually modifies the value in the value cell bound to the name, which does have important consequences. For example:

(define *x* 5)(define (test-binding)  (define *x* 3)  *x*)(define (test-assignment)  (begin    (set! *x* 3)    *x*));here's the difference:> *global-var*5> (test-binding)3> *global-var*5> (test-assignment)3> *global-var*3


So if any scheme newbie is reading this, remember that define is a binding construct and not an assignment operator!

Anyway, have a fun friday night! I'll proceed tomorrow or sunday, depending. :)
This is really interesting. I didn't know anything about lisp/scheme but they look useful.

But I'm not completely sure what you're working towards - Basically, a computer algebra system, right? But in this case you're taking into account the "cost" of each computation?
Oh the irony. I posted to unconfuse possible scheme newbies and probably confused them all the more.

I changed the global variable name in my example from *x* to *global-var* for clarity, but forgot to change it in the function definitions.

I am terribly sorry, the example should read:

(define *global-var* 5)(define (test-binding)  (define *global-var* 3)  *global-var*)(define (test-assignment)  (begin    (set! *global-var* 3)    *global-var*));here's the difference:> *global-var*5> (test-binding)3> *global-var*5> (test-assignment)3> *global-var*3


Quote:
But I'm not completely sure what you're working towards - Basically, a computer algebra system, right? But in this case you're taking into account the "cost" of each computation?
Well, it's basically a computer algebra system where each expression has an associated cost, so that the computer can automatically transform expressions into equivalent, lower cost ones.
So, one way to formalize our syntax is to write down the typical math syntax in BNF, but this is complicated: we'd have to introduce artificial terms in the BNF syntax to make each expression have one unique parse tree (one unique meaning). This is a headache, and conflates two issues: we'd be mixing the syntax specification with the semantics. And we would have to find a BNF syntax that is not left recursive.

So, why don't we follow the example of Lisp and use parenthesised prefix notation? We lose no generality (any expression that can be written in typical math syntax can be written in prefix syntax) and the BNF specification becomes laughably simple, letting us concentrate on the problem at hand. We can always write a prefix -> infix and infix -> prefix parser later if we need it.

We have the added advantage of writing our expression representations in a language that lisp already understands, so if we use lisp as an implementation language we wouldn't have to implement our own evaluator for our expression representation: we can just use lisp to do it for us!

So here's a simple BNF grammar for expressions using *, +, -, / in prefix notation :

<E> ::= <V>  |  <K>  |  <mul-expr>   |  <sum-expr>  |  <div-expr>  |  <sub-expr><mul-expr> ::= (* <E> <E>)<sum-expr> ::= (+ <E> <E>)  <div-expr> ::= (/ <E> <E>)  <sub-expr> ::= (- <E> <E>)  <V> ::= x | y | z | t | s ....... 				(variable names)<K> ::= 0 | 1 | 2 | 3 ...... 					(constants)


This says that the terminals in our grammar are either variables or constants, and that an expression is either a terminal or a multiplication, addition, division or subtraction of two expressions. It's also consistent with scheme syntax for the aforementioned operations.

I will use pattern matching in function definitions next, so read up on it if you're not familiar with it. I've decided to use Dr. Scheme (PLT mzscheme) as the implementation language.

http://en.wikipedia.org/wiki/Pattern_matching

http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-29.html#%_idx_5176

http://download.plt-scheme.org/doc/103p1/pdf/match.pdf

http://www.codecomments.com/archive282-2004-8-252517.html

Unlike unification, it is a very straightforward concept. If you've ever used Haskell you know about it already.

For now we'll use it only for convenience to avoid tiresome nested if statements; later it will be an integral part of our expression rewriting algorithm, and we'll maybe write a pattern matcher ourselves to clarify the concept. It's simple.
Now, to code! The first thing we have to do is translate the BNF terms into Scheme data types, so we can actually manipulate them in scheme. We'll choose a Scheme representation for each BNF term, and we'll make a predicate function for each term that tells us whether a given scheme data object is an instance of that term type.

You will notice that the code used in each predicate is basically the BNF syntax. A direct translation.

We will include the pattern matching module in PLT mzscheme because it makes writing the operator predicates very straightforward:

(require (lib "match.ss"))


We will represent terminals of type K (numerical constants) using Scheme's number types: so any number Scheme recognizes is a valid K terminal. So the built-in predicate number? is all we need to tell if a term is a constant in our grammar:

(define constant? number?)


We will represent terminals of type V (variable identifiers) as symbols belonging to a set of predefined variable symbols. We'll store this set in the global variable *variable-names*:

(define *variable-names*   '(x y z s t q r))(define (variable? var)  (if (member var *variable-names*)       #t       #f))


Now, the meat of the issue are the nonterminals of type mul-expr, sum-expr, etc. Just as the BNF definition of a given operator expression is recursive (operator expressions are defined in terms of expressions) the predicate for identifying them will be recursive. Let's make predicates for the *, /, + and - cases.

We represent these cases as lists of the form

( )

This common structure for these binary operators means we can abstract a function that makes predicates for a given binary operator. The function make-binary-operator-pred is a function that returns functions. We use that predicate-generating function to make predicates for the operators we'll use. This is the only nontrivial code written so far; the third link in my previous post is the documentation for the mzscheme pattern matcher, but check the fourth link for a simpler explanation:

(define (make-binary-operator-pred op-symbol)  (lambda (expr)    (match expr      ((operator subexpr1 subexpr2)        (and (eq? operator op-symbol) (expression? subexpr1) (expression? subexpr2)))      (anything-else #f))))(define mul-expr? (make-binary-operator-pred '*))(define div-expr? (make-binary-operator-pred '/))(define sum-expr? (make-binary-operator-pred '+))(define sub-expr? (make-binary-operator-pred '-))


The predicate returned by make-binary-operator-pred is a function that tries to match its argument expr to the (operator subexpr1 subexpr2) pattern. If it matches, it ensures that operator is indeed the symbol that represents the operator the predicate recognizes (stored in op-symbol), and also ensures that subexpr1 and subexpr2 are valid subexpressions, by executing

(and (eq? operator op-symbol) (expression? subexpr1) (expression? subexpr2)).


anything-else is guaranteed to match anything at all. It works kind of like the else clause in a cond; in this case, I use it to mean that any pattern that is not matched by (operator subexpr1 subexpr2) is not a binary operation so the predicate should return false.

Now we can trivially define the expression? predicate, which looks exactly like the BNF definition for :

(define (expression? expr)  (or (constant? expr)       (variable? expr)       (mul-expr? expr)       (div-expr? expr)      (sum-expr? expr)       (sub-expr? expr)))


Notice that, given the BNF definition, we directly translated each nonterminal into scheme code.

These predicates allow us to validate an expressions' syntactical correctness. They also will allow us to dispatch control on expression type, which is what we need in order to manipulate symbolic expressions.

Let's test it:

> (expression? '(+ 3 (- 4 5)))#t> (expression? 5)#t> (expression? '(y))#f> (expression? '(+ y (- 4 5)))#t> (expression? '(+ e (- 4 5)))#f> (expression? '(b y (- 4 5)))#f

This topic is closed to new replies.

Advertisement