Map, Reduce and co.

Started by
4 comments, last by SamLowry 16 years, 11 months ago
This thread will explain common higher-order functions such as map and reduce (also known as fold) as will it provide some exercises. map map takes a function and a list, and applies the function on every element of the list. The results are put in a new list, with the order preserved:

> (map sqr '(1 2 3 4 5))
(1 4 9 16 25)
I would suggest implementing this function yourself. fold This is a tricky one, but I find myself using it quite often. One way to see it, is that it applies associativity. In short (for long explanation, click link): The addition operation + is associative because 1+2+3=1+(2+3) = (1+2)+3. The power operation, ^, is not associative: 2^(3^4) != (2^3)^4 because 2417851639229258349412352 != 4096. So, when confronted with 2^3^4, you need a convention: you either evaluate it as (2^3)^4, known as left-associativity, or you go for 2^(3^4), right associativity. Given a binary function (= takes two arguments) and a list of arguments, fold-left will apply left associativity, and fold-right will apply right associativity:

(fold-left + '(1 2 3 4 5))
computes
(((1 + 2) + 3) + 4) + 5

(fold-right + '(1 2 3 4 5))
computes
1 + (2 + (3 + (4 + 5)))
In this example, both expressions will yield the same end result, as + is associative meaning that grouping from the left or the right does not matter. What happens though when you give one of the folds an empty list? This problem is solved by providing an extra argument, the "neutral element". One could say this neutral element is returned in the case of an empty list, but that wouldn't be fully correct. What really happens is that this neutral element will be added implicitly to the list:

(fold-left + 0 '(1 2 3 4 5))
==
((((0 + 1) + 2) + 3) + 4) + 5

(fold-right + '(1 2 3 4 5) 0)
==
1 + (2 + (3 + (4 + (5 + 0))))
Notice the position of the neutral element in the list: fold-left puts in on the left, fold-right on the right. Simply put, they each put it so that it will be part of the first subexpression evaluated. Another way of interpreting fold is to see it as an automatic accumulator. This view is easier to understand when your binary function does not take 2 arguments of the same type or if some processing goes on inside it. Instead of seeing it as applying associativity, you see it as having a list of values which you want to process them one by one, and each of them will add a small bit to the end result. As an example, let's write a binary-number-parser. You give it a list of binary digits, and it will return the number it represents:

> (binary->value '(1 1 0 0 1 0 1 1))
203
The algorithm I'll implement will process each digit in turn. Imagine you have parsed the binary number 110, and you know it to represent 6. But then you discover there's still a digit following, a 1. What do you do? Since the original 110 shifts one place to the left, you know its value is worth double your original thoughts: the 110 part is in fact 6*2=12, and you have to add 1 to that (the newest digit), so the correct value is 13. But then someone tells you there's yet another digit coming... If you repeat this system for each digit in the list, you'll eventually get the right result. Or for the more formally inclined: abcde = abcd * 2 + e So, how do we write this in Scheme?

(define (binary->value lst)
  (fold-left (lambda (x y) (+ (* 2 x) y))
             0
             lst))
If made a bit more verbose:

(define (binary->value lst)
  (fold-left (lambda (x y)
               (let ((result (+ (* 2 x) y)))
                 (display (format "Accumulated ~A, received new digit ~A ==> ~A * 2 + ~A = ~A~%"
                                  x
                                  y
                                  x
                                  y
                                  result))
                 result))
               0
               lst))

> (binary->value '(1 1 0 1))
Accumulated 0, received new digit 1 ==> 0 * 2 + 1 = 1
Accumulated 1, received new digit 1 ==> 1 * 2 + 1 = 3
Accumulated 3, received new digit 0 ==> 3 * 2 + 0 = 6
Accumulated 6, received new digit 1 ==> 6 * 2 + 1 = 13
The best way to understand fold is to make some exercises, which will soon be coming. [Edited by - SamLowry on May 9, 2007 5:35:46 PM]
Advertisement
Some exercises. These assume map is provided by your version of Scheme (otherwise you'll have to implement it yourself, see ex. 1 of the next section).

1. Implement a tree-map which works like map, but on trees instead of lists:
> (tree-map square '((1 2) (3 (4 5))))(tree-map square '((1 4) (9 (16 25))))


2. Implement (filter pred lst) which given a predicate (unary function returning true or false) and a list, will return a new list of all items for which the predicate returns true. Order must be maintained:
> (filter odd? '(1 2 3 4 5))(1 3 5)


3. Implement this filter function using map.
Hint:
(apply func '(x y z)) ==== (func x y z)

Second hint: append

4. Write an (index lst) function using map:
> (index '(a b c))((0 . a) (1 . b) (2 . c))

Note: this is in my opinion a very ugly way of solving this problem.

5. Write an (unordered-pairs lst) function:
> (unordered-pairs '(1 2 3 4))((3 . 4) (2 . 3) (2 . 4) (1 . 2) (1 . 3) (1 . 4))

Order is not important.

6. Write a (combinations lst) function which returns a list of all possible subsets (= list where order is not important, (1 2) = (2 1)):
> (combinations '(1 2 3))(() (3) (2) (2 3) (1) (1 3) (1 2) (1 2 3))

Order is not important, so
> (combinations '(1 2 3))(() (3) (2) (3 2) (3 1) (2 1) (2 1 3) (1))

is also acceptable.


; 1(define (tree-map func x)  (if (list? x)      (map (lambda (x) (tree-map func x)) x)      (func x))); 2(define (filter pred lst)  (cond ((null? lst)         ())        ((pred (car lst))         (cons (car lst)               (filter pred (cdr lst))))        (else         (filter pred (cdr lst))))); 3(define (filter pred lst)  (apply append         (map (lambda (x)                (if (pred x)                    (list x)                    ()))              lst))); 4(define (index lst)  (map (let ((counter -1))         (lambda (x)           (set! counter (+ 1 counter))           (cons counter x)))       lst)); 5(define (unordered-pairs lst)  (if (null? lst)      ()      (append (unordered-pairs (cdr lst))              (map (lambda (x)                     (cons (car lst) x))                   (cdr lst))))); 6(define (combinations lst)  (if (null? lst)      '(())      (let ((combs (combinations (cdr lst))))        (append combs                (map (lambda (x)                       (cons (car lst) x))                     combs)))))




Solve these without the help of any library functions.

1. Implement a (map func lst) function.

2. Implement a (fold-left func init lst) function.

3. Implement a (fold-right func lst init) function.

4. Implement map in terms of fold-left.

5. Implement map in terms of fold-right.

6. Implement (filter pred lst) using one of the folds.

7. Implement (partition pred lst) which separates elements into two lists depending on whether pred returns true or false. Use one of the folds.
> (car (partition odd? '(1 2 3 4 5)))(1 3 5)> (cdr (partition odd? '(1 2 3 4 5)))(2 4)


8. Implement (literal->value digits radix) which works on strings and converts a literal in any radix (up to 36, digits "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ") to a numeric value. No error checking necessary.
Hint: string->list, char->integer, char<?.
> (literal->value "1001" 2)9> (literal->value "F000" 16)61440


9. Implement (run-length-encode lst) (use fold) which groups equal consecutive elements with the sequence length into cons-cells as follows:
> (run-length-encode '(a a a b c c))((a . 3) (b . 1) (c . 2))


10. Implement (run-length-decode lst) which does the inverse of run-length-encode:
> (run-length-decode (run-length-encode '(a a a b c c)))(a a a b c c)


; 1(define (map func lst)  (if (null? lst)      ()      (cons (func (car lst))            (map func (cdr lst))))); 2(define (fold-left func init lst)  (if (null? lst)      init      (fold-left func                 (func init (car lst))                 (cdr lst)))); 3(define (fold-right func lst init)  (define (fold-right lst init)    (if (null? lst)        init        (fold-right (cdr lst)                    (func (car lst) init))))  (fold-right (reverse lst) init)); 4(define (map func lst)  (fold-left (lambda (acc x)               (append acc                       (list (func x))))             ()             lst)); 5; Note: the order in which the elements are computed is reversed(define (map func lst)  (fold-right (lambda (x acc)                (cons (func x) acc))              lst              ())); 6(define (filter pred lst)  (reverse (fold-left (lambda (acc x)                        (if (pred x)                            (cons x acc)                            acc))                      ()                      lst))); 7(define (partition pred lst)  (fold-right (lambda (x p)                (let ((yes (car p))                      (no (cdr p)))                  (if (pred x)                      (cons (cons x yes)                            no)                      (cons yes                            (cons x no)))))              lst              '(() . ()))); 8(define (literal->value literal radix)  (fold-left (lambda (total x)               (+ (* total radix) x))             0             (map (lambda (char)                                        (if (char<? char #\A)                        (- (char->integer char) (char->integer #\0))                        (+ (char->integer char) 10 (- (char->integer #\A)))))                  (string->list literal)))); 9(define (run-length-encode lst)  (reverse (fold-left (lambda (acc x)                        (if (or (null? acc)                                (not (eq? (caar acc) x)))                            (cons (cons x 1) acc)                            (cons (cons x (+ 1 (cdar acc)))                                  (cdr acc))))                      ()                      lst))); 10(define (run-length-decode lst)  (define (repeat x n)    (if (= n 0)        ()        (cons x (repeat x (- n 1)))))  (fold-left (lambda (acc x)               (append acc                       (repeat (car x)                               (cdr x))))             ()             lst))


[Edited by - SamLowry on May 9, 2007 7:14:29 PM]
My answers to 1-4, but not 3. I'm still thinking about problem 3, though, so give me some more time for that. Thanks a bunch for these exercises!
;Exercise 1(define (tree-map func tree)  (define (iter tree)        (if (list? tree)        (map iter tree)        (func tree)))  (iter tree));Exercise 2(define (filter pred list)  (if (null? list)      ()      (if (pred (car list))          (cons (car list) (filter pred (cdr list)))          (filter pred (cdr list)))));Exercise 3;I can't figure this one out!;Exercise 4;I had to 'cheat' here and used set!;I'm thinking that there's a better way?(define (index list)  (let ((a -1))    (define (index-item item)      (set! a (+ 1 a))      (cons a item))    (map index-item list)))


Edit: Oh! Just noticed that you have answers available. Still, I'd like some feedback on where we differ.
Regardless of whether solutions are available, it is always interesting to see other people's answers. I'm not perfect, it could very well be that there's a bug in my code, or that you'd find a more elegant solution, etc.

Exercise 1
There's not much difference except for the fact you declare a local function which can take one argument less. I'd say this has an impact on performance, but I wouldn't know if it would be in a positive or negative way.

I only use local helper functions if they're needed, e.g. when arguments need to be preprocessed, or if the return value needs to be postprocessed (such as a reversal).

Note that my solution would be nicer if there were currying: I would have been able to write (map (tree-map func) x) instead of (map (lambda (x) (tree-map func x)) x).


Exercise 2
No notable difference, exception for your nested if instead of my cond.

Exercise 3
Hint: you need some postprocessing:
(define (filter pred lst)  (postprocess (map some-func lst)))


Exercise 4
Using set! is what I had in mind. I personally think it's a horrible solution, but it shows an implicit dependency: you assume map processes the elements for left to right. This is not an issue when working in a functional way, but state can mess it all up.

Another way of solving it would be to use a zip function:
; Ignoring tail-recursiveness(define (range a b)  (if (< a b)      (cons a (range (+ 1 a) b))      ()))(define (zip xs ys)  (if (or (null? xs)          (null? ys))      ()      (cons (cons (car xs)                  (car ys))            (zip (cdr xs)                 (cdr ys)))))(define (index lst)  (zip (range 0 (length lst)) lst))> (zip '(1 2 3) '(a b c))((1 . a) (2 . b) (3 . c))> (index '(a b c))((0 . a) (1 . b) (2 . c))
(define (tree-map func tree)  (if (list? tree)      (map (lambda (x) (tree-map func x)) tree)      (func tree))); Straight-forward, low-level filter(define (filter predicate list)  (if (null? list)      ()      (let ((head (car list)) (tail (cdr list)))        (if (predicate head)            (cons head (filter predicate tail))            (filter predicate tail))))); First map attempt--fringe to the rescue!(define (fringe tree)  (if (list? tree)      (apply append (map fringe tree))      (list tree)))(define (filter-map predicate items)  (fringe (map (lambda (x) (if (predicate x) (list x) ())) items))); After finally getting what you meant by the apply hint (and seeing your implementation of fringe above :P)(define (filter-map-duh predicate items)  (apply append (map (lambda (x) (if (predicate x) (list x) ())) items))); No idea how to make a counter inside the lambda (or make one outside and increment it in the lambda); Haven't reached the part about side-effects in SICP yet :P(define (index items)  (map (lambda (x) (cons 1 x)) items))


If it's not too much to ask, would you mark the exercises requiring the use of set! (or whatever is used to implement C-like variable assignment)? I haven't reached that part yet in SICP.

Quote:Original post by Muhammad Haggag
*** Source Snippet Removed ***

If it's not too much to ask, would you mark the exercises requiring the use of set! (or whatever is used to implement C-like variable assignment)? I haven't reached that part yet in SICP.


No comment on your solutions, they all look fine, except for index which indeed needs state if you wish to implement it in terms of a single map. It is of course possible to write a purely functional index function, as shown in a previous post of mine in this thread.

None of the other exercises need state. Actually, the exercises are more about forcing you to solve it in in purely functional way (except for index). I hereby forbid you to use state :).

This topic is closed to new replies.

Advertisement