A common approach to defining functions is to specify the value that the function takes on in one or a small number of cases, and then specify larger values in terms of smnaller values. Definitions like this are called recursive definitions, and the functions so defined are called recursive functions.

Here's an extremely artificial and obtuse example of all this: if you wanted to add the numbers from 1 to 5, well...

this is just 5 + the sum of the numbers from 1 to 4
and that last sum is 4 + the sum of the numbers from 1 to 3
and that last sum is 3 + the sum of the numbers from 1 to 2
and that last sum is 2 + the sum of the numbers from 1 to 1
but the sum of the numbers from 1 to 1 is just 1!

We then work our way back up that series of statements and get the answer: 5 + 4 + 3 + 2 + 1.

That last step - that the sum of the numbers from 1 to 1 is called a base case. Without base cases, recursive functions would not terminate, and we'd get the dreaded stack overflow error! What's a stack overflow error? This fun video explains it:

https://www.youtube.com/watch?v=-PX0BV9hGZY

Recursive functions always work from larger inputs to smaller inputs, so if a base case is specified, how can it NOT stop at a base case? Here are some ways:

we call the function with a non-integer value or a negative value
instead of using one value less like we did with the sum example, we decrement by two
instead of decrementing by a fixed amount, we divide the input by, e.g., ten at each step

The common problem in each is that the function misses the base case as it works on smaller and smaller values. Each one of these situations can be fixed, but if they're not anticipated, we get a stack overflow error, and must then fix the base case in hindsight.

Here's an example of the third situation:

In [1]:

xxxxxxxxxx
 
function f(n)
    if n == 1
        return 0
    else
        return 1 + f(n/10)
    end
end

Out[1]:

f (generic function with 1 method)

In [2]:

xxxxxxxxxx
 
f(1000)

Out[2]:

This function works great when given a power of ten, like 1, 10, 100, 1000, etc., but fails on other values because we divide by two at each step therefore never hitting the base case which is when n = 1. Let's try it:

In [3]:

xxxxxxxxxx
 
f(1001)

StackOverflowError:

Stacktrace:
 [1] f(n::Float64) (repeats 79984 times)
   @ Main ./In[1]:5

See, a stack overflow error.

By the way, what exactly does this function calculate when it does terminate?

Later we'll see an example of a recursive function that uses two base cases.

The Factorial Function¶

For our first real example, consider factorial function: this function, written n!, takes a positive integer n as input and returns the product of all integers from one to that number. For example, 4! = 4 * 3 * 2 * 1 = 24. Here's how it can be defined recursively:

$n! = \begin{cases} 1 & \text{if}\ n = 1 \\ n (n - 1)! & \text{otherwise} \end{cases}$

Here's how to translate that definition into a Julia function:

In [4]:

xxxxxxxxxx
 
function factorial(n)
    if n == 1
        return 1
    else
        return n * factorial(n - 1)
    end
end

Out[4]:

factorial (generic function with 1 method)

In [5]:

xxxxxxxxxx
 
factorial(4)

Out[5]:

Let's make sure we understand how this works by walking through it by hand:

factorial(4) = 4 * factorial(3) = 4 * (3 * factorial(2)) = 4 * (3 * (2 * factorial(1))) = 4 * (3 * (2 * 1)) = 4 * (3 * 2) = 4 * 6 = 24

Whenever a function is called, a "program state" is saved to a "stack". Once the function is finished, the program state is popped off the stack and the program continues. As we can see, the factorial function as written consumes memory space for the stack of program state, as well as performing the actual multiplication!

Can we do better?

Tail Recursion for the Factorial Function¶

One approach to improve the calculations is to use what is called tail recursion. This is a programming technique where recursion is the very last thing the function does. Note that the implementation of factorial we just wrote does not operate in this way: after doing recursion, we then have to do all the multiplication!

The idea of tail recursion is to maintain the intermediate results as an argument of the function. In a sense, this extra argument acts like an accumulator.

Ordinarily we think of the factorial function as requiring exactly one argument, so how do we get this extra argument? In languages that don't allow a function's arguments to have default values, one would have to write two functions:

one function with two arguments that does the actual tail recursion work, and
another function, a wrapper function, with one argument that calls the first function while initializing the value of the second argument

Here are the two functions, written without benefit of default values:

In [6]:

xxxxxxxxxx
 
function factorial_TR_without_defaults(n, a)
    if n == 1
        return a
    else
        return factorial_TR_without_defaults(n - 1, n * a)
    end
end
​
function factorial_2(n)
    return factorial_TR_without_defaults(n, 1)
end

Out[6]:

factorial_2 (generic function with 1 method)

In [7]:

xxxxxxxxxx
 
factorial_2(4)

Out[7]:

Fortunately, Julia allows a function's arguments to have default values! Here is how we would then write the factorial using tail recursion without the need of a wrapper function:

In [8]:

xxxxxxxxxx
 
function factorial_TR(n, a = 1)
    if n == 1
        return a
    else
        return factorial_TR(n - 1, n * a)
    end
end

Out[8]:

factorial_TR (generic function with 2 methods)

In [9]:

xxxxxxxxxx
 
factorial_TR(4)

Out[9]:

Let's walk through the calculation by hand:

factorial_TR(4) = factorial_TR(4, 1) = factorial_TR(3, 4) = factorial_TR(2, 12) = factorial_TR(1, 24) = 24

The last line follows because the implementation of factorial_TR tells us to return the accumulator when n is 1. That's the base case.

Notice how the multiplication is performed as factorial_TR is called, instead of waiting for the recursion to "unwind" as in the first factorial implementation.

Can we do better?

Calculating Factorials Using a Loop¶

Recursion can be avoided, at least with factorials, by just following the original description of the factorial function as given above: "take a positive integer n as input and return the product of all integers from one to that number." Really, this is just a loop, so we can implement that description in Julia as follows:

In [10]:

xxxxxxxxxx
 
function factorial_as_loop(n)
    ans = 1
    
    for i = 1:n
        ans = ans * i
    end
    
    return ans
end

Out[10]:

factorial_as_loop (generic function with 1 method)

In [11]:

xxxxxxxxxx
 
factorial_as_loop(4)

Out[11]:

This is perhaps the most understandable way of calculating factorials.

The Euclidean Algorithm¶

Suppose we wish to find the greatest common divisor (GCD) of two positive integers $a$ and $b$ . This is the largest whole number that divides evenly into both $a$ and $b$ . How can we do this?

A first solution one may try is to factor $a$ and $b$ into primes, and take the product of the common factors; if $a$ and $b$ have no primes in common, their GCD is just 1. For example, let $a$ = 6750 and $b$ = 1400. Then:

$a = 6750 = 2^1 \cdot 3^3 \cdot 5^3$

$b = 1400 = 2^3 \cdot 5^2 \cdot 7^1$

Both $a$ and $b$ have 2 as a factor, and the smallest number of 2s is found in $a$ ; 5 is a factor of both, and the smallest number of 5s is found in $b$ , so:

$gcd(6750, 1400) = 2^1 \cdot 5^2 = 50$

The problem with this approach is that it requires we find the prime factorization of both numbers - not an easy task!

The immediate purpose of the Euclidean algorithm is to find the GCD of the two numbers without first factoring them into primes. The fundamental idea behind this algorithm is that the GCD of $a$ and $b$ (where $a \ge b$ ) is the same as the GCD of of a-b and b. This is not an obvious statement - see any elementary numnber theory textbook for a proof.

This fundamental idea tells us, for example, that

$gcd(6750, 1400) = gcd(5350, 1400)$

We can check this as follows:

$a - b = 5350 = 2^1 \cdot 5^2 \cdot 107^1$

$b = 1400 = 2^3 \cdot 5^2 \cdot 7^1$

and, using the common prime counting trick,

$gcd(5350, 1400) = 2^1 \cdot 5^2 = 50$ .

This key insight - that the GCD of a and b (where $a \ge b$ ) is the same as the GCD of of $a-b$ and $b$ - basically states how to convert the original task of finding the GCD into the same task, but with smaller numbers. That sounds like recursion!

Of course, if there is no base case, the Euclidian algorithm would never terminate. The base case is this: the GCD of a number $b$ and 0 is just $b$ .

Let's try implementing it!

In [12]:

xxxxxxxxxx
 
function gcd(a, b)
    if a < b
        a, b = b, a  # Swap a and b so that a is larger
    end
    
    if b == 0
        return a
    else
        return gcd(a - b, b)
    end
end

Out[12]:

gcd (generic function with 1 method)

In [13]:

xxxxxxxxxx
 
gcd(6750, 1400)

Out[13]:

Let's walk through the execution of this function, focusing on the times that $a$ and $b$ are swapped, and the point where the function calls itself:

gcd(6750, 1400) = gcd(5350, 1400) = gcd(3950, 1400) = gcd(2550, 1400) = gcd(1150, 1400) = gcd(1400, 1150) = gcd(250, 1150) = gcd(1150, 250) = gcd(900, 250) = gcd(650, 250) = gcd(400, 250) = gcd(150, 250) = gcd(250, 150) = gcd(100, 150) = gcd(150, 100) = gcd(50, 100) = gcd(100, 50) = gcd(50, 50) = gcd(0, 50) = gcd(50, 0) = 50

We can modify the implementation so that we aren't subtracting b from a over and over, like the first four steps where we subtracted 1400, etc. Really, the purpose of all this subtraction is to find the remainder - how much is left over from subtracting 1400 from 6750 as many times as possible while still getting something non-negative.

Julia has an operator for finding the remainder: %. Let's use it!

In [14]:

xxxxxxxxxx
 
function gcd(a, b)
    if a < b
        a, b = b, a  # Swap a and b so that a is larger
    end
    
    if b == 0
        return a
    else
        return gcd(a % b, b)
    end
end

Out[14]:

gcd (generic function with 1 method)

In [15]:

xxxxxxxxxx
 
gcd(6750, 1400)

Out[15]:

Let's walk through this by hand so see what we gained:

gcd(6750, 1400) = gcd(1150, 1400) = gcd(1400, 1150) = gcd(250, 1150) = gcd(1150, 250) = gcd(150, 250) = gcd(250, 150) = gcd(100, 150) = gcd(150, 100) = gcd(50, 100) = gcd(100, 50) = gcd(0, 50) = gcd(50, 0) = 50

This is better, but notice that we have to swap the arguments each time. By definition, the remainder of a divided by b must be smaller than b, so we can swap them when we recursively call the function:

In [16]:

xxxxxxxxxx
 
function gcd(a, b)
    if a < b
        a, b = b, a  # Swap a and b so that a is larger
    end
    
    if b == 0
        return a
    else
        return gcd(b, a % b)
    end
end

Out[16]:

gcd (generic function with 1 method)

We can go further: there is no need to swap the numbers at the start, since b is always greater than a % b. This, then, is our final recursive implementation of Euclid's algorithm:

In [17]:

xxxxxxxxxx
 
function gcd(a, b)
    if b == 0
        return a
    else
        return gcd(b, a % b)
    end
end

Out[17]:

gcd (generic function with 1 method)

In [18]:

xxxxxxxxxx
 
gcd(6750, 1400)

Out[18]:

To compare this with previous versions, here's a walkthrough of this last implementation:

gcd(6750, 1400) = gcd(1400, 1150) = gcd(1150, 250) = gcd(250, 150) = gcd(150, 100) = gcd(100, 50) = gcd(50, 0) = 50

Nice - no swaps and no repeated subtractions!

The Euclidean Algorithm - the Iterative Version¶

The above implementation is correct, but it doesn't quite match the way the Euclidean algorithm is described in many a modern elementary number theory textbooks. Instead, the algorithm is presented as a loop - we're supposed to repeatedly subtract the smaller number from the larger one, swapping as we go along, until the smaller one is zero. At that point, the larger number is the GCD.

We can use the above insights to avoid repeated subtractions to simplify this, and here's how it would be implemented:

In [19]:

xxxxxxxxxx
 
function gcd_as_loop(a, b)
    while b != 0
        b, a = a % b, b
    end
    return a
end

Out[19]:

gcd_as_loop (generic function with 1 method)

In [20]:

xxxxxxxxxx
 
gcd_as_loop(6750, 1400)

Out[20]:

This implementation precisely matches the Euclidean algorthm as described in modern textbooks, where a typical example calculation might look like this, with arrows showing how the value in column b moves over to column a in each step:

a	b
6750	1150
1150	250
250	150
150	100
100	50
50	0

While this is surely an efficient method for computation of the GCD, does it maintain the insight as to why the Euclidean algorithm works?

The Fibonacci Numbers¶

Here's our last example, the Fibonacci numbers. This is a sequence of numbers related to the groth of plants, animal populations, etc. The first few values of this sequence are:

0, 1, 1, 2, 3, 5, 8, 13, 21, ...

The first value is zero, the second value is one, and subsequent values are the sum of the two preceding terms. Here's how it can be defined as a recurrence relation:

$F_n = \begin{cases} 0 & \text{if}\ n = 0 \\ 1 & \text{if}\ n = 1 \\ F_{n - 1} + F_{n - 2} & \text{otherwise} \end{cases}$

Again, this is easy to translate into a Julia function:

In [21]:

xxxxxxxxxx
 
function fib(n)
    if n == 0
        return 0
    elseif n == 1
        return 1
    else
        return fib(n - 1) + fib(n - 2)
    end
end

Out[21]:

fib (generic function with 1 method)

In [22]:

xxxxxxxxxx
 
fib(7)

Out[22]:

In [23]:

xxxxxxxxxx
 
fib(5)

Out[23]:

This Fibonacci function has not only the space and delayed computation problems of the recursive factorial function, it also has a problem of repeated calculation! To see what I mean, let's walk through through a small example:

fib(5) = fib(4) + fib(3) = (fib(3) + fib(2)) + (fib(2) + fib(1)) = ((fib(2) + fib(1)) + (fib(1) + fib(0))) + ((fib(1) + fib(0)) + fib(1)) = (((fib(1) + fib(0)) + fib(1)) + (fib(1) + fib(0))) + ((fib(1) + fib(0)) + fib(1)) = (((1 + 0) + 1) + (1 + 0)) + ((1 + 0) + 1) = ((1 + 1) + 1) + (1 + 1) = (2 + 1) + 2 = 3 + 2 = 5

Notice how many times fib(3) had to be calculated!

What can be done about this?

Memoization¶

Memoization is a method for storing the results of calculations for later use, so instead of calculating fib(2) over and over, we will calculate it once and then when the value is needed again, we will look-up that value rather than repeat the calculation.

How should we store these calculations? One approach is to use dictionaries.

In [24]:

 
FibonacciDict = Dict(0=>0, 1=>1)

Out[24]:

Dict{Int64, Int64} with 2 entries:
  0 => 0
  1 => 1

To make use of this, we make a new version of the fib function. Since FibonacciDict stores the first two values of the sequence, we don't need the base cases.

In [25]:

xxxxxxxxxx
 
function fib_memoized(n)
    if haskey(FibonacciDict, n)
        # println("Using dictionary for ", n)
        return FibonacciDict[n]
    else
        ans = fib_memoized(n - 1) + fib_memoized(n - 2)
        # println("Setting dictionary ", n, " => ", ans)
        FibonacciDict[n] = ans
        return ans
    end
end

Out[25]:

fib_memoized (generic function with 1 method)

Before running this function...

In [26]:

xxxxxxxxxx
 
FibonacciDict

Out[26]:

Dict{Int64, Int64} with 2 entries:
  0 => 0
  1 => 1

In [27]:

xxxxxxxxxx
 
fib_memoized(7)

Out[27]:

... and afterwards:

In [28]:

xxxxxxxxxx
 
sort(collect(FibonacciDict))

Out[28]:

8-element Vector{Pair{Int64, Int64}}:
 0 => 0
 1 => 1
 2 => 1
 3 => 2
 4 => 3
 5 => 5
 6 => 8
 7 => 13

What memoization does is to trade the effort needed for duplicate calculations (time) for storing the calcuations in memory (space). This type of space-time tradeoff is very common in computer programming.

Can we do better?

Tail Recursion for the Fibonacci Numbers¶

Let's try tail recursion. One way of implementing this is to use two extra arguments for storing the previous two values in the series.

As mentioned above, in languages where functions cannot have default values, we would use two functions, one to perform the calculation, and a wrapper function.

In [29]:

xxxxxxxxxx
 
function fib_TR_without_defaults(n, a, b)
    if n == 0
        return a
    else
        return fib_TR(n - 1, b, a + b)
    end
end
​
function fib_2_without_defaults(n)
    return fib_TR(n, 0, 1)
end

Out[29]:

fib_2_without_defaults (generic function with 1 method)

Here's how we'd implement it using default values:

In [30]:

xxxxxxxxxx
 
function fib_TR(n, a = 0, b = 1)
    if n == 0
        return a
    else
        return fib_TR(n - 1, b, a + b)
    end
end

Out[30]:

fib_TR (generic function with 3 methods)

In [31]:

xxxxxxxxxx
 
fib_TR(7)

Out[31]:

Was there any duplicate calculations? Let's walk through this by hand to see:

fib_TR(7) = fib_TR(7, 0, 1) = fib_TR(6, 1, 1) = fib_TR(5, 1, 2) = fib_TR(4, 2, 3) = fib_TR(3, 3, 5) = fib_TR(2, 5, 8) = fib_TR(1, 8, 13) = fib_TR(0, 13, 21) = 13

So no duplicate calculations, yay!

Can we do better?

Fibonacci Numbers - the Iterative Version¶

Another way we can calculate the nth Fibonacci number is by starting with 0 and 1, add them, then use that result to calculate the next number in the sequence, doing this over and over n times. Each time through the loop, we add the two previous values, then move the that sum and the immediate predecessor to the previous values.

In [32]:

xxxxxxxxxx
 
function fib_as_loop(n)
    a = 0
    b = 1
    
    for i = 1:n
        a, b = b, a + b
    end
    
    return a
end

Out[32]:

fib_as_loop (generic function with 1 method)

In [33]:

xxxxxxxxxx
 
fib_as_loop(7)

Out[33]:

Here's the values of n, a, and b as fib_as_loop(7) executes:

n	a	b
-	0	1
1	1	1
2	1	2
3	2	3
4	3	5
5	5	8
6	8	13
7	13	21

This is a very intuitive approach, but still requires us to calculate all the previous terms in the Fibonacci sequence in order to get the number we want. Can we do better?

Closed Form of the Fibonacci Sequence¶

Our final approach to calculating the Fibonacci numbers is to use what is called the "closed form" of this sequence. Not all recurrance relations have a closed form, but as it goes, there is a formula called Binet's formula for calculating the nth Fibonacci number without calculating the prior values. This formula uses two constants, one of which is the golden ratio:

$\phi = \frac{1 + \sqrt{5}}{2}$

$\psi = \frac{1 - \sqrt{5}}{2} = 1 - \phi$

and the Binet formula itself is:

$F_n = \frac{1}{\sqrt{5}}(\phi^n - \psi^n)$

The Golden Ratio is in Julia's Base package:

In [34]:

xxxxxxxxxx
 
Base.MathConstants.golden

Out[34]:

φ = 1.6180339887498...

and this makes Binet's formula even easier to convert to a Julia function:

In [35]:

xxxxxxxxxx
 
function fib_Binet(n)
    ϕ = Base.MathConstants.golden  # Enter this using \phi [TAB]
    ψ = 1 - ϕ                      # Enter this using \psi [TAB]
    return (ϕ^n - ψ^n)/sqrt(5)
end

Out[35]:

fib_Binet (generic function with 1 method)

In [36]:

xxxxxxxxxx
 
fib_Binet(5)

Out[36]:

5.0

The formula is completely accurate and always returns an integer (which is in itself amazing), but we are working with limited precision arithmetic, so we should expect rounding errors. We do indeed get them:

In [37]:

xxxxxxxxxx
 
fib_Binet(30)

Out[37]:

832039.9999999999

We can fix this by rounding, but we can do better by noticing that

$\frac{\phi^n}{\sqrt{5}} < \frac{1}{2}$

for all $n \ge 0$ . Thus:

$F_n = \Bigl\lfloor \tfrac{\phi^n}{\sqrt{5}} + \frac{1}{2}\Bigr\rfloor$

and that is just $\frac{\phi^n}{\sqrt{5}}$ rounded to the nearest integer. Let's try it out!

In [38]:

xxxxxxxxxx
 
function fib_Binet2(n)
    φ = Base.MathConstants.golden
    return round(Int64, φ^n/sqrt(5))
end

Out[38]:

fib_Binet2 (generic function with 1 method)

Using Binet's formula we can calculate extremely large Fibonacci numbers:

In [39]:

xxxxxxxxxx
 
fib_Binet2(92)

Out[39]:

7540113804746344448

Which of the other versions of the Fibonacci function can calculate $F_{92}$ without encountering some sort of error?

Conclusion¶

In this chapter we used recursive functions to compute GCDs, factorials, and Fibonacci numbers... only to rewrite them in non-recursive forms!

Does this mean that recursive functions are to be avoided at all costs?

No! And here's why:

Recursion is an intuitive and natural way for describing and solving problems (though this intuition only comes from experience)
Many problems in compiler design are best solved using recursion
For certain data structures, like trees, it makes sense to use recursion to operate upon them
The performance of recursive algorithms frequently can be analyzed and understood
It makes sense to produce certain types of graphics using recursion; we'll see this later.

There are extremely few hard-and-fast rules when it comes to the art of computer programming (despite the bilge expounded by "best practice experts," "clean coders," and "style guides"), and completely banishing recursion is not one of them. Like any other tool, recursion must be used intelligently in order to get good results.

Red White and Julia

Monday, March 29, 2021

Functions: Recursion

The Factorial Function¶

Tail Recursion for the Factorial Function¶

Calculating Factorials Using a Loop¶

The Euclidean Algorithm¶

The Euclidean Algorithm - the Iterative Version¶

The Fibonacci Numbers¶

Memoization¶

Tail Recursion for the Fibonacci Numbers¶

Fibonacci Numbers - the Iterative Version¶

Closed Form of the Fibonacci Sequence¶

Conclusion¶

No comments:

Post a Comment