- Define the invariant for binary nodes that characterises
the ordering property of items in a binary search tree.
For all internal nodes in the tree (assuming node elements are Comparable):

( left.isExternal() || left.element().compareTo(this.element()) <= 0 ) &&(right.isExternal() ||right.element().compareTo(this.element()) >= 0 )

- Define the invariant for nodes in an AVL tree.
Assume that nodes have a query method
`height()`

.invariant for a BST node above && // height balanced ( isExternal() || Math.abs(left.height() - right.height()) <= 1 ) && // check on height ( isExternal() ? height() == 0 : height() == 1 + Math.max(left.height() - right.height())

- Trace the effect of
**inserting**keys into an initially empty AVL Tree in the following order. Depict the state of the tree before any required rotations (showing where the tree is out of balance), and the final state of the tree:`5, 2, 1, 6, 8, 3, 4`

5* 2 / => / \ 2 (single) 1 5 / 1 2 2 / \ / \ 1 5* 1 6 \ => / \ 6 (single) 5 8 \ 8 2* 2* 5 / \ / \ / \ 1 6 => 1 5 => 2 6 / \ (double 1) / \ (double 2) / \ \ 5 8 3 6 1 3 8 / \ 3 8 Final Tree: 5 / \ 2 6 / \ \ 1 3 8 \ 4

- Now trace the effect of
**removing**keys from the resulting tree, in the same order that they were inserted.5 6* 6* 3 / \ / \ / \ / \ 2 6 => 2 8 => 3 8 => 2 6 / \ \ (remove) / \ (double 1) / \ (double 2) / / \ 1 3 8 1 3 2 4 1 4 8 \ \ / 4 4 1 3 6 \ / \ 6 3 8 / \ => \ 4 8 (single) 4

Note: by convention, when the left-right and right-right grandchildren have equal height, we choose a single rotation (See G&T, page 399).8 4 / => / \ 3 (double) 3 8 \ 4

**3. Insertion and Removal in (2,4) Trees**

- Trace the effect of
**inserting**keys into an initially empty (2,4) Tree in the following order:`1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14`

+-----+ | 3 | +---------+ +-----+ | 1 2 3 4 | => / \ +---------+ +-----+ +-----+ | 1 2 | | 4 | +-----+ +-----+ +-----+ +---------+ | 3 | | 3 6 | +-----+ +---------+ / \ => / | \ +-----+ +---------+ +-----+ +-----+ +-----+ | 1 2 | | 4 5 6 7 | | 1 2 | | 4 5 | | 7 | +-----+ +---------+ +-----+ +-----+ +-----+ +---------+ +-------------+ | 3 6 | | 3 6 9 | +---------+ +-------------+ / | \ => / | | \ +-----+ +-----+ +----------+ +-----+ +-----+ +-----+ +----+ | 1 2 | | 4 5 | | 7 8 9 10 | | 1 2 | | 4 5 | | 7 8 | | 10 | +-----+ +-----+ +----------+ +-----+ +-----+ +-----+ +----+ +-------------+ +----------------------+ | 3 6 9 | | 3 6 9 12 | +-------------+ => +----------------------+ / | | \ / | | | \ +---+ +---+ +---+ +-------------+ +---+ +---+ +---+ +-----+ +----+ |1 2| |4 5| |7 8| |9 10 11 12 13| |1 2| |4 5| |7 8| |10 11| | 13 | +---+ +---+ +---+ +-------------+ +---+ +---+ +---+ +-----+ +----+ +-------+ | 9 | +-------+ / \ +---------+ +------+ => | 3 6 | | 12 | +---------+ +------+ / | \ / \ +---+ +---+ +---+ +-----+ +----+ |1 2| |4 5| |7 8| |10 11| | 13 | +---+ +---+ +---+ +-----+ +----+

- Now trace the effect of
**removing**keys from the resulting tree, in the same order that they were inserted.+-------+ +-------+ | 9 | | 9 | +-------+ +-------+ / \ / \ +-----+ +----+ +-----+ +----+ | 3 6 | | 12 | => | 4 6 | | 12 | +-----+ +----+ +-----+ +----+ / | \ / \ / | \ / \ +--+ +---+ +---+ +-----+ +-----+ +---+ +---+ +---+ +-----+ +-----+ | | |4 5| |7 8| |10 11| |13 14| | 3 | | 5 | |7 8| |10 11| |13 14| +--+ +---+ +---+ +-----+ +-----+ +---+ +---+ +---+ +-----+ +-----+ +-------+ +-------+ | 9 | | 9 | +-------+ +-------+ / \ / \ +------+ +------+ +-----+ +------+ | 4 6 | | 12 | => | 6 | | 12 | +------+ +------+ +-----+ +------+ / | \ / \ / \ / \ +--+ +---+ +---+ +-----+ +-----+ +---+ +---+ +-----+ +-----+ | | | 5 | |7 8| |10 11| |13 14| |4 5| |7 8| |10 11| |13 14| +--+ +---+ +---+ +-----+ +-----+ +---+ +---+ +-----+ +-----+ +-------+ +-------+ | 9 | | 9 | +-------+ +-------+ / \ / \ +-----+ +------+ +-----+ +------+ | 6 | | 12 | => | 7 | | 12 | +-----+ +------+ +-----+ +------+ / \ / \ / \ / \ +---+ +-----+ +-------+ +-------+ +---+ +---+ +-------+ +-------+ | | | 7 8 | | 10 11 | | 13 14 | | 6 | | 8 | | 10 11 | | 13 14 | +---+ +-----+ +-------+ +-------+ +---+ +---+ +-------+ +-------+ +-------+ +-------+ | 9 | | 9 | +-------+ +-------+ / \ / \ +-----+ +------+ +---+ +------+ | 7 | | 12 | => | | | 12 | +-----+ +------+ +---+ +------+ / \ / \ | / \ +---+ +-----+ +-------+ +-------+ +--------+ +-------+ +-------+ | | | 8 | | 10 11 | | 13 14 | | 7 8 | | 10 11 | | 13 14 | +---+ +-----+ +-------+ +-------+ +--------+ +-------+ +-------+ +---------+ | 9 12 | => +---------+ / | \ +-----+ +-------+ +-------+ | 7 8 | | 10 11 | | 13 14 | +-----+ +-------+ +-------+ +---------+ +----------+ | 9 12 | | 10 12 | +---------+ +----------+ / | \ => / | \ +---+ +-------+ +-------+ +-----+ +----+ +-------+ | | | 10 11 | | 13 14 | | 9 | | 11 | | 13 14 | +---+ +-------+ +-------+ +-----+ +----+ +-------+ +----------+ +------+ | 10 12 | | 12 | +----------+ +------+ / | \ => / \ +---+ +----+ +-------+ +-------+ +-------+ | | | 11 | | 13 14 | | 10 11 | | 13 14 | +---+ +----+ +-------+ +-------+ +-------+ +------+ +------+ | 12 | | 13 | +------+ +------+ / \ => / \ +---+ +-------+ +----+ +----+ | | | 13 14 | | 12 | | 14 | +---+ +-------+ +----+ +----+ +------+ | 13 | +------+ +-------+ / \ => | 13 14 | +---+ +----+ +-------+ | | | 14 | +---+ +----+

**4. Height of AVL Trees**

- Show that the
**minimum**number`N`

of items in an AVL tree of height`H`

is`Fib(H+3)-1`

*Hint: the recurrence relation for this was discussed in lectures, and the textbook, so you just need to use induction to prove that the solution to the recurrence is the shifted Fibonacci sequence.*The recurrence relation (see lecture slides or text book) is: N(0) = 1 N(1) = 2 N(H) = 1 + N(H-1) + N(H-2) if H > 1 Compare this with Fib(H+3): H : 0 1 2 3 4 5 .. N(H) : 1 2 4 7 12 20 .. K : 0 1 2 3 4 5 6 7 8 .. Fib(K): 0 1 1 2 3 5 8 13 21 .. Required to prove: N(H) = Fib(H+3) - 1 Proof by induction: When H=0, N(0) = 1 = Fib(3) - 1 When H=1, N(1) = 2 = Fib(4) - 1 Inductive assumption: assume result for values up to H > 1: Then N(H) = 1 + N(H-1) + N(H-2) [given recurrence] = 1 + ( Fib(H+2) - 1 ) + ( Fib(H+1) - 1 ) [inductive assumption] = Fib(H+3) - 1 [defn of Fib] as required

- What is the
**maximum**number`N`

for a given`H`

?`2^(h+1) - 1`

- For a tree with 1023 (= 2^10 - 1) items what is the minimum height? the maximum?
The minimum possible height is h = 10.

fib(16) = 987 and fib(17) = 1597. It follows that for a tree of height h = 15, n >= 1596. So with n =1023, the maximum height is h = 14.

- For a tree with 1048575 (= 2^20 - 1) nodes what are the minimum and maximum?
Minimum: h = 20.

fib(30) = 832040, fib(31) = 1346269, so maximum height is: h = 28.

**CHALLENGE Questions** (for the mathematically inclined)

- How may differently shaped binary trees with
`N`

items are there?

Starting with`N=0`

, show that successive values are:`1,1,2,5,14...`

by enumerating the possible tree shapes. These numbers are called Catalan numbers.n=0: 1 tree: the empty tree n=1: 1 tree with a single root n=2: 2 trees x x / \ x x n=3: 5 trees x x x x x / / / \ \ \ x x x x x x / \ / \ x x x x n=4: 14 trees x x x x x / / / / / x x x x x / / / \ \ \ x x x x x x / \ / \ x x x x x x / \ / \ x x x x / \ x x x x / \ / \ x x x x / \ x x x x x x x \ \ \ \ \ x x x x x / / / \ \ \ x x x x x x / \ / \ x x x x

Derive a recurrence relation for the Catalan numbers, and an explicit formula in terms of`N`

.

*The last part is harder. You may like to do a search on Catalan numbers to find the explicit formula, and prove by induction that it is the solution to your recurrence relation.*Let t(n) be the number of different binary trees shapes with n items. We have: t(0) = 1, [there is 1 empty tree] With n+1 nodes, we can count the number of trees of different shapes, by allocating 1 position at the root, and then allocating k nodes to the left and n - k nodes to the right subtree, for each value of k = 0 .. n. So we get the following recurrence relation: t(n+1) = t(0)*t(n) + t(1)*t(n-1) + t(2)*t(n-2) + .. + t(n) * t(0) The solution to this is (you can verify this using a proof by induction) is: t(n) = (2n)! / ( n! * (n+1)! ) Here is a tabulation: n : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 t(n) : 1 1 2 5 14 42 132 429 1430 4862 16796 58786 208012 742900 2674440 9694845 35357670 Asymptotically, as the number of items increases by 1, the number of trees quadruples.

- How many different ways can you write balanced parenthetical expressions using
`N`

pairs of parentheses? For example:`()()()()`

and`(())()`

are valid expressions, whereas`(()))(`

is not.

A simple grammar describing such expressions is:P ::= empty expression | '(' P ')' P

*Hint: think about binary trees representing parenthesised expressions.*For n = O: 1 valid expression the empty expression For n = 1: 1 valid expression () For n = 2: 2 valid expressions (()) ()() For n = 3: 5 valid expressions ((())) (()()) (())() ()(()) ()()() For n = 4: 14 valid expressions (((()))) ((()())) ((())()) (()(())) (()()()) ((()))() (()())() (())(()) (())()() ()((())) ()(()()) ()(())() ()()(()) ()()()()

The number of valid expressions with n pairs of parentheses is the same as the number of binary trees with n items. The correspondence can be seen from the grammar if you interpret it as a binary tree generator: the`'(' P ')'`

is an instruction to make a left sub-tree, and the following`P`

is an instruction to make a right sub-tree.Also note that this grammar corresponds to valid sequences of stack operations, where

`P`

represents sequences of pushes and pops, with no pops on an empty stack,`'('`

represents the pushes, and`')'`

represents the pops. - How many differently shaped AVL trees with a given height
`H`

are there?

Let c(h) be the number of AVL trees with height h. For h = 0, c = 1; h = 1, c = 1; h = 2, c = 3; Recurrence for h+1: to get an AVL tree of height (h+1), both sub-trees must have height (h-1) or h, and at least one of them must have height h, so c(h+1) = c(h) * c(h) + c(h-1) * c(h) + c(h) * c(h-1) = c(h) * ( c(h) + 2*c(h-1) ) This recurrence has no simple closed form solution that I am aware of. Here are the first few terms (I used Haskell to generate these values): h c 0 1 1 1 2 3 3 15 4 315 5 108675 6 11878720875 7 141106591466142946875 8 19911070158545297149037891328865229296875 9 39645071485851304455281818836461083701971963604987697945684203361075660 0341796875 10 15717316931182601571960181064602210498576956361668877698434766859653851 16930439741446186333028489448438308850843238872672700842128674239678896 45644664764404296875 This grows very very quickly. It is easy to derive a lower bound for this sequence: c(h) > c(h-1) for h > 1 So, using the recurrence above: c(h+1) > c(h) * (c(h) + 2*c(h)) = 3 * c(h)^2 The recurrence: c(2) >= 1 c(h+1) >= 3 * c(h)^2 has the solution (provable by induction): c(h) >= 3 ^ (2 ^ (h-1)) This double exponential provides a (very conservative) lower bound on the number of AVL trees of height h.

How may differently shaped AVL trees with`N`

items are there?

Because AVL tree construction is constrained to be be balanced, to derive a recurrence for this, we need to factor the constraint in, by including the height h as a parameter of the recursive construction. Otherwise the construction is similar to that for binary trees considered above. Let b(n,h) be the number of AVL trees with n items and height h, and a(n) the number with n items. Then: a(n) = sum [ b(n, h) | h <- [0 .. ] ] and b(n, 0) | n == 0 = 1 | otherwise = 0 b(n, 1) | n == 1 = 1 | otherwise = 0 b(n, h) = sum [ b(k, h-1) * b(n-k-1, h-1) | k <- [0..n-1] ] + 2 * sum [ b(k, h-2) * b(n-k-1, h-1) | k <- [0..n-1] ] Here is the first part of the table for b: n\h | 0 1 2 3 4 5 6 7 ----|------------------------------------------- 0 | 1 0 0 0 0 0 0 0 1 | 0 1 0 0 0 0 0 0 2 | 0 0 2 0 0 0 0 0 3 | 0 0 1 0 0 0 0 0 4 | 0 0 0 4 0 0 0 0 5 | 0 0 0 6 0 0 0 0 6 | 0 0 0 4 0 0 0 0 7 | 0 0 0 1 16 0 0 0 8 | 0 0 0 0 32 0 0 0 9 | 0 0 0 0 44 0 0 0 10 | 0 0 0 0 60 0 0 0 11 | 0 0 0 0 70 0 0 0 12 | 0 0 0 0 56 128 0 0 13 | 0 0 0 0 28 448 0 0 14 | 0 0 0 0 8 864 0 0 15 | 0 0 0 0 1 1552 0 0 16 | 0 0 0 0 0 2720 0 0 17 | 0 0 0 0 0 4288 0 0 18 | 0 0 0 0 0 6312 0 0 19 | 0 0 0 0 0 9004 0 0 20 | 0 0 0 0 0 11992 4096 0 21 | 0 0 0 0 0 14372 22528 0 22 | 0 0 0 0 0 15400 67584 0 23 | 0 0 0 0 0 14630 159744 0 24 | 0 0 0 0 0 11968 334080 0 25 | 0 0 0 0 0 8104 644992 0 26 | 0 0 0 0 0 4376 1195008 0 27 | 0 0 0 0 0 1820 2158912 0 28 | 0 0 0 0 0 560 3811904 0 29 | 0 0 0 0 0 120 6617184 0 30 | 0 0 0 0 0 16 11307904 0 31 | 0 0 0 0 0 1 18978576 0 32 | 0 0 0 0 0 0 31327104 0 and some values for a(n) (the row sums): n : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 a(n): 1 1 2 1 4 6 4 17 32 44 60 70 184 476 872 1553 2720 4288 6312 9004 16088 n : 21 22 23 24 25 26 27 28 29 30 31 32 a(n): 36900 82984 174374 346048 653096 1199384 2160732 3812464 6617304 11307920 18978577 31327104 By looking at the trend asymptotically, as the number of items increases by 1, the number of AVL trees approximately doubles (actually it seems closer to a factor of 1.9). Note that the number of trees c(h) for a given height h, as previously determined, also appears here as the column sums.

See Haskell implementations for these AVL tree counting functions if you wish.

Authors: Alan Blair, John Potter