Wednesday, December 3, 2008

Problem Solving Episode 2 – Program Correctness

In the duration of the course, we covered program correctness, proving code by precondition implies postcondition. By this, I wanted to introduce a technique my partner and I worked on for solving question 1 for assignment 2 instead of complete induction. For this episode, I will show the approach of how to create your own code (or my approach) for the ternary tree question.

Question: For arbitrary natural number n, find a (possibly recursive) formula for the number of non-equivalent ternary trees within nodes.

First: Understand the problem

First we have understand that we want to create ternary trees where we won’t have any duplicates trees given n nodes. This seems simple enough. Now I remember being told in highschool that the best way to understand the problem is by visually seeing the problem.
1) Draw cases for 1- 4 nodes (or how many you feel comfortable with to see a pattern).
So you won’t miss any trees, I always start from left to right so create trees manipulating the left subtree first, then the middle and lastly the right subtree by depth (remember this from csc148).
2) Ask yourself before writing your code, “Which coding style(s) would be the easiest for me (recursive, while loops, etc)?”

Second: Devise a Plan

From (2) of understand the problem, we need to put this to good use. Here is where we now need to our visual aspect of the problem and convert that to code. After finishing the cases of drawing, we can see trees of 2 nodes that repeat from 3 nodes. So there should have some recursion in it.

For ex. For trees with 3 nodes. Start with 0 nodes in the left subtree and find all the different combinations of ternary trees you can have. So we can have middle subtree with 2 nodes and right subtree with 0, middle subtree with 1 node and right subtree with 1. Now here is where the recursion comes along. We can find the different number of trees for 0,1 or 2 nodes by using a recursive function. Repeating the process with left subtree having 1 node and once more with 2 nodes. (note: we can’t have left subtree with 3 nodes since it is 3 nodes allowed = 1 (root) + 2). Then find the total.

In order for success in this code, we have to understand and see the pattern which is a combination of trees where you have the root (1 node) and the subtrees total nodes is n-1 nodes, where n is the total number of nodes.

Third: Carrying out the plan

Here is where the actual writing of the code starts to come together. There are a few key points we know already which will help us writing the code.
1) we have to iterate through the possibilities of the subtrees from having 0 nodes in the left subtree to n-1 nodes in the left subtree, where n is the total number of nodes.
2) when iterating, we are finding a combination of trees that we can have without overlapping.
3) since we are using some sort of recursion, we need a base case for nodes of 0 or 1 since there are only 1 possible ternary tree.

Pseudo-code:

# n is number of nodes, L number of node in left subtree (from (1) above)
def ter_tree(n,L):

#if the ternary tree is a tree with 0 or 1 nodes, return 1
#since there is only one possible way of creating a tree with
#0 or 1 nodes (from (3) above)
if n is 0 or 1: return 1

# Stop iterating when n is equal to L. Return 0 since total
# does not change when adding 0.
# total + 0 = total (at end of iterating)
if n is equal to L then stop iterating since n can’t be equal to L:
return 0

# Since each subtree can have n-1 nodes (n nodes minus the root)
# Return every possible subtree with n-1 nodes where the number of
# left(L), middle(M) and right(R) nodes equal to n-1.
# Ex*. Possible subtrees with 2 nodes are (in the form [L,M,R]):
# [[0,0,2],[0,1,1],[0,2,0],[1,0,1],[1,1,0],[2,0,0]]
total = 0
for i in range(n-L):
# the total number of possible ways is the number of
# ways for ternary tree with L nodes times the number
# of ways for ternary tree with i nodes times
# number of ways for ternary tree with n-i-1-L nodes.
# Note: This finds all possible ways with left subtree
# with L nodes and multiplication because
# it is combinations of subtrees.
total += ter_tree(L,0)*ter_tree(i,0)*ter_tree(n-i-1-L,0)
where ter_tree(L,0) will get the possibly trees with L nodes, and the rest of the calls is a manipulation to get the possibly combinations with n-L-1 nodes (i + (n-i-l-L) = n-L-1) and no overlapping.

# When done iterating through for-loop, find all possible ways.
# by making checking the number of ways with L+1 nodes.
# Then return the final number of ways to make a ternary tree
# of n nodes.
return total plus the total from the next possible combination for L+1 nodes in the left subtree with ter_tree(n,L+1)

Fourth: Wrapping Up

As you can see I did slightly half pseudo-code/half actual code. If you understand combinations well (if you are in statistics) then following or tracing your own code and seeing why it does what should be easy. Now what I have is one solution, there are tonnes of other ways to do it, but as long as it follows a double summation format, getting all the right combinations.

Monday, November 17, 2008

Problem Solving Episode- Inequalities

I guess I might as well do my first problem solving episode using the approach done by Polya.
I will show a way how to prove inequalities. For example, The proof will be very similar to proving for all natural numbers n, 2^n > 10n, but I will show how to prove k^n > 10n.

Question: Prove for all natural numbers n, there exist a natural number k, k^n > 10n

First: Understand the problem.

First we want to find a claim that suits the proof. So lets use predicate P(n), with parameter n. Claim: P(n): Prove for all natural numbers n, (*) there exist a natural number k, k^n > 10n. Now as you can see there is the unknown k. You get to choose a k. Now the claim should seem straight forward that you want to show that given any value of n, k^n will always be greater than 10n.

Second: Devise a Plan

As you notice, there is a missing part of the claim by (*). We must find a breakpoint that that it will satisfy the inequality for all natural numbers (we want in the form of n>=B, where B is the breakpoint). Here is where it gets slightly “messy”.

(1) By messiness, we will need to do a bit a scratch work in order to get a value for the breakpoint that satisfies the inequality.

(2) First look at the set of natural numbers (the set [0, inf]). Take the very base case which is n=0. As you can see k^0 = 1 is greater than 10(0) = 0. But don’t get ahead of yourself.

(3) Here is where the tricky part comes along. As we can see (2) does satisfy the proof (since 1 > 0) but don't assume that the breakpoint can be n>=0. Take a look at values n=1. Then we have k^1 > 10(1) but if we choose k to be 10 or less, this statement will fail.

(4) When you find two consecutive numbers that satisfy the inequality, then this is where we can call it our breakpoint.

For example- let k=2. If we have n=5 then 2^5 = 32 is not greater than 10(5) =50. Then when n=6, 2^6 = 64 is greater than 10(6) = 60. Check the next number for n=7 and we can see that 2^7 = 128 is greater than 10(7). Now we don't have to check anymore. The reason is because if you look at the types of functions that 10n and 2^n creates, 10n is a linear function and will always increase at a constant rate of 10 whereas 2^n is a curve and will increase more rapidly at the breakpoint. So our breakpoint would be n>=6 in this example.
Use this idea to find our breakpoint. So we can now say P(n): Prove for all natural numbers n, there exist B n >= B, there exist a natural number k, k^n > 10n where b is the breakpoint.

(5) Figure out if what type of induction you want to use. For simple inequalities like this one, simple induction seems to be the best way.

Third: Carrying out the plan

Finally we can start with our proof!

(1) As always for simple induction, it follows the usual proof structure of Base Case and Inductive Step.

(i) Base Case - find the base value n that satisfies the inequality. In the example above, your base case, n would be 6 (the breakpoint). Then P(6) is true.

(ii) Inductive Step- This is the “meat” of the code. Again following the same structure, we assume n is a natural number for n>=B, P(n). This is our Induction Hypothesis so we are assuming that k^n > 10n

Then we want to show P(n) implies P(n+1). So start with one side of the inequality. To be consistent, lets start with the left side.

So we get k^(n+1). Now when doing this, we have to break up the k^(n+1) so it we will be able to see k^n which is where we can apply the Induction Hypothesis.

Note:

(1) If k=2, then 2^(n+1) = 2^n + 2^n

(2) If k=3, then 3^(n+1) = 3^n + 3^n + 3^n

(3) In general, if k={k_1, k_2, …, k_k} where each value in the set is equal to k and there are k numbers of the values. Then k^(n+1) = k_1^n + k_2^n + … + k_k^n

So k^(n+1) = (k_1)^n + (k_2)^n + … + (k_k)^n

Now we can use our Induction hypothesis of the fact that k^n > 10n. Then

(k_1)^n + (k_2)^n + … + (k_k)^n

> 10n + 10n + … + 10n

> 10n + 10n

= 10(n+1)

We’re done! We’ve show that k^(n+1) > 10(n+1). So now we can say that P(n+1) is true and P(n) implies P(n+1)

Fourth: Wrapping up


Since P(n) implies P(n+1) relates only to the inductive step, we must conclude our statement in a general statement. So conclude that for all natural numbers n, there is exist B, n>=B, P(n)

Tuesday, November 11, 2008

Test 2

I was supposed to write about the test over the weekend but didn't have the chance to. Well another test down and one more to go (and by looking at the schedule....its on the very last day of school). I say this test went much better than Test 1. I didn't have any time management problems, I understood and read the questions correctly (unlike from the last test, the Fibonacci question). For the first question I got was the time complexity question. I basically knew how to do it since it was very similar to the question in the assignment. So I think by knowing that easily, it gave me more time to focus on the other two questions.

The second question was the precondition/postcondition proof and I was comfortable enough that I think i answered it correctly but something on the back of my mind says the proof isn't as "solid" as it should be. I was discussing with one of my friends about one of the cases where the the length of the string was greater than one (meaning it has to call the recursive function) where he had done sub cases for if there are no vowels found, and if the vowel was found. For me, I didn't take that approach and kind of merged the both sub cases together. I think i would have been better to do sub cases but I hope that it really only matters of how you explain yourself.

Finally, the last question was the loop invariant proof and I think this is where I will loose my marks. I think my understanding of loop invariant wasn't as clear to me as I thought after discussing with a couple of friends after the test. I think I might have to stop by Danny's office hour so I can get some help with that particular part. I really don't know how they will mark the question you don't get the initial conjecture or in this case the loop invariant since that is the main part of proving your point.

Overall, I think i did fine, its now just a matter of waiting...but at least we get our marks very quickly unlike my csc209 course (for those who are taking it this semester) that we haven't gotten our A1 mark that was due over a month ago as well our A2.

Tuesday, October 28, 2008

Assignment 2

Well yet another assignment done and one more to go! I must say that this assignment was a bigger challenge than the first (as expected). I am glad that question one from assignment 1 was pushed back. My partner and I had figured out the number of nodes a while back (in the time of assignment 1 before Danny announced the extension) but we had struggled for a while on how to prove it. I think the first challenge for most after looking at the responses on the bulletin board, the struggle people had to get a formula that is suitable for the nodes. But the tough part of this question would definitely be the proving. When I had a formula, I wasn't sure how I can put into words in a proof so I decided to do something I was a bit more comfortable for me. I wrote a python code for a ternary tree of n nodes. It made me visualize and assess the problem much easier. For any n, I was able to trace my code going through it recursively and making sure that it made sense for proving purposes. We used the proving of the "precondition implies postcondition" method that we learnt recently in class. In doing this, it gave us a better understanding of how to prove preconditions. Although we had a good format down since we had done writing out question 4 first.

For question 2, we had thought we got the proof easy but then we had a discussion about what the constant should be for the time complexity. We were looking at the course slides for week 4 (I think) for the recursive functions on time complexity. On the day slides, the time complexity had 4 + T(ceiling(n/2)) but then for the exact same question in the evening lecture slide they had 3 + T(ceiling(n/2)). This made my partner and I confused. Until we realized that the constant didn't really matter after reading a post done on the bulletin board. The only condition was the constant be not zero because the algorithm would fail. Other than that part, we found the rest of the problem easy.

Now as for question 3, We basically wrote a bunch of scratch work until we saw an algorithm but could'nt figure it out until we unwound it and found the connection. Then it was finally easy to put into words (well sort of).

I can't complain about the assignment, it was tough but with enough time, you can pull it off with no problems.

Friday, October 10, 2008

Test Aftermath

Well just had my first term test for csc236 today and went fairly decent. Some problems arised trying to figure out the Fibonnaci question which led to poor time management for the rest of the questions. I understood the question, but slightly had my understanding of the question a tad off. It had said "...greater than 5"! For some odd reason, I was thinking greater or equal to 5 which then completely made me loose track of where I was in the proof (even though it was a small fix, but sometimes its the small things that cause the most problems). I wish I could have finished the test, as I knew the answer to the last question I wrote but due to my lack of time management, I paid for it. Good thing that this will probably go towards the lower breakdown of the course and also good thing that i have my A1 mark to back up any faults I may have had during this test.

Time management has been one of my flaws for many of the tests I've done. Its either I get too caught up in a question (like this one today) or if I don't know the answer but have to do bunch of scratch work to come to the answer, it takes up too much of my time. Although my friends have offered advice on how to deal with this, in what way do you find to manage your time better? If you have 10 minutes left and you have two questions (but each take 10 minutes at least on it), do you dedicate your time on one of the two or split your time in half for both even though you know you can't finish?

Monday, September 29, 2008

Assignment 1...good or bad?

Well finally I've finished the assignment and glad to have that out of my hands for a while. After looking back at the A1 (even though we just finsihed it), my partner and I both felt the menu question was probably the toughest to write about but easy to come up with an algorithm. One reason being since I had created an algorithm for binary numbers for a wrap-around in summer 2007 CSC165. We just use that to our advantage and was finally able to put it down on paper as a proof. As for question 3 with the golden ratio, I think that the comphension of the golden ratio was a bit...tough (for most). Overall, it was one of those Danny Heap signature assignments...start off simple and as the questions progress, more thinking is involved which is probably one of the most effective ways of learning. Learn the basics and apply those basics onto tougher problems. Even though this assignment is done that question #4 will soon haunt me. I have come up with "something" but the only issue I know i will run into is how to prove my formula. Yes it works but putting that onto paper will raise an issue for me so I am happy that this question has been delayed for it will give us a lot more time to put the missing pieces together. But in the meantime, will have to focus on all these other assignments that are due soon like 209 to 336.

Saturday, September 13, 2008

Moving on

From CSC165 to CSC236...

It has only been one week since the next semester started. I'm ready to take on CSC236 although to be honest, I feel that my knowledge from CSC165 is in the back of my brain on "vacation" so I'm not getting the first few classes so quickly as I used to. But I'm guessing its because I took CSC165 a while ago, its taking some time to process back.

So first week of class, I can't really complain about anything right now. We reviewed induction mainly this week with the structure being a lot looser in comparison to CSC165 so it gives us a bit more freedom in explaining and proving a particular statement.

Thats about it...for now