SLOG in the CS World

Saturday, March 29, 2014

Week 11

This is the last topic of CSC148 --- sort and efficiency. The sorting methods we are looking at put a list of items in an ascending order. We've done different kinds of sorting methods in CSC108, including bubble sort, selection sort and insertion sort. But in CSC148, we are digging more into the sorting algorithms and learning to analyze their efficiency on different sizes of lists, with the addition of some new sorting algorithms (including quick sort, merge sort, count sort, built-in/tim sort). In order to do the running-time analysis, the 'big-oh', O(g), is being introduced. The 'big-oh' describes the algorithm's behaviour over a large input of size n. I haven't heard of the 'big-oh' before so the idea was a little bit vague to me. But after Professor Heap explained it, it becomes much clearer to me.

There are two very important, new sorting algorithms introduced. One is quick sort, the other is merge sort.

1) Quick sort picks a pivot and puts values greater than the pivot value to the right of the pivot and values less than the pivot value to the left of the pivot. After doing that, the pivot value is at the right position. It continues to pick pivots on sublists and quick sort the sublists.

2) Merge sort splits the list into halves recursively, sort each half and merge them together.

In the average case, both sorting algorithms are O(nlogn). This is more efficient than the sorting methods we learnt before. To fully understand and get familiar with these sorting algorithms, I practiced writing their codes on my own. And then I get to understand their 'big-oh's as well. For quick sort, it chooses the pivot logn times and for merge sort, it divides the list into halves logn times. And each partition has O(n) complexity. Therefore, the overall complexity is O(nlogn).

However, it is not always the case. For example, in the worst case of quick sort, the complexity is O(n^2). So we need to analyze best, average and worst cases of sorting algorithms respectively. In the last lab, we tested three situations for each sort: 1) random list 2) sorted list 3) sorted and reversed list. After we transform the data into graph, the graph gave us a very clear image on how different sorting algorithms perform. Moreover, the performance also depends on how the codes are written. I discussed a lot with my partner to figure out why different codes result in better/worse efficiency.

In addition, we looked at count sort which sorts consecutive integers. And there is the very efficient built-in sort (i.e. tim-sort). By testing it in the lab, we found it much faster than any other sorting algorithms. But it seems like it is also more complex comparing to the other ones. Even though I studied the above sorting algorithms, it was still quite challenging for me to analyze the ones in the test. I still have to practice more about the 'big-oh'.

Sunday, March 23, 2014

Week 10

This week we finished assignment 2 part II. In the first 'is_regex' function, we have revised so many times since there are a lot of cases needed to be considered. It is also tightly related to the 'build_regex_tree' function. Their codes are quite similar to each other. As each regex type has very different form, it is difficult to put them together. The hardest part I found was to check long regex with parentheses, since it takes a long time to figure out how to separate BarRegex and DotRegex into two parts to check if each part is a regex.

I didn't expect myself to finish writing 'all_regex_permutations' so quickly. When I first read the instruction, I thought it was really hard to do it. But it turns out that it is quite a short function to write. I think it is probably because I've got better with recursion.

This week's lab is really tough. I understood the instruction and our TA helped us a lot and gave us some hint. However, we still didn't manage to find a way to write the function. I found it is especially challenging to do recursion on linked list node and binary tree node classes. I need much more practice on that.

Saturday, March 15, 2014

Week 9

The topic we are working on is binary search tree for this week. Basically, binary search tree presents the nodes in a very orderly way. Nodes greater than the root value is in the right tree and nodes smaller than the root value is in the left tree. Based on this special property of binary search tree, it is much more efficient to implement methods, such as finding a node. If the item is bigger than the root value, it is certain that it is in the right subtree; if the item is smaller, it is certain that it is in the left subtree, and so on.

In the insertion and deletion function, we need to assign a new variable to the original BTNode. After some tracing, I realized that the new variable makes an alias with the original one, so that when we change the original node's left or right subtree, the new variable is being changed as well. That's why we can return the new variable in the end.

Based on this property, functions can be written recursively by calling recursion on either the left tree or the right tree. In the lab, we practiced different ways of writing the same function.

We also did Exercise 3 this week. It is the hardest exercise I've ever done. The sum_to_deepest function took me quite a long time. I didn't think of using recursion involving decreasing the index over each recursion at first. The make_tree function was not so hard to figure out at first, but the auto-marking reported minor mistakes for several times. There are some cases that I didn't consider. Also, even though I checked the docstring using pep8, there was still one more error reporting. Luckily, the due date has been extended so I could fix those small problems.

Saturday, March 8, 2014

Week 8

We are dealing with the 'LinkedList' now. It is quite different from other classes we've learnt before. It links a LinkedList object with another LinkedList object, one after another. The class itself is harder to understand, so are the functions. Adding or deleting an item from the LinkedList isn't so hard to figure out because it simply involves assignment of the attributes to the new values. I found the 'reverse' function very confusing. I couldn't even manage to trace that function. It starts to get more and more confusing with recursion and things like 'self.nxt' or 'self.nxt.nxt'. Also, there are two different ways of building a LinkedList. One is that an item points to another LinkedList with all the remaining items. The other one is an item points to another one (and another and another....) so that we need a 'node' class. Overall, this class seems a lot more complicated than the ones we did before.

Moreover, I found this week's lab quite challenging for me. I have not fully understood how to manage linked list yet. The functions all require some recursion in them. It is not easy for me to figure out the recursion inside LinkedList. I had to spend time working on this lab again.

Thursday, February 27, 2014

Week 7

This week I'm going to write something about recursion. Professor Heap mentioned this topic in the first week of class. It sounded totally fresh to me, but after these weeks of learning and practice, recursion no longer sounds so intimidating. In CS context, recursion is defined as the approach to solve a problem by having 'functions to call themselves to solve smaller subproblems'. Based on my understanding, recursion involves writing the code for a function with use of the function itself. It helps avoid redundant code so that sometimes we can simply write a neat one-line return statement for a complicated function.

The first function that we touched upon was 'sum_list' that sums up all the values in a nested list. Without recursion, we would have to write similar code to take the sum of each sublist in the nested list. With the use of recursion, and help of tenary if and list comprehension, we were able to write the function in one line: return sum([sum_list(x) if isinstance(x, list) else x for x in L]). See that the function 'sum_list' itself appears in this return statement, so that the function is executed for each nested list recursively. In the beginning, it was a little bit challenging to understand recursion so I practiced tracing the function for many times to get comfortable with this approach.

Then we got to see a lot of examples, such as finding the depth of a nested list and maximum in nested list. We also got to practice quite a lot problems in the lab. After doing recursion over and over again, I eventually figured out my own efficient way of writing recursion codes. First, I carefully read the description of the function (what it is trying to do). Secondly, I list out different situations and organize them into 'if' statements. Then I check which situations need to call the function itself. Finally I re-organize and put them into a simple return statement.

I think the hardest part is usually not to figure out where to use recursion, but to put all conditions together and make the code concise. For example, the code for nested depth of list was return (1 + max([nested_depth(x) for x in L] + [0]) if isinstance(L, list) else 0). It is very easy to mess up the '[]' and '()', and where to put the add 1. Also, the '+[0]' might easily be ignored because it only serves the special case of an empty list. Here, the case of a non-list is called the base-case where no recursion is executed.

Moreover, we practiced recursion in Assignment 1. In the tour.py, recursion was so vital to record the series of moves. It was such a good practice for me since it integrated different approaches.

Friday, February 14, 2014

Week 6

We finished Assignment 1 this week. Finally!!! It was a great practice on recursion and it was actually a lot of fun working on this assignment.
So this week we started a new class of object that seems to deal with a great amount of recursion --- the binary tree. We first look at 3 different traversals --- the pre-order, in-order and post-order. In all these three ways of traversals, we need to use recursion to complete the code. It is actually quite straightforward to write these traversal codes. The other functions are a little bit more complicated, including 'height', 'count', 'leaf_count', 'arity'. List comprehension was so useful in writing these functions, along with the use of built-in functions like 'sum' and 'max'. I found that it is very important to consider the base case in a recursion code. For example, in 'count' function, we need to have a '+1' at the beginning so that a node without children would count.

The lab this week is on converting 'tenary if' into regular if statements. In general, the tasks are easier than any other previous labs. We finished quite early. Moreover, we got to practice on 'any' and 'all'. These two are very efficient ways along with the use of list comprehensions. They check each condition in the list comprehension -- 'any' returns True if any one is true; 'all' returns True if all are true. It would've taken a lot more work to check the conditions one by one.

Sunday, February 9, 2014

Week 5

The midterms are coming. I just started working on Assignment 1 this week. It is really hard to understand the assignment at first. So I spent time discussing with my partner to make things clear. Then it took us not long to finish the first four steps. It is really cool to see the console controller working. I played the game over and over again to double check and have some fun with it. However, I'm stuck on Step 5 now because the recursion involved is kind of complicated. Professor Heap gave us quite clear explanation on the recursion though. It is still challenging to imagine the simple recursion codes working on this complicated Tour of Hanoi model. Comparing to the easier recursion codes we've dealt with in the lab, this recursion assignment is much harder to trace. Besides the recursion, the most difficult part turns out to be how to deal with the "i" involved. We are still in the way of figuring it out.

We practiced a lot of recursions in this week's lab. Most problems of the lab were fine, but the optional ones were extremely challenging for me. We didn't manage to finish them in the lab so I might need to work on them later when I have time. This week we also talked about "scope" and unittest. The scope part is pretty straightforward. Python looks from local -> nonlocal -> global -> built-in. We looked into some code showing the results with different scopes. Python gives different outputs with assignments to a same variable under different scopes.

Subscribe to: Posts (Atom)