Cs Sorting Analysis Essay, Research Paper
For this lab I tested ten different sorting methods by comparing the time it took to sort ordered, reverse ordered, and random ordered arrays of Integer objects of different sizes. I had to write a Timer class that started the timer before the sorting method was called and stopped the timer when the method was finished sorting the array. The elapsed time was calculated by another method in the Timer class so that the actual run time could be printed once the method completed. My test program contained methods to run all of the sorting methods given the size, type of array (ordered, reverse or random), and the number times to run the sorting method (all entered by the user) before the timer was stopped.
For my timing experiment I carried out tests on an array of size 1,000 run 100 times through the method before printing the time (since it would be relatively fast for one iteration), an array of size 10,000 run 10 times, an array of size 25,000 run 1 time, and an array of 50,000 run 1 time. For the arrays of 1,000, 10,000, and 25,000 I did five trials for ordered, reverse, and random arrays to get a good average of the time it took since each run varied slightly. Since the array of 50,000 took a long time to run I conducted only two trials for each size on the three different types of arrays and found the average. The run times are included at the end of this report. I then calculated the time it would take to do one iteration through each of the different sizes for the different types of arrays (i.e I divided the average time it took to run the ordered array of 1,000 100 times by 100 to get the average time it would take to run it once). This data is also available at the end of this report.
The outcome of the experiments proved to be very similar to what was expected. As seen in the many graphs provided one can see that there may or may not be a best case or worse case for each of the sorts. Below are brief descriptions of the results for each sort on average
Selection Sort: As seen the first graph, Sort Comparisons, selection sort does not really have a best-case scenario. All three types of arrays took approximately the same amount of time. This is what was expected from the theoretical results.
Insertion Sort: On the contrary, it can be seen that insertion sort does indeed have a best case and a worse case. The worse case (a reverse ordered list) ends up being roughly like selection sort but when given an already sorted list the method does not take hardly any time at all to run. Given a random array for insertion sort ended up being about the average of the two. This sort also performed as the theoretical results predicted.
All of the methods tested between insertion sort and bubble sort (simple quick sort, improved quick sort, merge sort recursive, heap sort, shell sort, and non recursive merge sort) all were approximately the same for all cases, which as seen in the graph is significantly less time than selection or insertion sort.
Simple Quick Sort: When given either the reverse or ordered array the middle pivot was always the middle most value so the times were very much alike. When given a random array it is not likely that the middle element will be the exact middle value of all those in the array so the time to complete the sort takes a little longer. This was true for all of the different sizes of arrays tested.
Improved Quick Sort: Improved quick sort was fastest for the ordered list by a small amount for each of the sizes. Reverse and random were roughly the same. Overall the times for improved quick sort were slightly faster than simple quick sort, especially noticeably with the random list since there was a better chance to find a pivot closer to the middle valued element. Both quick sorts performed as thought.
Merge Sort: Since merge sort divides and conquers regardless of the order of the array (it makes all comparisons anyway even if ordered) all three tests were about the same. The theoretical results predicted that each would be the same so the experimental results do match.
Heap Sort: Surprisingly heap sort, which should be equal to the quick sorts and merge sort was about the same if not a little slower in all cases. This is not quite as predicted by the theoretical results. The difference is probably due to having to build the heap each time which can be O(n log n) or O(n).
Shell Sort: Shell sort was significantly faster than insertion sort even though the worse case is still O(n2). All cases were approximately as fast as the O(n log n) methods but started to deviate more as the size increased. This is consistent with predicted results.
Merge Sort Non Recursive: This sort performed about at the same speed as the other middle methods . The tests on random, reverse and ordered arrays were all about the same and very comparable to the recursive merge sort.
Bubble Sort: Of all the sorts bubble sort was definitely the worst. Its best case was on the reverse ordered list, worse case was the ordered list and average was the random list. Due to all the comparisons it has to do it is no surprise that is runs the slowest.
Radix Sort: Radix sort ran a lot slower than expected. This is probably due to having to build the heaps every time and by using a linked queue instead of an array-based queue. All of the sizes took about an equal amount of time since each element was composed of a maximum of 4 digits.
In my opinion the best sort depends on what kind of list you are wanting to sort. If given an almost ordered list it is clear from the experiments that insertion sort would be the best bet because it would do the least amount of work. If given an almost reverse ordered list simple quick sort would be the best. In all cases it ran slightly faster than improved quick sort on reverse ordered lists. If given a totally random list improved quick sort instead of simple quick sort since it gives the best chance of finding the element that will be closest to the middle.
Since there is no significant difference between ordered and reverse ordered lists for the two quick sorts and improved quick sort runs faster on random lists I would say the best sort of the ten tested would be improved quick sort. Though there are sorts that run faster on ordered lists improve quick sort runs the best across the board and it is not slow on lists even of size 50,000. It is an in place method so it does not use any significant extra space like merge sort does.