數據結構lecture20sorting1_第1頁
數據結構lecture20sorting1_第2頁
數據結構lecture20sorting1_第3頁
數據結構lecture20sorting1_第4頁
數據結構lecture20sorting1_第5頁
已閱讀5頁,還剩23頁未讀 繼續免費閱讀

下載本文檔

版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領

文檔簡介

1、11 Advanced Sorting ConceptsOne of the most com mon applicati ons in computer scie nee issorting, the process through whichdata are arranged according to their values. We are surrounded by data. If the data were not ordered, we would spe nd hours trying to find a sin gle piece of in formatio n. I ma

2、gi ne the difficulty of finding some one s telepho ne nu mber in a teleph one book that had no internal order.The history of sorting dates back to the roots of computing, Herman Hollerith s electrictabulating mach ine, which was used to tally the 1890 U.S. Cen sus and was one of the first modem sort

3、 ing machines. Sorting was also on the scene when general-purpose computers first came into use. According to Kn uth, There is evide nee that a sorting rout ine was the first program ever writte n for a stored program computer. Although computer scie ntists have n ot developed a major new algorithm

4、in more than 30 years (the newest algorithm in this book is heap sort, which was developed in 1964), sorting is still one of the most important concepts in computing today.11-1 GENERAL SORT CONCEPTSWe discuss six internal sorts in this chapter; in serti on sort, bubble sort, select ion sort, shell s

5、ort, heap sort, and quick sort. The first three are useful only for sorting very small lists, but they form the basis of the last three, which are all useful gen eral-purpose sorting con cepts. After discuss ing the internal sorts, we will in troduce the basic con cepts used in exter nal sorts.Note

6、sorting is one of the most com mon data-process ing applicati ons.Sorts are gen erally classified as either internal or exter nal sorts. An in ternal soft is a sort in which all of the data are held in primary memory during the sorting process. An external soft uses primary memory for the data curre

7、 ntly being sorted and sec on dary storage for any data that will not fit in primary memory. For example, a file of 20.000 records may be sorted using an array that holds only 1000 records. During the sorting process, only 1000 records are therefore in memory at any one time; the other 19.000 are ke

8、pt i n a file in sec on dary storage.Note sorting algorithms are classified as either internal or externalInternal sorting algorithms have bee n grouped into several differe nt classificati ons depe nding on their general approach to sorting. Knuth identified five different classifications: insertio

9、n, select ion, excha nging, merging, and distributi on sorts. In this text we cover the first three. Distribution sorts, although interesting, have minimal use in computers. The different sorts are show n in Figure 11-1.Sort OrderData may be sorted in either ascending sequence or descending sequenee

10、. The soft order identifies the seque nce of the sorted data, asce nding or desce ndin g. If the order of the sort is not specified, It is assumed to be asce nding- Examples of com mon data sorted In asce nding seque nce are the dictio nary arid the teleph one book. Examples of desce nding data are

11、perce ntages of games won in a sport ing event such as baseball or grade point averages for hono rstude nts.* Insertion Selection * Bubble Shell* Heap* Quick365 blue212 gren876 white212 yellow119 purple737 green212 blue443 nad57 yellaw(a) Unsorted data19rJl:-1w&5期I 2 2 2 3 4purpl green yellow blue b

12、iue red567 yellow 737 green 876 while(b) Siable son119212purple blue green yellow blue red yefilow grn whrte(c) Unstable sortSort StabilitySort stability is an attribute of a sort in dicat ing that data with equal keys main tai n their relative in put order iii the output. Stability is best see n in

13、 an exampLe. In Figure 11-2(a) the un sorted data con tain three en tries with ide ntical keys (2121. If the data are sorted with a stable sort, the order in Figure 11-2(b) is guara nteed. That Is, 212 gree n is guara nteed to be the first of the three in the output, 212 yellow is guara nteed to be

14、the sec ond, and 212 blue is guara nteed to be the last, If thenA1.5亠 logn.nlogn-n嚴nA221.252,000,0001,500,0001,000,000500,000n (x 10,000)Algorithm Performa nee (II)2Algorithm Performa neesort is not stable, records with Identical keys may occur in any order, including the stable order show n in Figu

15、re 11-2(b). Figure 11-2(c) is one example of the six differe nt seque nces that could occur in an un stable sort. Note that in this example, blue comes out first even though it was the last of the equal keys. Of the sort algorithms we will discuss in this text, the insertionSort (page 442). Select i

16、on sort (page 453), and bubblesort (page 461) are stable; the others are un stable.Sort EfficiencySort efficiency is a measure of the relative efficiency of a sort. It is usually an estimate of the number of comparisons and moves required to order an unordered list. We discuss the sort efficie ncy o

17、f each of the internal sorts we cover in this chapter. Gen erally speak ing, however, the best possible sorting algorithms are on the order of nlog2n; that is, they are 0(nlog2n) sorts.3 Three of the sorts we study are 0(n 2). The best, quicksort, is 0(nl og2 n).PassesDuring the sorting process. the

18、 data are traversed many times. Each traversal of the data is referred to as a soft pass. Depending on the algorithm. the sort pass may traverse the whole list or Just a sect ion of the list. Also, characteristic of a sort pass is the placeme nt of one or more eleme nts in a sorted list.11-2 INSERTI

19、ON SORTSInsertion sorting is one of the most com mon sort ing tech niq ues used by card players. As they pick up each card, they in sert it into the proper seque nee in their hand. (As an aside, card sort ing is an example of a sort that uses two pieces of data to sort: suit and ran k.) The con cept

20、 exte nds well into computer sort ing. In each pass of an in serti on sort, one or more pieces of data are in serted into their correct locati on in an ordered List. I n this sect ion we study two in serti on sorts, the straight in serti on sort arid the shell sort.Straight Insertion Sort3In the str

21、aight in serti on sort, the list is divided into two parts: sorted and un sorted. In each pass, the first element of the unsorted sublist is transferred to the sorted sublist by inserting it at the appropriate place. If we have a list of it eleme nts, it will take at most it - 1 passes to sort the d

22、ata. This con cept is show n in Figure 11-3. In this figure, we have placed a visual wall betwee n the sorted and un sorted porti ons of the list.Straight insertion Sort ExampleFigure 11-4 traces the in serti on sort through a list of six numbers. Sorting these data requires five sort passes. Each p

23、ass moves the wall one eleme nt to the right as an eleme nt is removed from the un sorted sublist and in serted into the sorted sublist.23 78 45 B 32 56Original listUnsortedAfter pass 1Wall0W sortedUnsortedLast8 32 56Sorted UrisortedAfter pass 28 23 45 76Attar pass 3Insertion sort algorithmThe desig

24、n of the insertion sort follows the pattern in the example. Each execution of the outer loop inserts the first eleme nt from the un sorted list i nto the sorted list. The inner loop steps through the sorted list, start ing at the high en d, look ing for the correct insertion location. The pseudocode

25、 is shown in Algorithm 11-1.Note in the straight insertion sort, the list at any moment isSorted8 23 32 45 7856After pass 4Soiled0 23 32 45 56 78Afler pass 5divided into sorted and un sorted sublists. In each pass, the first eleme nt of the un sorted sublist is in serted into the sorted sublist.Ano

26、ther point you should study is the work ings of the inner loop. To make the sort as efficie nt as possible, we start with the high end of the sorted list and work toward the begi nning of the sorted area. For the first Insertion, this approach requires aSortedmaximum of one eleme nt to be shifted. F

27、or the sec on d, it requires a maximum of two eleme nts.The result is that only a porti on of the list is exam ined in each sort phase.Shell SortThe shell sort algorithm, named after its creator, Donald L. Shell, is an improved version of the straight insertion sort. It was one of the first fast sor

28、ting algorithms.Note The shell sort is an improved version of the straight insertion sort in which diminishing partiti ons are used to sort the data.In the shell sort, a list of N eleme nts is divided in to K segments where K is known as the increment. Each segment contains N/K or more elements. Fig

29、ure 11-5 contains a graphic represe ntati on of the segme nts in a shell sort. Note that the segme nts are dispersed throughout thelist. In Figure 11-5, the in creme nt is 3; the first, fourth, seve nth, and tenth eleme nts make up segment 1: the second, fifth, and eighth elements make up segment 2;

30、 and the third, sixth, and ninth eleme nts make up segme nt 3. After each pass through the data, the data in each segme nt are ordered. Thus, if there are three segments, as we see in Figure 11-5, there are three different ordered lists. If there are two segme nts, there are two ordered lists; if th

31、ere is on ly one segme nt,then the list is sorted.Shell Sort AlgorithmEach pass through the data starts with the firstAD A1 A|田 A|3 Afl A5| AS| A|7A9aSegment 11I waJkarJ- -Scgm&n|lHE ISAd K:II(h) First Increment: K = 5 aurrsnC IA0 2 x K A0 * 3 x K|7730 | | 77 | | 腳 | | 29 | B 弗 | 30 | | 77 |M r| |77

32、 Ml器 7o| psil2 lzE zkj而-兩eleme nt in the array and progresses through the array, compari ng adjace nt eleme nts in each segment. In Figure 11-5, we begin by comparing elements 1 and 4 from the first segment, then 2 and 5 from the sec ond segme nt, and the n 3 and 6 from the third segme nt. We the n

33、compare 4 and 7 from the first segme nt, 5 and 8 from the sec ond segme nt, and so forth un til fin ally we compare elements 7 and 10. If the elements are out of sequenee, they are exchanged. Study Figure 11-5 carefully until you see each of these comparisons. Be sure to note also that the first seg

34、ment has four eleme nts, whereas the other two have only three. The nu mber of eleme nts in the segme nts varies because the list size (10) is n ot evenly divisible by the in creme nt (3).To compare adjace nt keys in a segme nt, we add the in creme nt to the curre nt in dex, as show n below.Iistcur:

35、 listcur + increAfter each pass through the data, the in creme nt is reduced un til, i n the final pass, it is 1. Although the only absolute is that the last i ncreme nt must be 1, the size of the in creme nts in flue nces the efficie ncy of the sort. We will discuss this issue separately later. The

36、 diminishing in creme nt is show n for an array of ten eleme nts and In creme nts of 5, 2, and 1 in Figure 11-6.After each pass through the data, the elements in each segment are ordered. To ensure that each segme nt is ordered at the end of the pass, whe never an excha nge is made, we drop back one

37、 in creme nt and test the adjace nt eleme nts. If they are out of seque nee, we excha nge them and drop back again. If necessary, we keep exchanging and dropping back until we find two elements are ordered. We now have all of the eleme nts of the shell sort. Its pseudocode is show n in Algorithm11-2

38、.Selecting the Increment SizeFirst, recog nize that no in creme nt size is best for all situations. The overriding considerations in the sort are to complete the sort with the minimum number of passes (increments) and to minimize the number of elements that appear in more than one segment. One metho

39、d to eliminate completely an element being in more tha n one list is to use prime nu mbers. Unfortunately, the dynamic calculation of prime nu mbers is a relatively slow process.Most texts use the simple series we proposed inI14El匸rfein 1 1 M Pi*2 rZi nu 叵叵(b) Second increment: K = 2(c) Third increm

40、ient: K = 1 pal rsFl 7o 圉 ssl rl 155| 亙k 叭30 | | 5 | | 羽 |62 |而廿陰 I I 70 I Ij6g | 艸 眄 TO | I舲| ot n 殆 sg irn ri rsn25-I 2T 國 ri rai 55 |_J LZzE(d) Sorted aryay_ lM Iw1 HI 西 raalAlgorithm 11-2, sett ing the in creme nt to half the list size and dividi ng by 2 each pass. Knuth suggests, however, that

41、you should not start with an in creme nt greater tha n on e-third of the list size. Other computer scie ntists have suggested that the in creme nts be a power of 2 minus 1 or a Fib on acci series. These variati ons may result in a slightly more efficie nt sort, but they are relatively complex. One s

42、imple variation of the dlvision-bv-2 approach would be to add 1 whe never the in creme nt is even. Doi ng so would tend to reduce the nu mber of eleme nts that appear in multiple segme nts.Although you can use more complex in creme nt-sett ing algorithms, the efficie ncy of a shell sort will never a

43、pproach that of a quick sort. Therefore, if the objective is to obtain the most efficient sort, the soluti on is to use the quick sort rather tha n trying to optimize the in creme nt size in the shell sort. On the other hand, the shell sort is a much simpler sort and at the same time is reas on ably

44、 efficie nt.Insertion Sort AlgorithmsSort algorithms determine the sort effort for a given sort. Sort effort is defined as the relative efficie ncy of a sort. It can be determ ined in several ways, but we will use the nu mber of loops in the sort. Ano ther com mon measure is the nu mber of moves and

45、 compares n eeded to sort the list. Of course, the best measure would be the time it takes to actually run the sort. Time, however, varies by the efficie ncy of the program impleme ntati on and the speed of the computer being used. For analyzing different sorts, therefore, the first two measures are

46、 more meaningful. Let s now analyze the straight insertion and shell sort algorithms to determine their relative efficiency.Straight Insertion sortReferri ng to Algorithm 11- 1,“ Straight insertion so rtwe find the follow ing two loops:loop (curre nt 0 AND hold.key listwalker.key)The outer loop exec

47、utes n-1 times, from 1 through last. For each outer loop, the Inner Loop executes from 0 to current times, depending on the relationship between hold, key and list walker. key. On the average, we can expect the inner loop to process through the data in half of the sorted list. Because the inner loop

48、 depends on the outer loop s setting for current, we have a depe ndent quadratic loop, which is mathematically stated as f(n)=n(n+1)/2) in big-O no tati on, the depe ndent quadratic loop is 0(n 2).Note The straight i nsertio n sort efficie ncy is 0( n2).Shell sortNow let s look at Algdthm 11-2,“ She

49、ll so”This algorithm contains the nested loops shownbelow.loop (incre not 0)loop (curre nt = 0 AND hold.key listwalker.key)Because we are dividi ng the in creme nt by 2 in each loop, the outer loop is logarithmic; it is executed log2 n times. The first i nner loop executes n in creme nt times for ea

50、ch of the outer loops: the first time it loops through 50% of the array (n = (n /2), the sec ond time It loops through 75% of the array (n - (n /4), and so forth un til it loops through all of the eleme nts. The total nu mber of iterations for the outer loop and the first inner loop is shown below.T

51、he inn ermost loop is the most difficult to an alyze. The first limit keeps us from falli ng off the begi nning of the array. The sec ond limit determ ines whether we have to loop at all: We loop only when the data are out of order. Sometimes the inner loop is executed zero times, sometimes it is ex

52、ecuted any where from one to incre times. If we were able to derive a formula for the third factor, the total sort effort would be the product of the three loops. The first two loops have a comb ined efficiency of O(nlog2n). However, we still need to include the third loop. We can see, therefore, th

53、at the result is someth ing greater tha n O(nl og2 n).Knuth7 tells us that the sort effort for the shell sort cannot be derived mathematically, lie estimates from his empirical studies that the average sod effort is 15n 1.25. Reducing Knuth s analysis to abig-O no tati on, we see that the shell sort

54、 is O(n 1.25).Note The shell sort efficie ncy is O(n 1.25).SummaryOur analysis indicates that the shell sort is more efficient than the straight insertion sort. Table 11-1 shows the nu mber of loops for each sort with differe nt array sizes.One important point to note in analyzing Table 11-1 is that

55、 big-O notation is only an approximation. At small array sizes, it may not accurately indicate the relative merit of one sort over another. For example, heuristic studies indicate that the straight insertion sort is more efficie nt tha n the shell sort for small lists.11-3 SELECTION SORTSSelect ion

56、sorts are among the most in tuitive of all sorts. Give n a list of data to be sorted, we simply select the smallest item and place it in a sorted list. These steps are the n repeated un til we have sorted all of the data. In this sect ion we study two select ion sorts, the straight selecti on sort a

57、nd the heap sort.Straight SelectionSortIn the straight select ion sort, the list at any mome nt is divided into two sublists, sorted and unsorted, which are divided by an imaginary wall. We select the smallest element from the unsorted sublist and exchange It with the element at the beginning of the unsorted data. After each selection aid exchange, the wall between the two sublists moves one element, increasing the nu mber of sorted eleme nts and decreas ing the nu mber of un sorted on es. Each t

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯系上傳者。文件的所有權益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經權益所有人同意不得將文件中的內容挪作商業或盈利用途。
  • 5. 人人文庫網僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
  • 6. 下載文件中如有侵權或不適當內容,請與我們聯系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論