Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
409 views
in Technique[技术] by (71.8m points)

selection - Understanding "median of medians" algorithm

I want to understand "median of medians" algorithm on the following example:

We have 45 distinct numbers divided into 9 group with 5 elements each.

48 43 38 33 28 23 18 13 8

49 44 39 34 29 24 19 14 9 

50 45 40 35 30 25 20 15 10

51 46 41 36 31 26 21 16 53

52 47 42 37 32 27 22 17 54
  1. The first step is sorting every group (in this case they are already sorted)
  2. Second step recursively, ?nd the "true" median of the medians (50 45 40 35 30 25 20 15 10) i.e. the set will be divided into 2 groups:

    50 25
    
    45 20 
    
    40 15
    
    35 10
    
    30
    

    sorting these 2 groups

    30 10
    
    35 15 
    
    40 20
    
    45 25
    
    50
    

the medians is 40 and 15 (in case the numbers are even we took left median) so the returned value is 15 however "true" median of medians (50 45 40 35 30 25 20 15 10) is 30, moreover there are 5 elements less then 15 which are much less than 30% of 45 which are mentioned in wikipedia

and so T(n) <= T(n/5) + T(7n/10) + O(n) fails.

By the way in the Wikipedia example, I get result of recursion as 36. However, the true median is 47.

So, I think in some cases this recursion may not return true median of medians. I want to understand where is my mistake.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The problem is in the step where you say to find the true median of the medians. In your example, you had these medians:

50 45 40 35 30 25 20 15 10

The true median of this data set is 30, not 15. You don't find this median by splitting the groups into blocks of five and taking the median of those medians, but instead by recursively calling the selection algorithm on this smaller group. The error in your logic is assuming that median of this group is found by splitting the above sequence into two blocks

50 45 40 35 30

and

25 20 15 10

then finding the median of each block. Instead, the median-of-medians algorithm will recursively call itself on the complete data set 50 45 40 35 30 25 20 15 10. Internally, this will split the group into blocks of five and sort them, etc., but it does so to determine the partition point for the partitioning step, and it's in this partitioning step that the recursive call will find the true median of the medians, which in this case will be 30. If you use 30 as the median as the partitioning step in the original algorithm, you do indeed get a very good split as required.

Hope this helps!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...