Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
682 views
in Technique[技术] by (71.8m points)

mapreduce - What is the use of grouping comparator in hadoop map reduce

I would like to know why grouping comparator is used in secondary sort of mapreduce.

According to the definitive guide example of secondary sorting

We want the sort order for keys to be by year (ascending) and then by temperature (descending):

1900 35°C
1900 34°C
1900 34°C
...
1901 36°C
1901 35°C

By setting a partitioner to partition by the year part of the key, we can guarantee that records for the same year go to the same reducer. This still isn’t enough to achieve our goal, however. A partitioner ensures only that one reducer receives all the records for a year; it doesn’t change the fact that the reducer groups by key within the partition.

Since we would have already written our own partitioner which would take care of the map output keys going to particular reducer,so why should we group it.

Thanks in advance

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

In support of the chosen answer I add:

Following on from this explanation

**Input**:

    symbol time price
    a      1    10
    a      2    20
    b      3    30

**Map output**: create composite keyvalues like so:

> symbol-time time-price
>
>**a-1**         1-10
>
>**a-2**         2-20
>
>**b-3**         3-30

The Partitioner: will route the a-1 and a-2 keys to the same reducer despite the keys being different. It will also route the b-3 to a separate reducer.

GroupComparator: once the composites keyvalue arrive at the reducer instead of the reducer getting

>(**a-1**,{1-10})
>
>(**a-2**,{2-20})

the above will happen due to the unique key values following composition.

the group comparator will ensure the reducer gets:

(a-1,{**1-10,2-20**})

The key of the grouped values will be the one which comes first in the group. This can be controlled by Key comparator.

**[[In a single reduce method call.]]**

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...