Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
755 views
in Technique[技术] by (71.8m points)

amazon web services - Hadoop on EC2 vs Elastic Map Reduce

I'm trying to evaluate the differences between these two options. Here are some pros and cons I can think of :

Elastic Map Reduce => Better support from Amazon, No need to administer cluster, More Expensive (?) EC2 + Hadoop => More control of your hadoop configuration, Cheaper (?)

I'm wondering if anyone might have benchmarked the performance of EC2 + Hadoop vis a vis EMR? Is there any significant difference in cost for large cluster deployments? What other differences exist?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

We use both approaches (EMR and EC2) at my job.

The advantages of EMR that Amar mentioned are more or less true: so if you want simplicity it may be the way to go.

But there are other considerations:

  • the version of EMR is far behind apache head. it is approximately 0.20.205 whereas head is at 2.X, which is essentially 3 versions up (1.0, 1.1, 2.0..)

hadoop@domU-12-31-39-07-B9-97:~$ ll hadoop*.jar lrwxrwxrwx 1 hadoop hadoop 73 Feb 5 12:00 hadoop-examples-0.20.205.jar -> /home/hadoop/.versions/0.20.205/share/hadoop/hadoop-examples-0.20.205.jar lrwxrwxrwx 1 hadoop hadoop 69 Feb 5 12:00 hadoop-test-0.20.205.jar -> /home/hadoop/.versions/0.20.205/share/hadoop/hadoop-test-0.20.205.jar lrwxrwxrwx 1 hadoop hadoop 69 Feb 5 12:00 hadoop-core-0.20.205.jar -> /home/hadoop/.versions/0.20.205/share/hadoop/hadoop-core-0.20.205.jar lrwxrwxrwx 1 hadoop hadoop 70 Feb 5 12:00 hadoop-tools-0.20.205.jar -> /home/hadoop/.versions/0.20.205/share/hadoop/hadoop-tools-0.20.205.jar lrwxrwxrwx 1 hadoop hadoop 68 Feb 5 12:00 hadoop-ant-0.20.205.jar -> /home/hadoop/.versions/0.20.205/share/hadoop/hadoop-ant-0.20.205.jar

  • As a direct consequence I had to re-code /restructure my Map/reduce program due to missing contrib modules in the older version running on EMR

  • You do not have as much of an opportunity to use non-Map/Reduce algorithms as if you were using updated version of M/R.

  • Flexibility to mix and match versions of hadoop ecosystem.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...