mapreduce - What are SUCCESS and part-r-00000 files in hadoop

Question

Welcome To Ask or Share your Answers For Others

mapreduce - What are SUCCESS and part-r-00000 files in hadoop

1 Reply

深蓝 · Answer 1 · 2021-10-23T18:29:27+0000

See http://www.cloudera.com/blog/2010/08/what%E2%80%99s-new-in-apache-hadoop-0-21/

On the successful completion of a job, the MapReduce runtime creates a _SUCCESS file in the output directory. This may be useful for applications that need to see if a result set is complete just by inspecting HDFS. (MAPREDUCE-947)

This would typically be used by job scheduling systems (such as OOZIE), to denote that follow-on processing on the contents of this directory can commence as all the data has been output.

Update (in response to comment)

The output files are by default named part-x-yyyyy where:

x is either 'm' or 'r', depending on whether the job was a map only job, or reduce
yyyyy is the mapper or reducer task number (zero based)

So a job which has 32 reducers will have files named part-r-00000 to part-r-00031, one for each reducer task.

Categories

mapreduce - What are SUCCESS and part-r-00000 files in hadoop

mapreduce - What are SUCCESS and part-r-00000 files in hadoop

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags