Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
528 views
in Technique[技术] by (71.8m points)

hadoop - How to keep YARN's log files?

Suddenly, my YARN cluster has stopped working, everything I submit fails with "Exit code 1". I want to track down that problem, but as soon as an application failed, YARN deletes the log files. What is the configuration setting I have to adjust for YARN to keep these log files?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It seems your container is exiting with exit code 1.

You are unable to see the logs on the UI, because by default, the log aggregation is disabled. Following parameter determines the log aggregation: "yarn.log-aggregation-enable" (set to "false" if log aggregation is disabled).

If this is set to "false", then all the node managers store the container logs in a local directory, determined by the following configuration parameter: "yarn.nodemanager.log-dirs".

For e.g. in my case, this is set to:

  <property>
    <name>yarn.nodemanager.log-dirs</name>
    <value>e:hdpdatahadooplogs</value>
  </property>

So, all my container logs for a particular application, will be found in the folder "e:hdpdatahadooplogs {application-id} {container-id}", in the Node Manager machine, where the Application Master ran.

Let's assume that my application: "application_1443377528298_0010" FAILED. In the YARNRM's UI (determined by config parameter: yarn.resourcemanager.webapp.address), you can get the information about the node, on which the Application Manager ran. In the figure below, the Application Manager ran on the machine "120243". enter image description here

If you login to this machine and search in the folder "e:hdpdatahadooplogsapplication_1443377528298_0010", you can see the logs for all the containers of application "application_1443377528298_0010".

But, now if you want to see the logs through YARN RM web UI, then you need to enable the log aggregation. For that, you need to set the following parameters, in yarn-site.xml:

  <property>
      <name>yarn.log-aggregation-enable</name>
      <value>true</value>
  </property>
  <property>
     <name>yarn.nodemanager.remote-app-log-dir</name>
     <value>/app-logs</value>
  </property>
  <property>
      <name>yarn.nodemanager.remote-app-log-dir-suffix</name>
      <value>logs</value>
  </property>

With the above settings, my logs are aggregated in HDFS at "/app-logs/{username}/logs/". Under this folder, you can find logs for all the applications run so far. Again the log retention is determined by the configuration parameter "yarn.log-aggregation.retain-seconds" (how long to retain the aggregated logs).

When the MapReduce applications are running, then you can access the logs from the YARN's web UI. Once the application is completed, the logs are served through Job History Server.

In your case, if you want to see the logs on the Web UI, after the application is terminated, then you need to start running the MapReduce Job History server also. To enable it, set following configuration parameters in mapred-site.xml:

  <property>
    <name>mapreduce.jobhistory.address</name>
    <value>{job-history-hostname}:10020</value>
  </property>
  <property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>{job-history-hostname}:19888</value>
  </property>

And set following configuration parameter in yarn-site.xml:

  <property>
    <name>yarn.log.server.url</name>
    <value>http://{job-history-hostname}:19888/jobhistory/logs</value>
  </property>

I have replicated settings from HDP installation on Windows and these settings work for me. These should work for you also. For the description of each of the configurations mentioned above, refer the links below:

https://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml

https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...