2012年2月20日 星期一

Retired Job

關於 retired job 解釋

Job Retirement Policy (pre-H21 only!)

Once a job is complete it is kept in memory (up to mapred.jobtracker.completeuserjobs.maximum) and on disk as per the above. There is a configuration value that controls the overall retirement policy of completed jobs:
Key: mapred.jobtracker.retirejob.interval
Default: 24 * 60 * 60 * 1000  (1 day) 完成的job一天後就會自動清除(retired)
In other words, completed jobs are retired after one day by default. The check for jobs to be retired is done by default every minute and can be controlled with:
Key: mapred.jobtracker.retirejob.check
Default: 60 * 1000 (60s in msecs) job完成後1分鐘內不會被清除(retired)
The check runs continually while the JobTracker is running. If a job is retired it is simply removed from the in-memory list of the JobTracker (it also removes all Tasks for the job etc.). Jobs are not retired under at least 1 minute (hardcoded in JobTracker.java) of their finish time. The retire call also removes the JobTracker Local (see above) file for the job. All that is left are the two files per retired job in the history directory (hadoop.job.history.location) plus – if enabled – the Per Job files (hadoop.job.history.user.location).

[Reference]
http://www.cloudera.com/blog/2010/11/hadoop-log-location-and-retention/

沒有留言:

張貼留言