鍍金池/ 問答/人工智能  網(wǎng)絡(luò)安全/ MapReduce Job執(zhí)行報錯 File file job.jar does

MapReduce Job執(zhí)行報錯 File file job.jar does not exist

我嘗試在java中直接運行main方法去提交job到y(tǒng)arn中執(zhí)行。但是得到如下的錯誤信息:

2018-08-26 10:25:37,544 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1375)) - Job job_1535213323614_0010 failed with state FAILED due to: Application application_1535213323614_0010 failed 2 times due to AM Container for appattempt_1535213323614_0010_000002 exited with  exitCode: -1000 due to: File file:/tmp/hadoop-yarn/staging/nasuf/.staging/job_1535213323614_0010/job.jar does not exist
.Failing this attempt.. Failing the application.

并且HADOOP_HOME的日志目錄中沒有任何此次job的日志信息。

mapper代碼如下

public class WCMapper extends Mapper<LongWritable, Text, Text, LongWritable> {
    
    @Override
    protected void map(LongWritable key, Text value, Context context)
            throws IOException, InterruptedException {
        
        String line = value.toString();
        String[] words = StringUtils.split(line, " ");
        
        for (String word: words) {
            context.write(new Text(word), new LongWritable(1));
        }
        
    }

}

reducer代碼如下:

public class WCReducer extends Reducer<Text, LongWritable, Text, LongWritable>{
    
    @Override
    protected void reduce(Text key, Iterable<LongWritable> values, Context context) 
            throws IOException, InterruptedException {
        
        long count = 0;
        for (LongWritable value: values) {
            count += value.get();
        }
        
        context.write(key, new LongWritable(count));
        
    }

}

main方法如下:

public class WCRunner {
    
    public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
        Configuration conf = new Configuration();
        conf.set("mapreduce.job.jar", "wc.jar");
        conf.set("mapreduce.framework.name", "yarn");
        conf.set("yarn.resourcemanager.hostname", "hdcluster01");
        conf.set("yarn.nodemanager.aux-services", "mapreduce_shuffle");
        Job job = Job.getInstance(conf);
        
        // 設(shè)置整個job所用的類在哪個jar包
        job.setJarByClass(WCRunner.class);
        
        // 本job實用的mapper和reducer的類
        job.setMapperClass(WCMapper.class);
        job.setReducerClass(WCReducer.class);
        
        // 指定reducer的輸出數(shù)據(jù)kv類型(若不指定下面mapper的輸出類型,此處可以同時表明mapper和reducer的輸出類型)
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(LongWritable.class);
        
        // 指定mapper的輸出數(shù)據(jù)kv類型
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(LongWritable.class);
        
        // 指定原始數(shù)據(jù)存放位置
        FileInputFormat.setInputPaths(job, new Path("hdfs://hdcluster01:9000/wc/srcdata"));
        
        // 處理結(jié)果的輸出數(shù)據(jù)存放路徑
        FileOutputFormat.setOutputPath(job, new Path("hdfs://hdcluster01:9000/wc/output3"));
        
        // 將job提交給集群運行
        job.waitForCompletion(true);
        
    }

}

我本地執(zhí)行代碼的操作系統(tǒng)是MacOS,用戶名是nasuf,遠(yuǎn)程部署的hadoop是偽分布式模式,hdfs和yarn都在一臺服務(wù)器上,所屬用戶是parallels。
我查看了日志中提到的這個路徑/tmp/hadoop-yarn/staging/nasuf/.staging/job_1535213323614_0010/job.jar確實并不存在。/tmp下并沒有/hadoop-yarn目錄。

請問是什么原因?qū)е碌倪@個問題呢?
謝謝大家

回答
編輯回答
陌上花

問題解決了。直接將core-site.xml文件拷貝到classpath下,或者添加如下配置即可:

conf.set("hadoop.tmp.dir", "/home/parallels/app/hadoop-2.4.1/data/");
2018年3月23日 01:12