鍍金池/ 問答/Java  網(wǎng)絡安全/ 如何調(diào)試Hadoop的MapReduce 過程

如何調(diào)試Hadoop的MapReduce 過程

1.Hadoop 2.7.4,hbase 1.2.6
2.目的,從hbase 查詢數(shù)據(jù)到hdfs中
3.運行結(jié)果如下

Map-Reduce Framework
        Map input records=2
        Map output records=0
        Input split bytes=62
        Spilled Records=0
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=38
        CPU time spent (ms)=1450
        Physical memory (bytes) snapshot=213590016
        Virtual memory (bytes) snapshot=2123476992
        Total committed heap usage (bytes)=99090432

Map 一直輸出0,不知道怎么看中間過程,hbase user表中兩條記錄,所以input=2是對的
代碼如下:

    public class Hdfs {
    private static Logger logger = Logger.getLogger(Mysql.class);
    public static class HbaseMapper extends TableMapper<Text, Text> {
        @Override
        protected void map(ImmutableBytesWritable key, Result value, Context context) throws IOException, InterruptedException {
            StringBuffer sb = new StringBuffer("");
            context.write(new Text("test"),new Text("value"));
            for(Cell kv : value.listCells()){
                context.write(new Text(key.get()), new Text(new String(kv.getValue()+"sss")));
            }
        }
    }

    public static class HdfsReducer extends Reducer<Text,Text,Text,Text>{
        private Text result = new Text();

        @Override
        protected void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
            for(Text val:values){
                result.set(val);
            }
            context.write(key, result);
        }
    }

    public static void main(String[] args)throws Exception{

        String output = "hdfs://*.*.*.*:9000/output";
        System.setProperty("hadoop.home.dir","/Users/*/hadoop-2.7.4");
        Configuration conf = HBaseConfiguration.create();
        conf.set("hbase.zookeeper.quorum","master");
        conf.set("fs.default.name","hdfs://*.*.*.*:9000");
        conf.set("mapreduce.app-submission.cross-platform","true");
        conf.set("mapreduce.framework.name","yarn");
        conf.set("mapred.jar","/Users/*/Downloads/WordCount/target/hadoop_m2-1.0-SNAPSHOT.jar");

        FileSystem fs = FileSystem.get(conf);
        Path p = new Path(output);
        if(fs.exists(p)){fs.delete(p,true);}
        Job job = Job.getInstance(conf,"hbase2hdfs");
        job.setJarByClass(Hdfs.class);
        Scan s = new Scan();
        TableMapReduceUtil.initTableMapperJob("user", s,HbaseMapper.class, Text.class, Text.class, job);
        job.setReducerClass(HdfsReducer.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(Text.class);
        job.setNumReduceTasks(0);
        FileOutputFormat.setOutputPath(job, p);
        System.exit(job.waitForCompletion(true)?0:1);
    }
}
回答
編輯回答
清夢
  1. hadoop Map過程修改需要重新打包(過程中沒法輸出)
  2. main函數(shù)不需要重新打包
  3. 調(diào)試的話可以試試MRUnit
2017年3月5日 20:49