Taiwan Hadoop Forum

台灣 Hadoop 技術討論區
現在的時間是 2022-07-02, 13:10

所有顯示的時間為 UTC + 8 小時




發表新文章 回覆主題  [ 3 篇文章 ] 
發表人 內容
 文章主題 : 請問wordCount程式上傳用hdfs錯誤,請問問題在哪裡呢
文章發表於 : 2013-12-23, 04:02 
離線

註冊時間: 2013-10-15, 21:01
文章: 50
package testMap;

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.*;
import org.apache.hadoop.mapreduce.lib.output.*;
import org.apache.hadoop.util.*;

public class WordCount extends Configured implements Tool {
public static class Map extends Mapper<Object, Text, Text, IntWritable> {

private final static IntWritable one = new IntWritable(1);
private Text word = new Text();

public void map(Object key, Text value, Context context)
throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
int count = 0;
Text aa = new Text();
aa.set("AA");
while (itr.hasMoreTokens()) {
count++;
word.set(itr.nextToken());
System.out.println("word=" + word.toString() + ",count="
+ count);

context.write(word, one);
}

}
}

public static class Reduce extends
Reducer<Text, IntWritable, Text, IntWritable> {
private IntWritable result = new IntWritable();

public void reduce(Text key, Iterable<IntWritable> values,
Context context) throws IOException, InterruptedException {

int sum = 0;
for (IntWritable val : values) {
System.out.println("val=" + val.get());
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
}

public int run(String[] args) throws Exception {
boolean useJobTracker = true;
Configuration conf = new Configuration();
conf.set("fs.default.name", "hdfs://localhost:9000");
if (useJobTracker)
conf.set("mapred.job.tracker", "localhost:9001");
else
conf.set("mapred.job.tracker", "local");

FileSystem hdfs = FileSystem.get(conf);

Job job = new Job(conf, "WordCount");
job.setJarByClass(WordCount.class);
job.setJobName("WordCount");
job.setInputFormatClass(TextInputFormat.class);

job.setMapperClass(Map.class);
job.setCombinerClass(Reduce.class);
job.setReducerClass(Reduce.class);

FileInputFormat.setInputPaths(job, new Path(args[0]));

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

job.setOutputFormatClass(TextOutputFormat.class);

Path dst_path = new Path(args[1]);

if (hdfs.exists(dst_path)) {
hdfs.delete(dst_path, true);
System.out.println("檔案已經存在");
} else {
System.out.println("檔案不存在");
}
FileOutputFormat.setOutputPath(job, dst_path);

boolean success = job.waitForCompletion(true);

return success ? 0 : 1;// 如果是true 回傳0 如果fales

}

public static void main(String[] args) throws Exception {
int ret = ToolRunner.run(new WordCount(), args);
System.out.println("new WordCount");
System.exit(ret);

}

}


以下是錯誤訊息:
13/12/23 03:58:24 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/12/23 03:58:25 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
13/12/23 03:58:25 INFO input.FileInputFormat: Total input paths to process : 3
13/12/23 03:58:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/12/23 03:58:25 WARN snappy.LoadSnappy: Snappy native library not loaded
13/12/23 03:58:25 INFO mapred.JobClient: Running job: job_201312222115_0016
13/12/23 03:58:26 INFO mapred.JobClient: map 0% reduce 0%
13/12/23 03:58:55 INFO mapred.JobClient: Task Id : attempt_201312222115_0016_m_000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: testMap.WordCount$Map
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:867)
at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:199)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:719)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Unknown Source)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: testMap.WordCount$Map
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Unknown Source)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:865)
... 8 more


煩請大大門指教 謝謝!!我是用(windoop寫的)


回頂端
 個人資料 E-mail  
 
 文章主題 : Re: 請問wordCount程式上傳用hdfs錯誤,請問問題在哪裡呢
文章發表於 : 2013-12-24, 21:30 
離線

註冊時間: 2009-11-09, 19:52
文章: 2897
Works for me here.

代碼:
jazz@vmm:~/my_code$ ant
Buildfile: /home/jazz/my_code/build.xml

compile:
    [mkdir] Created dir: /home/jazz/my_code/class
    [javac] /home/jazz/my_code/build.xml:14: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds
    [javac] Compiling 1 source file to /home/jazz/my_code/class

doc:
    [mkdir] Created dir: /home/jazz/my_code/doc
  [javadoc] Generating Javadoc
  [javadoc] Javadoc execution
  [javadoc] Loading source file /home/jazz/my_code/src/WordCount.java...
  [javadoc] Constructing Javadoc information...
  [javadoc] Standard Doclet version 1.6.0_26
  [javadoc] Building tree for all the packages and classes...
  [javadoc] Building index for all the packages and classes...
  [javadoc] Building index for all classes...

jar:
      [jar] Building jar: /home/jazz/my_code/WordCount.jar

BUILD SUCCESSFUL
Total time: 2 seconds
jazz@vmm:~/my_code$ cat input
this is a test
that is also a test
jazz@vmm:~/my_code$ hadoop fs -put input input
jazz@vmm:~/my_code$ hadoop fs -ls
Found 1 items
-rw-r--r--   1 jazz supergroup         35 2013-12-24 21:26 /user/jazz/input
jazz@vmm:~/my_code$ hadoop jar WordCount.jar testMap.WordCount input output
檔案不存在
13/12/24 21:26:35 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/12/24 21:26:35 INFO input.FileInputFormat: Total input paths to process : 1
13/12/24 21:26:35 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/12/24 21:26:35 WARN snappy.LoadSnappy: Snappy native library not loaded
13/12/24 21:26:35 INFO mapred.JobClient: Running job: job_201312242120_0003
13/12/24 21:26:36 INFO mapred.JobClient:  map 0% reduce 0%
13/12/24 21:26:50 INFO mapred.JobClient:  map 100% reduce 0%
13/12/24 21:27:02 INFO mapred.JobClient:  map 100% reduce 100%
13/12/24 21:27:07 INFO mapred.JobClient: Job complete: job_201312242120_0003
13/12/24 21:27:07 INFO mapred.JobClient: Counters: 29
13/12/24 21:27:07 INFO mapred.JobClient:   Job Counters
13/12/24 21:27:07 INFO mapred.JobClient:     Launched reduce tasks=1
13/12/24 21:27:07 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=11803
13/12/24 21:27:07 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/12/24 21:27:07 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/12/24 21:27:07 INFO mapred.JobClient:     Launched map tasks=1
13/12/24 21:27:07 INFO mapred.JobClient:     Data-local map tasks=1
13/12/24 21:27:07 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=10002
13/12/24 21:27:07 INFO mapred.JobClient:   File Output Format Counters
13/12/24 21:27:07 INFO mapred.JobClient:     Bytes Written=37
13/12/24 21:27:07 INFO mapred.JobClient:   FileSystemCounters
13/12/24 21:27:07 INFO mapred.JobClient:     FILE_BYTES_READ=67
13/12/24 21:27:07 INFO mapred.JobClient:     HDFS_BYTES_READ=137
13/12/24 21:27:07 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=43807
13/12/24 21:27:07 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=37
13/12/24 21:27:07 INFO mapred.JobClient:   File Input Format Counters
13/12/24 21:27:07 INFO mapred.JobClient:     Bytes Read=35
13/12/24 21:27:07 INFO mapred.JobClient:   Map-Reduce Framework
13/12/24 21:27:07 INFO mapred.JobClient:     Map output materialized bytes=67
13/12/24 21:27:07 INFO mapred.JobClient:     Map input records=2
13/12/24 21:27:07 INFO mapred.JobClient:     Reduce shuffle bytes=0
13/12/24 21:27:07 INFO mapred.JobClient:     Spilled Records=12
13/12/24 21:27:07 INFO mapred.JobClient:     Map output bytes=71
13/12/24 21:27:07 INFO mapred.JobClient:     CPU time spent (ms)=2640
13/12/24 21:27:07 INFO mapred.JobClient:     Total committed heap usage (bytes)=401997824
13/12/24 21:27:07 INFO mapred.JobClient:     Combine input records=9
13/12/24 21:27:07 INFO mapred.JobClient:     SPLIT_RAW_BYTES=102
13/12/24 21:27:07 INFO mapred.JobClient:     Reduce input records=6
13/12/24 21:27:07 INFO mapred.JobClient:     Reduce input groups=6
13/12/24 21:27:07 INFO mapred.JobClient:     Combine output records=6
13/12/24 21:27:07 INFO mapred.JobClient:     Physical memory (bytes) snapshot=356114432
13/12/24 21:27:07 INFO mapred.JobClient:     Reduce output records=6
13/12/24 21:27:07 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=1210056704
13/12/24 21:27:07 INFO mapred.JobClient:     Map output records=9
new WordCount
jazz@vmm:~/my_code$ hadoop fs -cat output/part*
a   2
also   1
is   2
test   2
that   1
this   1


回頂端
 個人資料 E-mail  
 
 文章主題 : Re: 請問wordCount程式上傳用hdfs錯誤,請問問題在哪裡呢
文章發表於 : 2013-12-27, 17:18 
離線

註冊時間: 2013-10-15, 21:01
文章: 50
感謝JAZZ大大回答!!


回頂端
 個人資料 E-mail  
 
顯示文章 :  排序  
發表新文章 回覆主題  [ 3 篇文章 ] 

所有顯示的時間為 UTC + 8 小時


誰在線上

正在瀏覽這個版面的使用者:沒有註冊會員 和 3 位訪客


不能 在這個版面發表主題
不能 在這個版面回覆主題
不能 在這個版面編輯您的文章
不能 在這個版面刪除您的文章
不能 在這個版面上傳附加檔案

搜尋:
前往 :  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
正體中文語系由 竹貓星球 維護製作