Taiwan Hadoop Forum

台灣 Hadoop 技術討論區
現在的時間是 2022-08-19, 20:14

所有顯示的時間為 UTC + 8 小時




發表新文章 回覆主題  [ 5 篇文章 ] 
發表人 內容
 文章主題 : 想請教Hadoop pipe的問題
文章發表於 : 2015-04-15, 12:30 
離線

註冊時間: 2014-06-22, 01:33
文章: 7
小弟我用hadoop 2.6 架了一個偽分部的hadoop

並在上面成功執行hadoop pipe wordcount的範例,

後來在map function嘗試呼叫一個cuda空的function,程式就無法正常執行了。

想請教是否在hadoop pipe上run cuda的程式需要對設定做哪些特別的調整嗎? 還是我的撰寫方法有誤?

在此附上,Map function 以及執行結果。
代碼:
class WordCountMapper : public HadoopPipes::Mapper {
public:
  // constructor: does nothing
  WordCountMapper( HadoopPipes::TaskContext& context ) {  }

  // map function: receives a line, outputs (word,"1")
  // to reducer.
  void map( HadoopPipes::MapContext& context ) {

   //call cuda function
    hello<<<1, 1>>>();

    //--- get line of text ---
    string line = context.getInputValue();

    //--- split it into words ---
    vector< string > words =
      HadoopUtils::splitString( line, " " );
   
   

//--- emit each word tuple (word, "1" ) ---
    for ( unsigned int i=0; i < words.size(); i++ ) {
      context.emit( words[i], HadoopUtils::toString( 1 ) );
    }
  }

代碼:
DEPRECATED: Use of this script to execute mapred command is deprecated.
Instead use the mapred command for it.

15/04/15 12:06:47 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/04/15 12:06:47 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/04/15 12:06:47 WARN mapreduce.JobSubmitter: No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
15/04/15 12:06:47 INFO mapred.FileInputFormat: Total input paths to process : 1
15/04/15 12:06:48 INFO mapreduce.JobSubmitter: number of splits:1
15/04/15 12:06:48 INFO Configuration.deprecation: hadoop.pipes.java.recordreader is deprecated. Instead, use mapreduce.pipes.isjavarecordreader
15/04/15 12:06:48 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
15/04/15 12:06:48 INFO Configuration.deprecation: hadoop.pipes.java.recordwriter is deprecated. Instead, use mapreduce.pipes.isjavarecordwriter
15/04/15 12:06:48 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1429003001270_0004
15/04/15 12:06:48 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources.
15/04/15 12:06:49 INFO impl.YarnClientImpl: Submitted application application_1429003001270_0004
15/04/15 12:06:49 INFO mapreduce.Job: The url to track the job: http://es-OptiPlex-760:8088/proxy/application_1429003001270_0004/
15/04/15 12:06:49 INFO mapreduce.Job: Running job: job_1429003001270_0004
15/04/15 12:06:56 INFO mapreduce.Job: Job job_1429003001270_0004 running in uber mode : false
15/04/15 12:06:56 INFO mapreduce.Job:  map 0% reduce 0%
15/04/15 12:07:02 INFO mapreduce.Job: Task Id : attempt_1429003001270_0004_m_000000_0, Status : FAILED
Container [pid=30929,containerID=container_1429003001270_0004_01_000002] is running beyond virtual memory limits. Current usage: 284.8 MB of 4 GB physical memory used; 23.8 GB of 8.4 GB virtual memory used. Killing container.
Dump of the process-tree for container_1429003001270_0004_01_000002 :
   |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
   |- 30966 30933 30929 30929 (cuda_wordcount) 1 3 21612838912 5512 /home/es/hadoop/tmp/nm-local-dir/usercache/es/appcache/application_1429003001270_0004/container_1429003001270_0004_01_000002/cuda_wordcount
   |- 30929 30927 30929 30929 (bash) 0 0 17043456 313 /bin/bash -c /usr/lib/jvm/java-7-openjdk-amd64/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  -Xmx3072m -Djava.io.tmpdir=/home/es/hadoop/tmp/nm-local-dir/usercache/es/appcache/application_1429003001270_0004/container_1429003001270_0004_01_000002/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/home/es/hadoop-2.6.0/logs/userlogs/application_1429003001270_0004/container_1429003001270_0004_01_000002 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 127.0.1.1 53186 attempt_1429003001270_0004_m_000000_0 2 1>/home/es/hadoop-2.6.0/logs/userlogs/application_1429003001270_0004/container_1429003001270_0004_01_000002/stdout 2>/home/es/hadoop-2.6.0/logs/userlogs/application_1429003001270_0004/container_1429003001270_0004_01_000002/stderr 
   |- 30933 30929 30929 30929 (java) 612 22 3955175424 67072 /usr/lib/jvm/java-7-openjdk-amd64/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx3072m -Djava.io.tmpdir=/home/es/hadoop/tmp/nm-local-dir/usercache/es/appcache/application_1429003001270_0004/container_1429003001270_0004_01_000002/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/home/es/hadoop-2.6.0/logs/userlogs/application_1429003001270_0004/container_1429003001270_0004_01_000002 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 127.0.1.1 53186 attempt_1429003001270_0004_m_000000_0 2

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

15/04/15 12:07:12 INFO mapreduce.Job: Task Id : attempt_1429003001270_0004_m_000000_1, Status : FAILED
Container [pid=30995,containerID=container_1429003001270_0004_01_000003] is running beyond virtual memory limits. Current usage: 277.4 MB of 4 GB physical memory used; 23.8 GB of 8.4 GB virtual memory used. Killing container.
Dump of the process-tree for container_1429003001270_0004_01_000003 :
   |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
   |- 31032 30999 30995 30995 (cuda_wordcount) 1 3 21612838912 5511 /home/es/hadoop/tmp/nm-local-dir/usercache/es/appcache/application_1429003001270_0004/container_1429003001270_0004_01_000003/cuda_wordcount
   |- 30995 30993 30995 30995 (bash) 0 0 17043456 313 /bin/bash -c /usr/lib/jvm/java-7-openjdk-amd64/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  -Xmx3072m -Djava.io.tmpdir=/home/es/hadoop/tmp/nm-local-dir/usercache/es/appcache/application_1429003001270_0004/container_1429003001270_0004_01_000003/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/home/es/hadoop-2.6.0/logs/userlogs/application_1429003001270_0004/container_1429003001270_0004_01_000003 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 127.0.1.1 53186 attempt_1429003001270_0004_m_000000_1 3 1>/home/es/hadoop-2.6.0/logs/userlogs/application_1429003001270_0004/container_1429003001270_0004_01_000003/stdout 2>/home/es/hadoop-2.6.0/logs/userlogs/application_1429003001270_0004/container_1429003001270_0004_01_000003/stderr 
   |- 30999 30995 30995 30995 (java) 617 22 3943096320 65195 /usr/lib/jvm/java-7-openjdk-amd64/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx3072m -Djava.io.tmpdir=/home/es/hadoop/tmp/nm-local-dir/usercache/es/appcache/application_1429003001270_0004/container_1429003001270_0004_01_000003/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/home/es/hadoop-2.6.0/logs/userlogs/application_1429003001270_0004/container_1429003001270_0004_01_000003 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 127.0.1.1 53186 attempt_1429003001270_0004_m_000000_1 3

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

15/04/15 12:07:21 INFO mapreduce.Job: Task Id : attempt_1429003001270_0004_m_000000_2, Status : FAILED
Container [pid=31057,containerID=container_1429003001270_0004_01_000004] is running beyond virtual memory limits. Current usage: 287.6 MB of 4 GB physical memory used; 23.8 GB of 8.4 GB virtual memory used. Killing container.
Dump of the process-tree for container_1429003001270_0004_01_000004 :
   |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
   |- 31094 31061 31057 31057 (cuda_wordcount) 1 4 21612838912 5511 /home/es/hadoop/tmp/nm-local-dir/usercache/es/appcache/application_1429003001270_0004/container_1429003001270_0004_01_000004/cuda_wordcount
   |- 31061 31057 31057 31057 (java) 613 25 3952521216 67807 /usr/lib/jvm/java-7-openjdk-amd64/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx3072m -Djava.io.tmpdir=/home/es/hadoop/tmp/nm-local-dir/usercache/es/appcache/application_1429003001270_0004/container_1429003001270_0004_01_000004/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/home/es/hadoop-2.6.0/logs/userlogs/application_1429003001270_0004/container_1429003001270_0004_01_000004 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 127.0.1.1 53186 attempt_1429003001270_0004_m_000000_2 4
   |- 31057 31055 31057 31057 (bash) 0 0 17043456 314 /bin/bash -c /usr/lib/jvm/java-7-openjdk-amd64/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  -Xmx3072m -Djava.io.tmpdir=/home/es/hadoop/tmp/nm-local-dir/usercache/es/appcache/application_1429003001270_0004/container_1429003001270_0004_01_000004/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/home/es/hadoop-2.6.0/logs/userlogs/application_1429003001270_0004/container_1429003001270_0004_01_000004 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 127.0.1.1 53186 attempt_1429003001270_0004_m_000000_2 4 1>/home/es/hadoop-2.6.0/logs/userlogs/application_1429003001270_0004/container_1429003001270_0004_01_000004/stdout 2>/home/es/hadoop-2.6.0/logs/userlogs/application_1429003001270_0004/container_1429003001270_0004_01_000004/stderr 

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

15/04/15 12:07:30 INFO mapreduce.Job:  map 100% reduce 100%
15/04/15 12:07:31 INFO mapreduce.Job: Job job_1429003001270_0004 failed with state FAILED due to: Task failed task_1429003001270_0004_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

15/04/15 12:07:31 INFO mapreduce.Job: Counters: 9
   Job Counters
      Failed map tasks=4
      Launched map tasks=4
      Other local map tasks=3
      Data-local map tasks=1
      Total time spent by all maps in occupied slots (ms)=103744
      Total time spent by all reduces in occupied slots (ms)=0
      Total time spent by all map tasks (ms)=25936
      Total vcore-seconds taken by all map tasks=25936
      Total megabyte-seconds taken by all map tasks=106233856
Exception in thread "main" java.io.IOException: Job failed!
   at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
   at org.apache.hadoop.mapred.pipes.Submitter.runJob(Submitter.java:264)
   at org.apache.hadoop.mapred.pipes.Submitter.run(Submitter.java:503)
   at org.apache.hadoop.mapred.pipes.Submitter.main(Submitter.java:518)


回頂端
 個人資料 E-mail  
 
 文章主題 : Re: 想請教Hadoop pipe的問題
文章發表於 : 2015-04-16, 00:48 
離線

註冊時間: 2009-11-09, 19:52
文章: 2897
不太確定您是否有在每一台 TaskTracker 上都安裝 CUDA 執行環境。
目前我有看過的是透過 Hadoop Streaming 跑 CUDA
也許用 Hadoop Pipe 有一些函式庫呼叫上的問題,我比較沒這部份經驗,
可以先看一下:
http://wiki.apache.org/hadoop/CUDA%20On%20Hadoop

- Jazz


回頂端
 個人資料 E-mail  
 
 文章主題 : Re: 想請教Hadoop pipe的問題
文章發表於 : 2015-04-16, 19:46 
離線

註冊時間: 2014-06-22, 01:33
文章: 7
謝謝 JAZZ大的回覆

我的Hadoop 目前只有一台主機,是用Pseudo-Distributed 模式,不知道這有沒有影響?

我會在嘗試看看,或在試試其他方案jcuda or streaming,THANKS YOU

順便在請教ㄧ個問題,hadoop 可以直接呼叫執行在主機上的程式嗎?


回頂端
 個人資料 E-mail  
 
 文章主題 : Re: 想請教Hadoop pipe的問題
文章發表於 : 2015-04-21, 00:00 
離線

註冊時間: 2009-11-09, 19:52
文章: 2897
saiwayneliao 寫:
謝謝 JAZZ大的回覆
我的Hadoop 目前只有一台主機,是用Pseudo-Distributed 模式,不知道這有沒有影響?
我會在嘗試看看,或在試試其他方案jcuda or streaming,THANKS YOU
順便在請教ㄧ個問題,hadoop 可以直接呼叫執行在主機上的程式嗎?


Pseudo-Distributed Mode 影響不大,倒是看到一些奇怪的訊息:

代碼:
Container [pid=30929,containerID=container_1429003001270_0004_01_000002] is running beyond virtual memory limits. Current usage: 284.8 MB of 4 GB physical memory used; 23.8 GB of 8.4 GB virtual memory used. Killing container.


訊息看起來好像 YARN 的容器因為記憶體限制的關係,所以沒辦法正常執行。

Hadoop MapReduce 可以直接呼叫主機上的程式,透過 Hadoop Streaming 即可。
這裡有我設計的範例 - http://trac.3du.me/cloud/wiki/III150110/Lab7
供參考~

- Jazz


回頂端
 個人資料 E-mail  
 
 文章主題 : Re: 想請教Hadoop pipe的問題
文章發表於 : 2015-04-28, 21:46 
離線

註冊時間: 2014-06-22, 01:33
文章: 7
好的,我在研究看看,謝謝 JAZZ大。


回頂端
 個人資料 E-mail  
 
顯示文章 :  排序  
發表新文章 回覆主題  [ 5 篇文章 ] 

所有顯示的時間為 UTC + 8 小時


誰在線上

正在瀏覽這個版面的使用者:沒有註冊會員 和 1 位訪客


不能 在這個版面發表主題
不能 在這個版面回覆主題
不能 在這個版面編輯您的文章
不能 在這個版面刪除您的文章
不能 在這個版面上傳附加檔案

搜尋:
前往 :  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
正體中文語系由 竹貓星球 維護製作