Friday, March 21, 2014

Is your map reduce job running locally?

I spent almost half the day figuring out why my mahout job is running locally, even though I have not set MAHOUT_LOCAL flag. Then I figured out that, in the  mapped-site.xml, if you do not set the property, yarn.app.mapreduce.am.staging-dir, your job is going to run locally.


<name>yarn.app.mapreduce.am.staging-dir</name>
     <value>yarn</value>
     <description>Name of the MapReduce framework. Default value is yarn.</description>


Also on another note, set , yarn.application.classpath, so that ur mapred related jars are in class path.



<property>
    <name>yarn.application.classpath</name>
    <value>/etc/hadoop/conf,/usr/lib/hadoop/*,/usr/lib/hadoop/lib/*,/usr/lib/hadoop-hdfs/*,/usr/lib/hadoop-hdfs/lib/*,/usr/lib/hadoop-yarn/*,/usr/lib/hadoop-yarn/lib/*,/usr/lib/hadoop-mapreduce/*,/usr/lib/hadoop-mapreduce/lib/*</value>
  </property>

No comments: