admin管理员组

文章数量:1529463

首先启动Hadoop集群,然后提交jar(spark-submit --master yarn --class org.apache.spark.examples.SparkPi ./spark-examples-1.6.0-hadoop2.6.0.jar 10)后:

出现的问题:2018-11-01 09:33:44 INFO BlockManagerInfo:54 - Updated broadcast_0_piece0 in memory on test1:48384 (current size: 1181.0 B, original size: 1181.0 B, free: 413.9 MB) 2018-11-01 09:33:44 ERROR Client:91 - Failed to contact YARN for application application_1541034627389_0001. java.io.InterruptedIOException: Call interrupted at org.apache.hadoop.ipc.Client.call(Client.java:1469) at org.apache.hadoop.ipc.Client.call(Client.java:1412) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) at com.sun.proxy.$Proxy15.getApplicationReport(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:191) at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy16.getApplicationReport(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:430) at org.apache.spark.deploy.yarn.Client.getApplicationReport(Client.scala:300) at org.apache.spark.deploy.yarn.Client.monitorApplication(Client.scala:1059) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:109)
解决方法:
发现自己在spark/conf中的spark-env.sh,少指定了export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop,导致出现的问题。

这里计算的是pi,算出结果(输入只有10个点,所以算出的结果有点偏差)。

总结:
启动Spark关键三步设置​​:
HADOOP_HOME:设置hadoop的安装目录;
HADOOP_CONF_DIR:设置hadoop配置文件的目录;
YARN_CONF_DIR:设置yarn配置文件的目录,同上;​

本文标签: 模式CLIENTyarnSparkapplication