python - Thriftを使用してPythonスクリプト内でHive-Queryを実行すると、「接続が拒否されました」

Question

全て、

Python用のThriftライブラリを使用して、Pythonスクリプト内でハイブクエリを実行しようとしています。などのようcreate tableにM/Rselect * from tableを実行しないクエリを実行できます。しかし、（のように）M / Rジョブを実行するクエリを実行するselect * from table where...と、次の例外が発生します。

starting hive server...

Hive history file=/tmp/root/hive_job_log_root_201212171354_275968533.txt
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
java.net.ConnectException: Call to sp-rhel6-01/172.22.193.79:54311 failed on connection exception: java.net.ConnectException: Connection refused

Job Submission failed with exception 'java.net.ConnectException(Call to sp-rhel6-01/172.22.193.79:54311 failed on connection exception: java.net.ConnectException: Connection refused)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask

マルチノードのHadoopクラスターがあり、ハイブがネームノードにインストールされています。同じネームノードでもPythonスクリプトを実行しています。

Pythonスクリプトは

from hive_service import ThriftHive
from thrift import Thrift
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol

transport = TSocket.TSocket('172.22.193.79', 10000)
transport = TTransport.TBufferedTransport(transport)
protocol = TBinaryProtocol.TBinaryProtocol(transport)

client = ThriftHive.Client(protocol)
transport.open()

client.execute("select count(*) from example ")
print client.fetchAll();
transport.close()

誰かが私が何が悪いのかを理解するのを手伝ってもらえますか？

-Sushant

python - Thriftを使用してPythonスクリプト内でHive-Queryを実行すると、「接続が拒否されました」

1 に答える 1

Related

Reference