python - How to read other files in hadoop jobs?

Question

I need to read in a dictionary file to filter content specified in the hdfs_input, and I have uploaded it to the cluster using the put command, but I don't know how to access it in my program.

I tried to access it using path on the cluster like normal files, but it gives the error information: IOError: [Errno 2] No such file or directory

Besides, is there any way to maintain only one copy of the dictionary for all the machines that runs the job ?

So what's the correct way of access files other than the specified input in hadoop jobs?

score 0 · Accepted Answer

conf ファイルの-fileオプションまたはオプションで必要なファイルを追加することで問題が解決しました。file=

python - How to read other files in hadoop jobs?

1 に答える 1

Related

Reference