configuration - 引数リストを含む構成ファイルをhadoopプログラムで使用する方法は？

Question

私はHadoopプログラムを作成していますが、引数を直接Hadoop use args []に渡すことができることを知っています。つまり、現在は次のようになっています。

ToolRunner.run(new Configuration(), new RunDear(), args);

しかし、引数が多い場合は、以下のような構成ファイルを作成して、Hadoopに渡すことはできますか？このファイルは、ローカルファイルシステムまたはhdfsのどこに配置する必要がありますか？

sample_size 200
input_genotype_file /data/genotypes.txt 
input_phenotype_file /data/phenotypes.txt
output_directory /outout 
mtry 200
ntree 3000
distance 0 (e.g. 0=euclidean, 1=mehalanobis
variable_important 0 (e.g. 0=information gain, 1=permutation)
etc….

score 1 · Accepted Answer

conf.addResource（new Path（/ path / to / local / file））を使用できます。これにより、ファイルがすべてのタスクに渡されます。

score 1 · Accepted Answer

ファイルを分散キャッシュに入れてから、構成内のファイルの名前をタスクに渡すことができます。

score 0 · Accepted Answer

これらの引数を読み取り、agrs配列に設定して、それを渡すラッパークラスを作成できます。

configuration - 引数リストを含む構成ファイルをhadoopプログラムで使用する方法は？

3 に答える 3

Related

Reference