私は最近私の仕事でこの問題に遭遇しました、それは豚が平らになることについてです。簡単な例を使って表現します
2つのファイル
===file1===
1_a
2_b
4_d
=== file2(タブ区切り)===
1 a
2 b
3 c
豚のスクリプト1:
a = load 'file1' as (str:chararray);
b = load 'file2' as (num:int, ch:chararray);
a1 = foreach a generate flatten(STRSPLIT(str,'_',2)) as (num:int, ch:chararray);
c = join a1 by num, b by num;
dump c; -- exception java.lang.String cannot be cast to java.lang.Integer
豚のスクリプト2:
a = load 'file1' as (str:chararray);
b = load 'file2' as (num:int, ch:chararray);
a1 = foreach a generate flatten(STRSPLIT(str,'_',2)) as (num:int, ch:chararray);
a2 = foreach a1 generate (int)num as num, ch as ch;
c = join a2 by num, b by num;
dump c; -- exception java.lang.String cannot be cast to java.lang.Integer
豚のスクリプト3:
a = load 'file1' as (str:chararray);
b = load 'file2' as (num:int, ch:chararray);
a1 = foreach a generate flatten(STRSPLIT(str,'_',2));
a2 = foreach a1 generate (int)$0 as num, $1 as ch;
c = join a2 by num, b by num;
dump c; -- right
スクリプト1、2が間違っていて、スクリプト3が正しい理由がわかりません。また、関係c、thxを取得するためのより簡潔な式があることも知りたいです。