0

grunt で豚のスクリプトを実行すると、出力は良好に見えます。以下は例です

2013-07-08 16:58:40,640 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2013-07-08 16:58:40,647 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2013-07-08 16:58:40,647 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
((email,r@gmail.com),{(rrr24,(full_name,rachana),email,(state,ca),(birth_year,2013),(gender,female),{}),(rrr,(full_name,rachana),email,(state,ca),(birth_year,2013),(gender,female),{}),(rrr10,(full_name,rachana),email,(state,ca),(birth_year,2013),(gender,female),{}),(rrr20,(full_name,rachana),email,(state,ca),(birth_year,2013),(gender,female),{}),(rrr23,(full_name,rachana),email,(state,ca),(birth_year,2013),(gender,female),{}),(rrr9,(full_name,rachana),email,(state,ca),(birth_year,2013),(gender,female),{}),(rrr8,(full_name,rachana),email,(state,ca),(birth_year,2013),(gender,female),{}),(rrr22,(full_name,rachana),email,(state,ca),(birth_year,2013),(gender,female),{}),(rrr21,(full_name,rachana),email,(state,ca),(birth_year,2013),(gender,female),{})})
((email,zzzz@gmail.com),{(rrr0,(full_name,rachana),email,(state,ca),(birth_year,2013),(gender,female),{}),(rrr6,(full_name,rachana),email,(state,ca),(birth_year,2013),(gender,female),{}),(rrr7,(full_name,rachana),email,(state,ca),(birth_year,2013),(gender,female),{}),(rrr3,(full_name,rachana),email,(state,ca),(birth_year,2013),(gender,female),{}),(rrr1,(full_name,rachana),email,(state,ca),(birth_year,2013),(gender,female),{}),(rrr5,(full_name,rachana),email,(state,ca),(birth_year,2013),(gender,female),{}),(rrr4,(full_name,rachana),email,(state,ca),(birth_year,2013),(gender,female),{}),(rrr2,(full_name,rachana),email,(state,ca),(birth_year,2013),(gender,female),{})})
grunt> 

full_name、email、birth_year、gender を確認できますが、Java を使用して同じように実行すると

package com.chegg.hwh.tracking.dao;

import org.apache.pig.ExecType;
import org.apache.pig.PigServer;

public class HWHDataPigMapReduce {

    public static void main(String args[]) throws Exception {
        PigServer pigServer = new PigServer(ExecType.LOCAL);

        pigServer.registerQuery("rows = LOAD 'cassandra://hwh_tracking/users' USING org.apache.cassandra.hadoop.pig.CassandraStorage();");
        pigServer.registerQuery("emailgroup = group rows by email;");
        pigServer.dumpSchema("emailgroup");

    }

}

出力:

emailgroup: {group: (name: chararray,value: chararray),rows: {(key: chararray,full_name: (name: chararray,value: chararray),email: (name: chararray,value: chararray),state: (name: chararray,value: chararray),birth_year: (name: chararray,value: long),gender: (name: chararray,value: chararray),columns: {(name: chararray,value: bytearray)})}}

as (full_name:chararray) を使用してみましたが、違いはありません。ここで何が欠けていますか。誰でも助けることができますか?

4

1 に答える 1

1

呼び出している Java コードでは、これはgrunt でのdumpSchema(String alias)呼び出しに似ています。DESCRIBEこれが出力が異なる理由です。

次のようにクエリの結果を保存できます。pigServer.store("emailgroup", "out");

も試してみてくださいgetExamples()。ただし、使用したことはありません。

http://pig.apache.org/docs/r0.11.1/api/org/apache/pig/PigServer.html

于 2013-07-09T12:10:30.287 に答える