0

Amazon の Elastic MapReduce で Hive を使用してテーブルを作成し、データをインポートしてパーティション分割しました。ここで、テーブル フィールドの 1 つから最も頻繁に使用される単語をカウントするクエリを実行します。

1 つのマスター インスタンスと 2 つのコア インスタンスがあり、計算に 180 秒かかったときに、そのクエリを実行しました。次に、1 つのマスターと 10 のコアを持つように再構成しましたが、これも 180 秒かかりました。なぜ速くしないのですか?

2 コアと 10 コアで実行すると、ほぼ同じ出力が得られます。

Total MapReduce jobs = 2
Launching Job 1 out of 2

Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201208251929_0003, Tracking URL = http://ip-10-120-250-34.ec2.internal:9100/jobdetails.    jsp?jobid=job_201208251929_0003
Kill Command = /home/hadoop/bin/hadoop job  -Dmapred.job.tracker=10.120.250.34:9001 -kill     job_201208251929_0003
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
2012-08-25 19:38:47,399 Stage-1 map = 0%,  reduce = 0%
2012-08-25 19:39:00,482 Stage-1 map = 3%,  reduce = 0%
2012-08-25 19:39:03,503 Stage-1 map = 5%,  reduce = 0%
2012-08-25 19:39:06,523 Stage-1 map = 10%,  reduce = 0%
2012-08-25 19:39:09,544 Stage-1 map = 18%,  reduce = 0%
2012-08-25 19:39:12,563 Stage-1 map = 24%,  reduce = 0%
2012-08-25 19:39:15,583 Stage-1 map = 35%,  reduce = 0%
2012-08-25 19:39:18,610 Stage-1 map = 45%,  reduce = 0%
2012-08-25 19:39:21,631 Stage-1 map = 53%,  reduce = 0%
2012-08-25 19:39:24,652 Stage-1 map = 67%,  reduce = 0%
2012-08-25 19:39:27,672 Stage-1 map = 75%,  reduce = 0%
2012-08-25 19:39:30,692 Stage-1 map = 89%,  reduce = 0%
2012-08-25 19:39:33,715 Stage-1 map = 94%,  reduce = 0%, Cumulative CPU 23.11 sec
2012-08-25 19:39:34,723 Stage-1 map = 94%,  reduce = 0%, Cumulative CPU 23.11 sec
2012-08-25 19:39:35,730 Stage-1 map = 94%,  reduce = 0%, Cumulative CPU 23.11 sec
2012-08-25 19:39:36,802 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 62.57 sec
2012-08-25 19:39:37,810 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 62.57 sec
2012-08-25 19:39:38,819 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 62.57 sec
2012-08-25 19:39:39,827 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 62.57 sec
2012-08-25 19:39:40,835 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 62.57 sec
2012-08-25 19:39:41,845 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 62.57 sec
2012-08-25 19:39:42,856 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 62.57 sec
2012-08-25 19:39:43,865 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 62.57 sec
2012-08-25 19:39:44,873 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 62.57 sec
2012-08-25 19:39:45,882 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 62.57 sec
2012-08-25 19:39:46,891 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 62.57 sec
2012-08-25 19:39:47,900 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 62.57 sec
2012-08-25 19:39:48,908 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 62.57 sec
2012-08-25 19:39:49,916 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 62.57 sec
2012-08-25 19:39:50,924 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 62.57 sec
2012-08-25 19:39:51,934 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 62.57 sec
2012-08-25 19:39:52,942 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 62.57 sec
2012-08-25 19:39:53,950 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 62.57 sec
2012-08-25 19:39:54,958 Stage-1 map = 100%,  reduce = 72%, Cumulative CPU 62.57 sec
2012-08-25 19:39:55,967 Stage-1 map = 100%,  reduce = 72%, Cumulative CPU 62.57 sec
2012-08-25 19:39:56,976 Stage-1 map = 100%,  reduce = 72%, Cumulative CPU 62.57 sec
2012-08-25 19:39:57,990 Stage-1 map = 100%,  reduce = 90%, Cumulative CPU 62.57 sec
2012-08-25 19:39:59,001 Stage-1 map = 100%,  reduce = 90%, Cumulative CPU 62.57 sec
2012-08-25 19:40:00,011 Stage-1 map = 100%,  reduce = 90%, Cumulative CPU 62.57 sec
2012-08-25 19:40:01,022 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 72.86 sec
2012-08-25 19:40:02,031 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 72.86 sec
2012-08-25 19:40:03,041 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 72.86 sec
2012-08-25 19:40:04,051 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 72.86 sec
2012-08-25 19:40:05,060 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 72.86 sec
2012-08-25 19:40:06,070 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 72.86 sec
2012-08-25 19:40:07,079 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 72.86 sec
MapReduce Total cumulative CPU time: 1 minutes 12 seconds 860 msec
Ended Job = job_201208251929_0003
Counters:
Launching Job 2 out of 2
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201208251929_0004, Tracking URL = http://ip-10-120-250-34.ec2.internal:9100/jobdetails.    jsp?jobid=job_201208251929_0004
Kill Command = /home/hadoop/bin/hadoop job  -Dmapred.job.tracker=10.120.250.34:9001 -kill     job_201208251929_0004
Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 1
2012-08-25 19:40:30,147 Stage-2 map = 0%,  reduce = 0%
2012-08-25 19:40:43,241 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 7.48 sec
2012-08-25 19:40:44,254 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 7.48 sec
2012-08-25 19:40:45,262 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 7.48 sec
2012-08-25 19:40:46,272 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 7.48 sec
2012-08-25 19:40:47,282 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 7.48 sec
2012-08-25 19:40:48,290 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 7.48 sec
2012-08-25 19:40:49,298 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 7.48 sec
2012-08-25 19:40:50,306 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 7.48 sec
2012-08-25 19:40:51,315 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 7.48 sec
2012-08-25 19:40:52,323 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 7.48 sec
2012-08-25 19:40:53,331 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 7.48 sec
2012-08-25 19:40:54,339 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 7.48 sec
2012-08-25 19:40:55,347 Stage-2 map = 100%,  reduce = 33%, Cumulative CPU 7.48 sec
2012-08-25 19:40:56,357 Stage-2 map = 100%,  reduce = 33%, Cumulative CPU 7.48 sec
2012-08-25 19:40:57,365 Stage-2 map = 100%,  reduce = 33%, Cumulative CPU 7.48 sec
2012-08-25 19:40:58,374 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 10.85 sec
2012-08-25 19:40:59,384 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 10.85 sec
2012-08-25 19:41:00,393 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 10.85 sec
2012-08-25 19:41:01,407 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 10.85 sec
2012-08-25 19:41:02,420 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 10.85 sec
2012-08-25 19:41:03,431 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 10.85 sec
2012-08-25 19:41:04,443 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 10.85 sec
MapReduce Total cumulative CPU time: 10 seconds 850 msec
Ended Job = job_201208251929_0004
Counters:
MapReduce Jobs Launched: 
Job 0: Map: 2  Reduce: 1   Accumulative CPU: 72.86 sec   HDFS Read: 4920 HDFS Write: 8371374 SUCCESS
Job 1: Map: 1  Reduce: 1   Accumulative CPU: 10.85 sec   HDFS Read: 8371850 HDFS Write: 456 SUCCESS
Total MapReduce CPU Time Spent: 1 minutes 23 seconds 710 msec
4

2 に答える 2

1

レデューサーは 1 つしかなく、ほとんどの作業を行っています。理由だと思います。

于 2012-08-26T07:39:04.873 に答える