7 ギガの wiki バックアップを使用して mahout wikipedia の例を実行しましたが、分類器をテストすると、OutOfMemory エラーが発生します。
以下に出力を貼り付けました。 mahout ヒープ サイズと Java ヒープ サイズを 2500m に設定しました。
$MAHOUT_HOME/bin/mahout testclassifier -m wikipediamodel -d wikipediainput
run with heapsize 2500
-Xmx2500m
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using HADOOP_HOME=/home/hduser/hadoop/hadoop
No HADOOP_CONF_DIR set, using /home/hduser/hadoop/hadoop/conf
MAHOUT-JOB: /home/nauman/mahout/examples/target/mahout-examples-0.7-SNAPSHOT-job.jar
12/04/10 00:06:18 INFO common.HadoopUtil: Deleting wikipediainput-output
12/04/10 00:06:18 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
12/04/10 00:06:18 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments.
Applications should implement Tool for the same.
12/04/10 00:06:18 INFO mapred.FileInputFormat: Total input paths to process : 1
12/04/10 00:06:18 INFO mapred.JobClient: Running job: job_local_0001
12/04/10 00:06:18 INFO mapred.FileInputFormat: Total input paths to process : 1
12/04/10 00:06:18 INFO mapred.MapTask: numReduceTasks: 1
12/04/10 00:06:18 INFO mapred.MapTask: io.sort.mb = 100
12/04/10 00:06:19 INFO mapred.MapTask: data buffer = 79691776/99614720
12/04/10 00:06:19 INFO mapred.MapTask: record buffer = 262144/327680
12/04/10 00:06:19 INFO bayes.BayesClassifierMapper: Bayes Parameter {basePath=wikipediamodel, classifierType=bayes, dataSource=hdfs, alpha_i=1.0, gramSize=1, verbose=false, encoding=UTF-8, confusionMatrix=null, defaultCat=unknown, testDirPath=wikipediainput}
12/04/10 00:06:19 INFO bayes.BayesClassifierMapper: {basePath=wikipediamodel, classifierType=bayes, dataSource=hdfs, alpha_i=1.0, gramSize=1, verbose=false, encoding=UTF-8, confusionMatrix=null, defaultCat=unknown, testDirPath=wikipediainput}
12/04/10 00:06:19 INFO bayes.BayesClassifierMapper: Testing Bayes Classifier
12/04/10 00:06:19 INFO mapred.JobClient: map 0% reduce 0%
12/04/10 00:06:20 INFO bayes.SequenceFileModelReader: Read 50000 feature weights
12/04/10 00:06:20 INFO bayes.SequenceFileModelReader: Read 100000 feature weights
12/04/10 00:06:21 INFO bayes.SequenceFileModelReader: Read 150000 feature weights
12/04/10 00:06:21 INFO bayes.SequenceFileModelReader: Read 200000 feature weights
12/04/10 00:06:21 INFO bayes.SequenceFileModelReader: Read 250000 feature weights
12/04/10 00:06:21 INFO mapred.LocalJobRunner: file:/home/nauman/wikipediainput/part-r-00000:0+33554432
12/04/10 00:06:22 INFO bayes.SequenceFileModelReader: Read 300000 feature weights
12/04/10 00:06:22 INFO bayes.SequenceFileModelReader: Read 350000 feature weights
12/04/10 00:06:22 INFO bayes.SequenceFileModelReader: Read 400000 feature weights
12/04/10 00:06:22 INFO bayes.SequenceFileModelReader: Read 450000 feature weights
12/04/10 00:06:23 INFO bayes.SequenceFileModelReader: Read 500000 feature weights
12/04/10 00:06:23 INFO bayes.SequenceFileModelReader: Read 550000 feature weights
12/04/10 00:06:23 INFO bayes.SequenceFileModelReader: Read 600000 feature weights
12/04/10 00:06:23 INFO bayes.SequenceFileModelReader: Read 650000 feature weights
12/04/10 00:06:23 INFO bayes.SequenceFileModelReader: Read 700000 feature weights
12/04/10 00:06:23 INFO bayes.SequenceFileModelReader: Read 750000 feature weights
12/04/10 00:06:23 INFO bayes.SequenceFileModelReader: Read 800000 feature weights
12/04/10 00:06:23 INFO bayes.SequenceFileModelReader: Read 850000 feature weights
12/04/10 00:06:24 INFO bayes.SequenceFileModelReader: Read 900000 feature weights
12/04/10 00:06:25 INFO bayes.SequenceFileModelReader: Read 950000 feature weights
12/04/10 00:06:25 INFO bayes.SequenceFileModelReader: Read 1000000 feature weights
12/04/10 00:06:25 INFO bayes.SequenceFileModelReader: Read 1050000 feature weights
12/04/10 00:06:26 INFO bayes.SequenceFileModelReader: Read 1100000 feature weights
12/04/10 00:06:26 INFO bayes.SequenceFileModelReader: Read 1150000 feature weights
12/04/10 00:06:26 INFO bayes.SequenceFileModelReader: Read 1200000 feature weights
12/04/10 00:06:26 INFO bayes.SequenceFileModelReader: Read 1250000 feature weights
12/04/10 00:06:26 INFO bayes.SequenceFileModelReader: Read 1300000 feature weights
12/04/10 00:06:26 INFO bayes.SequenceFileModelReader: Read 1350000 feature weights
12/04/10 00:06:26 INFO bayes.SequenceFileModelReader: Read 1400000 feature weights
12/04/10 00:06:26 INFO bayes.SequenceFileModelReader: Read 1450000 feature weights
12/04/10 00:06:26 INFO bayes.SequenceFileModelReader: Read 1500000 feature weights
12/04/10 00:06:27 INFO bayes.SequenceFileModelReader: Read 1550000 feature weights
12/04/10 00:06:27 INFO bayes.SequenceFileModelReader: Read 1600000 feature weights
12/04/10 00:06:27 INFO bayes.SequenceFileModelReader: Read 1650000 feature weights
12/04/10 00:06:27 INFO bayes.SequenceFileModelReader: Read 1700000 feature weights
12/04/10 00:06:27 INFO bayes.SequenceFileModelReader: Read 1750000 feature weights
12/04/10 00:06:27 INFO bayes.SequenceFileModelReader: Read 1800000 feature weights
12/04/10 00:06:28 INFO bayes.SequenceFileModelReader: Read 1850000 feature weights
12/04/10 00:06:28 INFO bayes.SequenceFileModelReader: Read 1900000 feature weights
12/04/10 00:06:30 INFO bayes.SequenceFileModelReader: Read 1950000 feature weights
12/04/10 00:06:30 INFO bayes.SequenceFileModelReader: Read 2000000 feature weights
12/04/10 00:06:30 INFO bayes.SequenceFileModelReader: Read 2050000 feature weights
12/04/10 00:06:30 INFO bayes.SequenceFileModelReader: Read 2100000 feature weights
12/04/10 00:06:30 INFO bayes.SequenceFileModelReader: Read 2150000 feature weights
12/04/10 00:06:30 INFO bayes.SequenceFileModelReader: Read 2200000 feature weights
12/04/10 00:06:30 INFO bayes.SequenceFileModelReader: Read 2250000 feature weights
12/04/10 00:06:30 INFO bayes.SequenceFileModelReader: Read 2300000 feature weights
12/04/10 00:06:30 INFO bayes.SequenceFileModelReader: Read 2350000 feature weights
12/04/10 00:06:31 INFO bayes.SequenceFileModelReader: Read 2400000 feature weights
12/04/10 00:06:31 INFO bayes.SequenceFileModelReader: Read 2450000 feature weights
12/04/10 00:06:31 INFO bayes.SequenceFileModelReader: Read 2500000 feature weights
12/04/10 00:06:31 INFO bayes.SequenceFileModelReader: Read 2550000 feature weights
12/04/10 00:06:31 INFO bayes.SequenceFileModelReader: Read 2600000 feature weights
12/04/10 00:06:31 INFO bayes.SequenceFileModelReader: Read 2650000 feature weights
12/04/10 00:06:31 INFO bayes.SequenceFileModelReader: Read 2700000 feature weights
12/04/10 00:06:32 INFO bayes.SequenceFileModelReader: Read 2750000 feature weights
12/04/10 00:06:32 INFO bayes.SequenceFileModelReader: Read 2800000 feature weights
12/04/10 00:06:32 INFO bayes.SequenceFileModelReader: Read 2850000 feature weights
12/04/10 00:06:32 INFO bayes.SequenceFileModelReader: Read 2900000 feature weights
12/04/10 00:06:32 INFO bayes.SequenceFileModelReader: Read 2950000 feature weights
12/04/10 00:06:32 INFO bayes.SequenceFileModelReader: Read 3000000 feature weights
12/04/10 00:06:32 INFO bayes.SequenceFileModelReader: Read 3050000 feature weights
12/04/10 00:06:33 INFO bayes.SequenceFileModelReader: Read 3100000 feature weights
12/04/10 00:06:33 INFO bayes.SequenceFileModelReader: Read 3150000 feature weights
12/04/10 00:06:33 INFO bayes.SequenceFileModelReader: Read 3200000 feature weights
12/04/10 00:06:33 INFO bayes.SequenceFileModelReader: Read 3250000 feature weights
12/04/10 00:06:33 INFO bayes.SequenceFileModelReader: Read 3300000 feature weights
12/04/10 00:06:33 INFO bayes.SequenceFileModelReader: Read 3350000 feature weights
12/04/10 00:06:33 INFO bayes.SequenceFileModelReader: Read 3400000 feature weights
12/04/10 00:06:34 INFO bayes.SequenceFileModelReader: Read 3450000 feature weights
12/04/10 00:06:34 INFO bayes.SequenceFileModelReader: Read 3500000 feature weights
12/04/10 00:06:34 INFO bayes.SequenceFileModelReader: Read 3550000 feature weights
12/04/10 00:06:34 INFO bayes.SequenceFileModelReader: Read 3600000 feature weights
12/04/10 00:06:39 INFO bayes.SequenceFileModelReader: Read 3650000 feature weights
12/04/10 00:06:39 INFO bayes.SequenceFileModelReader: Read 3700000 feature weights
12/04/10 00:06:40 INFO bayes.SequenceFileModelReader: Read 3750000 feature weights
12/04/10 00:06:40 INFO bayes.SequenceFileModelReader: Read 3800000 feature weights
12/04/10 00:06:42 INFO bayes.SequenceFileModelReader: Read 3850000 feature weights
12/04/10 00:06:42 INFO bayes.SequenceFileModelReader: Read 3900000 feature weights
12/04/10 00:06:42 INFO bayes.SequenceFileModelReader: Read 3950000 feature weights
12/04/10 00:06:42 INFO bayes.SequenceFileModelReader: Read 4000000 feature weights
12/04/10 00:06:42 INFO bayes.SequenceFileModelReader: Read 4050000 feature weights
12/04/10 00:06:44 INFO bayes.SequenceFileModelReader: Read 4100000 feature weights
12/04/10 00:06:44 INFO bayes.SequenceFileModelReader: Read 4150000 feature weights
12/04/10 00:06:45 INFO bayes.SequenceFileModelReader: Read 4200000 feature weights
12/04/10 00:06:45 INFO bayes.SequenceFileModelReader: Read 4250000 feature weights
12/04/10 00:06:45 INFO bayes.SequenceFileModelReader: Read 4300000 feature weights
12/04/10 00:06:47 INFO bayes.SequenceFileModelReader: Read 4350000 feature weights
12/04/10 00:06:47 INFO bayes.SequenceFileModelReader: Read 4400000 feature weights
12/04/10 00:06:47 INFO bayes.SequenceFileModelReader: Read 4450000 feature weights
12/04/10 00:06:47 INFO bayes.SequenceFileModelReader: Read 4500000 feature weights
12/04/10 00:06:50 INFO bayes.SequenceFileModelReader: Read 4550000 feature weights
12/04/10 00:06:50 INFO bayes.SequenceFileModelReader: Read 4600000 feature weights
12/04/10 00:06:51 INFO bayes.SequenceFileModelReader: Read 4650000 feature weights
12/04/10 00:06:51 INFO bayes.SequenceFileModelReader: Read 4700000 feature weights
12/04/10 00:06:53 INFO bayes.SequenceFileModelReader: Read 4750000 feature weights
12/04/10 00:06:53 INFO bayes.SequenceFileModelReader: Read 4800000 feature weights
12/04/10 00:06:53 INFO bayes.SequenceFileModelReader: Read 4850000 feature weights
12/04/10 00:06:56 INFO bayes.SequenceFileModelReader: Read 4900000 feature weights
12/04/10 00:06:56 INFO bayes.SequenceFileModelReader: Read 4950000 feature weights
12/04/10 00:06:56 INFO bayes.SequenceFileModelReader: Read 5000000 feature weights
12/04/10 00:06:59 INFO bayes.SequenceFileModelReader: Read 5050000 feature weights
12/04/10 00:06:59 INFO bayes.SequenceFileModelReader: Read 5100000 feature weights
12/04/10 00:06:59 INFO bayes.SequenceFileModelReader: Read 5150000 feature weights
12/04/10 00:07:01 INFO bayes.SequenceFileModelReader: Read 5200000 feature weights
12/04/10 00:07:02 INFO bayes.SequenceFileModelReader: Read 5250000 feature weights
12/04/10 00:07:02 INFO bayes.SequenceFileModelReader: Read 5300000 feature weights
12/04/10 00:07:04 INFO bayes.SequenceFileModelReader: Read 5350000 feature weights
12/04/10 00:07:04 INFO bayes.SequenceFileModelReader: Read 5400000 feature weights
12/04/10 00:07:07 INFO bayes.SequenceFileModelReader: Read 5450000 feature weights
12/04/10 00:07:07 INFO bayes.SequenceFileModelReader: Read 5500000 feature weights
12/04/10 00:07:10 INFO bayes.SequenceFileModelReader: Read 5550000 feature weights
12/04/10 00:07:12 INFO bayes.SequenceFileModelReader: Read 5600000 feature weights
12/04/10 00:07:12 INFO bayes.SequenceFileModelReader: Read 5650000 feature weights
12/04/10 00:07:15 INFO bayes.SequenceFileModelReader: Read 5700000 feature weights
12/04/10 00:07:17 INFO bayes.SequenceFileModelReader: Read 5750000 feature weights
12/04/10 00:07:20 INFO bayes.SequenceFileModelReader: Read 5800000 feature weights
12/04/10 00:07:23 INFO bayes.SequenceFileModelReader: Read 5850000 feature weights
12/04/10 00:07:25 INFO bayes.SequenceFileModelReader: Read 5900000 feature weights
12/04/10 00:07:28 INFO bayes.SequenceFileModelReader: Read 5950000 feature weights
12/04/10 00:07:33 INFO bayes.SequenceFileModelReader: Read 6000000 feature weights
12/04/10 00:07:38 INFO bayes.SequenceFileModelReader: Read 6050000 feature weights
12/04/10 00:07:46 INFO bayes.SequenceFileModelReader: Read 6100000 feature weights
12/04/10 00:08:04 INFO bayes.SequenceFileModelReader: Read 6150000 feature weights
12/04/10 00:08:20 INFO bayes.SequenceFileModelReader: Read 6200000 feature weights
12/04/10 00:08:47 INFO bayes.SequenceFileModelReader: Read 6250000 feature weights
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at sun.misc.FloatingDecimal.toJavaFormatString(FloatingDecimal.java:887)
at java.lang.Double.toString(Double.java:179)
at java.text.DigitList.set(DigitList.java:272)
at java.text.DecimalFormat.format(DecimalFormat.java:584)
at java.text.DecimalFormat.format(DecimalFormat.java:507)
at java.text.NumberFormat.format(NumberFormat.java:269)
at org.apache.hadoop.util.StringUtils.formatPercent(StringUtils.java:119)
at org.apache.hadoop.mapred.JobClient.monitorAndPrintJob(JobClient.java:1283)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1251)
at org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesClassifierDriver.runJob(BayesClassifierDriver.java:87)
at org.apache.mahout.classifier.bayes.TestClassifier.classifyParallel(TestClassifier.java:288)
at org.apache.mahout.classifier.bayes.TestClassifier.main(TestClassifier.java:191)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
12/04/10 00:17:15 WARN mapred.LocalJobRunner: job_local_0001
java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
... 5 more
Caused by: java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
... 10 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
... 13 more
Caused by: java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapCharBuffer.<init>(HeapCharBuffer.java:39)
at java.nio.CharBuffer.allocate(CharBuffer.java:312)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:760)
at org.apache.hadoop.io.Text.decode(Text.java:350)
at org.apache.hadoop.io.Text.decode(Text.java:327)
at org.apache.hadoop.io.Text.toString(Text.java:254)
at org.apache.mahout.common.StringTuple.readFields(StringTuple.java:143)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1836)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1876)
at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:95)
at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:38)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)
at com.google.common.collect.Iterators$5.hasNext(Iterators.java:525)
at com.google.common.collect.ForwardingIterator.hasNext(ForwardingIterator.java:43)
at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadFeatureWeights(SequenceFileModelReader.java:72)
at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadModel(SequenceFileModelReader.java:46)
at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.initialize(InMemoryBayesDatastore.java:72)
at org.apache.mahout.classifier.bayes.ClassifierContext.initialize(ClassifierContext.java:44)
at org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesClassifierMapper.configure(BayesClassifierMapper.java:121)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)