Weka でアルゴリズムのいずれかを使用すると、次の形式の結果が得られます。
=== Stratified cross-validation ===
=== Summary ===
Correctly Classified Instances 302 63.3124 %
Incorrectly Classified Instances 175 36.6876 %
Kappa statistic 0.3536
Mean absolute error 0.3464
Root mean squared error 0.4176
Relative absolute error 85.5832 %
Root relative squared error 92.8684 %
Total Number of Instances 477
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure ROC Area Class
0.801 0.407 0.686 0.801 0.739 0.659 1
0.748 0.243 0.549 0.748 0.633 0.718 2
0 0 0 0 0 0.478 3
Weighted Avg. 0.633 0.283 0.516 0.633 0.568 0.641
=== Confusion Matrix ===
a b c <-- classified as
201 50 0 | a = 1
34 101 0 | b = 2
58 33 0 | c = 3
しかし、k-means を使用すると、結果は次の形式になります。
=== Model and evaluation on training set ===
kMeans
======
Number of iterations: 9
Within cluster sum of squared errors: 297.46622082142716
Missing values globally replaced with mean/mode
Cluster centroids:
Cluster#
Attribute Full Data 0 1 2
(477) (136) (172) (169)
========================================================
Religion 8.6939 7.6691 8.9709 9.2367
Vote_Criterion 2.7736 2.8971 2.4942 2.9586
Sex 1.4906 1.4559 2 1
DateBirth 1930.7652 1937.5147 1920.2965 1935.9882
Educ 3.2201 3.2721 3.2209 3.1775
Immigrant 1.6415 1.6838 1.5872 1.6627
Income 2.4675 2.5 2.5523 2.355
Occupation 3.6184 3.8162 3.2907 3.7929
Vote2013 1 2 1 1
Time taken to build model (full training data) : 0.06 seconds
=== Model and evaluation on training set ===
Clustered Instances
0 136 ( 29%)
1 172 ( 36%)
2 169 ( 35%)
..しかし、他のアルゴリズムが示すように、正しく分類されたインスタンス、精度、リコールなどを知りたい.なぜそれが起こっているのか、どうすればwekaにk-meansの最初の形式で結果を表示させることができますか?