machine-learning - Learning Weka - Precision and Recall - .Arff ファイルへの Wiki の例

Question

私はWEKAと高度な統計に不慣れで、WEKAの測定値を理解するためにゼロから始めています。私はすべての @rushdi-shams の例を実行しました。これは優れたリソースです。

ウィキペディアのhttp://en.wikipedia.org/wiki/Precision_and_recallの例では、9 匹の実際の犬と何匹かの猫のグループで 7 匹の犬を検出するビデオソフトウェア認識に関する簡単な例を説明しています。例と再現率の計算を完全に理解しています。最初のステップとして、Weka でこのデータをどのように再現するかを見てみましょう。このような .ARFF ファイルを作成するにはどうすればよいですか? このファイルでは、間違った混同行列があり、クラスリコールによる間違った精度が 1 ではなく、4/9 (0.4444) である必要があります。

@relation 'dogs and cat detection'

@attribute              'realanimal'      {dog,cat}
@attribute              'detected'        {dog,cat}
@attribute              'class'           {correct,wrong}

@data
dog,dog,correct
dog,dog,correct
dog,dog,correct
dog,dog,correct
cat,dog,wrong
cat,dog,wrong
cat,dog,wrong
dog,?,?
dog,?,?
dog,?,?
dog,?,?
dog,?,?
cat,?,?
cat,?,?

出力 Weka (フィルターなし)

=== 運行情報 ===

Scheme:weka.classifiers.rules.ZeroR 
Relation:     dogs and cat detection
Instances:    14
Attributes:   3
          realanimal
          detected
          class
Test mode:10-fold cross-validation

=== Classifier model (full training set) ===

ZeroR predicts class value: correct

Time taken to build model: 0 seconds

=== Stratified cross-validation ===
=== Summary ===

Correctly Classified Instances           4               57.1429 %
Incorrectly Classified Instances         3               42.8571 %
Kappa statistic                          0     
Mean absolute error                      0.5   
Root mean squared error                  0.5044
Relative absolute error                100      %
Root relative squared error            100      %
Total Number of Instances                7     
Ignored Class Unknown Instances          7     

=== Detailed Accuracy By Class ===

           TP Rate   FP Rate   Precision   Recall  F-Measure   ROC Area  Class
             1         1          0.571     1         0.727      0.65     correct
             0         0          0         0         0          0.136    wrong
Weighted Avg.    0.571     0.571      0.327     0.571     0.416      0.43 

=== Confusion Matrix ===

 a b   <-- classified as
 4 0 | a = correct
 3 0 | b = wrong

偽陰性の犬に何か問題があるに違いありませんか、それとも私の ARFF アプローチが完全に間違っていて、別の種類の属性が必要ですか?

ありがとう

score 6 · Accepted Answer

プレシジョンとリコールの基本的な定義から始めましょう。

Precision = TP/(TP+FP)
Recall = TP/(TP+FN)

TPTrue Positive、FPFalse Positive、およびFNFalse Negativeはどこにありますか。

上記の dog.arff ファイルでは、Weka は最初の 7 つのタプルのみを考慮し、残りの 7 つを無視しました。上記の出力から、7 つのタプルすべてが正しい (4 つの正しいタプル + 3 つの間違ったタプル) として分類されていることがわかります。）。

正しいクラスと間違ったクラスの精度を計算してみましょう。最初に正しいクラスの場合:

Prec = 4/(4+3) = 0.571428571
Recall = 4/(4+0) = 1.

間違ったクラスの場合:

Prec = 0/(0+0)= 0
recall =0/(0+3) = 0

machine-learning - Learning Weka - Precision and Recall - .Arff ファイルへの Wiki の例

1 に答える 1

Related

Reference