r - R は正と負の因子変数をどのように判断しますか?

Question

二項分類を行いたいのですが、一方のレベルは「上」、もう一方は「下」です。h2o パッケージで gbm を使用し、"bottom" をポジティブクラスとして、"top" をネガティブクラスとして取得しました。これが私のコードです：

fit <- h2o.gbm(x = regr.var, y = max.var,
             training_frame = ddd, 
             nfolds = 10, 
             distribution = 'multinomial',
             balance_classes = TRUE)
pred <- as.data.frame(h2o.predict(fit, newdata = eee))
threshold <- 0.5
pred1 <- factor( ifelse(pred[, 'top'] > threshold, 'top', 'bottom') )
err.res<-confusionMatrix(pred1 , hh$score_class)
err.res

結果は次のとおりです。

Confusion Matrix and Statistics
           Reference
Prediction bottom top
bottom      420   123
top          1     6
Accuracy : 0.7745          
95% CI : (0.7373, 0.8088)
No Information Rate : 0.7655          
P-Value [Acc > NIR] : 0.3279          

Kappa : 0.0657          
Mcnemar's Test P-Value : <2e-16          

Sensitivity : 0.99762         
Specificity : 0.04651         
Pos Pred Value : 0.77348         
Neg Pred Value : 0.85714         
Prevalence : 0.76545         
Detection Rate : 0.76364         
Detection Prevalence : 0.98727         
Balanced Accuracy : 0.52207         

'Positive' Class : bottom

しかし、もっと「トップ」を正しく予測したいです。しきい値を 0.3 に変更してみましたが、パフォーマンスが向上しました。ただし、フィッティングプロセスを変更して、「ROC」メトリクスのように「トップ」への予測を増やす必要がありますか? 「トップ」をポジティブクラスに、「ボトム」をネガティブクラスに反転する必要がありますか?どうすれば変更できますか?

score 0 · Accepted Answer

h2o.confustionMatrix を使用し、それを使用して異なるしきい値でマトリックスを作成することをお勧めします。

元。h2o.confusionMatrix(object = fit, threshold = 0.3)

ありがとう、

アヴニ

r - R は正と負の因子変数をどのように判断しますか?

3 に答える 3

Related

Reference