1

I'm using randomForest in order to find out the most significant variables. I was expecting some output that defines the accuracy of the model and also ranks the variables based on their importance. But I am a bit confused now. I tried randomForest and then ran importance() to extract the importance of variables. But then I saw another command rfcv (Random Forest Cross-Valdidation for feature selection), which should be the most appropriate for this purpose I suppose, but the question I have regarding this is: how to get the list of the most important variables? How to see the output after running it? Which command to use?

Another thing: What is the difference between randomForest and predict.randomForest?

I am not very familiar with randomforest and R therefore any help would be appreciated.

Thank you in advance!

4

1 に答える 1

4

モデルを作成したら、作成したモデルを新しいデータでrandomForest使用するためpredict.randomForestに使用します。たとえば、トレーニング データを使用してランダム フォレストを構築し、そのモデルを使用して検証データを実行しますpredict.randomForest

rfcv に関しては、recursive(ヘルプから) 次のオプションがあります。

変数削減の各ステップで変数の重要性を (再) 評価するかどうか

そのすべてはヘルプファイルにあります

于 2012-07-11T15:02:46.140 に答える