vowpalwabbit - vowpal wabbitでコンテキストバンディットから出力ポリシーを抽出するには?

Question

私はコンテキストバンディットのためにこの例を実行しています。例のデータは彼らのものです:

1:2:0.4 | a c  
3:0.5:0.2 | b d  
4:1.2:0.5 | a b c  
2:1:0.3 | b c  
3:1.5:0.7 | a d

提案としてコマンドを使用： vw -d train.dat --cb 4 --cb_type dr -f traindModel

そして、このコマンドからポリシーを抽出する方法と、それを解釈する方法を知りたいですか?

そして、私は行きます

vw -d train.dat --invert_hash traindModel

そのような出力を受け取ります

Num weight bits = 18
learning rate = 0.5
initial_t = 0
power_t = 0.5
using no cache
Reading datafile = ../r-mkosinski/train.dat
num sources = 1
average    since         example     example  current  current  current
loss       last          counter      weight    label  predict features
1.000000   1.000000          1      1.0     1.0000   0.0000        3
4.439352   7.878704          2      2.0     3.0000   0.1931        3
4.457758   4.476164          4      4.0     2.0000   1.4285        3

finished run
number of examples per pass = 5
passes used = 1
weighted example sum = 5
weighted label sum = 13
average loss = 4.14973
best constant = 2.6
total feature number = 16

それらの結果をどのように解釈しますか? ポリシーを抽出する方法は?

このタイプのコマンドも試しました：

vw -d train.dat --cb 4 --cb_type dr  --invert_hash p2222.txt

そして次の結果を得ました：

Version 7.8.0
Min label:0.000000
Max label:5.000000
bits:18
0 pairs: 
0 triples: 
lda:0
0 ngram: 
0 skip: 
options: --cb 4 --cb_type dr --csoaa 4
:0
 ^a:108232:0.263395
 ^a:108233:-0.028344
 ^a:108234:0.140435
 ^a:108235:0.215673
 ^a:108236:0.234253
 ^a:108238:0.203977
 ^a:108239:0.182416
 ^b:129036:-0.061075
 ^b:129037:0.242713
 ^b:129038:0.229821
 ^b:129039:0.206961
 ^b:129041:0.185534
 ^b:129042:0.137167
 ^b:129043:0.182416
 ^c:219516:0.264300
 ^c:219517:0.242713
 ^c:219518:-0.158527
 ^c:219519:0.206961
 ^c:219520:0.234253
 ^c:219521:0.185534
 ^c:219523:0.182416
 ^d:20940:-0.058402
 ^d:20941:-0.028344
 ^d:20942:0.372860
 ^d:20943:-0.056001
 ^d:20946:0.326036
Constant:202096:0.263742
Constant:202097:0.242226
Constant:202098:0.358272
Constant:202099:0.205581
Constant:202100:0.234253
Constant:202101:0.185534
Constant:202102:0.326036
Constant:202103:0.182416

din output のレコードが 5 つしかなく、 c, b,のレコードが 7 つあるのはなぜaですか? c特徴、b、aがデータに 3 回出現し、d2 回しか出現しなかったことに対応しますか? 8 つの一定の行もあります。それらは何に対応していますか?

score 4 · Accepted Answer

4

于 2015-02-12T11:30:45.863 に答える

vowpalwabbit - vowpal wabbitでコンテキストバンディットから出力ポリシーを抽出するには?

1 に答える 1

Related

Reference