r - 表データの 2 パラメーターワイブル分布の近似

Question

ワイブル分布を 1 つの集計データに調整しようとしています。点群を処理した後、高さ 1 メートルのスライスごとにリターンの数を示す列を取得します。例：

a = matrix(c(7,12,10,10,20,3,15,40,33,57,58,60,79,132,174,201,191,184,115,70,22,2,0),1,23)
colnames(a) <- c(13.5,14.5,15.5,16.5,17.5,18.5,19.5,20.5,21.5,22.5,23.5,24.5,25.5,26.5,27.5,28.5,29.5,30.5,31.5,32.5,33.5,34.5,35.5)

上記の私の例では、中心が 13.5 メートルの高さクラスには、内部に 7 つのポイントがあります。

行列をプロットすると、データ分布を視覚化できます。

barplot(a)

ここに画像の説明を入力

ワイブル 2 パラメータをその集計データに適合させる方法について誰か提案がありますか?

前もって感謝します！

score 6 · Accepted Answer

打ち切りデータに対して最尤法を実行できます。

a = matrix(c(7,12,10,10,20,3,15,40,33,57,58,60,79,132,174,201,191,184,115,70,22,2,0),1,23)
colnames(a) <- c(13.5,14.5,15.5,16.5,17.5,18.5,19.5,20.5,21.5,22.5,23.5,24.5,25.5,26.5,27.5,
                 28.5,29.5,30.5,31.5,32.5,33.5,34.5,35.5)


centers <- as.numeric(colnames(a))
low <- centers - .5
up <- centers + .5

ll.weibullCensored <- function(par, dat){
    shape <- par[1]
    scale <- par[2]
    # Get the probability for each 'bin' and take the log
    log.ps <- log(pweibull(up, shape, scale) - pweibull(low, shape, scale))
    # Sum the logs of the bin probabilities as many times
    # as they should be as dictated by the data
    sum(rep(log.ps, dat))
}

# Use optim or any other function to find a set
# of parameters that maximizes the log likelihood
o.optim <- optim(c(9, 28), 
                 ll.weibullCensored, 
                 dat = as.numeric(a), 
                 # this tells it to find max instead of a min
                 control=list(fnscale=-1))

これは本質的に AndresT と同じ推定値を提供しますが、彼らのアプローチは、すべてのデータがビンの中心にあると仮定し、その帰属データセットに対して最尤法を実行することでした。大した違いはありませんが、この方法では必ずしも他のパッケージは必要ありません。

編集: AndresT のソリューションと私のソリューションが非常によく似た推定値を示しているという事実は、各メソッドで最大化しているものを見ると非常に理にかなっています。私の場合、各「ビン」に入る確率を調べています。AndreT のソリューションでは、この確率の代わりに、ビンの中心にある分布の密度を使用します。各ビンのビンに入る確率とビンの中心の密度値の比率を見ることができます (私のソリューションから得られた形状とスケールを使用)。

# Probability of each bin
> ps
 [1] 0.0005495886 0.0009989085 0.0017438767 0.0029375471 0.0047912909
 [6] 0.0075863200 0.0116800323 0.0174991532 0.0255061344 0.0361186335
[11] 0.0495572085 0.0656015797 0.0832660955 0.1004801353 0.1139855466
[16] 0.1197890284 0.1144657811 0.0971503491 0.0711370586 0.0433654456
[21] 0.0210758647 0.0077516837 0.0020274896
# Density evaluated at the center of the bin
> ps.cent
 [1] 0.0005418957 0.0009868040 0.0017254545 0.0029103746 0.0047524364
 [6] 0.0075325510 0.0116083397 0.0174078328 0.0253967142 0.0359988789
[11] 0.0494450583 0.0655288551 0.0832789134 0.1006305707 0.1143085230
[16] 0.1202647955 0.1149865305 0.0975322358 0.0712125315 0.0431169222
[21] 0.0206762531 0.0074246320 0.0018651941
# Ratio of the probability and the density
> ps/ps.cent
 [1] 1.0141963 1.0122663 1.0106767 1.0093364 1.0081757 1.0071382 1.0061760
 [8] 1.0052459 1.0043084 1.0033266 1.0022682 1.0011098 0.9998461 0.9985051
[15] 0.9971745 0.9960440 0.9954712 0.9960845 0.9989402 1.0057639 1.0193271
[22] 1.0440495 1.0870127

これらの比率はすべて 1 に近いため、2 つの方法は本質的に同じ可能性を最大化しようとしています。

score 2 · Accepted Answer

より良い方法で再形成する方法があると確信していますが、これはうまくいくかもしれません。

library('fitdistrplus')
    library('reshape2')



    a = matrix(c(7,12,10,10,20,3,15,40,33,57,58,60,79,132,174,201,191,184,115,70,22,2,0),1,23)
    colnames(a) <- c(13.5,14.5,15.5,16.5,17.5,18.5,19.5,20.5,21.5,22.5,23.5,24.5,25.5,26.5,27.5,
                     28.5,29.5,30.5,31.5,32.5,33.5,34.5,35.5)

    barplot(a)

    a2 = melt(a)
    a3= (rep(a2[,2],a2[,3]))

    fitdist(a3, "weibull")

descdist(a3,boot=5000)

r - 表データの 2 パラメーター ワイブル分布の近似

2 に答える 2

Related

Reference

r - 表データの 2 パラメーターワイブル分布の近似