私はデータフレームを持っています、これを仮定します:
names<-c("a","a","a","a","a","b","b","b","b","b","c","c","c","c","c","c","c","c")
var1<-c(0.942999593,0.935507266,0.973589623,0.969415912,0.95230801,0.935507266,0.888740961,0.91750551,0.944482672,0.945468585,1.457579147,0.922206277,0.941511433,0.954724791,0.941014244,0.941511433,0.941511433,1.50511433)
var2<-c(-0.012678088,0.014313763,0.001138275,-0.020568206,0.012987126,0.001217192,0.03360358,0.009758172,0.015066932,-0.037879492,0.020471157,0.010738162,0.010952531,0.019377213,0.027140572,0.031116892,-0.018530676,-8.90E-05)
as.data.frame(cbind(names,var1,var2))->df
列 var1 と var2 の外れ値を Na に変換したいと思います。ただし、「名前」列のカテゴリごとに外れ値を個別に計算したいと思います。したがって、var1 の「a」の外れ値は、var1 の最初の 5 行のみを使用して見つかった外れ値になります。
外れ値を検出する方法は、分位数 0.25 と 0.75 をそれぞれ下回るか上回るすべての値です。
Rでこれを行う簡単な方法はありますか?
事前にどうもありがとうございました。
ティナ。