r - KmeansとSAS：proc fastclusアウトシード、コンバージェンス、ストリクトの取得方法

Question

これは私がRで複製したいsasコードです、

proc fastclus data = in.stores_standard
maxclusters = 20
outseed= in.out_seed
maxiter = 1000
converge = 0 
strict=5.0; 
var storesize sales_per_sqft sales_per_visits tothhsinta;
id store_nbr;
run;

私の試み：

library(amap)
set.seed(1)
kmeans_object=Kmeans(stores_standard, 20, iter.max = 1000, nstart = 1, method = c("euclidean"))
p=do.call(rbind, kmeans_object)

達成できないこと：1）次のパラメーターのみでkmeansを実行します：storesize、sales_per_sqft、sales_per_visits、tothhsinta

2）store_nbrのID

3）Rのアウトシード機能

ありがとう！

score 4 · Accepted Answer

1）非常に簡単です：

want <- c("storesize", "sales_per_sqft", "sales_per_visits", "tothhsinta")
Kmeans(stores_standard[, want], 20, iter.max = 1000, nstart = 1,
       method = c("euclidean"))

2）の場合

 ## a 2-dimensional example from ?Kmeans
 x <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
            matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
 colnames(x) <- c("x", "y")
 cl <- Kmeans(x, 2)

今見てくださいcl：

R> str(cl)
List of 4
 $ cluster : int [1:100] 2 2 2 2 2 2 2 2 2 2 ...
 $ centers : num [1:2, 1:2] 1.0245 -0.017 1.0346 0.0375
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:2] "1" "2"
  .. ..$ : chr [1:2] "x" "y"
 $ withinss: num [1:2] 0.00847 0.22549
 $ size    : int [1:2] 50 50
 - attr(*, "class")= chr "kmeans"

リストのclusterコンポーネントには、割り当てられたクラスターIDが含まれています。これらは、入力データのサンプルと同じ順序です。clusterコンポーネントを入力データの列として割り当てたい場合は、次のようにします。

R> x <- cbind(x, Cluster = cl$cluster)
R> head(x)
               x            y Cluster
[1,] -0.24251497  0.532012889       2
[2,]  0.10957740  0.225168920       2
[3,] -0.35563544 -0.428798979       2
[4,] -0.41251306  0.529953489       2
[5,] -0.61212001 -0.003443993       2
[6,]  0.04435213  0.086595025       2

データについては、次のようにします。

stores_standard <- cbind(stores_standard, Cluster = kmeans_object$cluster)

3に関してはkmeans()、標準のRやKmeans()パッケージのamapでは不可能と思われます。

r - KmeansとSAS：proc fastclusアウトシード、コンバージェンス、ストリクトの取得方法

1 に答える 1

Related

Reference