r - 行名からデータを抽出してテーブルに挿入する

Question

次のようなデータフレームがあります

                ML1    ML1 SD       ML2    ML2 SD ...
aPhysics0 0.8730469 0.3329205 0.5950521 0.4908820
aPhysics1 0.8471074 0.3598839 0.6473829 0.4777848
aPhysics2 0.8593750 0.3476343 0.7031250 0.4568810
aPhysics3 0.8875000 0.3159806 0.7000000 0.4582576
aPhysics4 0.7962963 0.4027512 0.7654321 0.4237285
...

行名を使用して、次のようなデータフレームを作成したい

     Institution Subject Class       ML1    ML1 SD       ML2    ML2 SD ...
[1,]           A Physics     0 0.8730469 0.3329205 0.5950521 0.4908820
[2,]           A Physics     1 0.8471074 0.3598839 0.6473829 0.4777848
[3,]           A Physics     2 0.8593750 0.3476343 0.7031250 0.4568810
[4,]           A Physics     3 0.8875000 0.3159806 0.7000000 0.4582576
[5,]           A Physics     4 0.7962963 0.4027512 0.7654321 0.4237285
...

これを行う最良の方法は何ですか？

score 3 · Accepted Answer

行名が (1 つの小文字文字列 - 1 桁の数字) の形式であると仮定すると、次の正規表現を使用できますgsub。

#test data
x <- data.frame(ML1=runif(5),ML2=runif(5),row.names=paste0("aPhysics",1:5))

#logic
transform(x, Institution=toupper(gsub("^([a-z])([a-zA-Z]+)([0-9])$","\\1",rownames(x))), Subject=gsub("^([a-z])([a-zA-Z]+)([0-9])$","\\2",rownames(x)), Class=gsub("^([a-z])([a-zA-Z]+)([0-9])$","\\3",rownames(x)))
                 ML1       ML2 Institution Subject Class
aPhysics1 0.51680701 0.4102757           A Physics     1
aPhysics2 0.60388358 0.7438400           A Physics     2
aPhysics3 0.26504243 0.7598557           A Physics     3
aPhysics4 0.55900273 0.5263205           A Physics     4
aPhysics5 0.05589591 0.7903568           A Physics     5

score 3 · Accepted Answer

あなたのdata.frameがdfであると仮定すると、

header <- as.data.frame(do.call(rbind, strsplit(gsub("Physics", " Physics ", 
                rownames(df)), " ")))
names(header) <- c("Institution", "Subject", "Class")
cbind(header, df)
df.out <- cbind(header, df)
df.out$Institution <- toupper(df.out$Institution)

あなたがより多くの主題を持っている場合（一般化された解決策）：

header <- as.data.frame(do.call(rbind, strsplit(gsub("^([a-z])(.*)([0-9])$", 
                 "\\1 \\2 \\3", rownames(df)), " ")))
names(header) <- c("Institution", "Subject", "Class")
df.out <- cbind(header, df)
df.out$Institution <- toupper(df.out$Institution)

r - 行名からデータを抽出してテーブルに挿入する

2 に答える 2

Related

Reference