6

非ASCII文字(今のところ、スペイン語のみ)を同等のASCII文字に置き換えたいと思います。「á」がある場合は、「a」などに置き換えたいと思います。

この関数を作成しました(正常に動作します)が、ループ(sapplyなどの内部ループを含む)を使用したくありません。

latin2ascii<-function(x) {
if(!is.character(x)) stop ("input must be a character object")
require(stringr)
mapL<-c("á","é","í","ó","ú","Á","É","Í","Ó","Ú","ñ","Ñ","ü","Ü")
mapA<-c("a","e","i","o","u","A","E","I","O","U","n","N","u","U")
for(y in 1:length(mapL)) {
  x<-str_replace_all(x,mapL[y],mapA[y])
  }
x
}

それを解決するためのエレガントな方法はありますか?ヘルプ、提案、変更は大歓迎です

4

2 に答える 2

7

gsubfn()同じ名前のパッケージに含まれているのは、この種のことには本当に便利です。

library(gsubfn)

# Create a named list, in which:
#   - the names are the strings to be looked up
#   - the values are the replacement strings
mapL <- c("á","é","í","ó","ú","Á","É","Í","Ó","Ú","ñ","Ñ","ü","Ü")
mapA <- c("a","e","i","o","u","A","E","I","O","U","n","N","u","U")

# ll <- setNames(as.list(mapA), mapL) # An alternative to the 2 lines below
ll <- as.list(mapA)
names(ll) <- mapL


# Try it out
string <- "ÍÓáÚ"
gsubfn("[áéíóúÁÉÍÓÚñÑüÜ]", ll, string)
# [1] "IOaU"

編集:

G.グロタンディークは、ベースRにもこのための機能があると指摘しています。

A <- paste(mapA, collapse="")
L <- paste(mapL, collapse="")
chartr(L, A, "ÍÓáÚ")
# [1] "IOaU"
于 2012-05-22T15:47:22.307 に答える
2

Joshのバージョンが好きですが、別の「ベクトル化された」ソリューションを追加するかもしれないと思いました。アクセントのない文字列のベクトルを返します。また、base関数のみに依存します。

x=c('íÁuÚ','uíÚÁ')

mapL<-c("á","é","í","ó","ú","Á","É","Í","Ó","Ú","ñ","Ñ","ü","Ü")
mapA<-c("a","e","i","o","u","A","E","I","O","U","n","N","u","U")
split=strsplit(x,split='')
m=lapply(split,match,mapL)
mapply(function(split,m) paste(ifelse(is.na(m),split,mapA[m]),collapse='') , split, m)
# "iAuU" "uiUA"
于 2012-05-22T16:10:45.017 に答える