0

式のリストを使用して新しいフィールドをコーディングしたいと考えています。

私のデータフレームでは、Bisaccategory1 に本のカテゴリの完全な説明が含まれています。このフィールドの部分的な値を表す特定の文字列を使用して、「ジャンル」と呼ばれる新しいフィールドを定義できます。特定のジャンルの 1 つは「ノンフィクション」で、これは 25 の一意の完全な説明にマップされます。それらに含まれる特定のパターンを指定することで、これらの完全な説明を識別できます。

  nonfiction<-c("BIOGRAPHY & AUTOBIOGRAPHY","BODY, MIND & SPIRIT","BUSINESS & ECONOMICS","COMICS & GRAPHIC NOVELS",
                  "COMPUTERS","COOKING","FAMILY & RELATIONSHIPS","HEALTH & FITNESS","HISTORY","HOUSE & HOME","HUMOR",
                  "LITERARY CRITICISM","NATURE","PERFORMING 
ARTS","PETS","PHOTOGRAPHY","POETRY","POLITICAL SCIENCE","RELIGION",
                      "SCIENCE","SELF-HELP","SOCIAL SCIENCE","SPORTS & RECREATION","TRANSPORTATION","TRUE CRIME")

これらの文字列を照合して、次のように Biscategory1 値を完成させることができます。

matches <- unique (grep(paste(nonfiction,collapse="|"), 
                                detail$Bisaccategory1, value=TRUE))

しかし、これらの「一致」を使用して、「ノンフィクション」という値を新しいジャンル フィールドに割り当てる方法がよくわかりません。

これはサンプルデータです:

structure(list(Author = c("James Swallow", "Billy Crystal", "Mark Divine", 
"Charles Cumming", "Victoria Schwab", "Louise Penny", "Elizabeth Warren", 
"Linda Castillo", "Paul Fischer", "Sandy Hall", "Louise Penny", 
"Louise Penny", "Lisa Scottoline", "Linda Castillo", "Evan Osnos", 
"Porter Erisman"), Title = c("24: Deadline", "700 Sundays - Still Foolin' 'Em", 
"8 Weeks to Sealfit", "A Colder War", "A Dark Shade of Magic", 
"A Fatal Grace", "A Fighting Chance", "A Hidden Secret", "A Kim Jong-Il Production", 
"A Little Something Different", "A Rule Against Murder", "A Trick of the Light", 
"Accused", "After the Storm", "Age of Ambition", "Alibaba's World"
), Bisac = c("FICTION / Thrillers / General", "BIOGRAPHY & AUTOBIOGRAPHY / Entertainment & Performing Arts", 
"HEALTH & FITNESS / Exercise", "FICTION / Thrillers / Espionage", 
"FICTION / Fantasy / Historical", "FICTION / Mystery & Detective / Traditional", 
"BIOGRAPHY & AUTOBIOGRAPHY / Political", "FICTION / Mystery & Detective / Police Procedural", 
"HISTORY / Asia / Korea", "JUVENILE FICTION / Love & Romance", 
"FICTION / Mystery & Detective / Traditional", "FICTION / Mystery & Detective / Traditional", 
"FICTION / Thrillers / Legal", "FICTION / Mystery & Detective / Police Procedural", 
"HISTORY / Asia / China", "BUSINESS & ECONOMICS / E-Commerce / General"
)), .Names = c("Author", "Title", "Bisac"), class = "data.frame", row.names = c(NA, 
-16L))

私は次のようなことができることを知っています:

df$Genre[Bisaccategory1=="BODY, MIND & SPIRIT / Inspiration & Personal Growth"]<-"nonfiction"

しかし、何百ものカテゴリがあり、これは実際にはスケーラブルではありません。提案をいただければ幸いです。

4

2 に答える 2