1

dogs次のようなデータフレームがあります。

url 
https://en.wikipedia.org/wiki/Dog
https://en.wikipedia.org/wiki/Dingo
https://en.wikipedia.org/wiki/Canis_lupus_dingo

すべての URL を rvest に送信したいのですが、方法がわかりません。

私はこれを試しました

dogstext <-html(dogs$url) %>%
    html_nodes("p:nth-child(4)") %>%
    html_text() 

しかし、私はこのエラーが発生しました

Error in UseMethod("parse") : 
  no applicable method for 'parse' applied to an object of class "factor"
4

2 に答える 2

1

エラーが示すように、解析する前に因子列を文字に変換する必要があります。

dogs$url<-as.character(dogs$url)

そして、あなたのコードはこの後に続きます。

アップデート:

dog<-data.frame(url=c("https://en.wikipedia.org/wiki/Dog","https://en.wikipedia.org/wiki/Dingo","https://en.wikipedia.org/wiki/Canis_lupus_dingo"))
> str(dog)
'data.frame':   3 obs. of  1 variable:
 $ url: Factor w/ 3 levels "https://en.wikipedia.org/wiki/Canis_lupus_dingo",..: 3 2 1
> lapply(as.character(dog$url),function(i)dogstext <-html(i) %>%
          html_nodes("p:nth-child(4)") %>%
            html_text() )
[[1]]
[1] "The domestic dog (Canis lupus familiaris or Canis familiaris) is a domesticated canid which has been selectively bred for millennia for various behaviors, sensory capabilities, and physical attributes.[2] The global dog population is estimated to between 700 million[3] to over one billion, thus making the dog the most abundant member of order Carnivora.[4]"

[[2]]
[1] "The dingo's habitat ranges from deserts to grasslands and the edges of forests. Dingoes will normally make their dens in deserted rabbit holes and hollow logs close to an essential supply of water."

[[3]]
character(0)
于 2015-06-02T13:17:40.680 に答える