json - データマイニングのためのJSONからR

Question

Topsy Otter apiを使用してツイートを取得しようとしているので、論文のデータマイニングを実行できます。

これまでのところ、私は持っています：

library(RJSONIO)
library(RCurl)
tweet_data <- getURL("http://otter.topsy.com/search.json?q=PSN&mintime=1301634000&perpage=10&maxtime=1304226000&apikey=xxx")
fromJSON(tweet_data)

これは問題なく動作します。ただし、ここで、このファイルから「content」と「trackback_date」の2つの詳細だけを返したいと思います。方法がわからないようです。いくつかの例を組み合わせてみましたが、必要なものを抽出できませんでした。

これが私がこれまでに試したことです：

trackback_date <- lapply(tweet_data$result, function(x){x$trackback_date})

content <- lapply(tweet_data$result, function(x){x$content})

どんな助けでも大歓迎です、ありがとう。

編集私も試しました：

library("rjson")
# use rjson

tweet_data <- fromJSON(paste(readLines("http://otter.topsy.com/search.json?q=PSN&mintime=1301634000&perpage=10&maxtime=1304226000&apikey=xxx"), collapse=""))
# get a data from Topsy Otter API
# convert JSON data into R object using fromJSON()

trackback_date <- lapply(tweet_data$result, function(x){x$trackback_date})

content <- lapply(tweet_data$result, function(x){x$content})

score 5 · Accepted Answer

Topsy Otter API応答の基本的な処理：

library(RJSONIO)
library(RCurl)
tweet_data <- getURL("http://otter.topsy.com/search.json?q=PSN&mintime=1301634000&perpage=10&maxtime=1304226000&apikey=xxx")

#
# Addition to your code
#
tweets <- fromJSON(tweet_data)$response$list
content <- sapply(tweets, function(x) x$content)
trackback_date <- sapply(tweets, function(x) x$trackback_date)

編集：複数のページを処理する

関数は指定されたものから100個のアイテムを取得しますpage：

pagetweets <- function(page){
  url <- paste("http://otter.topsy.com/search.json?q=PSN&mintime=1301634000&page=",page,
               "&perpage=100&maxtime=1304226000&apikey=xxx",
               collapse="", sep="")
  tweet_data <- getURL(url)
  fromJSON(tweet_data)$response$list
}

これで、複数のページに適用できます。

tweets <- unlist(lapply(1:10, pagetweets), recursive=F)

そして、出来上がり、このコード：

content <- sapply(tweets, function(x) x$content)
trackback_date <- sapply(tweets, function(x) x$trackback_date)

1000レコードを返します。

json - データマイニングのためのJSONからR

1 に答える 1

Related

Reference