RPy を使用して Python で LDA を使用したいと考えています。私はすでにgensim
パッケージを使用してこれを試しましたが、まだ試してみたいと思っRPy2
ています.
RI を使用している間は、次のコードを使用します。
library(RTextTools)
library(topicmodels)
library(tm)
...Get Data Here and Store to `data`...
matrix <- create_matrix(as.vector(data$body),
language = "english",
removeNumbers = TRUE,
removePunctuation = TRUE,
stemWords = FALSE,
weighting = weightTf)
mat <- as.matrix(matrix)
list <- rowSums(matrix)
rowTotals <- apply(matrix , 1, sum)
matrix.new <- matrix[rowTotals > 0]
lda <- LDA(matrix, 250)
上記のコードを RPy2 の Python コードに変換したいと思います。私はまだこれを試しました:
import rpy2.robjects as robjects
from rpy2.robjects.packages import importr
RTT = importr('RTextTools')
tmod = importr('topicmodels')
tm = importr('tm')
#CommentBunch is a list of strings.
matrix = RTT.create_matrix(CommentBunch,
language = "english",
removeNumber = True,
removePunctuation = True,
stemWords = True,
weighting = tm.weightTf)
lda = tmod.LDA(matrix, 250)
以下は DEBUG ログです。
Connection to database established!
Error in (function (x, k, method = "VEM", control = NULL, model = NULL, :
Each row of the input matrix needs to contain at least one non-zero entry
Traceback (most recent call last):
File "C:\Requirements\Python27\lib\site-packages\rpdb2.py", line 14499, in <module>
ret = rpdb2.main()
File "C:\Requirements\Python27\lib\site-packages\rpdb2.py", line 14470, in main
StartServer(_rpdb2_args, fchdir, _rpdb2_pwd, fAllowUnencrypted, fAllowRemote, secret)
File "C:\Requirements\Python27\lib\site-packages\rpdb2.py", line 14220, in StartServer
imp.load_source('__main__', _path)
File "c:\requirements\msr\msr14-mysql\rpylda.py", line 69, in <module>
lda = tmod.LDA(matrix, 20)
File "C:\Requirements\Python27\lib\site-packages\rpy2\robjects\functions.py", line 86, in __call__
return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
File "C:\Requirements\Python27\lib\site-packages\rpy2\robjects\functions.py", line 35, in __call__
res = super(Function, self).__call__(*new_args, **new_kwargs)
rpy2.rinterface.RRuntimeError: Error in (function (x, k, method = "VEM", control
= NULL, model = NULL, :
Each row of the input matrix needs to contain at least one non-zero entry
R コードを Python の RPy2 コードに変換するにはどうすればよいですか? 助けてください!