r - 与えられたチャンクで環境を再現するために、knitr のキャッシュされた結果をどのように使用できますか?

Question

tl;dr

私の質問: R セッション内で、Knitrのキャッシュされた結果を使用して、特定のコードブロックで使用可能な環境 (つまり、オブジェクトのセット) に「早送り」する方法はありknit()ますか?

設定：

Knitrのビルトインコードチャンクのキャッシュは、そのキラー機能の 1 つです。

一部のチャンクに時間のかかる計算が含まれている場合に特に役立ちます。それら (またはそれらが依存するチャンク) が変更されない限り、計算はドキュメントが最初にknit編集されるときにのみ実行する必要があります: への後続のすべての呼び出しでknit、チャンクによって作成されたオブジェクトはキャッシュからロードされるだけです。

これは最小限の例で、次のファイルです"lotsOfComps.Rnw"。

\documentclass{article}
\begin{document}

The calculations in this chunk take a looooong time.

<<slowChunk, cache=TRUE>>=
Sys.sleep(30)  ## Stands in for some time-consuming computation
x <- sample(1:10, size=2)
@

I wish I could `fast-forward' to this chunk, to view the cached value of 
\texttt{x}

<<interestingChunk>>=
y <- prod(x)^2
y
@

\end{document}

編み物とTeXifyに必要な時間"lotsOfComps.Rnw":

## First time
system.time(knit2pdf("lotsOfComps.Rnw"))
##   user  system elapsed
##   0.07    0.02   31.81

## Second (and subsequent) runs
system.time(knit2pdf("lotsOfComps.Rnw"))
##   user  system elapsed
##   0.03    0.02    1.28

私の質問：

R セッション内で、Knitrのキャッシュされた結果を使用して、特定のコードブロックで利用可能な環境 (つまり、オブジェクトのセット) に「早送り」する方法はありknit()ますか?

途中ですべてのオブジェクトを再計算する必要があるためpurl("lotsOfComps.Rnw")、コードを実行してから実行すること"lotsOfComps.R"はできません。

理想的には、次のようなことを実行して、の最初に存在する環境に到達することが可能です<<interestingChunk>>=。

spin("lotsOfComps.Rnw", chunk="interestingChunk")
ls()
# [1] "x"
x
# [1] 3 8

spin()（まだ？）利用できないので、同等の結果を得るための最良の方法は何ですか？

score 6 · Accepted Answer

これは、私がしばらくの間書いた中で最も醜いクラッジの 1 つに違いありません...

基本的な考え方は、チャンクの .Rnw ファイルをスキャンし、それらの名前を抽出し、キャッシュされているものを検出し、ロードする必要があるものを決定することです。それが完了したら、ロードする必要がある各チャンク名を段階的にスキャンし、キャッシュフォルダーからデータベース名を検出し、.xml を使用してロードしlazyLoadます。すべてのチャンクをロードしたら、評価を強制する必要があります。醜いし、いくつかのエラーがあると確信していますが、あなたが与えた単純な例と私が作成した他のいくつかの最小限の例でうまくいくようです. これは、.Rnw ファイルが現在の作業ディレクトリにあると仮定します...

load_cache_until <- function(file, chunk, envir = parent.frame()){
    require(knitr)

    # kludge to detect chunk names, which come before the chunk of
    # interest, and which are cached... there has to be a nicer way...
    text <- readLines(file)
    chunks <- grep("^<<.*>>=", text, value = T)
    chunknames <- gsub("^<<([^,>]*)[,>]*.*", "\\1", chunks)
    #detect unnamed chunks
    tmp <- grep("^\\s*$", chunknames)
    chunknames[tmp] <- paste0("unnamed-chunk-", seq_along(tmp))
    id <- which(chunk == chunknames)
    previouschunks <- chunknames[seq_len(id - 1)]
    cachedchunks <- chunknames[grep("cache\\s*=\\s*T", chunks)]

    # These are the names of the chunks we want to load
    extractchunks <- cachedchunks[cachedchunks %in% previouschunks]

    oldls <- ls(envir, all = TRUE)
    # For each chunk...
    for(ch in extractchunks){   
        # Detect the file name of the database...
        pat <- paste0("^", ch, ".*\\.rdb")
        val <- gsub(".rdb", "", dir("cache", pattern = pat))
        # Lazy load the database
        lazyLoad(file.path("cache", val), envir = envir)
    }
    # Detect the new objects added
    newls <- ls(envir, all = TRUE)
    # Force evaluation...  There is probably a better way
    # to do this too...
    lapply(setdiff(newls, oldls), get)

    invisible()

}

load_cache_until("lotsOfComps.Rnw", "interestingChunk")

コードをより堅牢にすることは、読者の演習として残されています。

score -3 · Accepted Answer

これらは、によって生成されるデータファイルとまったく同じですsave。新しい場所からKnitr-cache の例を取得すると、次のようになります。

> library(knitr)
> knit("./005-latex.Rtex")
> load("cache/latex-my-cache_d9835aca7e54429f59d22eeb251c8b29.RData")
> ls()
 [1] "x"

r - 与えられたチャンクで環境を再現するために、knitr のキャッシュされた結果をどのように使用できますか?

tl;dr

設定：

私の質問：

5 に答える 5

Related

Reference