ruby - Rubyに組み込みのレイジーハッシュはありますか？

Question

ハッシュにさまざまな値を入力する必要があります。一部の値は十分な頻度でアクセスされ、別の値は実際にはめったにアクセスされません。

問題は、値を取得するためにいくつかの計算を使用しており、複数のキーを使用するとハッシュへの入力が非常に遅くなることです。

私の場合、ある種のキャッシュを使用することはできません。

キーが最初にアクセスされたときだけ、キーが追加されたときではなく、ハッシュに値を計算させる方法を知りたいですか？

このように、めったに使用されない値は、充填プロセスを遅くしません。

「ちょっと非同期」または怠惰なアクセスである何かを探しています。

score 12 · Accepted Answer

これにアプローチする方法はたくさんあります。ハッシュの代わりに、定義したクラスのインスタンスを使用することをお勧めします。たとえば、代わりに...

# Example of slow code using regular Hash.
h = Hash.new
h[:foo] = some_long_computation
h[:bar] = another_long_computation
# Access value.
puts h[:foo]

...独自のクラスを作成し、次のようにメソッドを定義します...

class Config
  def foo
    some_long_computation
  end

  def bar
    another_long_computation
  end
end

config = Config.new
puts config.foo

長い計算をキャッシュする簡単な方法が必要な場合、または絶対に独自のクラスではなくハッシュでなければならない場合は、Configインスタンスをハッシュでラップできるようになりました。

config = Config.new
h = Hash.new {|h,k| h[k] = config.send(k) }
# Access foo.
puts h[:foo]
puts h[:foo]  # Not computed again. Cached from previous access.

上記の例の問題の1つは、まだアクセスしていないためh.keys、含まれないことです。:barしたがって、たとえば、h実際にアクセスされるまで存在しないため、のすべてのキーまたはエントリを反復処理することはできません。もう1つの潜在的な問題は、キーが有効なRuby識別子である必要があるため、で定義するときにスペースを含む任意の文字列キーが機能しないことConfigです。

これが重要な場合は、さまざまな方法で処理できます。これを行う1つの方法は、ハッシュにサンクを入力し、アクセス時にサンクを強制することです。

class HashWithThunkValues < Hash
  def [](key)
    val = super
    if val.respond_to?(:call)
      # Force the thunk to get actual value.
      val = val.call
      # Cache the actual value so we never run long computation again.
      self[key] = val
    end

    val
  end
end

h = HashWithThunkValues.new
# Populate hash.
h[:foo] = ->{ some_long_computation }
h[:bar] = ->{ another_long_computation }
h["invalid Ruby name"] = ->{ a_third_computation }  # Some key that's an invalid ruby identifier.
# Access hash.
puts h[:foo]
puts h[:foo]  # Not computed again. Cached from previous access.
puts h.keys  #=> [:foo, :bar, "invalid Ruby name"]

この最後の例の1つの注意点は、強制する必要のあるサンクと値の違いがわからないため、値が呼び出し可能である場合は機能しないことです。

繰り返しますが、これを処理する方法があります。これを行う1つの方法は、値が評価されたかどうかを示すフラグを格納することです。ただし、これにはすべてのエントリに追加のメモリが必要になります。より良い方法は、ハッシュ値が未評価のサンクであることをマークするために新しいクラスを定義することです。

class Unevaluated < Proc
end

class HashWithThunkValues < Hash
  def [](key)
    val = super

    # Only call if it's unevaluated.
    if val.is_a?(Unevaluated)
      # Force the thunk to get actual value.
      val = val.call
      # Cache the actual value so we never run long computation again.
      self[key] = val
    end

    val
  end
end

# Now you must populate like so.
h = HashWithThunkValues.new
h[:foo] = Unevaluated.new { some_long_computation }
h[:bar] = Unevaluated.new { another_long_computation }
h["invalid Ruby name"] = Unevaluated.new { a_third_computation }  # Some key that's an invalid ruby identifier.
h[:some_proc] = Unevaluated.new { Proc.new {|x| x + 2 } }

これの欠点はUnevaluted.new、ハッシュを設定するときに使用することを忘れないようにする必要があることです。すべての値を遅延させたい場合は、オーバーライドすること[]=もできます。、、、を使用するかProc.new、最初にブロックを作成する必要があるため、実際にはタイピングを大幅に節約できるとは思いません。しかし、それは価値があるかもしれません。もしそうなら、それはこのように見えるかもしれません。proclambda->{}

class HashWithThunkValues < Hash
  def []=(key, val)
    super(key, val.respond_to?(:call) ? Unevaluated.new(&val) : val)
  end
end

だからここに完全なコードがあります。

class HashWithThunkValues < Hash

  # This can be scoped inside now since it's not used publicly.
  class Unevaluated < Proc
  end

  def [](key)
    val = super

    # Only call if it's unevaluated.
    if val.is_a?(Unevaluated)
      # Force the thunk to get actual value.
      val = val.call
      # Cache the actual value so we never run long computation again.
      self[key] = val
    end

    val
  end

  def []=(key, val)
    super(key, val.respond_to?(:call) ? Unevaluated.new(&val) : val)
  end

end

h = HashWithThunkValues.new
# Populate.
h[:foo] = ->{ some_long_computation }
h[:bar] = ->{ another_long_computation }
h["invalid Ruby name"] = ->{ a_third_computation }  # Some key that's an invalid ruby identifier.
h[:some_proc] = ->{ Proc.new {|x| x + 2 } }

score 3 · Accepted Answer

あなたはこれを使うことができます：

class LazyHash < Hash

  def [] key
    (_ = (@self||{})[key]) ? 
      ((self[key] = _.is_a?(Proc) ? _.call : _); @self.delete(key)) :
      super
  end

  def lazy_update key, &proc
    (@self ||= {})[key] = proc
    self[key] = proc
  end

end

怠惰なハッシュは通常どおりに動作します。Hashこれは実際には本物であるためHashです。

こちらのライブデモをご覧ください

***更新-ネストされたprocsの質問への回答***

はい、動作しますが、面倒です。

更新された回答を参照してください。

lazy_update=の代わりに使用[]して、ハッシュに「遅延」値を追加します。

score 2 · Accepted Answer

次のようなものを使用して、独自のインデクサーを定義できます。

class MyHash
  def initialize
    @cache = {}
  end

  def [](key)
    @cache[key] || (@cache[key] = compute(key))
  end

  def []=(key, value)
    @cache[key] = value
  end

  def compute(key)
    @cache[key] = 1
  end
end

次のように使用します。

1.9.3p286 :014 > hash = MyHash.new
 => #<MyHash:0x007fa0dd03a158 @cache={}> 

1.9.3p286 :019 > hash["test"]
 => 1 

1.9.3p286 :020 > hash
 => #<MyHash:0x007fa0dd03a158 @cache={"test"=>1}>

score -1 · Accepted Answer

これは厳密には質問の本文に対する答えではありませんが、Enumerable::Lazy 間違いなくRuby2.0の一部になります。これにより、イテレータ構成の遅延評価を行うことができます。

lazy = [1, 2, 3].lazy.select(&:odd?)
# => #<Enumerable::Lazy: #<Enumerator::Generator:0x007fdf0b864c40>:each>
lazy.to_a 
# => [40, 50]

ruby - Rubyに組み込みのレイジーハッシュはありますか？

4 に答える 4

こちらのライブデモをご覧ください

Related

Reference