ruby - Ruby で非常に大きな txt ファイル内の文字列を検索するにはどうすればよいですか?

翻译自：https://stackoverflow.com/questions/19752886 2013-11-03T12:21:05.717

56 次

私はそれを解決する良い方法を見つけることができない問題に遭遇しました。

問題の説明:

File 1: short_map.txt, contains with over 2millon lines with each line consist of a short url like the one in twitter and its corresponding full web url.

(例: " http://bit.ly/18sy7Fzhttp://www.london24.com/spurs_star_townsend_deemed_hodgson_joke_a_compliment_1_2903643?utm_source=Daily+News&utm_medium=twitter " )

File 2: html_index.txt, contains with about 50k lines with each line stands for a full web url.

(例: " http://www.redbubble.com/people/tipptoggy/works/10898437-rock-of-cashel ")

html_index.txt ファイル内の各 Web URL に対応する短い URL を取得し、それを新しい txt ファイルに出力したいと考えています。

私のやり方は、html_index.txt の各行を読み取り、それを short_map.txt の各行と比較することです。この方法で、必要なものをすべて取得できます。問題は、遅すぎることです。

これを行うためのより高速なアルゴリズムで誰かが私を助けてくれますか?

問題解決: ハッシュテーブルを使用すると機能します。最初の回答を参照してください。ありがとう！

ruby - Ruby で非常に大きな txt ファイル内の文字列を検索するにはどうすればよいですか?

1 に答える 1

Related

Reference