c# - DB から大量のデータを読み取る最速の方法

Question

約 5 億件のレコードを含むテーブルがあります。テーブルからデータを読み取り、辞書に格納しています。

編集:これらのデータは、インデックスサーバーからの別のボリュームのデータと比較する必要があるため、データを辞書に読み込んでいます。

私のコードは以下の通りです：

public static void GetDetailsFromDB()
{
    string sqlStr = "SELECT ID, Name ,Age, email ,DOB ,Address ,Affiliation ,Interest ,Homepage FROM Author WITH (NOLOCK) ORDER BY ID";
    SqlCommand cmd = new SqlCommand(sqlStr, _con);
    cmd.CommandTimeout = 0;

    using (SqlDataReader reader = cmd.ExecuteReader())
    {
        while (reader.Read())
        {
            //Author Class
            Author author = new Author();

            author.id = Convert.ToInt32(reader["ID"].ToString());
            author.Name = reader["Name"].ToString().Trim();
            author.age = Convert.ToInt32(reader["Age"].ToString());
            author.email = reader["email"].ToString().Trim();
            author.DOB = reader["DOB"].ToString().Trim();
            author.Address = reader["Address"].ToString().Trim();
            author.Affiliation = reader["Affiliation"].ToString().Trim();
            author.Homepage = reader["Homepage"].ToString().Trim();

            string interests = reader["Interest"].ToString().Trim();
            author.interest = interests.Split(new char[] { ',' }, StringSplitOptions.RemoveEmptyEntries).Select(p => p.Trim()).ToList();

            if (!AuthorDict.ContainsKey(author.id))
            {
                AuthorDict.Add(author.id, author);
            }

            if (AuthorDict.Count % 1000000 == 0)
            {
                Console.WriteLine("{0}M author loaded.", AuthorDict.Count / 1000000);
            }
        }
    }
}

このプロセスでは、DB から 5 億件のレコードすべてを読み取って保存するのに長い時間がかかります。また、RAM の使用率が非常に高くなります。

これを最適化できますか? また、稼働時間は短縮できますか？どんな助けでも大歓迎です。

score 3 · Accepted Answer

鼻をかむと、次の最適化を思い付くことができます。

readerフィールドの順序位置をローカル変数に格納し、これらの順序変数を使用してフィールドを参照します。
リーダーを呼び出して変換しないでくださいToString。正しいタイプで値を取得するだけです。

AuthorDictIDを取得したらすぐに、で作成者IDが存在するかどうかを確認します。Author必要がない場合は、インスタンスを作成しないでください。

using (SqlDataReader reader = cmd.ExecuteReader())
{
    var idOrdinal = reader.GetOrdinal("ID");
    //extract other ordinal positions and store here

    while (reader.Read())
    {
        var id = reader.GetInt32(idOrdinal);

        if (!AuthorDict.ContainsKey(id))
        {
            Author author = new Author();
            author.id = reader.GetInt32(idOrdinal);
            ...
        }
    }
}

c# - DB から大量のデータを読み取る最速の方法

1 に答える 1

Related

Reference