python - テーブルを行列として表現する

Question

string title、int A、int B の 3 つのフィールドを持つデータベーステーブルがあるとします。A と B の範囲は 1 から 500
です。値の一部を行列 5x5 として表現したいと考えています。したがって、(1, 1) は A と B の両方が最も低い文字列になります。(5, 5) は A と B の両方が最高になります。(1, 5) は最低の A と最高の B を持ちます。
どのアルゴリズムを使用すればよいですか?

score 1 · Accepted Answer

ありますか

title  A  B
one    1  1
two    1  2
three  2  1
four   3  3
five   4  4
six    5  5
seven  5  1
eight  1  5

等々...？

3x3 マトリックスに縮小すると、次のようになります

a/b  1     2    3  
1   one   two  eight    
2   three four  ?
3   seven  ?   six

問題は、(2,2) が何を指すかということです。平均？わかりました、そして 5x5 マトリックスで？あなたの定義にはいくつかの情報が欠けています。

上記の行列のアルゴリズムは次のようになります。

A と B については、最小値、最大値、平均値を計算します
データベースに (Amin, Bmin)、(Aavg, Bmin)、(Amax, Bmin) などのタプルを問い合わせる
行列に値を入力

追加: 一致するものがない場合は、最小、最大、平均の範囲を試してください。

score 1 · Accepted Answer

ここでシミュレーションを設定しました。コメントで手順を説明します。

まず、いくつかのデータを生成します。それぞれが文字列と、スコア A および B を表す 2 つの乱数を含む一連のタプルです。

次に、A と B の範囲を 5 つの等間隔のビンに分割し、それぞれがセルの最小値と最大値を表します。

次に、データセットに対してシリアルクエリを実行して、各セルの文字列を抽出します。

使用している実際のデータ構造とストレージに基づいて、これを最適化する方法は 100 通りあります。

from random import random

# Generate data and keep record of scores
data = []
a_list = []
b_list = []
for i in range(50):
    a = int(random()*500)+1
    b = int(random()*500)+1
    rec = { 's' : 's%s' % i,
            'a' : a,
            'b' : b
             }
    a_list.append(a)
    b_list.append(b)
    data.append(rec)

# divide A and B ranges into five bins

def make_bins(f_list):
    f_min = min(f_list)
    f_max = max(f_list)
    f_step_size = (f_max - f_min) / 5.0
    f_steps = [ (f_min + i * f_step_size,
                 f_min + (i+1) * f_step_size)
                for i in range(5) ]
    # adjust top bin to be just larger than maximum
    top = f_steps[4]
    f_steps[4] = ( top[0], f_max+1 )
    return f_steps

a_steps = make_bins(a_list)
b_steps = make_bins(b_list)

# collect the strings that fit into any of the bins
# thus all the strings in cell[4,3] of your matrix
# would fit these conditions:
# string would have a Score A that is
# greater than or equal to the first element in a_steps[3]
# AND less than the second element in a_steps[3]
# AND it would have a Score B that is
# greater than or equal to the first element in b_steps[2]
# AND less than the second element in a_steps[2]
# NOTE: there is a need to adjust the pointers due to
#       the way you have numbered the cells of your matrix

def query_matrix(ptr_a, ptr_b):
    ptr_a -= 1
    from_a = a_steps[ptr_a][0]
    to_a = a_steps[ptr_a][1]

    ptr_b -= 1
    from_b = b_steps[ptr_b][0]
    to_b = b_steps[ptr_b][1]

    results = []
    for rec in data:
        s = rec['s']
        a = rec['a']
        b = rec['b']
        if (a >= from_a and
            a < to_a and
            b >= from_b and
            b < to_b):
            results.append(s)
    return results

# Print out the results for a visual check
total = 0
for i in range(5):
    for j in range(5):
        print '=' * 80
        print 'Cell: ', i+1, j+1, ' contains: ',
        hits = query_matrix(i+1,j+1)
        total += len(hits)
        print hits
print '=' * 80
print 'Total number of strings found: ', total

python - テーブルを行列として表現する

2 に答える 2

Related

Reference