このnumpyマトリックスの操作:
>>> print matrix
[['L' 'G' 'T' 'G' 'A' 'P' 'V' 'I']
['A' 'A' 'S' 'G' 'P' 'S' 'S' 'G']
['A' 'A' 'S' 'G' 'P' 'S' 'S' 'G']
['G' 'L' 'T' 'G' 'A' 'P' 'V' 'I']]
私はすでにこのコードを持っています:
for i, j in itertools.combinations(range(len(matrix.T)), 2):
c = matrix[:, [i,j]]
counts = collections.Counter(map(tuple,c))
print 'columns {} and {}'.format(i,j)
for AB in counts:
freq_AB = float(float(counts[AB])/len(c))
print 'Frequency of {} = {}'.format(AB, freq_AB)
print
生産する
columns 0 and 1
Frequency of ('A', 'A') = 0.5
Frequency of ('G', 'L') = 0.25
Frequency of ('L', 'G') = 0.25
columns 0 and 2
Frequency of ('A', 'S') = 0.5
Frequency of ('G', 'T') = 0.25
Frequency of ('L', 'T') = 0.25
[...]
コードに追加したいのは、列 i、j の文字のペアから各文字の列 (i、j) 内の頻度を取得することです... つまり、次のような出力です。
columns 0 and 1
Frequency of ('A', 'A') = 0.5
Freq of 'A' in column 0 = 0.5
Freq of 'A' in column 1 = 0.5
Frequency of ('G', 'L') = 0.25
Freq of 'G' in column 0 = 0.25
Freq of 'L' in column 1 = 0.25
Frequency of ('L', 'G') = 0.25
Freq of 'L' in column 0 = 0.25
Freq of 'G' in column 1 = 0.25
columns 0 and 2
Frequency of ('A', 'S') = 0.5
Freq of 'A' in column 0 = 0.5
Freq of 'S' in column 2 = 0.5
Frequency of ('G', 'T') = 0.25
Freq of 'G' in column 0 = 0.25
Freq of 'T' in column 2 = 0.5
Frequency of ('L', 'T') = 0.25
Freq of 'L' in column 0 = 0.25
Freq of 'T' in column 2 = 0.5
[...]
どんな助けでも大歓迎です