python - 複数のクエリを numpy マスクと組み合わせる

Question

私がやろうとしているのは、次のようなファイルから2行をプロットすることです:

number          pair        atom       count         shift      error
 1            ALA ALA       CA         7624           1.35           0.13
 1            ALA ALA       HA         7494          19.67          11.44
38            ARG LYS       CA         3395          35.32           9.52
38            ARG LYS       HA         3217           1.19           0.38
38            ARG LYS       CB         3061           0.54           1.47
39            ARG MET       CA         1115          35.62          13.08
39            ARG MET       HA         1018           1.93           0.20
39            ARG MET       CB          976           1.80           0.34

私がやりたいことは、アトム値を使用してアトム CA と CB を含む行をプロットすることです。だから基本的に私はやりたい：

atomtypemask_ca = data['atom'] == 'CA'
xaxis = np.array(data['shift'][atomtypemask_ca])
aa, atom = data['aa'][atomtypemask_ca], data['atom'][atomtypemask_ca]

atomtypemask_cb = data['atom'] == 'CB'
yaxis = np.array(data['shift'][atomtypemask_cb])

plot (xaxis, yaxis)

私の一日を台無しにしているのは、一部の値に CB エントリがないという理由です。2 つのアトム値のうち 1 つだけが設定されているエントリを無視して、この種のものをプロットするにはどうすればよいですか? もちろんプログラムすることはできますが、これはマスクを使用して可能にする必要があると考えているため、よりクリーンなコードが生成されます。

score 2 · Accepted Answer

私は推測しています、最初の列は残基番号です。それを使用します。あなたのデータ構造や何を参照しているのかわかりませんが、shift次のようなことができるはずです：

In : residues
Out: array([ 1,  1, 38, 38, 38, 39, 39, 39])

In : atom
Out: 
array(['CA', 'HA', 'CA', 'HA', 'CB', 'CA', 'HA', 'CB'], 
      dtype='|S2')

In : shift
Out: array([7624, 7494, 3395, 3217, 3061, 1115, 1018,  976])

# rows with name 'CB'
In : cb = atom=='CB'

# rows with name 'CA' _and_ residues same as 'CB'
In : ca = numpy.logical_and(numpy.in1d(residues, residues[cb]), atom=='CA')
# or if in1d is not available
# ca = numpy.logical_and([(residue in residues[cb]) for residue in residues], atom=='CA')

In : shift[ca]
Out: array([3395, 1115])

In : shift[cb]
Out: array([3061,  976])

python - 複数のクエリを numpy マスクと組み合わせる

1 に答える 1

Related

Reference