python - 3 次元の xr.DataArray (Xarray) を軸に沿って 2 次元にフラット化/分解/縮小しますか?

Question

私は、異なるクラス/サブタイプのレプリケートを保存しているデータセットを持っています(それを何と呼ぶべきかわかりません)。次に、それぞれの属性を保存します。基本的に、5 つのサブタイプ/クラス、サブタイプ/クラスごとに 4 つの複製、および測定される 100 の属性があります。

np.ravelまたはnp.flattenを使用して2次元をマージできる方法はありXarrayますか?

subtypeこれで、dimをマージしたいreplicatesので、2D配列があります（pd.DataFrameまたはattributes vs. subtype/replicates.

「coord_1 | coord_2」などの形式にする必要はありません。元の座標名を保持すると便利です。たぶん、groupbyこれを行うことができるような何かがありますか？Groupbyいつも私を混乱させるので、それがネイティブなものxarrayであれば素晴らしいでしょう.

import xarray as xr
import numpy as np

# Set up xr.DataArray
dims = (5,4,100)
DA_data = xr.DataArray(np.random.random(dims), dims=["subtype","replicates","attributes"])
DA_data.coords["subtype"] = ["subtype_%d"%_ for _ in range(dims[0])]
DA_data.coords["replicates"] = ["rep_%d"%_ for _ in range(dims[1])]
DA_data.coords["attributes"] = ["attr_%d"%_ for _ in range(dims[2])]

# DA_data.coords
# Coordinates:
#   * subtype     (subtype) <U9 'subtype_0' 'subtype_1' 'subtype_2' ...
#   * replicates  (replicates) <U5 'rep_0' 'rep_1' 'rep_2' 'rep_3'
#   * attributes  (attributes) <U7 'attr_0' 'attr_1' 'attr_2' 'attr_3' ...
# DA_data.dims
# ('subtype', 'replicates', 'attributes')

# Naive way to collapse the replicate dimension into the subtype dimension
desired_columns = list()
for subtype in DA_data.coords["subtype"]:
    for replicate in DA_data.coords["replicates"]:
        desired_columns.append(str(subtype.values) + "|" + str(replicate.values))
desired_columns
# ['subtype_0|rep_0',
#  'subtype_0|rep_1',
#  'subtype_0|rep_2',
#  'subtype_0|rep_3',
#  'subtype_1|rep_0',
#  'subtype_1|rep_1',
#  'subtype_1|rep_2',
#  'subtype_1|rep_3',
#  'subtype_2|rep_0',
#  'subtype_2|rep_1',
#  'subtype_2|rep_2',
#  'subtype_2|rep_3',
#  'subtype_3|rep_0',
#  'subtype_3|rep_1',
#  'subtype_3|rep_2',
#  'subtype_3|rep_3',
#  'subtype_4|rep_0',
#  'subtype_4|rep_1',
#  'subtype_4|rep_2',
#  'subtype_4|rep_3']

score 5 · Accepted Answer

はい、これはまさにの.stack目的です:

In [33]: stacked = DA_data.stack(desired=['subtype', 'replicates'])

In [34]: stacked
Out[34]:
<xarray.DataArray (attributes: 100, desired: 20)>
array([[ 0.54020268,  0.14914837,  0.83398895, ...,  0.25986503,
         0.62520466,  0.08617668],
       [ 0.47021735,  0.10627027,  0.66666478, ...,  0.84392176,
         0.64461418,  0.4444864 ],
       [ 0.4065543 ,  0.59817851,  0.65033094, ...,  0.01747058,
         0.94414244,  0.31467342],
       ...,
       [ 0.23724934,  0.61742922,  0.97563316, ...,  0.62966631,
         0.89513904,  0.20139552],
       [ 0.21157447,  0.43868899,  0.77488211, ...,  0.98285015,
         0.24367352,  0.8061804 ],
       [ 0.21518079,  0.234854  ,  0.18294781, ...,  0.64679141,
         0.49678393,  0.32215219]])
Coordinates:
  * attributes  (attributes) |S7 'attr_0' 'attr_1' 'attr_2' 'attr_3' ...
  * desired     (desired) object ('subtype_0', 'rep_0') ...

結果として得られる積み上げ座標はでありpandas.MultiIndex、その値はタプルによって与えられます。

In [35]: stacked['desired'].values
Out[35]:
array([('subtype_0', 'rep_0'), ('subtype_0', 'rep_1'),
       ('subtype_0', 'rep_2'), ('subtype_0', 'rep_3'),
       ('subtype_1', 'rep_0'), ('subtype_1', 'rep_1'),
       ('subtype_1', 'rep_2'), ('subtype_1', 'rep_3'),
       ('subtype_2', 'rep_0'), ('subtype_2', 'rep_1'),
       ('subtype_2', 'rep_2'), ('subtype_2', 'rep_3'),
       ('subtype_3', 'rep_0'), ('subtype_3', 'rep_1'),
       ('subtype_3', 'rep_2'), ('subtype_3', 'rep_3'),
       ('subtype_4', 'rep_0'), ('subtype_4', 'rep_1'),
       ('subtype_4', 'rep_2'), ('subtype_4', 'rep_3')], dtype=object)

python - 3 次元の xr.DataArray (Xarray) を軸に沿って 2 次元にフラット化/分解/縮小しますか?

1 に答える 1

Related

Reference