3

I have a DataFrame like this

                       OPEN    HIGH     LOW   CLOSE         VOL
2012-01-01 19:00:00  449000  449000  449000  449000  1336303000
2012-01-01 20:00:00     NaN     NaN     NaN     NaN         NaN
2012-01-01 21:00:00     NaN     NaN     NaN     NaN         NaN
2012-01-01 22:00:00     NaN     NaN     NaN     NaN         NaN
2012-01-01 23:00:00     NaN     NaN     NaN     NaN         NaN
...
                         OPEN      HIGH       LOW     CLOSE          VOL
2013-04-24 14:00:00  11700000  12000000  11600000  12000000  20647095439
2013-04-24 15:00:00  12000000  12399000  11979000  12399000  23997107870
2013-04-24 16:00:00  12399000  12400000  11865000  12100000   9379191474
2013-04-24 17:00:00  12300000  12397995  11850000  11850000   4281521826
2013-04-24 18:00:00  11850000  11850000  10903000  11800000  15546034128

I need to fill NaN according this rule

When OPEN, HIGH, LOW, CLOSE are NaN,

  • set VOL to 0
  • set OPEN, HIGH, LOW, CLOSE to previous CLOSE candle value

else keep NaN

4

3 に答える 3

0

マスキングを使った方法はこちら

いくつかの穴のあるフレームをシミュレートします (A は「近い」フィールドです)

In [20]: df = DataFrame(randn(10,3),index=date_range('20130101',periods=10,freq='min'),
            columns=list('ABC'))

In [21]: df.iloc[1:3,:] = np.nan

In [22]: df.iloc[5:8,1:3] = np.nan

In [23]: df
Out[23]: 
                            A         B         C
2013-01-01 00:00:00 -0.486149  0.156894 -0.272362
2013-01-01 00:01:00       NaN       NaN       NaN
2013-01-01 00:02:00       NaN       NaN       NaN
2013-01-01 00:03:00  1.788240 -0.593195  0.059606
2013-01-01 00:04:00  1.097781  0.835491 -0.855468
2013-01-01 00:05:00  0.753991       NaN       NaN
2013-01-01 00:06:00 -0.456790       NaN       NaN
2013-01-01 00:07:00 -0.479704       NaN       NaN
2013-01-01 00:08:00  1.332830  1.276571 -0.480007
2013-01-01 00:09:00 -0.759806 -0.815984  2.699401

ナンの僕らは

In [24]: mask_0 = pd.isnull(df).all(axis=1)

In [25]: mask_0
Out[25]: 
2013-01-01 00:00:00    False
2013-01-01 00:01:00     True
2013-01-01 00:02:00     True
2013-01-01 00:03:00    False
2013-01-01 00:04:00    False
2013-01-01 00:05:00    False
2013-01-01 00:06:00    False
2013-01-01 00:07:00    False
2013-01-01 00:08:00    False
2013-01-01 00:09:00    False
Freq: T, dtype: bool

普及させたいもの A

In [26]: mask_fill = pd.isnull(df['B']) & pd.isnull(df['C'])

In [27]: mask_fill
Out[27]: 
2013-01-01 00:00:00    False
2013-01-01 00:01:00     True
2013-01-01 00:02:00     True
2013-01-01 00:03:00    False
2013-01-01 00:04:00    False
2013-01-01 00:05:00     True
2013-01-01 00:06:00     True
2013-01-01 00:07:00     True
2013-01-01 00:08:00    False
2013-01-01 00:09:00    False
Freq: T, dtype: bool

最初に宣伝する

In [28]: df.loc[mask_fill,'C'] = df['A']

In [29]: df.loc[mask_fill,'B'] = df['A']

0を埋める

In [30]: df.loc[mask_0] = 0

終わり

In [31]: df
Out[31]: 
                            A         B         C
2013-01-01 00:00:00 -0.486149  0.156894 -0.272362
2013-01-01 00:01:00  0.000000  0.000000  0.000000
2013-01-01 00:02:00  0.000000  0.000000  0.000000
2013-01-01 00:03:00  1.788240 -0.593195  0.059606
2013-01-01 00:04:00  1.097781  0.835491 -0.855468
2013-01-01 00:05:00  0.753991  0.753991  0.753991
2013-01-01 00:06:00 -0.456790 -0.456790 -0.456790
2013-01-01 00:07:00 -0.479704 -0.479704 -0.479704
2013-01-01 00:08:00  1.332830  1.276571 -0.480007
2013-01-01 00:09:00 -0.759806 -0.815984  2.699401
于 2013-05-09T17:14:42.130 に答える
0

これは、パンダの欠損データの動作を示しています。あなたが探している呪文は、値を取る fillna メソッドです。

In [1381]: df2
Out[1381]: 
        one       two     three four   five           timestamp
a       NaN  1.138469 -2.400634  bar   True                 NaT
c       NaN  0.025653 -1.386071  bar  False                 NaT
e  0.863937  0.252462  1.500571  bar   True 2012-01-01 00:00:00
f  1.053202 -2.338595 -0.374279  bar   True 2012-01-01 00:00:00
h       NaN -1.157886 -0.551865  bar  False                 NaT

In [1382]: df2.fillna(0)
Out[1382]: 
        one       two     three four   five           timestamp
a  0.000000  1.138469 -2.400634  bar   True 1970-01-01 00:00:00
c  0.000000  0.025653 -1.386071  bar  False 1970-01-01 00:00:00
e  0.863937  0.252462  1.500571  bar   True 2012-01-01 00:00:00
f  1.053202 -2.338595 -0.374279  bar   True 2012-01-01 00:00:00
h  0.000000 -1.157886 -0.551865  bar  False 1970-01-01 00:00:00

それらを前方および後方に伝播することもできます。

In [1384]: df
Out[1384]: 
        one       two     three
a       NaN  1.138469 -2.400634
c       NaN  0.025653 -1.386071
e  0.863937  0.252462  1.500571
f  1.053202 -2.338595 -0.374279
h       NaN -1.157886 -0.551865

In [1385]: df.fillna(method='pad')
Out[1385]: 
        one       two     three
a       NaN  1.138469 -2.400634
c       NaN  0.025653 -1.386071
e  0.863937  0.252462  1.500571
f  1.053202 -2.338595 -0.374279
h  1.053202 -1.157886 -0.551865

あなたの特定のケースでは、次のことを行う必要があると思います:

df['VOL'].fillna(0)
df.fillna(df['CLOSE'])
于 2013-05-09T16:41:50.840 に答える