結合操作で非常に堅牢なエラーが発生します。マージ(left_index、right_index)も試みましたが、同じ結果になりました。
インデックスは(設計上)同一であり、両方のインデックスでindex.is_unique(TRUE)とindex.get_duplicates()(EMPTY)によってチェックされます。
基本バージョン:
df1.join(series)
merge(df1, series_as_df,
print tempres.index
<class'pandas.tseries.index.DatetimeIndex'> [2013-01-14 17:04:45、...、2013-01-14 16:53:05]長さ:89、頻度:なし、タイムゾーン:なし
奇妙なことは、値を出力することです:print tempres.index.values [1970-01-16 121:04:45 1970-01-16 121:04:35 1970-01-16 121:04:25 1970-01-16 121:04:15 1970-01-16 121:04:05 1970-01-16 121:03:55 1970-01-16 121:03:45 1970-01-16 121:03:35 1970-01-16 121:03:25 1970-01-16 121:03:15 1970-01-16 121:03:05 1970-01-16 121:02:55 1970-01-16 121:02:45 1970-01-16 121:02:35 1970-01-16 121:02:25 1970-01-16 121:02:15 1970-01-16 121:02:05 1970-01-16 121:01:55 1970-01-16 121:01:45 1970-01-16 121:01:35 1970-01-16 121:01:25 ...]
必要に応じて、漬物のセリエとdfを追加できます...
最新のパンダバージョン0.10.xを使用
ありがとう、
リュック
私のコード(より大きなコードから切り取ったもの)
XYTparams (existing dataframe)
prep_functions[funcname] = [list of values, same length as XYTparams]
iSeries = Series(prep_functions[funcname], index = XYTparams.index, name = funcname)
XYTparams = XYTparams.join(iSeries)
私の問題のレビュー:
基本的なDataFrameでマージと結合を連続して使用します。ある時点で、次のマージ/結合を試みるときにエラーが発生し始めます。簡単なテストではこれを再現できませんでしたが、問題が発生する前にデータフレームを保存しました。
何が問題なのかわかりません。
base_df = load('SPOparams.pic')
lookup_df = load('lookup.pic')
print base_df
print lookup_df
print base_df.count()
print base_df['VKCSKEY1']
print lookup_df['traf_key']
# reset index does not change a thing
base_df = base_df.reset_index(drop=True)
print base_df.index
print base_df.index.get_duplicates()
print lookup_df.index
print lookup_df.index.get_duplicates()
# checking value matches
for k in lookup_df['traf_key']:
print k, k in base_df['VKCSKEY1'].values
# why does this merge is unsuccesfull ???
# in any combination of the parameters
df_result =merge(base_df, lookup_df,
how='left',
#how = 'outer',
left_on ='VKCSKEY1',
right_on ='traf_key',
#left_index=True,
#right_index = True,
#sort=True,
#suffixes=('', '.m'), copy=True
)
print df_result
出力:
1.6.1
0.10.1
<class 'pandas.core.frame.DataFrame'>
Int64Index: 89 entries, 0 to 88
Data columns:
T 89 non-null values
X 89 non-null values
Y 89 non-null values
precip_quantity_1hour 89 non-null values
pressure 89 non-null values
rel_humidity 89 non-null values
temp 89 non-null values
temp_max 0 non-null values
temp_min 0 non-null values
wind_direction 89 non-null values
wind_speed 89 non-null values
BC_TRAF 89 non-null values
closest 89 non-null values
closest.m 89 non-null values
AGGP.P50_ID 89 non-null values
AGGP.FUNC_CLASS 89 non-null values
AGGP.SPEED_CAT 89 non-null values
LINK_ID 89 non-null values
FUNC_CLASS 89 non-null values
SPEED_CAT 89 non-null values
AR_AUTO 89 non-null values
AR_BUS 89 non-null values
AR_TAXIS 89 non-null values
AR_CARPOOL 89 non-null values
AR_PEDEST 89 non-null values
AR_TRUCKS 89 non-null values
STCA20_PCT 89 non-null values
VKC_LINKNR 89 non-null values
TRVIC150R1 89 non-null values
closest.m 89 non-null values
closest.m.m 89 non-null values
VKCP.LINK_ID 89 non-null values
VKCP.FUNC_CLASS 89 non-null values
VKCP.SPEED 89 non-null values
VKCP.LINKNR 89 non-null values
VKCP.TWIN_ID 89 non-null values
VKCSKEY1 89 non-null values
dtypes: datetime64[ns](1), float64(13), int64(9), object(14)
<class 'pandas.core.frame.DataFrame'>
Index: 30 entries, (60744, 0) to (58314, 0)
Data columns:
traf_key 30 non-null values
weekday_nr 30 non-null values
linknr 30 non-null values
weekday 30 non-null values
vr0 30 non-null values
vr1 30 non-null values
vr2 30 non-null values
vr3 30 non-null values
vr4 30 non-null values
vr5 30 non-null values
vr6 30 non-null values
vr7 30 non-null values
vr8 30 non-null values
vr9 30 non-null values
vr10 30 non-null values
vr11 30 non-null values
vr12 30 non-null values
vr13 30 non-null values
vr14 30 non-null values
vr15 30 non-null values
vr16 30 non-null values
vr17 30 non-null values
vr18 30 non-null values
vr19 30 non-null values
vr20 30 non-null values
vr21 30 non-null values
vr22 30 non-null values
vr23 30 non-null values
au0 30 non-null values
au1 30 non-null values
au2 30 non-null values
au3 30 non-null values
au4 30 non-null values
au5 30 non-null values
au6 30 non-null values
au7 30 non-null values
au8 30 non-null values
au9 30 non-null values
au10 30 non-null values
au11 30 non-null values
au12 30 non-null values
au13 30 non-null values
au14 30 non-null values
au15 30 non-null values
au16 30 non-null values
au17 30 non-null values
au18 30 non-null values
au19 30 non-null values
au20 30 non-null values
au21 30 non-null values
au22 30 non-null values
au23 30 non-null values
sn0 30 non-null values
sn1 30 non-null values
sn2 30 non-null values
sn3 30 non-null values
sn4 30 non-null values
sn5 30 non-null values
sn6 30 non-null values
sn7 30 non-null values
sn8 30 non-null values
sn9 30 non-null values
sn10 30 non-null values
sn11 30 non-null values
sn12 30 non-null values
sn13 30 non-null values
sn14 30 non-null values
sn15 30 non-null values
sn16 30 non-null values
sn17 30 non-null values
sn18 30 non-null values
sn19 30 non-null values
sn20 30 non-null values
sn21 30 non-null values
sn22 30 non-null values
sn23 30 non-null values
dtypes: float64(24), int64(50), object(2)
T 89
X 89
Y 89
precip_quantity_1hour 89
pressure 89
rel_humidity 89
temp 89
temp_max 0
temp_min 0
wind_direction 89
wind_speed 89
BC_TRAF 89
closest 89
closest.m 89
AGGP.P50_ID 89
AGGP.FUNC_CLASS 89
AGGP.SPEED_CAT 89
LINK_ID 89
FUNC_CLASS 89
SPEED_CAT 89
AR_AUTO 89
AR_BUS 89
AR_TAXIS 89
AR_CARPOOL 89
AR_PEDEST 89
AR_TRUCKS 89
STCA20_PCT 89
VKC_LINKNR 89
TRVIC150R1 89
closest.m 89
closest.m.m 89
VKCP.LINK_ID 89
VKCP.FUNC_CLASS 89
VKCP.SPEED 89
VKCP.LINKNR 89
VKCP.TWIN_ID 89
VKCSKEY1 89
0 (60744, 0)
1 (60744, 0)
2 (60744, 0)
3 (60750, 0)
4 (60768, 0)
5 (60768, 0)
6 (60758, 0)
7 (60758, 0)
8 (69223, 0)
9 (69223, 0)
10 (69223, 0)
11 (64265, 0)
12 (64265, 0)
13 (64265, 0)
14 (64265, 0)
15 (64265, 0)
16 (64265, 0)
17 (64265, 0)
18 (64265, 0)
19 (64265, 0)
20 (64216, 0)
21 (64216, 0)
22 (64216, 0)
23 (64216, 0)
24 (64216, 0)
25 (64216, 0)
26 (64216, 0)
27 (64216, 0)
28 (64216, 0)
29 (57085, 0)
30 (57085, 0)
31 (57085, 0)
32 (57085, 0)
33 (57085, 0)
34 (57085, 0)
35 (57014, 0)
36 (57033, 0)
37 (57033, 0)
38 (64065, 0)
39 (64065, 0)
40 (64065, 0)
41 (64065, 0)
42 (64065, 0)
43 (57070, 0)
44 (64062, 0)
45 (64062, 0)
46 (64062, 0)
47 (64062, 0)
48 (57070, 0)
49 (64061, 0)
50 (64061, 0)
51 (64061, 0)
52 (64061, 0)
53 (59849, 0)
54 (59415, 0)
55 (58487, 0)
56 (58054, 0)
57 (58054, 0)
58 (58054, 0)
59 (52551, 0)
60 (58054, 0)
61 (58054, 0)
62 (58054, 0)
63 (58054, 0)
64 (52551, 0)
65 (58054, 0)
66 (58488, 0)
67 (58488, 0)
68 (58028, 0)
69 (58464, 0)
70 (58028, 0)
71 (57989, 0)
72 (58595, 0)
73 (58027, 0)
74 (57989, 0)
75 (58595, 0)
76 (58595, 0)
77 (58019, 0)
78 (58595, 0)
79 (58595, 0)
80 (58019, 0)
81 (58595, 0)
82 (58595, 0)
83 (66715, 0)
84 (58595, 0)
85 (59295, 0)
86 (67614, 0)
87 (58314, 0)
88 (58314, 0)
Name: VKCSKEY1, Length: 89
VKCSKEY1
(60744, 0) (60744, 0)
(60750, 0) (60750, 0)
(60768, 0) (60768, 0)
(60758, 0) (60758, 0)
(69223, 0) (69223, 0)
(64265, 0) (64265, 0)
(64216, 0) (64216, 0)
(57085, 0) (57085, 0)
(57014, 0) (57014, 0)
(57033, 0) (57033, 0)
(64065, 0) (64065, 0)
(57070, 0) (57070, 0)
(64062, 0) (64062, 0)
(64061, 0) (64061, 0)
(59849, 0) (59849, 0)
(59415, 0) (59415, 0)
(58487, 0) (58487, 0)
(58054, 0) (58054, 0)
(52551, 0) (52551, 0)
(58488, 0) (58488, 0)
(58028, 0) (58028, 0)
(58464, 0) (58464, 0)
(57989, 0) (57989, 0)
(58595, 0) (58595, 0)
(58027, 0) (58027, 0)
(58019, 0) (58019, 0)
(66715, 0) (66715, 0)
(59295, 0) (59295, 0)
(67614, 0) (67614, 0)
(58314, 0) (58314, 0)
Name: traf_key
Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88], dtype=int64)
[]
Index([(60744, 0), (60750, 0), (60768, 0), (60758, 0), (69223, 0), (64265, 0), (64216, 0), (57085, 0), (57014, 0), (57033, 0), (64065, 0), (57070, 0), (64062, 0), (64061, 0), (59849, 0), (59415, 0), (58487, 0), (58054, 0), (52551, 0), (58488, 0), (58028, 0), (58464, 0), (57989, 0), (58595, 0), (58027, 0), (58019, 0), (66715, 0), (59295, 0), (67614, 0), (58314, 0)], dtype=object)
[]
(60744, 0) True
(60750, 0) True
(60768, 0) True
(60758, 0) True
(69223, 0) True
(64265, 0) True
(64216, 0) True
(57085, 0) True
(57014, 0) True
(57033, 0) True
(64065, 0) True
(57070, 0) True
(64062, 0) True
(64061, 0) True
(59849, 0) True
(59415, 0) True
(58487, 0) True
(58054, 0) True
(52551, 0) True
(58488, 0) True
(58028, 0) True
(58464, 0) True
(57989, 0) True
(58595, 0) True
(58027, 0) True
(58019, 0) True
(66715, 0) True
(59295, 0) True
(67614, 0) True
(58314, 0) True
Traceback (most recent call last):
File "L:\temp\pandas_join_bug.py", line 43, in <module>
right_on ='traf_key',
File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 36, in merge
return op.get_result()
File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 185, in get_result
ldata, rdata = self._get_merge_data()
File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 277, in _get_merge_data
copydata=False)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 1194, in _maybe_rename_join
to_rename = self.items.intersection(other.items)
File "C:\Python27\lib\site-packages\pandas\core\index.py", line 666, in intersection
indexer = self.get_indexer(other.values)
File "C:\Python27\lib\site-packages\pandas\core\index.py", line 812, in get_indexer
raise Exception('Reindexing only valid with uniquely valued Index '
Exception: Reindexing only valid with uniquely valued Index objects
エラーが発生すると、マージまたは結合ステートメントを成功させることができません。最初は、エラーが繰り返しのマージ/結合アクションにリンクされていることはわかりませんでした。最新のセットの単一のマージ/結合が機能するようになりました。別のマージ/結合が必要になるとすぐに、同じエラーが発生します。