3

結合操作で非常に堅牢なエラーが発生します。マージ(left_index、right_index)も試みましたが、同じ結果になりました。

インデックスは(設計上)同一であり、両方のインデックスでindex.is_unique(TRUE)とindex.get_duplicates()(EMPTY)によってチェックされます。

基本バージョン:

df1.join(series)
merge(df1, series_as_df, 

print tempres.index

<class'pandas.tseries.index.DatetimeIndex'> [2013-01-14 17:04:45、...、2013-01-14 16:53:05]長さ:89、頻度:なし、タイムゾーン:なし


奇妙なことは、値を出力することです:print tempres.index.values [1970-01-16 121:04:45 1970-01-16 121:04:35 1970-01-16 121:04:25 1970-01-16 121:04:15 1970-01-16 121:04:05 1970-01-16 121:03:55 1970-01-16 121:03:45 1970-01-16 121:03:35 1970-01-16 121:03:25 1970-01-16 121:03:15 1970-01-16 121:03:05 1970-01-16 121:02:55 1970-01-16 121:02:45 1970-01-16 121:02:35 1970-01-16 121:02:25 1970-01-16 121:02:15 1970-01-16 121:02:05 1970-01-16 121:01:55 1970-01-16 121:01:45 1970-01-16 121:01:35 1970-01-16 121:01:25 ...]

必要に応じて、漬物のセリエとdfを追加できます...

最新のパンダバージョン0.10.xを使用

ありがとう、

リュック

私のコード(より大きなコードから切り取ったもの)

XYTparams (existing dataframe)
prep_functions[funcname] = [list of values, same length as XYTparams]

iSeries = Series(prep_functions[funcname], index = XYTparams.index, name = funcname)
XYTparams = XYTparams.join(iSeries)

私の問題のレビュー:

基本的なDataFrameでマージと結合を連続して使用します。ある時点で、次のマージ/結合を試みるときにエラーが発生し始めます。簡単なテストではこれを再現できませんでしたが、問題が発生する前にデータフレームを保存しました。

何が問題なのかわかりません。

base_df = load('SPOparams.pic')
lookup_df = load('lookup.pic')

print base_df
print lookup_df

print base_df.count()

print base_df['VKCSKEY1']
print lookup_df['traf_key']

# reset index does not change a thing
base_df = base_df.reset_index(drop=True)

print base_df.index
print base_df.index.get_duplicates()
print lookup_df.index
print lookup_df.index.get_duplicates()


# checking value matches
for k in lookup_df['traf_key']:
    print k, k in  base_df['VKCSKEY1'].values

# why does this merge is unsuccesfull ???
# in any combination of the parameters
df_result =merge(base_df, lookup_df, 
             how='left', 
             #how = 'outer',
             left_on ='VKCSKEY1', 
             right_on ='traf_key',
             #left_index=True, 
             #right_index = True,
             #sort=True, 
             #suffixes=('', '.m'), copy=True
             )
print df_result

出力:

1.6.1
0.10.1
<class 'pandas.core.frame.DataFrame'>
Int64Index: 89 entries, 0 to 88
Data columns:
T                        89  non-null values
X                        89  non-null values
Y                        89  non-null values
precip_quantity_1hour    89  non-null values
pressure                 89  non-null values
rel_humidity             89  non-null values
temp                     89  non-null values
temp_max                 0  non-null values
temp_min                 0  non-null values
wind_direction           89  non-null values
wind_speed               89  non-null values
BC_TRAF                  89  non-null values
closest                  89  non-null values
closest.m                89  non-null values
AGGP.P50_ID              89  non-null values
AGGP.FUNC_CLASS          89  non-null values
AGGP.SPEED_CAT           89  non-null values
LINK_ID                  89  non-null values
FUNC_CLASS               89  non-null values
SPEED_CAT                89  non-null values
AR_AUTO                  89  non-null values
AR_BUS                   89  non-null values
AR_TAXIS                 89  non-null values
AR_CARPOOL               89  non-null values
AR_PEDEST                89  non-null values
AR_TRUCKS                89  non-null values
STCA20_PCT               89  non-null values
VKC_LINKNR               89  non-null values
TRVIC150R1               89  non-null values
closest.m                89  non-null values
closest.m.m              89  non-null values
VKCP.LINK_ID             89  non-null values
VKCP.FUNC_CLASS          89  non-null values
VKCP.SPEED               89  non-null values
VKCP.LINKNR              89  non-null values
VKCP.TWIN_ID             89  non-null values
VKCSKEY1                 89  non-null values
dtypes: datetime64[ns](1), float64(13), int64(9), object(14)
<class 'pandas.core.frame.DataFrame'>
Index: 30 entries, (60744, 0) to (58314, 0)
Data columns:
traf_key      30  non-null values
weekday_nr    30  non-null values
linknr        30  non-null values
 weekday      30  non-null values
vr0           30  non-null values
vr1           30  non-null values
vr2           30  non-null values
vr3           30  non-null values
vr4           30  non-null values
vr5           30  non-null values
vr6           30  non-null values
vr7           30  non-null values
vr8           30  non-null values
vr9           30  non-null values
vr10          30  non-null values
vr11          30  non-null values
vr12          30  non-null values
vr13          30  non-null values
vr14          30  non-null values
vr15          30  non-null values
vr16          30  non-null values
vr17          30  non-null values
vr18          30  non-null values
vr19          30  non-null values
vr20          30  non-null values
vr21          30  non-null values
vr22          30  non-null values
vr23          30  non-null values
au0           30  non-null values
au1           30  non-null values
au2           30  non-null values
au3           30  non-null values
au4           30  non-null values
au5           30  non-null values
au6           30  non-null values
au7           30  non-null values
au8           30  non-null values
au9           30  non-null values
au10          30  non-null values
au11          30  non-null values
au12          30  non-null values
au13          30  non-null values
au14          30  non-null values
au15          30  non-null values
au16          30  non-null values
au17          30  non-null values
au18          30  non-null values
au19          30  non-null values
au20          30  non-null values
au21          30  non-null values
au22          30  non-null values
au23          30  non-null values
sn0           30  non-null values
sn1           30  non-null values
sn2           30  non-null values
sn3           30  non-null values
sn4           30  non-null values
sn5           30  non-null values
sn6           30  non-null values
sn7           30  non-null values
sn8           30  non-null values
sn9           30  non-null values
sn10          30  non-null values
sn11          30  non-null values
sn12          30  non-null values
sn13          30  non-null values
sn14          30  non-null values
sn15          30  non-null values
sn16          30  non-null values
sn17          30  non-null values
sn18          30  non-null values
sn19          30  non-null values
sn20          30  non-null values
sn21          30  non-null values
sn22          30  non-null values
sn23          30  non-null values
dtypes: float64(24), int64(50), object(2)
T                        89
X                        89
Y                        89
precip_quantity_1hour    89
pressure                 89
rel_humidity             89
temp                     89
temp_max                  0
temp_min                  0
wind_direction           89
wind_speed               89
BC_TRAF                  89
closest                  89
closest.m                89
AGGP.P50_ID              89
AGGP.FUNC_CLASS          89
AGGP.SPEED_CAT           89
LINK_ID                  89
FUNC_CLASS               89
SPEED_CAT                89
AR_AUTO                  89
AR_BUS                   89
AR_TAXIS                 89
AR_CARPOOL               89
AR_PEDEST                89
AR_TRUCKS                89
STCA20_PCT               89
VKC_LINKNR               89
TRVIC150R1               89
closest.m                89
closest.m.m              89
VKCP.LINK_ID             89
VKCP.FUNC_CLASS          89
VKCP.SPEED               89
VKCP.LINKNR              89
VKCP.TWIN_ID             89
VKCSKEY1                 89
0     (60744, 0)
1     (60744, 0)
2     (60744, 0)
3     (60750, 0)
4     (60768, 0)
5     (60768, 0)
6     (60758, 0)
7     (60758, 0)
8     (69223, 0)
9     (69223, 0)
10    (69223, 0)
11    (64265, 0)
12    (64265, 0)
13    (64265, 0)
14    (64265, 0)
15    (64265, 0)
16    (64265, 0)
17    (64265, 0)
18    (64265, 0)
19    (64265, 0)
20    (64216, 0)
21    (64216, 0)
22    (64216, 0)
23    (64216, 0)
24    (64216, 0)
25    (64216, 0)
26    (64216, 0)
27    (64216, 0)
28    (64216, 0)
29    (57085, 0)
30    (57085, 0)
31    (57085, 0)
32    (57085, 0)
33    (57085, 0)
34    (57085, 0)
35    (57014, 0)
36    (57033, 0)
37    (57033, 0)
38    (64065, 0)
39    (64065, 0)
40    (64065, 0)
41    (64065, 0)
42    (64065, 0)
43    (57070, 0)
44    (64062, 0)
45    (64062, 0)
46    (64062, 0)
47    (64062, 0)
48    (57070, 0)
49    (64061, 0)
50    (64061, 0)
51    (64061, 0)
52    (64061, 0)
53    (59849, 0)
54    (59415, 0)
55    (58487, 0)
56    (58054, 0)
57    (58054, 0)
58    (58054, 0)
59    (52551, 0)
60    (58054, 0)
61    (58054, 0)
62    (58054, 0)
63    (58054, 0)
64    (52551, 0)
65    (58054, 0)
66    (58488, 0)
67    (58488, 0)
68    (58028, 0)
69    (58464, 0)
70    (58028, 0)
71    (57989, 0)
72    (58595, 0)
73    (58027, 0)
74    (57989, 0)
75    (58595, 0)
76    (58595, 0)
77    (58019, 0)
78    (58595, 0)
79    (58595, 0)
80    (58019, 0)
81    (58595, 0)
82    (58595, 0)
83    (66715, 0)
84    (58595, 0)
85    (59295, 0)
86    (67614, 0)
87    (58314, 0)
88    (58314, 0)
Name: VKCSKEY1, Length: 89
VKCSKEY1
(60744, 0)    (60744, 0)
(60750, 0)    (60750, 0)
(60768, 0)    (60768, 0)
(60758, 0)    (60758, 0)
(69223, 0)    (69223, 0)
(64265, 0)    (64265, 0)
(64216, 0)    (64216, 0)
(57085, 0)    (57085, 0)
(57014, 0)    (57014, 0)
(57033, 0)    (57033, 0)
(64065, 0)    (64065, 0)
(57070, 0)    (57070, 0)
(64062, 0)    (64062, 0)
(64061, 0)    (64061, 0)
(59849, 0)    (59849, 0)
(59415, 0)    (59415, 0)
(58487, 0)    (58487, 0)
(58054, 0)    (58054, 0)
(52551, 0)    (52551, 0)
(58488, 0)    (58488, 0)
(58028, 0)    (58028, 0)
(58464, 0)    (58464, 0)
(57989, 0)    (57989, 0)
(58595, 0)    (58595, 0)
(58027, 0)    (58027, 0)
(58019, 0)    (58019, 0)
(66715, 0)    (66715, 0)
(59295, 0)    (59295, 0)
(67614, 0)    (67614, 0)
(58314, 0)    (58314, 0)
Name: traf_key
Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88], dtype=int64)
[]
Index([(60744, 0), (60750, 0), (60768, 0), (60758, 0), (69223, 0), (64265, 0), (64216, 0), (57085, 0), (57014, 0), (57033, 0), (64065, 0), (57070, 0), (64062, 0), (64061, 0), (59849, 0), (59415, 0), (58487, 0), (58054, 0), (52551, 0), (58488, 0), (58028, 0), (58464, 0), (57989, 0), (58595, 0), (58027, 0), (58019, 0), (66715, 0), (59295, 0), (67614, 0), (58314, 0)], dtype=object)
[]
(60744, 0) True
(60750, 0) True
(60768, 0) True
(60758, 0) True
(69223, 0) True
(64265, 0) True
(64216, 0) True
(57085, 0) True
(57014, 0) True
(57033, 0) True
(64065, 0) True
(57070, 0) True
(64062, 0) True
(64061, 0) True
(59849, 0) True
(59415, 0) True
(58487, 0) True
(58054, 0) True
(52551, 0) True
(58488, 0) True
(58028, 0) True
(58464, 0) True
(57989, 0) True
(58595, 0) True
(58027, 0) True
(58019, 0) True
(66715, 0) True
(59295, 0) True
(67614, 0) True
(58314, 0) True
Traceback (most recent call last):
  File "L:\temp\pandas_join_bug.py", line 43, in <module>
    right_on ='traf_key',
  File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 36, in merge
    return op.get_result()
  File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 185, in get_result
    ldata, rdata = self._get_merge_data()
  File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 277, in _get_merge_data
    copydata=False)
  File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 1194, in _maybe_rename_join
    to_rename = self.items.intersection(other.items)
  File "C:\Python27\lib\site-packages\pandas\core\index.py", line 666, in intersection
    indexer = self.get_indexer(other.values)
  File "C:\Python27\lib\site-packages\pandas\core\index.py", line 812, in get_indexer
    raise Exception('Reindexing only valid with uniquely valued Index '
Exception: Reindexing only valid with uniquely valued Index objects

エラーが発生すると、マージまたは結合ステートメントを成功させることができません。最初は、エラーが繰り返しのマージ/結合アクションにリンクされていることはわかりませんでした。最新のセットの単一のマージ/結合が機能するようになりました。別のマージ/結合が必要になるとすぐに、同じエラーが発生します。

4

1 に答える 1

16

列名が重複しているとこのエラーが発生します。重複する列名を削除してみてください

于 2013-08-29T20:11:48.220 に答える