婚礼网站建设网站seo诊断报告
简短说明:
如果数据中有重复的列名,请确保在读取文件时重命名一列.
如果您的数据中有NaN等,请删除它们.
然后使用下面的正确答案合并.
可能是一个非常简单的问题.
我使用pandas.read_csv()读入了两个数据集.
我的数据在两个单独的csv中.
使用以下代码:
import mibian
import pandas as pd
underlying = pd.read_csv("txt1.csv",names=['dt1','price']);
options = pd.read_csv("txt2.txt",names=['dt2','ticker','maturity','strike','cP','px','yield','rF','T','rlzd10']);
merged = underlying.merge(options,left_on='dt1',right_on='dt2');
我的两个数据头看起来像这样:
>>> underlying.head();
0 1
0 20040326 3.579987
1 20040329 3.690494
2 20040330 3.755247
3 20040331 3.719373
4 20040401 3.728671
和
>>> options.head();
0 1 2 3 4 5 6 7 8 9 10
0 20130628 SVXY 20130817 32.5 call 39.22 32.5 0 0.005 0.136986 0.411224
所以我在任一数据集上的列0是我要合并的键,我想保留两个结果集中的所有数据.
我该怎么做呢?我在网上找到的所有例子都需要密钥,但我的结果中没有.
但是在连接上我得到以下错误:
Traceback (most recent call last):
File "",line 1,in
File "/Applications/Spyder.app/Contents/Resources/lib/python2.7/spyderlib/widgets/externalshell/sitecustomize.py",line 540,in runfile
execfile(filename,namespace)
File "/Users/jasonmellone/.spyder2/.temp.py",line 12,in
merged = underlying.merge(options,right_on='dt2',how='outer');
File "/Library/Python/2.7/site-packages/pandas-0.13.0-py2.7-macosx-10.9-intel.egg/pandas/core/frame.py",line 3723,in merge
suffixes=suffixes,copy=copy)
File "/Library/Python/2.7/site-packages/pandas-0.13.0-py2.7-macosx-10.9-intel.egg/pandas/tools/merge.py",line 40,in merge
return op.get_result()
File "/Library/Python/2.7/site-packages/pandas-0.13.0-py2.7-macosx-10.9-intel.egg/pandas/tools/merge.py",line 197,in get_result
result_data = join_op.get_result()
File "/Library/Python/2.7/site-packages/pandas-0.13.0-py2.7-macosx-10.9-intel.egg/pandas/tools/merge.py",line 722,in get_result
return BlockManager(result_blocks,self.result_axes)
File "/Library/Python/2.7/site-packages/pandas-0.13.0-py2.7-macosx-10.9-intel.egg/pandas/core/internals.py",line 1954,in __init__
self._set_ref_locs(do_refs=True)
File "/Library/Python/2.7/site-packages/pandas-0.13.0-py2.7-macosx-10.9-intel.egg/pandas/core/internals.py",line 2091,in _set_ref_locs
'have _ref_locs set' % (block,labels))
AssertionError: Cannot create BlockManager._ref_locs because block [IntBlock: [dt1],1 x 372145,dtype: int64] with duplicate items [Index([u'dt1',u'price',u'dt2',u'ticker',u'maturity',u'strike',u'cP',u'px',u'yield',u'rF',u'T',u'rlzd10'],dtype='object')] does not have _ref_locs set
我搜索了我的数据集,没有重复.
谢谢!