まず最初に。Python 2.7.2+ で Scrapy スクリプト (バージョン: 0.16.2) を実行しています。私が抱えている問題は、MySQLdb ライブラリを使用して生成したパイプ ファイルと SQL 構文にあります。実行しようとしているコードは次のとおりです。
class basketpipeline(object):
def __init__(self):
self.conn = MySQLdb.connect(user="blabla", passwd="blabla", db="basketball", host="blabla")
self.cursor = self.conn.cursor()
def process_item(self,item, spider):
#declare variables for item elements
Game_id=item['game_id']
Attendance=item['attendance']
Referees=item['referees']
#form sql for database implementing variables above into the SQL
self.cursor.execute("""INSERT INTO BasketGame (Game_id,Referees,Attendance) VALUES (%s,%s,%s)""", (Game_id,Referees,Attenance))
self.conn.commit()
return item
そして、スクリプトを実行した後、シェルでこれを取得します:
2013-11-14 00:32:59+0200 [scrapy] INFO: Scrapy 0.16.2 started (bot: basketbase)
2013-11-14 00:32:59+0200 [scrapy] DEBUG: Enabled extensions: LogStats, TelnetConsole,
CloseSpider, WebService, CoreStats, SpiderState
2013-11-14 00:32:59+0200 [scrapy] DEBUG: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, RedirectMiddleware, CookiesMiddleware, HttpCompressionMiddleware, ChunkedTransferMiddleware, DownloaderStats
2013-11-14 00:32:59+0200 [scrapy] DEBUG: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
2013-11-14 00:32:59+0200 [scrapy] DEBUG: Enabled item pipelines: basketpipeline
2013-11-14 00:32:59+0200 [basketsp] INFO: Spider opened
2013-11-14 00:32:59+0200 [basketsp] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2013-11-14 00:32:59+0200 [scrapy] DEBUG: Telnet console listening on 0.0.0.0:6023
2013-11-14 00:32:59+0200 [scrapy] DEBUG: Web service listening on 0.0.0.0:6080
2013-11-14 00:33:01+0200 [basketsp] DEBUG: Crawled (200) <GET http://www.euroleague.net/main/results/showgame?gamecode=46&s%09code=E2013> (referer: None)
2013-11-14 00:33:01+0200 [basketsp] ERROR: Error processing %(item)s
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.16.2-py2.7.egg/scrapy/middleware.py", line 62, in _process_chain
return process_chain(self.methods[methodname], obj, *args)
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.16.2-py2.7.egg/scrapy/utils/defer.py", line 65, in process_chain
d.callback(input)
File "/usr/local/lib/python2.7/dist-packages/Twisted-12.2.0-py2.7-linux-i686.egg/twisted/internet/defer.py", line 368, in callback
self._startRunCallbacks(result)
File "/usr/local/lib/python2.7/dist-packages/Twisted-12.2.0-py2.7-linux-i686.egg/twisted/internet/defer.py", line 464, in _startRunCallbacks
self._runCallbacks()
--- <exception caught here> ---
File "/usr/local/lib/python2.7/dist-packages/Twisted-12.2.0-py2.7-linux-i686.egg/twisted/internet/defer.py", line 551, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/home/vy/basketbase/basketbase/pipelines.py", line 41, in process_item
self.cursor.execute("""INSERT INTO BasketGame (Game_id,Referees,Attendance) VALUES (%s,%s,%s)""", (Game_id,Referees,Attendance))
File "/usr/lib/pymodules/python2.7/MySQLdb/cursors.py", line 174, in execute
self.errorhandler(self, exc, value)
File "/usr/lib/pymodules/python2.7/MySQLdb/connections.py", line 36, in defaulterrorhandler
raise errorclass, errorvalue
_mysql_exceptions.ProgrammingError: (1064, 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near \'),("\'BULTO, VICENTE (ESP), SAHIN, TOLGA (ITA), VYKLICKY, ROBERT (CZE)\'",),("\'695\' at line 1')
エラー:
`_mysql_exceptions.ProgrammingError: (1064, 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near \'),("\'BULTO, VICENTE (ESP), SAHIN, TOLGA (ITA), VYKLICKY, ROBERT (CZE)\'",),("\'695\' at line 1`'
変数 (これら 3 つのいずれか) のみを使用する場合は問題ありません。正常に動作しますが、2 つ以上の変数を使用し始めると、同じ問題が発生します。この問題を解決しようとしましたが、ドキュメントを読んでください: http://mysql-python.sourceforge.net/MySQLdb.html
ここ Stackoverflow で素敵な答えが見つかりました: Python MySQL Parameterized Queries
ただし、複数の変数を使用するとエラーが発生します。私は Python/Scrapy/SQL が初めてなので、ここで何かを見逃している可能性があることはわかっていますが、誰かが助けてくれれば幸いです。前もって感謝します。