これはpythonとsqlalchemyで行われた私のクエリですが、これはsqlalchemyが遅いとは思いません。高速なクエリを作成する方法を知らないだけです。クエリには約 8 秒かかり、45,000 件の結果が返されます。
games = s.query(Box_Score, Game).join(Game, Box_Score.espn_game_id==Game.espn_game_id)
.filter(Game.total!=999)
.filter(Game.a_line!=999)\
.order_by(Box_Score.date.desc()).all()
これは通常の SQL のクエリです
SELECT box_scores.date AS box_scores_date, box_scores.id AS box_scores_id, box_scores.player_name AS box_scores_player_name, box_scores.team_name AS box_scores_team_name, box_scores.espn_player_id AS box_scores_espn_player_id, box_scores.espn_game_id AS box_scores_espn_game_id, box_scores.pass_attempt AS box_scores_pass_attempt, box_scores.pass_made AS box_scores_pass_made, box_scores.pass_yards AS box_scores_pass_yards, box_scores.pass_td AS box_scores_pass_td, box_scores.pass_int AS box_scores_pass_int, box_scores.pass_longest AS box_scores_pass_longest, box_scores.run_carry AS box_scores_run_carry, box_scores.run_yards AS box_scores_run_yards, box_scores.run_td AS box_scores_run_td, box_scores.run_longest AS box_scores_run_longest, box_scores.reception AS box_scores_reception, box_scores.reception_yards AS box_scores_reception_yards, box_scores.reception_td AS box_scores_reception_td, box_scores.reception_longest AS box_scores_reception_longest, box_scores.interception_lost AS box_scores_interception_lost, box_scores.interception_won AS box_scores_interception_won, box_scores.fg_attempt AS box_scores_fg_attempt, box_scores.fg_made AS box_scores_fg_made, box_scores.fg_longest AS box_scores_fg_longest, box_scores.punt AS box_scores_punt, box_scores.first_down AS box_scores_first_down, box_scores.penalty AS box_scores_penalty, box_scores.penalty_yards AS box_scores_penalty_yards, box_scores.fumbles AS box_scores_fumbles, box_scores.possession AS box_scores_possession, games.id AS games_id, games.espn_game_id AS games_espn_game_id, games.date AS games_date, games.status AS games_status, games.time AS games_time, games.season AS games_season, games.h_name AS games_h_name, games.a_name AS games_a_name, games.league AS games_league, games.h_q1 AS games_h_q1, games.h_q2 AS games_h_q2, games.h_q3 AS games_h_q3, games.h_q4 AS games_h_q4, games.h_ot AS games_h_ot, games.h_score AS games_h_score, games.a_q1 AS games_a_q1, games.a_q2 AS games_a_q2, games.a_q3 AS games_a_q3, games.a_q4 AS games_a_q4, games.a_ot AS games_a_ot, games.a_score AS games_a_score, games.possession_h2 AS games_possession_h2, games.d_yards_h1 AS games_d_yards_h1, games.f_yards_h1 AS games_f_yards_h1, games.h_ml AS games_h_ml, games.a_ml AS games_a_ml, games.h_h1_ml AS games_h_h1_ml, games.a_h1_ml AS games_a_h1_ml, games.h_q1_ml AS games_h_q1_ml, games.a_q1_ml AS games_a_q1_ml, games.h_h2_ml AS games_h_h2_ml, games.a_h2_ml AS games_a_h2_ml, games.h_line AS games_h_line, games.h_price AS games_h_price, games.a_line AS games_a_line, games.a_price AS games_a_price, games.h_open_line AS games_h_open_line, games.h_open_price AS games_h_open_price, games.a_open_line AS games_a_open_line, games.a_open_price AS games_a_open_price, games.h_h1_line AS games_h_h1_line, games.h_h1_price AS games_h_h1_price, games.a_h1_line AS games_a_h1_line, games.a_h1_price AS games_a_h1_price, games.h_q1_line AS games_h_q1_line, games.h_q1_price AS games_h_q1_price, games.a_q1_line AS games_a_q1_line, games.a_q1_price AS games_a_q1_price, games.h_h2_line AS games_h_h2_line, games.h_h2_price AS games_h_h2_price, games.a_h2_line AS games_a_h2_line, games.a_h2_price AS games_a_h2_price, games.total AS games_total, games.o_price AS games_o_price, games.u_price AS games_u_price, games.total_h1 AS games_total_h1, games.o_h1_price AS games_o_h1_price, games.u_h1_price AS games_u_h1_price, games.total_q1 AS games_total_q1, games.o_q1_price AS games_o_q1_price, games.u_q1_price AS games_u_q1_price, games.total_h2 AS games_total_h2, games.o_h2_price AS games_o_h2_price, games.u_h2_price AS games_u_h2_price
FROM box_scores JOIN games ON box_scores.espn_game_id = games.espn_game_id
WHERE games.total != :total_1 AND games.a_line != :a_line_1 ORDER BY box_scores.date DESC
このクエリには 3 秒以上かかり、55,000 件の結果が返されます。
box_scores = s.query(Box_Score).all()
私は何か間違ったことをしているに違いない。人々が何百万ものエントリを持つデータベースを定期的に使用していることは知っているので、なぜ 50,000 行を選択することが大したことなのかわかりません。また、Game ではなく Box_Score に参加して、order_by() 部分を取り除いてみましたが、どちらもパフォーマンスが向上しませんでした。
アップデート; 以下の質問に答えるために、断片化とは何かを学ぼうとしています。まだ理解できていませんが、コマンド PRAGMA page_count -> 64,785 を実行しましたが、これは大きな数字ではないようです。また、sqlite nfl.db "VACUUM"; も実行しました。その後、クエリを再度実行しましたが、パフォーマンスは向上しませんでした。