I'm attempting to debug a query which is slow on production, but fast on my development machine. My dev box has a snapshot of the prod database which is only a couple of days old, so the contents of both DBs are roughly the same.
The query is:
select count(*) from big_table where search_column in ('something')
Notes:
big_table
is a snapshot materialized view with about 35M rows and is refreshed dailysearch_column
has a b-tree index.- prod is 9.1 on ubuntu
- dev is 9.0 on OS X
Query Plan
The results of explain analyze
:
prod:
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1119843.20..1119843.21 rows=1 width=0) (actual time=467388.276..467388.278 rows=1 loops=1)
-> Bitmap Heap Scan on big_table (cost=10432.55..1118804.45 rows=415497 width=0) (actual time=116891.126..466949.331 rows=210053 loops=1)
Recheck Cond: ((search_column)::text = 'something'::text)
-> Bitmap Index Scan on big_table_search_column_index (cost=0.00..10328.68 rows=415497 width=0) (actual time=8467.901..8467.901 rows=337164 loops=1)
Index Cond: ((search_column)::text = 'something'::text)
Total runtime: 467389.534 ms
(6 rows)
dev:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=524011.38..524011.39 rows=1 width=0) (actual time=209.852..209.852 rows=1 loops=1)
-> Bitmap Heap Scan on big_table (cost=5131.43..523531.22 rows=192064 width=0) (actual time=33.792..194.730 rows=209551 loops=1)
Recheck Cond: ((search_column)::text = 'something'::text)
-> Bitmap Index Scan on big_table_search_column_index (cost=0.00..5083.42 rows=192064 width=0) (actual time=27.568..27.568 rows=209551 loops=1)
Index Cond: ((search_column)::text = 'something'::text)
Total runtime: 209.938 ms
(6 rows)
and the actual results of the two queries for prod and dev are 210053 and 209551 rows, respectively.
Although the structure of the two plans are the same, what could possibly explain the differences in the costs of the above, given that there are roughly the same number of rows in this table in each DB?
Bloat
On @bma's suggestion, here are the results of the "bloat" query for prod and dev and the relevant table/index:
prod:
current_database | schemaname | tablename | tbloat | wastedbytes | iname | ibloat | wastedibytes
------------------+------------+---------------------------------+--------+-------------+---------------------------------------------------------------+--------+--------------
my_db | public | big_table | 1.6 | 7965433856 | big_table_search_column_index | 0.1 | 0
dev:
current_database | schemaname | tablename | tbloat | wastedbytes | iname | ibloat | wastedibytes
------------------+------------+---------------------------------+--------+-------------+---------------------------------------------------------------+--------+--------------
my_db | public | big_table | 0.8 | 0 | big_table_search_column_index | 0.1 | 0
Voila, there is a difference here.
I have run vacuum analyze big_table;
but that doesn't seem to have made any significant different to the run time of the count query.
Config
Results of SELECT name, current_setting(name), source FROM pg_settings WHERE source NOT IN ('default', 'override');
as suggested by bma:
prod:
name | current_setting | source
----------------------------+----------------------------------+----------------------
application_name | psql | client
DateStyle | ISO, MDY | configuration file
default_text_search_config | pg_catalog.english | configuration file
effective_cache_size | 6GB | configuration file
external_pid_file | /var/run/postgresql/9.1-main.pid | configuration file
listen_addresses | * | configuration file
log_line_prefix | %t | configuration file
log_timezone | localtime | environment variable
max_connections | 100 | configuration file
max_stack_depth | 2MB | environment variable
port | 5432 | configuration file
shared_buffers | 2GB | configuration file
ssl | on | configuration file
TimeZone | localtime | environment variable
unix_socket_directory | /var/run/postgresql | configuration file
(15 rows)
dev:
name | current_setting | source
----------------------------+-------------------------+----------------------
application_name | psql | client
DateStyle | ISO, MDY | configuration file
default_text_search_config | pg_catalog.english | configuration file
effective_cache_size | 4GB | configuration file
lc_messages | en_US | configuration file
lc_monetary | en_US | configuration file
lc_numeric | en_US | configuration file
lc_time | en_US | configuration file
listen_addresses | * | configuration file
log_destination | syslog | configuration file
log_directory | ../var | configuration file
log_filename | postgresql-%Y-%m-%d.log | configuration file
log_line_prefix | %t | configuration file
log_statement | all | configuration file
log_timezone | Australia/Hobart | command line
logging_collector | on | configuration file
maintenance_work_mem | 512MB | configuration file
max_connections | 50 | configuration file
max_stack_depth | 2MB | environment variable
shared_buffers | 2GB | configuration file
ssl | off | configuration file
synchronous_commit | off | configuration file
TimeZone | Australia/Hobart | command line
timezone_abbreviations | Default | command line
work_mem | 100MB | configuration file
(25 rows)