tags:

views:

168

answers:

4

I have query which is going in the SENDING DATA state for very long period of time. Can someone please help me with this : below are details

Mysql Query:

select a.msgId,a.senderId,a.destination,a.inTime,a.status as InStatus,b.status as SubStatus,c.deliverTime,substr(c.receipt,82,7) as DlvStatus from inserted_history a left join submitted_history b on b.msgId = a.msgId left join delivered_history c on a.msgId = c.msgId where a.inTime between '2010-08-10 00:00:00' and '2010-08-010 23:59:59' and a.systemId='ND_arber'

Total records in delivered_history : 223870168

Total records in inserted_history : 264817239

Total records in submitted_history : 226637058

Explain query returns:

id , select_type , table , type , possible_keys , key , key_len , ref , rows , Extra
1 , SIMPLE , a , ref , systemId,idx_time , systemId , 14 , const , 735310 , Using where
1 , SIMPLE , b , ref , PRIMARY , PRIMARY , 66 , gwreports2.a.msgId , 2270405 ,
1 , SIMPLE , c , ref , PRIMARY , PRIMARY , 66 , gwreports2.a.msgId , 2238701 ,

CREATE TABLE for delivered_history

CREATE TABLE delivered_history ( msgId VARCHAR(64) NOT NULL, systemId VARCHAR(12) NOT NULL, deliverTime TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00', smscId VARCHAR(64) NOT NULL, smsc VARCHAR(20) NOT NULL, receipt BLOB NULL, errcode INT(11) NULL DEFAULT NULL, PRIMARY KEY (msgId, deliverTime), INDEX systemId (systemId), INDEX smsc (smsc), INDEX idx_time (deliverTime) ) ROW_FORMAT=DEFAULT

CREATE TABLE for inserted_history

CREATE TABLE inserted_history ( msgId VARCHAR(64) NOT NULL, systemId VARCHAR(12) NOT NULL, senderId VARCHAR(15) NOT NULL, destination VARCHAR(15) NOT NULL, inTime TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00', status VARCHAR(20) NOT NULL, msgText BLOB NULL, msgType VARCHAR(15) NULL DEFAULT NULL, PRIMARY KEY (msgId, inTime), INDEX systemId (systemId), INDEX senderId (senderId), INDEX destination (destination), INDEX status (status), INDEX idx_time (inTime) ) ROW_FORMAT=DEFAULT

CREATE TABLE for submitted_history

CREATE TABLE submitted_history ( msgId VARCHAR(64) NOT NULL, systemId VARCHAR(12) NOT NULL, submitTime TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00', status VARCHAR(20) NOT NULL, smscId VARCHAR(64) NOT NULL, smsc VARCHAR(16) NOT NULL, errcode INT(6) NULL DEFAULT '0', PRIMARY KEY (msgId, submitTime), INDEX systemId (systemId), INDEX smsc (smsc), INDEX status (status), INDEX idx_time (submitTime) ) ROW_FORMAT=DEFAULT

ALL TABLES ARE DATE PARTIONED on the timestamp fields

List of the global variables in Mysql Server

Variable_name , Value
auto_increment_increment , 1
auto_increment_offset , 1
autocommit , ON
automatic_sp_privileges , ON
back_log , 50
basedir , /usr/
big_tables , OFF
binlog_cache_size , 32768
binlog_format , STATEMENT
bulk_insert_buffer_size , 8388608
character_set_client , latin1
character_set_connection , latin1
character_set_database , latin1
character_set_filesystem , binary
character_set_results , latin1
character_set_server , latin1
character_set_system , utf8
character_sets_dir , /usr/share/mysql/charsets/
collation_connection , latin1_swedish_ci
collation_database , latin1_swedish_ci
collation_server , latin1_swedish_ci
completion_type , 0
concurrent_insert , 1
connect_timeout , 10
datadir , /var/lib/mysql/
date_format , %Y-%m-%d
datetime_format , %Y-%m-%d %H:%i:%s
default_week_format , 0
delay_key_write , ON
delayed_insert_limit , 100
delayed_insert_timeout , 300
delayed_queue_size , 1000
div_precision_increment , 4
engine_condition_pushdown , ON
error_count , 0
event_scheduler , OFF
expire_logs_days , 10
flush , OFF
flush_time , 0
foreign_key_checks , ON
ft_boolean_syntax , + -><()~*: &|
ft_max_word_len , 84
ft_min_word_len , 4
ft_query_expansion_limit , 20
ft_stopword_file , (built-in)
general_log , OFF
general_log_file , /var/run/mysqld/mysqld.log
group_concat_max_len , 1024
have_community_features , YES
have_compress , YES
have_crypt , YES
have_csv , YES
have_dynamic_loading , YES
have_geometry , YES
have_innodb , YES
have_ndbcluster , NO
have_openssl , DISABLED
have_partitioning , YES
have_query_cache , YES
have_rtree_keys , YES
have_ssl , DISABLED
have_symlink , YES
hostname , smscdb
identity , 0
ignore_builtin_innodb , OFF
init_connect ,
init_file ,
init_slave ,
innodb_adaptive_hash_index , ON
innodb_additional_mem_pool_size , 1048576
innodb_autoextend_increment , 8
innodb_autoinc_lock_mode , 1
innodb_buffer_pool_size , 8388608
innodb_checksums , ON
innodb_commit_concurrency , 0
innodb_concurrency_tickets , 500
innodb_data_file_path , ibdata1:10M:autoextend
innodb_data_home_dir ,
innodb_doublewrite , ON
innodb_fast_shutdown , 1
innodb_file_io_threads , 4
innodb_file_per_table , OFF
innodb_flush_log_at_trx_commit , 1
innodb_flush_method ,
innodb_force_recovery , 0
innodb_lock_wait_timeout , 50
innodb_locks_unsafe_for_binlog , OFF
innodb_log_buffer_size , 1048576
innodb_log_file_size , 5242880
innodb_log_files_in_group , 2
innodb_log_group_home_dir , ./
innodb_max_dirty_pages_pct , 90
innodb_max_purge_lag , 0
innodb_mirrored_log_groups , 1
innodb_open_files , 300
innodb_rollback_on_timeout , OFF
innodb_stats_on_metadata , ON
innodb_support_xa , ON
innodb_sync_spin_loops , 20
innodb_table_locks , ON
innodb_thread_concurrency , 8
innodb_thread_sleep_delay , 10000
innodb_use_legacy_cardinality_algorithm , ON
insert_id , 0
interactive_timeout , 28800
join_buffer_size , 131072
keep_files_on_create , OFF
key_buffer_size , 1073741824
key_cache_age_threshold , 300
key_cache_block_size , 1024
key_cache_division_limit , 100
language , /usr/share/mysql/english/
large_files_support , ON
large_page_size , 0
large_pages , OFF
last_insert_id , 0
lc_time_names , en_US
license , GPL
local_infile , ON
locked_in_memory , OFF
log , OFF
log_bin , ON
log_bin_trust_function_creators , OFF
log_bin_trust_routine_creators , OFF
log_error ,
log_output , FILE
log_queries_not_using_indexes , OFF
log_slave_updates , OFF
log_slow_queries , OFF
log_warnings , 1
long_query_time , 10.000000
low_priority_updates , OFF
lower_case_file_system , OFF
lower_case_table_names , 0
max_allowed_packet , 536870912
max_binlog_cache_size , 4294963200
max_binlog_size , 104857600
max_connect_errors , 10
max_connections , 151
max_delayed_threads , 20
max_error_count , 64
max_heap_table_size , 16777216
max_insert_delayed_threads , 20
max_join_size , 18446744073709551615
max_length_for_sort_data , 1024
max_prepared_stmt_count , 16382
max_relay_log_size , 0
max_seeks_for_key , 4294967295
max_sort_length , 1024
max_sp_recursion_depth , 0
max_tmp_tables , 32
max_user_connections , 0
max_write_lock_count , 4294967295
min_examined_row_limit , 0
multi_range_count , 256
myisam_data_pointer_size , 6
myisam_max_sort_file_size , 2146435072
myisam_recover_options , BACKUP
myisam_repair_threads , 1
myisam_sort_buffer_size , 8388608
myisam_stats_method , nulls_unequal
myisam_use_mmap , OFF
net_buffer_length , 16384
net_read_timeout , 30
net_retry_count , 10
net_write_timeout , 60
new , OFF
old , OFF
old_alter_table , OFF
old_passwords , OFF
open_files_limit , 20000
optimizer_prune_level , 1
optimizer_search_depth , 62
optimizer_switch , index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on
pid_file , /var/run/mysqld/mysqld.pid
plugin_dir , /usr/lib/mysql/plugin
port , 3306
preload_buffer_size , 32768
profiling , OFF
profiling_history_size , 15
protocol_version , 10
pseudo_thread_id , 0
query_alloc_block_size , 8192
query_cache_limit , 1073741824
query_cache_min_res_unit , 4096
query_cache_size , 536870912
query_cache_type , ON
query_cache_wlock_invalidate , OFF
query_prealloc_size , 8192
rand_seed1 ,
rand_seed2 ,
range_alloc_block_size , 4096
read_buffer_size , 131072
read_only , OFF
read_rnd_buffer_size , 33554432
relay_log ,
relay_log_index ,
relay_log_info_file , relay-log.info
relay_log_purge , ON
relay_log_space_limit , 0
report_host ,
report_password ,
report_port , 3306
report_user ,
rpl_recovery_rank , 0
secure_auth , OFF
secure_file_priv ,
server_id , 3
skip_external_locking , ON
skip_networking , OFF
skip_show_database , OFF
slave_compressed_protocol , OFF
slave_exec_mode , STRICT
slave_load_tmpdir , /tmp
slave_net_timeout , 3600
slave_skip_errors , OFF
slave_transaction_retries , 10
slow_launch_time , 2
slow_query_log , OFF
slow_query_log_file , /var/run/mysqld/mysqld-slow.log
socket , /var/run/mysqld/mysqld.sock
sort_buffer_size , 67108864
sql_auto_is_null , ON
sql_big_selects , ON
sql_big_tables , OFF
sql_buffer_result , OFF
sql_log_bin , ON
sql_log_off , OFF
sql_log_update , ON
sql_low_priority_updates , OFF
sql_max_join_size , 18446744073709551615
sql_mode ,
sql_notes , ON
sql_quote_show_create , ON
sql_safe_updates , OFF
sql_select_limit , 18446744073709551615
sql_slave_skip_counter ,
sql_warnings , OFF
ssl_ca ,
ssl_capath ,
ssl_cert ,
ssl_cipher ,
ssl_key ,
storage_engine , MyISAM
sync_binlog , 0
sync_frm , ON
system_time_zone , IST
table_definition_cache , 256
table_lock_wait_timeout , 50
table_open_cache , 500
table_type , MyISAM
thread_cache_size , 8
thread_handling , one-thread-per-connection
thread_stack , 196608
time_format , %H:%i:%s
time_zone , SYSTEM
timed_mutexes , OFF
timestamp , 1282125419
tmp_table_size , 16777216
tmpdir , /tmp
transaction_alloc_block_size , 8192
transaction_prealloc_size , 4096
tx_isolation , REPEATABLE-READ
unique_checks , ON
updatable_views_with_limit , YES
version , 5.1.37-1ubuntu5-log
version_comment , (Ubuntu)
version_compile_machine , i486
version_compile_os , debian-linux-gnu
wait_timeout , 28800
warning_count , 0

A: 

Does filtering the left join tables help to speed up the query?

John M
A: 

Main Idea

Are you using InnoDB? It looks like your buffer pool is only 8MB. That could easily be the problem, you're dealing with a lot of data and InnoDB doesn't have much memory. Can you bump the innodb_buffer_pool_size up? You'll have to restart MySQL, but I'm betting that would make a HUGE difference, even if you only give it 256 or 512MB.

Update: I see your storage engine and table format seem to default to MyISAM, so unless you specified otherwise this wouldn't apply. I wonder if the myisam_sort_buffer_size would help? We don't use MyISAM so I'm not familiar with tuning it.

Random Thought

I wonder if having having the primary key be alphanumeric (especially VARCHAR) has anything to do with it. I remember we had problems with performance on non-numeric primary keys, but that database dated from 4.0 or 4.1, so that may not apply (or ever have been true).

Secondary Idea

After the memory thing above, my best guess would be to give MySQL more hints. When I have a query that's running slow, I often find giving it more information helps it out. You have messageId/time indexes on each table. Maybe something more like this would work better:

select a.msgId,a.senderId,a.destination,a.inTime,a.status as InStatus,
b.status as SubStatus,c.deliverTime,substr(c.receipt,82,7) as DlvStatus
from inserted_history a left join submitted_history b on b.msgId = a.msgId
left join delivered_history c on a.msgId = c.msgId
where a.inTime between '2010-08-10 00:00:00' and '2010-08-010 23:59:59'
and a.systemId='ND_arber' AND c.inTime between b.inTime >= a.inTime
and c.inTime >= b.inTime

I'm guessing things get inserted into A, then B, then C. If you have better limits (say when something goes in A, it's always sent out and submitted within one day) add that information could help.

I wonder about this both because I've seen it help my query performance in some situations, but also because you have the data partitioned on the datetime. That may help the optimizer.

My other suggestion would be to run your query for a short amount of time, say 10 minutes instead of a full day, and make sure the results are right. Then try 30. Increase it and see when it falls off into "come back tomorrow" territory. That may tell you something.

MBCook
A: 

Would it be possible to alter the index of inserted_history,

systemId (systemId)

to be

systemId (systemId, inTime). Or add an additional index

My logic being that this should help to speed up the selection of the inserted_history (a) rows which forms the basis of the join.

The where clause "where a.inTime between '2010-08-10 00:00:00' and '2010-08-010 23:59:59' and a.systemId='ND_arber'" would all be selectable by index. At present, rows are selectable by systemId but then all those rows need to be scanned for the time.

Just as a matter of interest, how many records would there be (on average) for each system id. Also as msgid is not unique on its own, how many records (on average) in the other tables will have teh same msgid.

Jaydee
A: 

Your explain plan that you gave:

id , select_type , table , type , possible_keys      , key      , key_len  , ref                 , rows , Extra
1  , SIMPLE      , a     , ref  , systemId idx_time) , systemId , 14       , const              , 735310 , Using where
1  , SIMPLE      , b     , ref  , PRIMARY            , PRIMARY  , 66       , gwreports2.a.msgId , 2270405 ,
1  , SIMPLE      , c     , ref  , PRIMARY            , PRIMARY  , 66       , gwreports2.a.msgId , 2238701 , 

shows that you are hitting: 735310 * 2270405 * 2238701 = 3T rows!!!!!! Effectively your not using your indexes to their fullest potential.

How to interpret your 'explain plan': For every row in table 'a' (735310 ), you hit table 'b' 2270405 times. For every row you hit in table 'b', you hit table 'c' 2238701 times. As you can see, this is an exponential problem.

Yes, the 8MB of InnoDb Buffer space is small, but getting your explain plan down to xxxx * 1 * 1 will result in incredible speeds, even for 8MB of Buffer Space.

Given your Query:

SELECT a.msgId,a.senderId,a.destination,a.inTime,a.status as InStatus,b.status as SubStatus,c.deliverTime,substr(c.receipt,82,7) as DlvStatus 
  FROM inserted_history a
  LEFT JOIN submitted_history b ON b.msgId = a.msgId  -- USES 1 column of PK
  LEFT JOIN delivered_history c ON a.msgId = c.msgId  -- USES 1 column of PK 
 WHERE a.inTime BETWEEN '2010-08-10 00:00:00' AND '2010-08-010 23:59:59' -- NO key 
   AND a.systemId='ND_arber' -- Uses non-unique PK

Here are the problems I see: A) Your _history tables are partitioned on the columns with 'Timestamp' datatype, YET you are NOT those columns in your JOIN/WHERE criteria. The engine must hit EVERY partition without that information. B) Access to submitted_history and delivered_history is using only 1 column of a 2-column PK. You are only getting partial benefit of the PK. Can you get more columns to be part of the JOIN? You must get the # of rows found for this table as close to '1' as possible.
C) msgID = varchar(64) and this is the 1st column of the PK for each table. Your Keys on each table are ** HUGE **!!
- Try to reduce the size of columns for the PK, or use different columns.

Your data patterns of the other keys shows that you have LOTS of disk/ram space tied up in non-PK keys.

Question 1) What does "Show Indexes FROM " (Link) for each of the tables report?? The column 'Cardinality' will show you how effective each of your keys really are. The smaller the cardinality is, the WORST/Less effective that index is. You want cardinality as close to "total rows" as possible for ideal performance.

Question 2) Can you re-factor the SQL such that the JOIN'd columns of each table are those with the highest cardinality for that table?

Question 3) Is the columns of 'timestamp' datatype really the best column for the partitioning? If your access patterns always use 'msgId', and msgId is the 1st column of the PK, then .

Question 4) Is msgId unique? My guess is yes, and the 2nd column of the PK is not really necessary.

Read up on Optimizing SQL (Link) and have the index cardinality reports of your tables. This is the path to figure out how to optimize an query. You want the 'rows' of the explain plan to be N * 1 * 1.

SIDE NOTE:InnoDb & MyISAM engines does NOT automatically update table cardinality for non-unique columns, the DBA needs to manually run 'Analyze Table' periodically to ensure its accuracy.

Good Luck.

J Jorgenson