tags:

views:

92

answers:

1

Hello, I have a test table. The test table is as follows:

CREATE TABLE  `mytest` (
  `num1` int(10) unsigned NOT NULL,
  KEY `key1` (`num1`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

I inserted 50 million rows in this table.

When I do show table status the avg_row_length is 7. I was expecting to see 4 since mySQL uses 4 bytes for integers. Does the key have an effect on the avg_row_length? When I look at my .MYD file the size is 334 MB which is exactly what it should be given that the avg_row_length is 7, however I was really expecting to see 190 MB given that I only have an int.

+----------------+--------+---------+------------+----------+----------------+-------------+------------------+--------------+-----------+----------------+---------------------+---------------------+------------+-------------------+----------+----------------+---------+
| Name           | Engine | Version | Row_format | Rows     | Avg_row_length | Data_length | Max_data_length  | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time | Collation         | Checksum | Create_options | Comment |
+----------------+--------+---------+------------+----------+----------------+-------------+------------------+--------------+-----------+----------------+---------------------+---------------------+------------+-------------------+----------+----------------+---------+
| mytest         | MyISAM |      10 | Fixed      | 50000000 |              7 |   350000000 | 1970324836974591 |    600518656 |         0 |           NULL | 2010-05-22 09:15:06 | 2010-05-22 19:32:53 | NULL       | latin1_swedish_ci |     NULL |                |         |

I have included the output of show table status for mytest table. Sorry about the formatting :D Thanks in advance!

Alessandro Ferrucci

+1  A: 

I think the problem is with the pointer size MySQL uses by default.

A citation from MySQL reference:

AVG_ROW_LENGTH

An approximation of the average row length for your table. You need to set this only for large tables with variable-size rows.

When you create a MyISAM table, MySQL uses the product of the MAX_ROWS and AVG_ROW_LENGTH options to decide how big the resulting table is. If you don't specify either option, the maximum size for MyISAM data and index files is 256TB by default. (If your operating system does not support files that large, table sizes are constrained by the file size limit.) If you want to keep down the pointer sizes to make the index smaller and faster and you don't really need big files, you can decrease the default pointer size by setting the myisam_data_pointer_size system variable. (See Section 5.1.4, “Server System Variables”.) If you want all your tables to be able to grow above the default limit and are willing to have your tables slightly slower and larger than necessary, you can increase the default pointer size by setting this variable. Setting the value to 7 allows table sizes up to 65,536TB.

Try to set MAX_ROWS and AVG_ROW_LENGTH yourself on table creation or with ALTER TABLE statement, and see, if that helped.

newtover
Interesting. I tried setting just the AVG_ROW_LENGTH since that's really what I was looking to constrict. That took a really short time and did nothing (AVG_ROW_LENGTH was still 7). I then set both variables with the following statement:alter table mytest MAX_ROWS=50000000 AVG_ROW_LENGTH=4;and now the AVG_ROW_LENGTH is 5 (bytes I'm assuming).I'm not sure what MyISAM is using that extra byte for... byte padding? but why would it need to pad an already even-number byte size?
alessandro ferrucci
I forgot to add, now the entire .MYD size is 239 MB, which does not compute to 5 * 500,000,000 . Anyone know if altering a table and changing the max row length and avg_row_size allows myISAM to do some space saving optimizations in the way it stores the data?
alessandro ferrucci
@alessandro ferrucci: I am not sure, but I think that the `AVG_ROW_LENGTH` and `MAX_ROWS` are just advisable values. And probably, AVG_ROW_SIZE is even calculated based on current data size and row_count. But I do not think that MySQL reserves space exactly for the current number of rows, thus you obtain the difference. These are my own speculations though.
newtover
@alessandro ferrucci: setting both values (MAX_ROWS and AVG_ROW_SIZE) to a MyISAM table instructs MySQL to recalculate poiner size for the table. I read about it in a book about MySQL tuning (written by authors of http://www.mysqlperformanceblog.com/)
newtover
These parameters only affect the index, overriding the variable "myisam_data_pointer_size" - which defaults to 6 bytes since 5.0. Each index entry contains either a pointer (for dynamic format rows) or a record number (for fixed format) which are used to locate the appropriate data record in the table file (MYD).
Martin
@Martin: altering the table to have those 2 attributes however did both 1) change the size of AVG_ROW_LENGTH and decrease the size of the table file. I do not understand how it could only affect the index if it shrunk my table file size.
alessandro ferrucci
For a table containing a single non-nullable int, I expect the avg_row_length to be 5: 4-bytes for the int, 1-byte for the delete flag. On linux, I get 6-bytes, on windows I get 7. For a table containing 2 non-nullable ints, I expect the avg_row_length to be 9 bytes: 8-bytes for the ints, 1-byte for the delete flag. On linux and windows, this is exactly what happens. I think you may have found a bug in the myisam engine for the case of a single non-nullable int.
Martin
@Martin: nice to learn about the delete flag.
newtover
@Martin: Thank you very much for that piece of information. I will look for a dev/user listserv for myISAM to report/ask about this.
alessandro ferrucci