views:

116

answers:

3

What is the easiest way to determine which table takes most disk space ?

A handicap: I have no MySQL server, only the file with all data (dump.sql)

+3  A: 

You may want to download MySQL server, install it on your localhost, import the dump file, and then use:

SELECT   table_schema, 
         table_name, 
         data_length, 
         index_length 
FROM     information_schema.tables
ORDER BY data_length DESC;
Daniel Vassallo
I told I search a solution without MySQL server, I'm interested in text processing ideas only
astropanic
@astropanic: It's true, but why can't you install it on your local machine? There is a version of MySQL for every platform, and it installs in minutes. You can also install it in a VM.
Daniel Vassallo
A: 

If you're on linux, you can scan the dump file for the longest line, which may (or may not!) be the largest table depending on indexes, structure etc, but it's going to be a decent guess if you can't get a server up.

awk ' { if ( length > L ) { L=length ;s=$0 } }END{ print L,"\""s"\"" }' /root/sql/_common.all.301009.sql  | cut -c 1-100

This will show you the first 100 characters of the longest line. This may not work if you have, for example, multiple inserts per table in your dump file (with --extended-insert or --opt options).

Andy
A: 

A quick solution might be to do something like that for each table in the dump:

cat dump.sql | grep -i '^INSERT INTO `table1`' | wc -c

(I hope you are using Linux, otherwise you can install Cygwin to get some of the functionally of the linux command line on a windows system)

This command will filter out the inserts statements of the dump for a specific table and print out the total bytes of those filtered insert statements. Of course, the size of the insert statements isn't the same as the size of the stored rows in the table, but it might be a good approximation if you only need to decide which table is (probably) the largest.

tux21b
You'll need to preprocess the dump file into individual tables to use this command - it'll only output the total character count in this form
Andy
That's why there is the `grep` command between the `cat` and the `wc -c` command. This will create such a "individual" dump on the fly...
tux21b
I don't understand how this works, when I execute I only get the sum of characters from all rows?
Andy
When you execute only the first part of the command `cat dump.sql | grep -i '^INSERT INTO `table1`'` you should only see lines starting with "INSERT INTO table1" and no lines starting with "INSERT INTO table2 or something like that. Those statements are filtered. `wc -c` then display the bytes of only the matched lines.
tux21b