views:

559

answers:

14

What would be the easiest way to count the new records that are inserted into a database? Is it possible to include a count query in with the load query?

Or is something more complex needed, such as recording the existing last record and counting everything added after it?

edit:

I have a cron job, that uses LOAD DATA INFILE in a script that is passed directly to mysql. This data is used with a php web application. As part of the php web application, I need to generate weekly reports, including how many records were inserted in the last week.

I am unable to patch mysql, or drastically change the database schema/structure, but I am able to add in new tables or fields. I would prefer not to count records from the csv file and store this result in a textfile or something. INstead, I would prefer to do everything from within PHP with queries.

A: 

That would probably depend on what is determined as being new. Is it entries entered into the database in the last five minutes or 10 minutes etc? Or is it any record past a certain Auto ID?

If you are looking at time based method of determining what's new, you can have a field (probably of type datetime) that records the time when the record was inserted and to get the number, you simply do a...

select count(*) from table where currentTime > 'time-you-consider-to-be-new'

If you don't want to go by recording the time, you can use an auto increment key and simply keep track of the last inserted ID and count the ones that come after that at any given time window. so if one hour ago the ID was 10000 then a number of records have been inserted since then. You will need to count all records greater than 10000 and keep track of the last insert ID and repeat whenever needed.

Steve Obbayi
each time a load is done, the entries loaded in will be considered new, regardless of their date.
Joshxtothe4
What exactly do you mean by "load". There's no LOAD keyword in SQL. Do you mean INSERT?
Dustin Fineout
There is a LOAD command in mysql...
Joshxtothe4
If you're doing a LOAD DATA INFILE, why can't you just count the number of records in the file you're importing?
scwagner
because I load data infile daily, and want to generate a report only weekly. I would rather do this from within PHP...
Joshxtothe4
A: 

Your question is a bit ambiguous but they mysql c APIs provide a function "mysql_affected_rows" that you can call after each query to get the number of affected rows. For an insert it returns the number of rows inserted. Be aware that for updates it returns the number of rows changed not the number of rows that matched the where clause.

If you are performing a number of queries and need to know how many were inserted the most reliable way would probably be doing a count before and after the queries.

As noted in sobbayi's answer adding a "created at" timestamp to your tables would allow you to query for records created after (or before) a given time.

UPDATE: OK here is what you need to do to get a count before and after: create a table for the counts:

create table row_counts (ts timestamp not null, row_count integer not null);

in your script add the following before and after your load file inline query:

insert into row_counts (ts,row_count) select now(),count(0) from YOUR_TABLE;
load file inline......
insert into row_counts (ts,row_count) select now(),count(0) from YOUR_TABLE;

the row_counts table will now have the count before and after your load.

Craig
creating a timestamp field is not an option, as the structure can not be changed. I don't do the load from within php, so affected rows will not help me here.
Joshxtothe4
How do you load the data?
Craig
load data infile from a csv file
Joshxtothe4
I'd be happy to insert the date in a format that is easier to subtract from another date...
Joshxtothe4
A: 

From where do you load the data? You might consider to count them befor you insert them into the database. If it's a sqlscript you might write a quick and dirty bash script (with grep or something similar) to count the fields.

I would rather count after insert than before, just in case of any inaccuracies
Joshxtothe4
Well if you run the insert script, you will get errors if it does not work. If you have no errors, the number is correct. If you have already data in the table and you don't want to do it like this, you have either to add a special field for the current insert or do some pretty complicated stored procedure thingy. But probably there's something I can not think of. But like Craig asked above, how do you want to insert your date is a pretty good question.
I'd much rather do it from within php if possible, due to the existing setup
Joshxtothe4
+1  A: 

Assuming your using Mysql 5 or greater, you could create a trigger which would fire upon inserting into a specific table. Note that an "insert" trigger also fires with the "LOAD" command.

Using a trigger would require you to persist the count information in a separate table. Basically you'd need to create a new table with 1 row/column to hold the count. The trigger would then update that value with the amount of data loaded.

Here's the MySQL manual page on triggers, the syntax is fairly straight forward. http://dev.mysql.com/doc/refman/5.0/en/create-trigger.html

edit

Alternatively, if you don't want to persist the data within the database you could perform your "Load" operations within a stored procedure. This would allow you to perform a select count() on the table before you begin the Load and after the Load is complete. You would just need to subtract the resulting values to determine how many rows were inserted during the Load.

Here's the MySQL manual page on procedures. http://dev.mysql.com/doc/refman/5.0/en/create-procedure.html

CR
Can you expand this a bit more with an example?
Joshxtothe4
A: 

If you are not looking at a specific table, you can use the following:

 show global status like "Com_%";

This will show you statistics for every type of query. These numbers just keep on counting, so if you want to use them, record the initial number when starting to track the queries, and subtract this from your final number (but yea, that's a given).

If you are looking for pure statistics, I can recommend using Munin with the MySQL plugins.

Evert
So I call call this from PHP and store the result, and store it somehow to use each time a report is generated?
Joshxtothe4
Yes, you could even store the result of that in mysql! Just record the number for example every day, and subtract from multiple days to find out how many queries happened in that time.
Evert
Careful though: this will count *every* query on *every* database on the server.
Eli
so I cant use this for just one table?
Joshxtothe4
If you want to count just use a trigger instead. Also, please be more elaborate in your questions.
Evert
A: 

You say you can't change the structure. Does that mean you can't change the table you are inserting into, or you can't change the database at all? If you can add a table, then just create a table with 2 columns - a timestamp and the key of the table you are loading. Before you load your csv file, create another csv file with just those two columns, and load that csv after your main one.

Peter Recore
I can add tables. How would you create another csv file to be loaded in automatically?
Joshxtothe4
just write a script that reads in each line of the primary csv file, grabs the primary key column, and appends a new line to the second csv file. When you want to run your report at the end of the week, you would join the main data table with this second table using the key. that was you could associate a timestamp with each of your rows. Whatever script/program/cron job loads in the main file would also have to load the second file.
Peter Recore
dangit - that should say "that *way* you could"
Peter Recore
A: 

This might be simpler than you want, but what about a Nagios monitor to track the row count? (Also consider asking around on serferfault.com; this stuff is totally up their alley.)

ojrac
A: 

Perhaps you could write a small shell script that queries the database for the number of rows. You could then have a Cron job that runs every minute/hour/day etc and outputs the COUNT to a log file. Over time, you could review the log file and see the rate at which the database is growing. If you also put a date in the log file, you could review it easier over longer periods.

Gav
A: 

See if this is the kind of MySQL data collection you're interested in: http://code.google.com/p/google-mysql-tools/wiki/UserTableMonitoring.

If that is the case, Google offers a MySQL patch (to apply to a clean mysql directory source) at http://google-mysql-tools.googlecode.com/svn/trunk/mysql-patches/all.v4-mysql-5.0.37.patch.gz. You can read more about the patch at http://code.google.com/p/google-mysql-tools/wiki/Mysql5Patches.

If this is not what you're looking for, I suggest you explain yourself a little more in order for us to help you better.

tomzx
patching mysql is not an option. I am not certain what is unclear about my question? I use LOAD DATA INFILE from a cron script hat runs daily, and wish to generate a weekly report of records inserted using php
Joshxtothe4
A: 

Could you use a trigger on the table which will insert into a table you created, which in the structure has a timestamp?

You could then use a date calculation on a period range to find the information needed.

I dont know what version of mysql you are using, but here is link to the syntax for trigger creation in version 5.0: http://dev.mysql.com/doc/refman/5.0/en/create-trigger.html

Good luck,

Matt

Lima
A: 

Well, if you need exhaustive information: which rows were inserted, updated or deleted, it might make sense to create an additional audit table to store those things with a timestamp. You could do this with triggers. I would also write a stored procedure which would execute as event and erase old entries (whatever you consider old).

Refer to the link posted by Lima on how to create triggers in MySQL.

Refer to page 655 of "MySQL Cookbook" by Paul Dubois (2nd Edition) or page 158 of "SQL for smarties" by Joe Celko.

MadH
A: 

so the 'load' will only insert new data in the table ? or rewrite the whole table ?

If it will load new data, then you can do a

select count(*) from yourtable
once before the loading and once after the loading ... the difference will show you how many new records where inserted..

If on the other hand you rewrite the whole table and want to find the different records from the previous version .. then you would need a completely different approach..

Which one is it ?

Gaby
This sounds about right! I am only adding new data to a table with load, not rewriting anything. How would I store the result of the count in another table, to be able to persist the result for my reports?
Joshxtothe4
A: 
show global status like 'Com_insert';

flush status and show session status... will work for just the current connection.

see http://dev.mysql.com/doc/refman/5.1/en/server-status-variables.html#statvar_Com_xxx

ʞɔıu
A: 

Since you asked for the easiest way, I would suggest you to use a trigger on insert. You could use a single column, single row table as a counter and update it with the trigger.

df