views:

37

answers:

3

I'm running a cron job (every 15 minutes) which takes about a minute to execute. It makes lots of API calls and stores data to the database.

Right now I create a mysql connection at the beginning and use the same connection through out the code. Most of the time is spent making the API calls.

Will it be more efficient to create a new database connection only when it's time to store the data (below)?

  1. Kill the last connection
  2. Wait for API call to complete
  3. Create new DB connection
  4. Execute query
  5. Goto 1

[Edit] Here's the MYSQL report. I'm new to mysql - is there any reason to re-connect to DB based on the following report?

  1 MySQL 5.1.26-rc-5.1.26r  uptime 0 1:8:58        Tue Jun 15 21:25:03 2010
  2
  3 __ Key _________________________________________________________________
  4 Buffer used    33.00k of  24.00M  %Used:   0.13
  5   Current       4.52M            %Usage:  18.84
  6 Write hit      33.33%
  7 Read hit       69.16%
  8
  9 __ Questions ___________________________________________________________
 10 Total           1.75k     0.4/s
 11   COM_QUIT    319.92k    77.3/s  %Total: 18312.
 12   -Unknown    319.90k    77.3/s          18311.
 13   DMS           1.53k     0.4/s           87.58
 14   Com_            199     0.0/s           11.39
 15   QC Hits           1     0.0/s            0.06
 16 Slow              144     0.0/s            8.24  %DMS:   9.41
 17 DMS             1.53k     0.4/s           87.58
 18   SELECT        1.22k     0.3/s           69.83         79.74
 19   INSERT          155     0.0/s            8.87         10.13
 20   UPDATE          155     0.0/s            8.87         10.13
 21   REPLACE           0       0/s            0.00          0.00
 22   DELETE            0       0/s            0.00          0.00
 23 Com_              199     0.0/s           11.39
 24   check            86     0.0/s            4.92
 25   show_status      41     0.0/s            2.35
 26   set_option       23     0.0/s            1.32
 27
 28 __ SELECT and Sort _____________________________________________________
 29 Scan              653     0.2/s %SELECT:  53.52
 30 Range               0       0/s            0.00
 31 Full join           0       0/s            0.00
 32 Range check         0       0/s            0.00
 33 Full rng join       0       0/s            0.00
 34 Sort scan           0       0/s
 35 Sort range        590     0.1/s
 36 Sort mrg pass       0       0/s
 37
 38 __ Query Cache _________________________________________________________
 39 Memory usage   43.57k of  12.00M  %Used:   0.35
 40 Block Fragmnt  25.35%
 41 Hits                1     0.0/s
 42 Inserts           916     0.2/s
 43 Insrt:Prune     916:1     0.2/s
 44 Hit:Insert     0.00:1        
 45
 46 __ Table Locks _________________________________________________________
 47 Waited              0       0/s  %Total:   0.00
 48 Immediate       1.65k     0.4/s
 49
 50 __ Tables ______________________________________________________________
 51 Open               47 of 1024    %Cache:   4.59
 52 Opened             54     0.0/s
 53
 54 __ Connections _________________________________________________________
 55 Max used            3 of   60      %Max:   5.00
 56 Total         319.92k    77.3/s
 57
 58 __ Created Temp ________________________________________________________
 59 Disk table          2     0.0/s
 60 Table              28     0.0/s
 61 File                5     0.0/s
 62
 63 __ Threads _____________________________________________________________
 64 Running             3 of    3
 65 Cached              0 of    4      %Hit:    100
 66 Created             3     0.0/s
 67 Slow                0       0/s
 68
 69 __ Aborted _____________________________________________________________
 70 Clients             0       0/s
 71 Connects      319.86k    77.3/s
 72
 73 __ Bytes _______________________________________________________________
 74 Sent           52.36M   12.7k/s
 75 Received       23.17M    5.6k/s
+1  A: 

It's a bit hard to give an opinion considering that we have no idea what happens in (2).

Remember that the first rule of optimization is: "Don't do it". In the sense that unless you have good reasons (DB slow for other users, CPU maxed during your cron process and so on) to address the performance problem maybe it's better not to do anything.

If you do have some reason to improve the efficiency of the program, then you will have some hard numbers to compare against (example: your cron batch takes so long that you had to skip some runs, or it ends too late to satisfy user requirements, or fills up the rollback structures etc.) and you can simply apply your modification in your test environment (it looks like a simple fix, unless you forgot to tell us that it would be very complicated to implement) and see if it improves what you have measured and found lacking at the start.

I am sorry but "I wonder if this could be more efficient" without having an idea of what problem you are really trying to address is a recipe for problems.

p.marino
Have edited the question. Most of the things in the report are pretty alien to me as I'm new to mysql. Can you figure out if there's any problem here?
Yeti
+1  A: 

If the bottle neck is that you do not have enough free slots to connect to the DB then, yes, close the connection when possible.
Otherwise, use it and reuse it (at least in the same request).

Itay Moav
+1  A: 

It's rarely advantageous to drop connections and re-establish them. Making a connection to a DB is normally a fairly heavy process. Lots of apps create connection pools just to avoid this: make some reasonable number of connections and then keep them for long periods of time, maybe even forever, letting each thread or user take connections from the pool when you need them and then giving them back.

If you're having a problem with connections getting orphaned -- the query fails and you never manage to free the connection -- the real solution is to implement proper exception handling so that doesn't happen.

If you have one thread sitting on an inactive connection while other threads are failing because the database has hit its connection maximum, then, yes, you need to look at freeing up connections.

Side note: MySQL claims that it can make connections very quickly, so that this is less of an issue than for other database engines. I have never benchmarked this.

Jay