views:

162

answers:

5

I've inherited a front end written by a third-party. That front end interacts with Oracle through procedures written by a different third-party. The stored procedure in question requires 2 minutes and 36 seconds to return search results when it is manually executed. I can't see the procedure and that team has suggested that I increase the timeout in the Web application (hosted on a shared server).

In my world, anything over 30 seconds would require a performance fix before being deployed to production with a few exceptions (legacy code, crazy reports, etc.). The option that was suggested to me was to increase the timeout from 30 seconds (explicitly added by the front end developer) to 180 seconds.

My question to you: What are the risks with taking the easy approach and increasing the timeout? If possible, please provide links to articles that support your views so I can reference them.

Also feel free to chime in if you believe this is a non-issue.

+2  A: 

If you really want the procedure to complete, you should run it as a background task (not in the script context) and report its state in a resource accessible via AJAX or a plain page.

If the procedure runs in the script context, then whenever the script dies, the Oracle session dies too and the procedure is rolled back if its not complete yet.

This can be caused by the reasons other than the timeout (connection dies, user closes the page etc.)

Quassnoi
The comment "return search results" implies an synchonous process where the user is waiting for the data. For reports or other asynchronous processes where the user isn't waiting, it makes total sense to run in background through schedulers and task management infrastructure.
Adam Musch
@Adam: Sorry, I don't see how a search "implies a synchronous process". A search can take minutes (which any person who has ever used online flight booking service is aware of) and on a properly written engine the user should be able to see the progress or even close the page and be able to return later.
Quassnoi
@Quassnoi: I disagree, given his other comments about exceptions and "crazy reports". Reports to me are asychronous (you get them when you're done) and searches are synchronous (you need the search to complete to decide what to do next). It would be nice if the OP would clarify. I can't see actual progress when searching for flights on Orbitz, and I've never had it search for minutes without me resubmitting my search. Regardless, as my downvote seems to be deeply offensive, I'll remove it if you update your answer so I can do so.
Adam Musch
@Adam: I don't care much about the reputation, I just wanted to see your point. Do I get it right: you want to say that asynchronous search is a poor design and any search function should be synchronous or not implemented at all?
Quassnoi
@Quassnoi: I would not say that synchronous search is a poor design; I would say that the business case described does imply synchronicity, or the user/front end serializing on the result, if you prefer. If the front end is serializing on the results, performance better be snappy -- 30 seconds being the OP's standard, and I've seen stricter -- whether that means tuning the process or requiring greater selectivity of user input.
Adam Musch
+5  A: 

You should not increase the timeout to hide the problem. You've positively identified a performance problem -- it's something that should be fixed, not swept under the rug.

Two things to do:

Get a SQL trace of the stored procedure.

exec dbms_monitor.session_trace_enable(binds => false, waits => true);
exec poor_performing_procedure();
exec dbms_monitor.session_trace_disable();

Look to see what statements are being run how often, and how much time is spent running them.

Add hooks into DBMS_PROFILER in the code for the stored procedure.

I've got code like this in all my packages so that I can determine whether or not to profile them by setting a package variable:

PROCEDURE profiler_control(p_start_stop IN VARCHAR2, p_run_comm IN VARCHAR2, p_ret OUT BOOLEAN) AS
  l_ret_code INTEGER;
BEGIN
  l_ret_code:=dbms_profiler.internal_version_check;
  IF l_ret_code !=0 THEN
    p_ret:=FALSE;
  ELSIF p_start_stop NOT IN ('START','STOP') THEN
    p_ret:=FALSE;
  ELSIF p_start_stop = 'START' THEN
    l_ret_code:=DBMS_PROFILER.START_PROFILER(run_comment1 => p_run_comm);
    IF l_ret_code=0 THEN
      p_ret:=TRUE;
    ELSE
      p_ret:=FALSE;
    END IF;
  ELSIF p_start_stop = 'STOP' THEN
    l_ret_code:=DBMS_PROFILER.FLUSH_DATA;
    l_ret_code:=DBMS_PROFILER.STOP_PROFILER;
    IF l_ret_code=0 THEN
      p_ret:=TRUE;
    ELSE
      p_ret:=FALSE;
    END IF;
  END IF;
END profiler_control;

Inside the procedures, then, there's code like:

create or replace procedure poorly_performing_procedure() 
begin
  if run_profiler then
    profiler_control('START', 'poorly_performing_procedure', g_retval);
  end if;
...
  if run_profiler then
    profiler_control('STOP', 'poorly_performing_procedure', g_retval);
  end if;
end poorly_performing_procedure;
/

Oracle provides scripts (one named profiler.sql) you can use to get pretty reports to show how many times each PL/SQL statement/operation was executed during a run. Here's a link to the DBMS_PROFILER documentation for 10g.

Adam Musch
I won't downvote your answer, but the op clearly stated that the procedures are written by the third party and are out of his control.
Quassnoi
Voting up, these are the 2 methods I would use to see what is going on for code I could not touch. Learning curve on trace files is steep but good tool to have in your DBA toolbox. Profiling doesn't have to be invasive you can do on command line with simple start_profiler()/run_SP()/stop_profiler() commands to gather info if the code isn't wrapped. Once you get a bead on slow SQL statements you very well may be able to fix the issue yourself with indexes or by more accurate statistics. (No code changes!)
David Mann
@Quassnoi: Even if the poorly performing code's out of his hands, the OP could create his own procedure, wrapping it in DBMS_PROFILER calls. That would help demonstrate to the owner of the code where inefficiencies or other problems lie.
Adam Musch
Vote up. The SQL trace MAY show that the problem isn't in the PL/SQL but in a query which could be improved through database changes [eg stats, index...]
Gary
+1 even if you can't view or change the code, if the performance issue is some SQL, you can sometimes use stored outlines to change the query plan.
Jeffrey Kemp
+2  A: 

The issue with increasing timeouts globally is that you could run into several issues down the road:

  1. Denial of service attacks.
  2. Resource exhaustion on the server.
  3. Reduced throughput.

When you increase timeouts what you are telling the server is that it needs to keep the thread that it is using to service that request running. The server will have a limited number of threads, so a thread that it keeps running for extended periods of time is a thread that is not available to service other requests. If you have a lot of requests that take a long time to run, then eventually you run out of threads and the server becomes unresponsive.

Whether this matters at all for you will depend on how many requests are made to that particular stored procedure. If there is only one request made every so often, then it's not a big deal. However, the problem with setting the timeout globally is that it now applies to ALL requests, so if there are other requests that might take a long time to run, you will be extending their duration as well.

TskTsk
+1  A: 

I don't think increasing the timeout to 180 seconds is a good idea. I worked for a fast growing company for 2 years. During that time we had dozens of stored procedures that had moving execution times. They started out running at under 1 second, then they took 30 seconds, and then eventually 2 or 3 minutes. Once they got to 2 minutes they would cause a timeout in the site, we would catch it, and rewrite the proc to be more efficient. Long story short if you up the time to 180 seconds that means you might be increasing the timeout window to 360 seconds in a month and then 720 seconds in 2 months. You can see where this is going. If others disagree then you need to understand where they are coming from because any kind of data growth is going to slow your performance down.

RandomBen
+1  A: 

You say that you can't see the procedure. Do you have access to the database, and can you make changes to it? If you can't access the database and/or can't make changes to it, I think your options are limited to:

  1. Increase the timeout, and
  2. Encourage the outside vendors to address the problem.

As others have said, increasing the timeout is not a good solution for all sorts of reasons. Encouraging the outside vendors to help (by, for example, threatening to replace their application with a competitor's, or refusing to pay license fees until your problems are addressed) may be your best bet. As a rule, the person you want to talk to about this is the SALES GUY, not the technical staff. The techie's do not, in general, give a hoot about losing revenue because it doesn't affect them directly. The sales guy DOES care because if you bail or refuse to pay it probably has a direct impact on his paycheck, so he has a vested interested in keeping you happy, and he probably has some level of influence over the tech's. Key catchphrases for dealing with sales guys are "...performance is unacceptable in our environment...", "...unable to support the business using your application...", "...will not pay one cent more for this application until it meets our needs...", and the always popular "...we will be evaluating alternative solutions...". Remember, the sales dude is there to be your interface to his company - so get inter his face - or better yet, have some manager type chew on the sales dude for a while. (That's what managers are for...). The squeaky wheel gets the grease...

If on the other hand you can modify the database it might serve you well to do as @Adam Munsch suggested and figure out what SQL statements are running so doggone slow. You might be able to dramatically improve the situation by adding an index or two.

Good luck.

Bob Jarvis