views:

1959

answers:

3

I need some pointers on how to diagnose and fix this problem. I don't know if this is a simple server setup problem or an application design problem (or both).

Once or twice every few months this Oracle XE database reports ORA-4031 errors. It doesn't point to any particular part of the sga consistently. A recent example is:

ORA-04031: unable to allocate 8208 bytes of shared memory ("large pool","unknown object","sort subheap","sort key")

When this error comes up, if the user keeps refreshing, clicking on different links, they'll generally get more of these kinds of errors at different times, then soon they'll get "404 not found" page errors.

Restarting the database usually resolves the problem for a while, then a month or so later it comes up again, but rarely at the same location in the program (i.e. it doesn't seem linked to any particular portion of code) (the above example error was raised from an Apex page which was sorting 5000+ rows from a table).

I've tried increasing sga_max_size from 140M to 256M and hope this will help things. Of course, I won't know if this has helped since I had to restart the database to change the setting :)

I'm running Oracle XE 10.2.0.1.0 on a Oracle Enterprise Linux 5 box with 512MB of RAM. The server only runs the database, Oracle Apex (v3.1.2) and Apache web server. I installed it with pretty much all default parameters and it's been running quite well for a year or so. Most issues I've been able to resolve myself by tuning the application code; it's not intensively used and isn't a business critical system.

These are some current settings I think may be relevant:

pga_aggregate_target        41,943,040
sga_max_size              268,435,456
sga_target                146,800,640
shared_pool_reserved_size   5,452,595
shared_pool_size          104,857,600

If it's any help here's the current SGA sizes:

Total System Global Area  268435456 bytes
Fixed Size                  1258392 bytes
Variable Size             251661416 bytes
Database Buffers           12582912 bytes
Redo Buffers                2932736 bytes
A: 

Check out this link

Oracle Help - ORA04031

Gandalf
Thanks, I have read links like that. Since I'm using ASMM on 10g so I think most of the info in that link is n/a here? (see also http://download.oracle.com/docs/cd/B19306_01/server.102/b14237/initparams193.htm)
Jeffrey Kemp
Didnt know you were using ASMM
Gandalf
Yes - if sga_target is set that means the buffer cache, shared pool, large pool, java pool, and streams pool are automatically sized.
Jeffrey Kemp
+1  A: 

Even though you are using ASMM, you can set a minimum size for the large pool (MMAN will not shrink it below that value). You can also try pinning some objects and increasing SGA_TARGET.

Kathryn

sounds reasonable, I'll give these a go.
Jeffrey Kemp
I've set the min size for the large pool and increased sga_target to the same as sga_max_size. I'll see how it goes, thanks.
Jeffrey Kemp
I'll accept this as the answer because I think it's the best advice, even though to test this will take the next month or so to see if the errors happen again.
Jeffrey Kemp
Well, it's been a year now, and our regular spike of activity in April/May so far has yielded no memory errors or server downtime at all. Looking good :)
Jeffrey Kemp
+1  A: 

Don't forget about fragmentation. If you have a lot of traffic, your pools can be fragmented and even if you have several MB free, there could be no block larger than 4KB. Check size of largest free block with a query like:

 select
  '0 (<140)' BUCKET, KSMCHCLS, KSMCHIDX,
  10*trunc(KSMCHSIZ/10) "From",
  count(*) "Count" ,
  max(KSMCHSIZ) "Biggest",
  trunc(avg(KSMCHSIZ)) "AvgSize",
  trunc(sum(KSMCHSIZ)) "Total"
from
  x$ksmsp
where
  KSMCHSIZ<140
and
  KSMCHCLS='free'
group by
  KSMCHCLS, KSMCHIDX, 10*trunc(KSMCHSIZ/10)
UNION ALL
select
  '1 (140-267)' BUCKET,
  KSMCHCLS,
  KSMCHIDX,
  20*trunc(KSMCHSIZ/20) ,
  count(*) ,
  max(KSMCHSIZ) ,
  trunc(avg(KSMCHSIZ)) "AvgSize",
  trunc(sum(KSMCHSIZ)) "Total"
from
  x$ksmsp
where
  KSMCHSIZ between 140 and 267
and
  KSMCHCLS='free'
group by
  KSMCHCLS, KSMCHIDX, 20*trunc(KSMCHSIZ/20)
UNION ALL
select
  '2 (268-523)' BUCKET,
  KSMCHCLS,
  KSMCHIDX,
  50*trunc(KSMCHSIZ/50) ,
  count(*) ,
  max(KSMCHSIZ) ,
  trunc(avg(KSMCHSIZ)) "AvgSize",
  trunc(sum(KSMCHSIZ)) "Total"
from
  x$ksmsp
where
  KSMCHSIZ between 268 and 523
and
  KSMCHCLS='free'
group by
  KSMCHCLS, KSMCHIDX, 50*trunc(KSMCHSIZ/50)
UNION ALL
select
  '3-5 (524-4107)' BUCKET,
  KSMCHCLS,
  KSMCHIDX,
  500*trunc(KSMCHSIZ/500) ,
  count(*) ,
  max(KSMCHSIZ) ,
  trunc(avg(KSMCHSIZ)) "AvgSize",
  trunc(sum(KSMCHSIZ)) "Total"
from
  x$ksmsp
where
  KSMCHSIZ between 524 and 4107
and
  KSMCHCLS='free'
group by
  KSMCHCLS, KSMCHIDX, 500*trunc(KSMCHSIZ/500)
UNION ALL
select
  '6+ (4108+)' BUCKET,
  KSMCHCLS,
  KSMCHIDX,
  1000*trunc(KSMCHSIZ/1000) ,
  count(*) ,
  max(KSMCHSIZ) ,
  trunc(avg(KSMCHSIZ)) "AvgSize",
  trunc(sum(KSMCHSIZ)) "Total"
from
  x$ksmsp
where
  KSMCHSIZ >= 4108
and
  KSMCHCLS='free'
group by
  KSMCHCLS, KSMCHIDX, 1000*trunc(KSMCHSIZ/1000);

Code from

slovon
+1 for the query. Next time I get these errors I'll use it again to see if that's the problem. Thanks!
Jeffrey Kemp