views:

143

answers:

1

I'm building business app that will hold somewhere between 50,000 to 150,000 companies. Each company (db row) is represented with 4-5 properties/columns (title, location,...). ORM is LINQ2SQL.

I have to do some calculation, and for that I have lot of queries for specific company. Now, i go to db every time when i need something, and it produces 50-200 queries, depending on calculation complexy. I tried to put all companies to cache, and for 10,000 rows (companies) in db, it takes around 5,5MB of cache. In this scenario, I have only one query.

This application will be on shared hosting server, so my resources are limited. I'm interested, what will happen if I try to load, let say 100,000 companies (rows, objects)? Or put that in cache?
Is there any RAM limit that average hosting company give to ASP.NET application? Does it depend on dedicated Applcation Pool (I can put app to dedicated pool)?

Options are:
- load whole table to c# objects. Id did some memory profiling, 10,000 objects needs 5MB RAM
- query db to get referenced objects when needed.

Task is: for given company A, build tree of connected companies.

Table and columns:
Company : IdCompany, Title, Address, Contact
CompanyConnection: IdParentCompany, IdChildCompany

+1  A: 

Your shared host will likely be IIS 7 on Windows Server running as a virtual machine. This machine will behave as any ordinary machine would - it is not 'aware' of being shared or virtualised.

You should expect Windows to begin paging to disk when it is out of physical RAM and then out of memory errors only get thrown only when the page file has filled the disk. Of course, you don't ever want to page any part of the warm cache to disk.

Windows itself can begin nagging you about being out of memory, but this is not the same 'urgency' and applications will continue to be able to request RAM and it will continue being given (albeit serviced from the page file).

If you application could crash and leave corrupt state or a partial transaction, then you should code defensively and check memory is available before embarking upon an action.

Create the expected number of objects in a loop with pretend data and watch the memory consumption on the box - the Working Set of the worker process is the one to watch. You can do this in Task Manager.

Watch for Page Faults. These are events when a memory operation had to be directed to disk.

Also, very large sets of objects can cause long garbage collection cycles >1second. This can be a big issue in time-sensitive applications like trading and market data.

Hope that helps.

Update: I do a similar caching thang for a mega data-mining application.

Each ORM type has a GetObject method which uses a giant cache or goes to disk and then updates the cache: Person.GetPerson( check people cache, go to db, add to people cache )

Now my queries return just the unique keys of the results. Then each key is fetched using the above method. This is slow initially until the cache builds up but...

The point being that each query result points to the same instance in memory! This means the RAM footprint is much smaller due to sharing.

The query results are then cached, too. Of course.

Where objects are not immutable, each object-write updates its own instance in the giant cache but also causes all query caches that concern that type of object to void themselves!

Of course, in this application, writes are rare as its mainly reference data.

Luke Puplett