views:

450

answers:

4

I'm trying to parse through e-mails in Outlook 2007. I need to streamline it as fast as possible and seem to be having some trouble.

Basically it's:

foreach( Folder fld in outllookApp.Session.Folders )
{
    foreach( MailItem mailItem in fld )
    {
        string body = mailItem.Body;
    }
}

and for 5000 e-mails, this takes over 100 seconds. It doesn't seem to me like this should be taking anywhere near this long.

If I add:

string entry = mailItem.EntryID;

It ends up being an extra 30 seconds.

I'm doing all sorts of string manipulations including regular expressions with these strings and writing out to database and still, those 2 lines take 50% of my runtime.

I'm using Visual Studio 2008

+1  A: 

I do not know if this will address your specific issue, but the latest Office 2007 service pack made a synificant performance difference (improvement) for Outlook with large numbers of messages.

JonnyBoats
Ah, came out a little over a week ago. Trying it out.
McAden
5-10% improvement.Had hoped for more but it's something. Thanks for the heads-up. I was hoping for some sort of other way to access the mail items but it's looking like I'm bound by Outlook's I/O on this one.
McAden
A: 

Are you just reading in those strings in this loop, or are you reading in a string, processing it, then moving on to the next? You could try reading all the messages into a HashTable inside your loop then process them after they've been loaded--it might buy you some gains.

Any kind of UI updates are extremely expensive; if you're writing out text or incrementing a progress bar it's best to do so sparingly.

STW
In the end for what I'm trying to do I'm doing parsing and processing.However, the question is merely regarding the code above. Purely assignment (my processing is commented out) is taking roughly 100-130 seconds. With processing it takes 190 seconds. All this is backend.
McAden
+1  A: 

Doing this kind of thing will take a long time as you having to pull the data from the exchange store for each item.

I think that you have a couple of options here..

Process this information out of band use CDO/RDO in some other process. Or Use MapiTables as this is the fastest way to get properties there are caveats with this though and you may be doing things in your processin that can be brought into a table.

Redemption wrapper - http://www.dimastr.com/redemption/mapitable.htm

MAPI Tables http://msdn.microsoft.com/en-us/library/cc842056.aspx

76mel
I was hoping to avoid having to purchase additional licenses, but this is definitely a possibility. Thanks!
McAden
CDO is free MS lib etc. www.cdolive.com you would have to use it out of process though as you above code looks like OOM. You connect to exchange directly and work on the items ..
76mel
Ooh forgot that there are outlook table now in 2007 ... this the fastest way to get datahttp://msdn.microsoft.com/en-us/library/bb147822.aspxstring filter = "";Outlook.Table inboxTable = inboxFolder.GetTable(filter, Outlook.OlTableContents.olUserItems );Use a filter to select what you want and dont want.
76mel
I think that's about the best it's going to get. The fact of the matter is that I'm stuck with Outlook's performance being what it is due to how it stores and retrieves data and that you've given me the best way to deal with the situation. Thanks much!
McAden
I'm using RDOMail objects within Redemption - I went from processing(tokenizing, parsing, writing out to database) 7000 e-mails/attachments in 2200 seconds to doing it all in 1100 seconds.
McAden
if you are using redemption and yo uonly need certain fields a mapitable will be one of that fast ways to roll through the data. or a table restrict may help
76mel
A: 

We had exactly the same problem even when the folders were local and there was no network delay.

We got 10x speedup by storing a copy of every email in a local Sql Server CE table tuned for the search we needed. We also used update events to make sure the local database remains in sync with the Outlook/Exchange folders.

To totally eliminate user lag we took the search out of the Outlook thread and put it in its own thread. The perception of lagging was worse than the actual delay it seems.

rfreytag