tags:

views:

60

answers:

3

I'm writing a solution around MOSS 2007. And storing fairly large quantities of data in a list.

My first question is: Can lists handle large quantities of data - around 200 000 items. Now I've already read up about it, and it seems like the limitations of lists are on the number of items the views can display (2000). So question is: Is this a recommendation or a real limitation? No documentation actually confirms this.

second question if its a physical limitation in how many items the view can display, Does this mean that its impossible to check for duplicates in a sharepoint list that contains vast quantities of data?

In the sense that to perform a wsList.getListItems you have to pass a view (if the list contains 100 000 records, and the view can only contain 2000 records) how is it possible to check for duplicates?

Thanks

+1  A: 

You can have very large lists, but the performance is going to SUCK.

We had lists with 50,000+ items in a project and we found the best way we could query and process the contents was using SPSiteDataQuery and CrossListQueryCache and formatting the queries in the obscure, annoying SharePoint CAML dialect.

axel_c
Does that mean this 2000 limitation is theoretical, and not a hard coded enforced limit?
JL
You can query much more than that 2000. (You can specify RowLimit on SPQuery or SPSiteDataQuery or CrossListQueryCache).
Janis Veinbergs
No, there's no limitation, or at least we never ran into it.
axel_c
+1  A: 

If possible breaking up the items into containers like folders would help with performance. If one of the list item fields is some type of classification lookup, then that could be replaced by putting items in folders of that classification type.

Dan Revell
+2  A: 

Huge list performance

You may want to read "Scaling to Extremely Large Lists and Performant Access Methods" and "Best Practices for LARGE SharePoint Lists and Documents Libraries".

Another thing this article does not mention that adding list items with SPList.Items.Add, because on large list it's a huge performance penality. What you do is build efficient query that returns no items and then add item to that collection (somwhere i was reading that webservices perform good on adding item, however i can't find that article no more).

You can also see some tests (or other tests) on how huge lists perform.

As for duplicates

You may want to create Scheduled job (SPJobDefinition) that runs somwhere at night and checks for duplicates.

Better idea than looping all SPListItem's and then Query list for each item to check for duplicates would probably be to get a DataTable (SPListItemCollection.GetDataTable()) for all items and use some technique to determine duplicates.

As for views

Filter items, order to see relevant ones and define your RowLimit. That's the key for views - you just need most relevant items, don't you?

Janis Veinbergs