Originally published 1/17/2013 and reproduced here for reference
Recently, I had a need to return distinct items from a SharePoint 2010 list based on a field in the list. Since CAML doesn't support this type of query directly, people have posted various approaches to solve the problem. Unfortunately, all of the approaches involve loading the entire list into memory. This is fine for small lists, but is unacceptable for larger lists because it will cause out-of-memory errors on the server.
My solution was to create an extension method for the SPList class, which you can download and add to your own projects.
The extension method is named CamlDistinct and takes a page size and group by field as arguments. The return value is a List<string> of distinct values found in the group by field. The page size controls how many items are brought back from the list for processing so you can tune the batch size versus the number of queries. This makes it work well even for larger lists. The following sample code shows how to use it in a console application against a contacts list to return the set of distinct company names appearing in the list:
uint pageSize = 3;
string groupField = "Company";
string listName = "Contacts";
string siteCollectionUrl = "http://dev.wingtip.com";
using (SPSite siteCollection = new SPSite(siteCollectionUrl))
using (SPWeb site = siteCollection.OpenWeb())
SPList list = site.Lists[listName];
List<string> values = list.CamlDistinct(pageSize, groupField);
Console.WriteLine("Distinct Values for " + groupField);
foreach (string value in values)