views:

133

answers:

2

I am trying to use Linq2Sql to return all rows that contain values from a list of strings. The linq2sql class object has a string property that contains words separated by spaces.

public class MyObject
{
    public string MyProperty { get; set; }
}

Example MyProperty values are:

MyObject1.MyProperty = "text1 text2 text3 text4"
MyObject2.MyProperty = "text2"

For example, using a string collection, I pass the below list

var list = new List<>() { "text2", "text4" }

This would return both items in my example above as they both contain "text2" value.

I attempted the following using the below code however, because of my extension method the Linq2Sql cannot be evaluated.

public static IQueryable<MyObject> WithProperty(this IQueryable<MyProperty> qry,
    IList<string> p)
{
    return from t in qry
        where t.MyProperty.Contains(p, ' ')
        select t;
}

I also wrote an extension method

public static bool Contains(this string str, IList<string> list, char seperator)
{
    if (str == null) return false;
    if (list == null) return true;

    var splitStr = str.Split(new char[] { seperator },
        StringSplitOptions.RemoveEmptyEntries);

    bool retval = false;
    int matches = 0;

    foreach (string s in splitStr)
    {
        foreach (string l in list)
        {
            if (String.Compare(s, l, true) == 0)
            {
                retval = true;
                matches++;
            }
        }
    }

    return retval && (splitStr.Length > 0) && (list.Count == matches);
 }

Any help or ideas on how I could achieve this?

A: 

I haven't tried, but if I remember correctly, this should work:

from t in ctx.Table
where list.Any(x => t.MyProperty.Contains(x))
select t

you can replace Any() with All() if you want all strings in list to match

EDIT:

To clarify what I was trying to do with this, here is a similar query written without linq, to explain the use of All and Any

where list.Any(x => t.MyProperty.Contains(x))

Translates to:

where t.MyProperty.Contains(list[0]) || t.MyProperty.Contains(list[1]) ||
      t.MyProperty.Contains(list[n])

And

where list.Any(x => t.MyProperty.Contains(x))

Translates to:

where t.MyProperty.Contains(list[0]) && t.MyProperty.Contains(list[1]) &&
      t.MyProperty.Contains(list[n])
Sander Rijken
I'm not sure if this makes sense as the MyProperty contains a full string "text1 text2 text3" which needs to be broken up into individual values and then compared to the list. I tried your suggestion and get the following error: Local sequence cannot be used in LINQ to SQL implementation of query operators except the Contains() operator
David Liddle
Your answer has nothing to do with the question!? It doesnt work, and the Any and All operator do something completely different than the OP wants.
Philip Daubmeier
@phild: The OP already said that it doesn't work, but the Any and All operator are correct when used this way. See my explanation in my answer. If it weren't for the error he mentions, this way just works
Sander Rijken
@Sander: oh, I see. I'm sorry, of course this works, except for not comparing the words individually, but rather searching the string somewhere in the field.
Philip Daubmeier
+1  A: 

Youre on the right track. The first parameter of your extension method WithProperty has to be of the type IQueryable<MyObject>, not IQueryable<MyProperty>.

Anyways you dont need an extension method for the IQueryable. Just use your Contains method in a lambda for filtering. This should work:

List<string> searchStrs = new List<string>() { "text2", "text4" }

IEnumerable<MyObject> myFilteredObjects = dataContext.MyObjects
                   .Where(myObj => myObj.MyProperty.Contains(searchStrs, ' '));

Update:

The above code snippet does not work. This is because the Contains method can not be converted into a SQL statement. I thought a while about the problem, and came to a solution by thinking about 'how would I do that in SQL?': You could do it by querying for each single keyword, and unioning all results together. Sadly the deferred execution of Linq-to-SQL prevents from doing that all in one query. So I came up with this compromise of a compromise. It queries for every single keyword. That can be one of the following:

  • equal to the string
  • in between two seperators
  • at the start of the string and followed by a seperator
  • or at the end of the string and headed by a seperator

This spans a valid expression tree and is translatable into SQL via Linq-to-SQL. After the query I dont defer the execution by immediatelly fetch the data and store it in a list. All lists are unioned afterwards.

public static IEnumerable<MyObject> ContainsOneOfTheseKeywords(
        this IQueryable<MyObject> qry, List<string> keywords, char sep)
{
    List<List<MyObject>> parts = new List<List<MyObject>>();

    foreach (string keyw in keywords)
        parts.Add((
            from obj in qry
            where obj.MyProperty == keyw ||
                  obj.MyProperty.IndexOf(sep + keyw + sep) != -1 ||
                  obj.MyProperty.IndexOf(keyw + sep) >= 0 ||
                  obj.MyProperty.IndexOf(sep + keyw) ==
                      obj.MyProperty.Length - keyw.Length - 1
            select obj).ToList());

    IEnumerable<MyObject> union = null;
    bool first = true;
    foreach (List<MyObject> part in parts)
    {
        if (first)
        {
            union = part;
            first = false;
        }
        else
            union = union.Union(part);
    }

    return union.ToList();
}

And use it:

List<string> searchStrs = new List<string>() { "text2", "text4" };

IEnumerable<MyObject> myFilteredObjects = dataContext.MyObjects
                    .ContainsOneOfTheseKeywords(searchStrs, ' ');

That solution is really everything else than elegant. For 10 keywords, I have to query the db 10 times and every time catch the data and store it in memory. This is wasting memory and has a bad performance. I just wanted to demonstrate that it is possible in Linq (maybe it can be optimized here or there, but I think it wont get perfect).

I would strongly recommend to swap the logic of that function into a stored procedure of your database server. One single query, optimized by the database server, and no waste of memory.

Another alternative would be to rethink your database design. If you want to query contents of one field (you are treating this field like an array of keywords, seperated by spaces), you may simply have chosen an inappropriate database design. You would rather want to create a new table with a foreign key to your table. The new table has then exactly one keyword. The queries would be much simpler, faster and more understandable.

Philip Daubmeier
No sorry this doesn't work. My custom Contains method is not valid (it was just an example of explaining what I was trying to do) as this cannot be valuated as it cannot be translated to SQL.
David Liddle
Youre right, I updated my answer.
Philip Daubmeier
The second code snipped does work now, just tested it with a sample database.
Philip Daubmeier
Yes, i was contemplating just doing this with a stored procedure as it's very inefficient with linq2sql. I chose storing the values in one column instead of keeping the information linked in another table for ease of use when creating/editing MyObjects. It seems I have two options, either do it in a custom sproc or split the MyProperties into another table.
David Liddle
I'd go for the second alternative. Deleting is just as simple, as triggers can delete all dependent rows that have foreign keys to that deleted row. Adding is not really more complex, too.
Philip Daubmeier
In most cases it is better to optimize it for querying, and take a little loss in insert/delete/update complexity.
Philip Daubmeier