views:

404

answers:

4

Hi,

I have the following LINQ query:

var aKeyword = "ACT";
var results = from a in db.Activities
              where a.Keywords.Split(',').Contains(aKeyword) == true
              select a;

Keywords is a comma delimited field.

Everytime I run this query I get the following error:

"LINQ to Entities does not recognize the method 'Boolean ContainsString' method, and this method cannot be translated into a store expression."

What is the alternative for what I am trying to do?

+7  A: 

Your problem is that LINQ-to-Entites has to translate everything you give it into SQL to send to the database.

If that is really what you need to do, you'll have to force LINQ-to-Entities to pull back all the data and LINQ-to-Objects to evaluate the condition.

Ex:

var aKeyword = "ACT";
var results = from a in db.Activities.ToList()
              where a.Keywords.Split(',').Contains(aKeyword) == true
              select a;

Be aware though, that this will pull back all the objects from the Activities table. An alternative may be to let the DB do a bit of an initial filter, and filter down the rest of the way afterwards:

var aKeyword = "ACT";
var results = (from a in db.Activities
              where a.Keywords.Contains(aKeyword)
              select a).ToList().Where(a => a.KeyWords.Split(',').Contains(aKeyword));

That will let LINQ-to-Entities do the filter it understands (string.Contains becomes a like query) that will filter down some of the data, then apply the real filter you want via LINQ-to-Objects once you have the objects back. The ToList() call forces LINQ-to-Entities to run the query and build the objects, allowing LINQ-to-Objects to be the engine that does the second part of the query.

Jonathan
Thanks worked a treat! Is this going to cost me a lot of performance as my table grows?
James
@James - yes, the former will cost you quite a bit of performance. The latter less so (it'll require a table/index scan, but will only return candidate rows). If this is something you'll be running a lot of, I'd recommend storing the keywords in the DB pre-split (ie. a table with one row per keyword per activity -- Yannick's suggestion). Then, you can write it as a query the DB can handle well.
Jonathan
Yeah I agree, I have decided to use Yannick's suggestion!
James
When using this approach I would suggest to just do two matches, so all the processing still happens server-side:where a.Keywords.Contains("," + aKeyword) || a.Keywords.Contains(aKeyword + ",")
Yannick M.
@Yannick - that won't take care of everything -- it'll still match keywords that begin with/end with the keyword in question, and you'll want to also add || a.Keywords == aKeyword. You'll still need the post-filter. Still, a separate table is the real way to go.
Jonathan
A: 

My guess is the way you are calling Split. It should take an array. Maybe there is another Split in Linq it is finding and giving you an unusual error:

This works for Linq to Objects:

 var dataStore = new List<string>
                    {
                        "foo,bar,zoo",
                        "yelp,foo",
                        "fred",
                        ""
                    };
 var results = from a in dataStore
               where a.Split(new[] {','}).Contains("foo")
               select a;

 foreach (var result in results)
 {
     Console.WriteLine("Match: {0}", result);
 }

Outputs the following:

Match: foo,bar,zoo
Match: yelp,foo

Actually, thinking about it, do you need the split at all? a.Contains("foo") may be enough for you (unless you don't want to hit foobar).

Ray Hayes
Yeah defo need the split (as you mentioned) keywords can be part of other keywords. Need an exact match.
James
A: 

You may want to look at this question about L2E and .Contains for a solution that should be more efficient than guessing at a superset before filtering client side.

Marc
+1  A: 

In response to your performance considerations on a big dataset:

You are going to be doing non indexed wildcard string matching on the client, so yes, there will be performance loss.

Is there a reason why you have multiple keywords in one table field? You could normalize that out, to have a ActivityKeywords table where for each Activity you have a number of Keyword records.

Activities(activity_id, ... /* remove keywords field */) ---> ActivityKeywords(activity_id, keyword_id) ---> Keywords(keyword_id, value)

Check out Non-first normal form: http://en.wikipedia.org/wiki/Database%5Fnormalization

EDIT: Also even if you were to stick with the one column, there is a way to do everything serverside (if you have a strict syntax: 'keyword1, keyword2, ..., keywordN'):

var aKeyword = "ACT";
var results = (from a in db.Activities
              where a.Keywords.Contains("," + aKeyword) || a.Keywords.Contains(aKeyword + ",")
              select a;
Yannick M.
It was really just for ease of use. Most activities will only have a few keywords hence I didn't see the need for another table for it. however, I never actually thought of doing what you have suggested...it would probably be a change for the better! Thanks.
James
I do have a strict syntax for the keywords and it is as you suggested i.e. Keyword, Keyword, Keyword. However, I decided to go with the other table. Thanks.
James