tags:

views:

178

answers:

6

What is the difference between

var q_nojoin = from o in one
               from t in two
               where o.SomeProperty == t.SomeProperty
               select new { o, t };

and

var q_join = from o in one
             join t in two on o.SomeProperty equals t.SomeProperty
             select new { o, t };

They seem to give me the same results.

+1  A: 

A JOIN is a means for combining fields from two (or more) tables by using values common to each.

A WHERE clause specifies that a SQL (data manipulation language) statement should only affect rows that meet specified criteria (think of a WHERE clause as a FILTER).

Joubert Nel
Doesn't answer my question at all :(
Difference Engine
ah, are the semantics of a LINQ query different (w.r.t. WHERE and JOIN) from that of an ANSI SQL query?
Joubert Nel
@rooibos: Surely even in an SQL query this doesn’t answer the question. You say “A JOIN is a means for combining fields from two tables” but surely so is `from a, b where a.Field=b.Field`, so it doesn’t answer the question what the difference is.
Timwi
@timwi Of course a join (explicit or implicit) can contain a WHERE clause (a WHERE clause acting as a filter). Perhaps I should have addressed the specific *example* instead of explaining the conceptual difference between a join and a WHERE clause.
Joubert Nel
The big difference in SQL is clarity - when using JOIN syntax, it's much easier, for example, to detect an incomplete join condition that can lead to a Cartesian product. It helps differentiate join conditions from filtering / restricting conditions.
Adam Musch
A: 

in practice, depending on lots of other factors, you can get performance gains by using one over another. I would imagine (though I have no basis in fact for this) that joins are more sargable than WHERE clauses.

edit: turns out I'm totally wrong. There (should be) no difference to the performance between the two types. However, the newer style (using JOIN) is a lot clearer to read (imo) and also, Microsoft have said that they won't be support the older style (outer-join using WHERE) indefinately.

Tom Morgan
+11  A: 

They give the same result, but the join is very much faster, unless you use LINQ to SQL so that the database can optimise the queries.

I made a test with two arrays containing 5000 items each, and the query with a join was about 450 times faster (!) than the query without a join.

If you use LINQ to SQL, the database will optimise both queries to do the same job, so there is no performance difference in that case. However, an explicit join is considered more readable.

If you are using LINQ against a different data source, there is no optimising layer, so there is a significant difference in how the queries work. The join uses a hash table or similar to quickly look up matching values, while the query without a join will compare all items in one table with each item in the other table. The complexity of the join is roughly O(n+m), while the complexity of the query without the join is O(n*m). This means not only that the query without the join is slower, but also that it scales badly, so as the data grows it will get exponentially slower.

Guffa
A: 

Actually in SQL, join-on statements can be written in from-where statements(if you really want). But you know we have left join left outer join and etc in SQL statements, which make us easier to express what we want (of course you can also use from-where but it will make your code look crazy). So we always use where if we want to filter our result, while use join if there are relationship between tables.

Danny Chen
A: 

Very interesting issue. Here is a blog post about it:

http://blogs.msdn.com/b/mattwar/archive/2007/09/04/linq-building-an-iqueryable-provider-part-vii.aspx

A citation from this blog:

"The answer to that is a bit involved, but I'll try to make a stab at it here. The explicit join requires me to specify two key expressions to match; in database parlance this is known as an equi-join. The nested from clauses allow more flexibility. The reason the explicit join is so restrictive is that by being restrictive the LINQ to Objects implementation can be efficient in execution without needing to analyze and rewrite the query. The good news here is that almost all joins used in database queries are equi-joins. "

Andrzej Nosal
A: 

The first query is saying, in effect, "Do a cross join on these collections (creating essensially a NxM matrix), then take only those that are along the diagonal, and give them to me"

The second query is, in effect, "Create a list of just the combined items where the proeprties match".

The results are the same, but the process of getting there is a bit different.

Since SQL databases are generally highly optimized, so when you ask for the first, the server just says "Idiot user....", and substitutes the second.

In non-SQL environments (like Linq-to-Objects), if you ask for the first, that's what it will do, and you'll see a significant performance hit.

James Curran