views:

120

answers:

5

I'm entirely new at SQL, but let's say that on the StackExchange Data Explorer, I just want to list the top 15 users by reputation, and I wrote something like this:

SELECT TOP 15
  DisplayName, Id, Reputation, Reputation/1000 As RepInK
FROM
  Users
WHERE
  RepInK > 10
ORDER BY Reputation DESC

Currently this gives an Error: Invalid column name 'RepInK', which makes sense, I think, because RepInK is not a column in Users. I can easily fix this by saying WHERE Reputation/1000 > 10, essentially repeating the formula.

So the questions are:

  • Can I actually use the RepInK "column" in the WHERE clause?
    • Do I perhaps need to create a virtual table/view with this column, and then do a SELECT/WHERE query on it?
  • Can I name an expression, e.g. Reputation/1000, so I only have to repeat the names in a few places instead of the formula?
    • What do you call this? A substitution macro? A function? A stored procedure?
  • Is there an SQL quicksheet, glossary of terms, language specification, anything I can use to quickly pick up the syntax and semantics of the language?
    • I understand that there are different "flavors"?
+4  A: 

Can I actually use the RepInK "column" in the WHERE clause?

No, but you can rest assured that your database will evaluate (Reputation / 1000) once, even if you use it both in the SELECT fields and within the WHERE clause.

Do I perhaps need to create a virtual table/view with this column, and then do a SELECT/WHERE query on it?

Yes, a view is one option to simplify complex queries.

Can I name an expression, e.g. Reputation/1000, so I only have to repeat the names in a few places instead of the formula?

You could create a user defined function which you can call something like convertToK, which would receive the rep value as an argument and returns that argument divided by 1000. However it is often not practical for a trivial case like the one in your example.

Is there an SQL quicksheet, glossary of terms, language specification, anything I can use to quickly pick up the syntax and semantics of the language?

I suggest practice. You may want to start following the mysql tag on Stack Overflow, where many beginner questions are asked every day. Download MySQL, and when you think there's a question within your reach, try to go for the solution. I think this will help you pick up speed, as well as awareness of the languages features. There's no need to post the answer at first, because there are some pretty fast guns on the topic over here, but with some practice I'm sure you'll be able to bring home some points :)

I understand that there are different "flavors"?

The flavors are actually extensions to ANSI SQL. Database vendors usually augment the SQL language with extensions such as Transact-SQL and PL/SQL.

Daniel Vassallo
@Daniel: would you advise against learning SQL by querying Stack Exchange Data Explorer? Because that was my plan. No need to install anything, lots of people querying it already so I can look up other people's queries, real data that I'm familiar with and care about, etc.
polygenelubricants
@polygenelubricants: No not at all. That's certainly good practice, especially if you have some specific curiosities that you want to satisfy from the SE data dumps :) ... I suggested to follow SO questions simply because the real problems often reveal much more situations for practical SQL stuff. In addition, you'll also see how the experts respond, from whom there's plenty to learn. The mysql tag typically gathers simpler problems than the sql-server tags, or others. The syntax is not all that different: most queries will run on all databases.
Daniel Vassallo
@polygenelubricants: It looks like you're not the only one [interested in learning SQL from the Stack Exchange Data Exporer](http://meta.stackoverflow.com/questions/51395/can-we-become-our-own-northwind-for-teaching-sql-databases) :)
Daniel Vassallo
+5  A: 

You could simply re-write the WHERE clause

where reputation > 10000

This won't always be convenient. As an alternativly, you can use an inline view:

SELECT
  a.DisplayName, a.Id, a.Reputation, a.RepInK 
FROM 
   (
        SELECT  TOP 15  
          DisplayName, Id, Reputation, Reputation/1000 As RepInK 
        FROM 
          Users 
        ORDER BY Reputation DESC 
    ) a
WHERE 
  a.RepInK > 10 
APC
+1  A: 

You CAN refer to RepInK in the Order By clause, but in the Where clause you must repeat the expression. But, as others have said, it will only be executed once.

Chris
+1  A: 

There are good answers for the technical problem already, so I'll only address some of the rest of your questions.

If you're just working with the DataExplorer, you'll want to familiarize yourself with SQL Server syntax since that's what it's running. The best place to find that, of course, is MSDN's reference.

Yes, there are different variations in SQL syntax. For example, the TOP clause in the query you gave is SQL Server specific; in MySQL you'd use the LIMIT clause instead (and these keywords don't necessarily appear in the same spot in the query!).

Jon Seigel
+2  A: 

Regarding something like named expressions, while there are several possible alternatives, the query optimizer is going to do best just writing out the formula Reputation / 1000 long-hand. If you really need to run a whole group of queries using the same evaluated value, your best bet is to create view with the field defined, but you wouldn't want to do that for a one-off query.

As an alternative, (and in cases where performance is not much of an issue), you could try something like:

SELECT TOP 15
    DisplayName, Id, Reputation, RepInk
FROM (
     SELECT DisplayName, Id, Reputation, Reputation / 1000 as RepInk
     FROM Users
) AS table
WHERE table.RepInk > 10
ORDER BY Reputation DESC

though I don't believe that's supported by all SQL dialects and, again, the optimizer is likely to do a much worse job which this kind of thing (since it will run the SELECT against the full Users table and then filter that result). Still, for some situations this sort of query is appropriate (there's a name for this... I'm drawing a blank at the moment).

Personally, when I started out with SQL, I found the W3 schools reference to be my constant stopping-off point. It fits my style for being something I can glance at to find a quick answer and move on. Eventually, however, to really take advantage of the database it is necessary to delve into the vendors documentation.

Although SQL is "standarized", unfortunately (though, to some extent, fortunately), each database vendor implements their own version with their own extensions, which can lead to quite different syntax being the most appropriate (for a discussion of the incompatibilities of various databases on one issue see the SQLite documentation on NULL handling. In particular, standard functions, e.g., for handling DATEs and TIMEs tend to differ per vendor, and there are other, more drastic differences (particularly in not support subselects or properly handling JOINs). If you care for some of the details, this document provides both the standard forms and deviations for several major databases.

ig0774
(Re: named queries and optimizer) At this point I care about writing good readable queries; I'll worry about optimization/performance later. I think that's the better way to learn most languages; I hope it's also true for SQL.
polygenelubricants