views:

130

answers:

3
+2  Q: 

Some SQL Questions

Hello there,

I have been using SQL for years, but have mostly been using the query designer within SQL Studio (etc.) to put together my queries. I've recently found some time to actually "learn" what everything is doing and have set myself the following fairly simple tasks. Before I begin, I'd like to ask the SOF community their thoughts on the questions, possible answers and any tips they may have.

The questions are;

  1. Find all records w/ a duplicate in a particular column (e.g. a linking id is in more than 1 record throughout table)
  2. SUM price from a linked table within the same query (select within a select?)
  3. Explain the difference between the 4 joins; LEFT, RIGHT, OUTER, INNER
  4. Copy data from one table to another based on SELECT and WHERE criteria

Input welcomed & appreciated.

Chris

+1  A: 

Hello,

Let's say you have 2 tables named :

  • [OrderLine] with the columns [Id, OrderId, ProductId, Qty, Status]
  • [Product] with [Id, Name, Price]

1) all orderline of command having more than 1 line (it's technically the same as looking for duplicates on OrderId :) :

select OrderId, count(*) 
from OrderLine
group by OrderId
having count(*) > 1

2) total price for all order line of the order 1000

select sum(p.Price * ol.Qty) as Price
from OrderLine ol
inner join Product p on ol.ProductId = p.Id
where ol.OrderId = 1000

3) difference between joins:

  • a inner join b => take all a that has a match with b. if b is not found, a will be not be returned
  • a left join b => take all a, match them with b, include a even if b is not found
  • a righ join b => b left join a
  • a outer join b => (a left join b) union ( a right join b)

4) copy order lines to a history table :

insert into OrderLinesHistory
(CopiedOn, OrderLineId, OrderId, ProductId, Qty)
select 
  getDate(), Id, OrderId, ProductId, Qty
from 
  OrderLine
where 
  status = 'Closed'
Manitra Andriamitondra
Your answer to 1 is incorrect- the OP wants a count of duplicates
RichardOD
Thank you manitra! This question seems to have caused all sorts of controversy, so I will accept yours as the answer. Thank you. :) +1
Chris Laythorpe
I corrected the #1 (w/ and w/o confusion :)
Manitra Andriamitondra
A: 

To answer #4 and to perhaps show at least some understanding of SQL and the fact this isn't HW, just me trying to learn best practise;

SET NOCOUNT ON;
DECLARE @rc int
if @what = 1
    BEGIN
        select id from color_mapper where product = @productid and color = @colorid;
     select @rc = @@rowcount
     if @rc = 0
  BEGIN
   exec doSavingSPROC @colorid, @productid;
  END
    END
END
Chris Laythorpe
-1 bad example, does not answer #4, not really relevant to the questions (and does not show understanding of SQL since you more likely copied it from a SP written by someone else)
finnw
Actually I wrote the SQL. It isn't the best way to do it as Steve N has shown, but the purpose of the question is to show my understanding of SQL - as many assumed the questions were for HW.
Chris Laythorpe
Further, your attitude is poor. Do you just visit questions with no intention to help, but to mark down answers?
Chris Laythorpe
+1  A: 

I recommend that you start by following some tutorials on this topic. Your questions are not uncommon questions for someone moving from a beginner to intermediate level in SQL. SQLZoo is an excellent resource for learning SQL so consider following that.

In response to your questions:

1) Find all records with a duplicate in a particular column

There are two steps here: find duplicate records and select those records. To find the duplicate records you should be doing something along the lines of:

select possible_duplicate_field, count(*) 
from   table 
group by possible_duplicate_field 
having count(*) > 1

What we're doing here is selecting everything from a table, then grouping it by the field we want to check for duplicates. The count function then gives me a count of the number of items within that group. The HAVING clause indicates that we want to filter AFTER the grouping to only show the groups which have more than one entry.

This is all fine in itself but it doesn't give you the actual records that have those values on them. If you knew the duplicate values then you'd write this:

select * from table where possible_duplicate_field = 'known_duplicate_value'

We can use the SELECT within a select to get a list of the matches:

select * 
from table 
where possible_duplicate_field in (
  select possible_duplicate_field 
  from   table 
  group by possible_duplicate_field 
  having count(*) > 1
)

2) SUM price from a linked table within the same query

This is a simple JOIN between two tables with a SUM of the two:

select sum(tableA.X + tableB.Y) 
from  tableA 
join  tableB on tableA.keyA = tableB.keyB

What you're doing here is joining two tables together where those two tables are linked by a key field. In this case, this is a natural join which operates as you would expect (i.e. get me everything from the left table which has a matching record in the right table).

3) Explain the difference between the 4 joins; LEFT, RIGHT, OUTER, INNER

Consider two tables A and B. The concept of "LEFT" and "RIGHT" in this case are slightly clearer if you read your SQL from left to right. So, when I say:

select x from A join B ...

The left table is "A" and the right table is "B". Now, when you explicitly say "LEFT" the SQL statement you are declaring which of the two tables you are joining is the primary table. What I mean by this is: Which table do I scan through first? Incidentally, if you omit the LEFT or RIGHT, then SQL implicitly uses LEFT.

For INNER and OUTER you are declaring what to do when matches don't exist in one of the tables. INNER declares that you want everything in the primary table (as declared using LEFT or RIGHT) where there is a matching record in the secondary table. Hence, if the primary table contains keys "X", "Y" and "Z", and the secondary table contains keys "X" and "Z", then an INNER will only return "X" and "Z" records from the two tables.

When OUTER is used, we're saying: Give me everything from the primary table and anything that matches from the secondary table. Hence, in the previous example, we'd get "X", "Y" and "Z" records in the output record set. However, there would be NULLs in the fields which should have come from the secondary table for key value "Y" as it doesn't exist in the secondary table.

4) Copy data from one table to another based on SELECT and WHERE criteria

This is pretty trivial and I'm surprised you've never encountered it. It's a simple nested SELECT in an INSERT statement (this may not be supported by your database - if not, try the next option):

insert into new_table select * from old_table where x = y

This assumes the tables have the same structure. If you have different structures then you'll need to specify the columns:

insert into new_table (list, of, fields) 
    select list, of, fields from old_table where x = y
Steve N
Fantastic Steve. Well written and the examples explain things clearly. You should write a book! ;)
Chris Laythorpe