GROUP BY
would probably allow the SQL engine to better optimize your query but would make it harder to read due to the large number of grouping parameters.
Another option as recommended by the SQL Team is to consider using Sub queries. This can often make the GROUP BY
statements much simpler and makes the overall query much easier to read.
Using a Sub query:
SELECT orders.billerID,
orders.invoiceDate,
orders.txnID,
orders.bName,
orders.bStreet1,
orders.bStreet2,
orders.bCity,
orders.bState,
orders.bZip,
orders.bCountry,
orders.sName,
orders.sStreet1,
orders.sStreet2,
orders.sCity,
orders.sState,
orders.sZip,
orders.sCountry,
orders.paymentType,
orders.invoiceNotes,
orders.pFee,
orders.shipping,
orders.tax,
orders.reasonCode,
orders.txnType,
orders.customerID,
customers.firstName AS firstName,
customers.lastName AS lastName,
customers.businessName AS businessName,
orderStatus.statusName AS orderStatus,
orderItem.fees + orders.shipping + orders.tax AS orderTotal,
orderItem.fees + orders.shipping + orders.tax - payments.amount AS orderBalance
FROM orders
LEFT JOIN customers ON orders.customerID = customers.id
LEFT JOIN orderStatus ON orders.orderStatus = orderStatus.id
LEFT JOIN
(
SELECT orderID, SUM(itemPrice * itemQuantity) as fees
FROM orderItems
GROUP BY orderID
) orderItems ON orderItems.orderID = orders.id
LEFT JOIN
(
SELECT orderID, SUM(amount) as amount
FROM payments
GROUP BY orderID
) payments ON payments.orderID = orders.id
Using a GROUP BY
:
SELECT orders.billerID,
orders.invoiceDate,
orders.txnID,
orders.bName,
orders.bStreet1,
orders.bStreet2,
orders.bCity,
orders.bState,
orders.bZip,
orders.bCountry,
orders.sName,
orders.sStreet1,
orders.sStreet2,
orders.sCity,
orders.sState,
orders.sZip,
orders.sCountry,
orders.paymentType,
orders.invoiceNotes,
orders.pFee,
orders.shipping,
orders.tax,
orders.reasonCode,
orders.txnType,
orders.customerID,
customers.firstName AS firstName,
customers.lastName AS lastName,
customers.businessName AS businessName,
orderStatus.statusName AS orderStatus,
SUM(orderItems.itemPrice * orderItems.itemQuantity) + orders.shipping + orders.tax AS orderTotal,
SUM(orderItems.itemPrice * orderItems.itemQuantity) + orders.shipping + orders.tax - SUM(payments.amount) AS orderBalance
FROM orders
LEFT JOIN customers ON orders.customerID = customers.id
LEFT JOIN orderStatus ON orders.orderStatus = orderStatus.id
LEFT JOIN orderItems ON orderItems.orderID = orders.id
LEFT JOIN payments ON payments.orderID = orders.id
GROUP BY
orders.billerID,
orders.invoiceDate,
orders.txnID,
orders.bName,
orders.bStreet1,
orders.bStreet2,
orders.bCity,
orders.bState,
orders.bZip,
orders.bCountry,
orders.sName,
orders.sStreet1,
orders.sStreet2,
orders.sCity,
orders.sState,
orders.sZip,
orders.sCountry,
orders.paymentType,
orders.invoiceNotes,
orders.pFee,
orders.shipping,
orders.tax,
orders.reasonCode,
orders.txnType,
orders.customerID,
customers.firstName,
customers.lastName,
customers.businessName,
orderStatus.statusName
GROUP BY
Explained:
You can thing of GROUP BY
as collecting records together that have similar data. For my example I am going to use a simple produce table with Category
, Name
and Price
columns. If I group the data by Category
I can aggregate ( i.e. SUM
, COUNT
, MIN
, MAX
, etc.) based on any of the other columns. Since I am grouping by the Category
column the resulting records will have a unique value for Category
. Any of the other columns might be return different value and therefore cannot be included in the select statement.
Name, Category, Price
Green Peppers, Peppers, 1.50
Orange Peppers, Peppers, 2.50
Yellow Peppers, Peppers, 2.50
Lemons, Citrus, 1.00
Oranges, Citrus, 1.00
Limes, Citrus, 1.00
SELECT
Category, /* This is unique because it is in the GROUP BY clause */
AVG(Price) AS AveragePrice,
MAX(Price) AS MaxPrice,
MIN(Price) AS MinPrice
/* , Name */ /* This is invalid because it is not in the GROUP BY clause */
/* The values are not unique so SQL does not know what to return */
FROM Produce
GROUP BY Category