Group By Clause

With Group by we can arrange rows in groups and apply aggregate functions against those groups.

USE TSQL2012;
SELECT COUNT(*) AS numorders
FROM Sales.Orders;

This query generates the following output.

numorders
———–
830

Because there is no GROUP BY clause, all rows queried from the Sales.Orders table are arranged in one group, and then the COUNT(*) function counts the number of rows in that group. Grouped queries return one result row per group, and because the query defines only one group, it returns only one row in the result set.

Now Using GROUP BY clause, you can group the rows based on a specified grouping set of expressions.

For example, the following query groups the rows by shipper ID and
counts the number of rows (orders, in this case) per each distinct group.
SELECT shipperid, COUNT(*) AS numorders
FROM Sales.Orders
GROUP BY shipperid;

This query generates the following output.
shipperid     numorders
———–          ———–
1                    249
2                   326
3                   255

The query identifies three groups because there are three distinct shipper IDs.

The grouping set can be made of multiple elements. For example, the following query groups the rows by shipper ID and shipped year.
SELECT shipperid, YEAR(shippeddate) AS shippedyear,
COUNT(*) AS numorders
FROM Sales.Orders
GROUP BY shipperid, YEAR(shippeddate);

shipperid         shippedyear            numorders
———–                ———–                     ———–
1                           2008                         79
3                           2008                         73
1                         NULL                         4
3                         NULL                         6
1                          2006                         36
2                          2007                        143
2                         NULL                        11
3                          2006                         51
1                          2007                        130
2                          2008                        116
2                          2006                        56
3                          2007                        125

Notice that you get a group for each distinct shipper ID and shipped year combination that exists in the data, even when the shipped year is NULL. Remember that a NULL in the shippeddate column represents unshipped orders, so a NULL in the shippedyear column represents the group of unshipped orders for the respective shipper.
If you need to filter entire groups, you need a filtering option that is evaluated at the group level not like the WHERE clause, which is evaluated at the row level. For this, SQL provides the HAVING clause. Like the WHERE clause, the HAVING clause uses a predicate but evaluates the predicate per group as opposed to per row. This means that you can refer to aggregate computations because the data has already been grouped.

For example, suppose that you need to group only shipped orders by shipper ID and shipping year, and filter only groups having fewer than 100 orders. You can use the following query to achieve this task.
SELECT shipperid, YEAR(shippeddate) AS shippedyear,
COUNT(*) AS numorders
FROM Sales.Orders
WHERE shippeddate IS NOT NULL
GROUP BY shipperid, YEAR(shippeddate)
HAVING COUNT(*) < 100;
This query generates the following output

shipperid         shippedyear        numorders
———–              ———–                    ———–
1                       2008                          79
3                       2008                          73
1                       2006                          36
3                       2006                          51
2                       2006                          56

Notice that the query filters only shipped orders in the WHERE clause. This filter is applied at the row level conceptually before the data is grouped. Next the query groups the data by shipper ID and shipped year. Then the HAVING clause filters only groups that have a count of rows (orders) that is less than 100. Finally, the SELECT clause returns the shipper ID, shipped year, and count of orders per each remaining group.

The following query invokes the COUNT(*) function, in addition to a number of general set functions, including COUNT.

SELECT shipperid,
COUNT(*) AS numorders,
COUNT(shippeddate) AS shippedorders,
MIN(shippeddate) AS firstshipdate,
MAX(shippeddate) AS lastshipdate,
SUM(val) AS totalvalue
FROM Sales.OrderValues
GROUP BY shipperid;

This query generates the following output (dates formatted for readability).
shipperid   numorders   shippedorders  firstshipdate lastshipdate       totalvalue
———–          ———-             ————             ————-          ————-           ———–
3                       255                    249                    2006-07-15     2008-05-01       383405.53
1                       249                    245                    2006-07-10    2008-05-04       348840.00
2                       326                    315                    2006-07-11     2008-05-06       533547.69

The difference in count(*) and count(shippeddate) is count(*) does not ignore NULL where count(shippeddate) does.

With general set functions, you can work with distinct occurrences by specifying a DISTINCT clause before the expression, as follows.
SELECT shipperid, COUNT(DISTINCT shippeddate) AS numshippingdates
FROM Sales.Orders
GROUP BY shipperid;

This query generates the following output.
shipperid        numshippingdates
———–               —————–
1                          188
2                          215
3                          198

SELECT shipperid, COUNT(shippeddate) AS numshippingdates
FROM Sales.Orders
GROUP BY shipperid;

This query generates the following output.
shipperid        numshippingdates
———–               —————–
1                          249
2                          245
3                          315

Note that the DISTINCT option is available not only to the COUNT function, but also to other general set functions. However, it’s more common to use it with COUNT.

SELECT S.shipperid, S.companyname,
COUNT(*) AS numorders
FROM Sales.Shippers AS S
INNER JOIN Sales.Orders AS O
ON S.shipperid = O.shipperid
GROUP BY S.shipperid, S.companyname;
This query generates the following output.
shipperid            companyname               numorders
———–                       ————–                        ———–
1                               Shipper GVSUA               249
2                               Shipper ETYNR               326
3                               Shipper ZHISN                 255

 

Thanks