This article will cover the SQL PARTITION BY clause and, in particular, the difference with GROUP BY in a select statement. We will also explore various use cases of SQL PARTITION BY.
Generally, We use SQL PARTITION BY to divide the result set into partitions and perform computation on each subset of partitioned data.
Sample - Order table Like-
Suppose we want to find the following values in the Orders table-
- Minimum order value in a city
- Maximum order value in a city
- Average order value in a city
Execute the following query with GROUP BY clause to calculate these values.
SELECT Customercity,
AVG(Orderamount) AS AvgOrderAmount,
MIN(OrderAmount) AS MinOrderAmount,
SUM(Orderamount) TotalOrderAmount
FROM [dbo].[Orders]
GROUP BY Customercity;
Now, we want to add CustomerName and OrderAmount column as well in the output. Let’s add these columns in the select statement and execute the following code.
SELECT Customercity, CustomerName ,OrderAmount,
AVG(Orderamount) AS AvgOrderAmount,
MIN(OrderAmount) AS MinOrderAmount,
SUM(Orderamount) TotalOrderAmount
FROM [dbo].[Orders]
GROUP BY Customercity;
Once we execute this query, we get an error message. In the SQL GROUP BY clause, we can use a column in the select statement if it is used in Group by clause as well. It does not allow any column in the select clause that is not part of GROUP BY clause.
SQL PARTITION BY
We can use the SQL PARTITION BY clause with the OVER clause to specify the column on which we need to perform aggregation. In the previous example, we used Group By with CustomerCity column and calculated average, minimum and maximum values.
Let us rerun this scenario with the SQL PARTITION BY clause using the following query.
SELECT Customercity,
AVG(Orderamount) OVER(PARTITION BY Customercity) AS AvgOrderAmount,
MIN(OrderAmount) OVER(PARTITION BY Customercity) AS MinOrderAmount,
SUM(Orderamount) OVER(PARTITION BY Customercity) TotalOrderAmount
FROM [dbo].[Orders];
In the previous example, we get an error message if we try to add a column that is not a part of the GROUP BY clause.
We can add required columns in a select statement with the SQL PARTITION BY clause. Let us add CustomerName and OrderAmount columns and execute the following query.
SELECT Customercity,
CustomerName,
OrderAmount,
COUNT(OrderID) OVER(PARTITION BY Customercity) AS CountOfOrders,
AVG(Orderamount) OVER(PARTITION BY Customercity) AS AvgOrderAmount,
MIN(OrderAmount) OVER(PARTITION BY Customercity) AS MinOrderAmount,
SUM(Orderamount) OVER(PARTITION BY Customercity) TotalOrderAmount
FROM [dbo].[Orders];
No comments:
Post a Comment