MsSQL having max(id) = id problem –> row_number partition

One problem I haven’t been able to solve in a satisfying way is to create a query that returns a full row based on an aggregate function. Suppose we have the following database schema:

If you need to write a query that returns the most expensive book for each customer you probably would first write a query like:

SELECT C.customerId, C.Name, B.Title FROM customers C 
INNER JOIN orders O ON C.CustomerId = O.CustomerId 
 
INNER JOIN Books B ON B.BookId = O.BookId 
GROUP BY C.customerId, C.Name, B.Title 
 
HAVING B.price = MAX(B.price)

Which results in this error:

Column ‘Books.Price’ is invalid in the HAVING clause because it is not contained in either an aggregate function or the GROUP BY clause.

The only solution I found was to create a sub-query that returns price of the most expensive book per customer and join that sub query on the price. Since Sql server 2008 there is a very nice solution to this problem. The ROW_NUMBER PARTITION function, that function returns a row number for each record, which in nice. But how is this going to solve your problem? That will happen when you use the “OVER (PARTITION BY …. ORDER BY…) “. This tells Sql server when the row number needs to be reset and how to number. For our database schema this this query looks like this:

SELECT *, ROW_NUMBER() OVER(PARTITION BY C.CustomerId ORDER BY PRICE DESC) rownumber FROM customers C 
INNER JOIN orders O ON C.CustomerId = O.CustomerId 
INNER JOIN Books B ON B.BookId = O.BookId

 

This query returns all books every customer has ordered, with an extra column “rownumber” that is reset per customer, and is ordered from highest to lowest price:

Now, all we need to do is to get the rows where rownumber equals 1. For that we need to create a sub query, but this sub queru doesn’t cause any extra table scans and is very light:

SELECT * FROM ( 
    SELECT C.customerId, C.name, B.Title,B.Author, B.Price, ROW_NUMBER() OVER(PARTITION BY C.CustomerId ORDER BY PRICE DESC) rownumber FROM customers C 
    INNER JOIN orders O ON C.CustomerId = O.CustomerId 
    INNER JOIN Books B ON B.BookId = O.BookId 
) tmp 
WHERE tmp.rownumber = 1

That gives use this result:

And that is what e needed in the first place.

Comments are closed.