Home > Uncategorized > Select top n rows from a table for each group

Select top n rows from a table for each group

We have a retail shop online and with just 2 weeks of our website launch we already have close to 30 orders. Now of course marketing wanted to get some ads up on the site to derive more orders and one for the reports was to get the top two products from each manufacturer which have the best promotional price.

A GROU BY clause immediately comes to mind for the above scenario, but SQL 2005/2008 offer a much better solution.

SELECT    
    ProductId,
    MfgPartNumber,
    DescriptionText,
    MfgListPrice,
    SellPrice,
    SavingsPercentage
FROM    (
    SELECT
        ROW_NUMBER() OVER(PARTITION BY P.MfgCode ORDER BY (P.MfgListPrice - PR.SellPrice) / P.MfgListPrice DESC) AS RowNumber,
        P.ProductId,
        P.MfgPartNumber,
        P.DescriptionText,
        P.MfgListPrice,
        PR.SellPrice,
        (P.MfgListPrice - PR.SellPrice) / P.MfgListPrice AS SavingsPercentage
    FROM    Product P
    INNER JOIN    Price PR
        ON    P.ProductId = PR.ProductId
    INNER JOIN    SpecialPricingXRef SP
        ON    PR.PriceGroupId = SP.PriceGroupId
    WHERE    SP.PricingType = 'Promo'
        AND    P.MfgListPrice > PR.SellPrice
    ) AS InnerTable
WHERE    RowNumber < 3
ORDER BY    SavingsPercentage DESC;

Aside from normal use of the ROW_NUMBER() function to generate sequence numbers for the result set the above is the perfect scenario where the ROW_NUMBER() function is really useful. ROW_NUMBER() function creates a new column in the result set with a unique / incremental number for each row. If the PARTITION BY clause (think of a partition as a category or group) is also specified then the sequence number of a row is reset to 1 for each new partition / category.

Lets dissect the above query; following query returns all the products which have a promo price record with a RowNumber incremented for each row. I’ll get to why we need the RowNumber in our query later.

SELECT
    ROW_NUMBER() OVER(ORDER BY (P.MfgListPrice - PR.SellPrice) / P.MfgListPrice DESC) AS RowNumber,
    P.ProductId,
    P.MfgPartNumber,
    P.DescriptionText,
    P.MfgListPrice,
    PR.SellPrice,
    (P.MfgListPrice - PR.SellPrice) / P.MfgListPrice AS SavingsPercentage
FROM    Product P
INNER JOIN    Price PR
    ON    P.ProductId = PR.ProductId
INNER JOIN    SpecialPricingXRef SP
    ON    PR.PriceGroupId = SP.PriceGroupId
WHERE    SP.PricingType = 'Promo'
    AND    P.MfgListPrice > PR.SellPrice

Adding the PARTITION BY clause to the above changes the output slightly. When the PARTITION BY clause is also specified the sequence number generated by ROW_NUMBER() is reset to 1 for each new partition. In concept the PARTITION BY clause is similar to a GROUP BY except it only applies to the function and not the select as a whole.

SELECT
    ROW_NUMBER() OVER(PARTITION BY P.MfgCode ORDER BY (P.MfgListPrice - PR.SellPrice) / P.MfgListPrice DESC) AS RowNumber,
    P.ProductId,
    P.MfgPartNumber,
    P.DescriptionText,
    P.MfgListPrice,
    PR.SellPrice,
    (P.MfgListPrice - PR.SellPrice) / P.MfgListPrice AS SavingsPercentage
FROM    Product P
INNER JOIN    Price PR
    ON    P.ProductId = PR.ProductId
INNER JOIN    SpecialPricingXRef SP
    ON    PR.PriceGroupId = SP.PriceGroupId
WHERE    SP.PricingType = 'Promo'
    AND    P.MfgListPrice > PR.SellPrice

The where clause in our outer select is just filtering the results so that only those rows with RowNumber less than 3 are returned. Recall from our previous query that RowNumber is reset to 1 for each manufacturer; so to get just the top 2 rows for each manufacturer we need to get rows where RowNumber is 1 and 2 (< 3). The result is the list of 2 rows with highest savings for each manufacturer… simple really.

P.S. If you prefer Common Table Expressions then you can rewrite the query to:

WITH InnerTable AS(
    SELECT
        ROW_NUMBER() OVER(PARTITION BY P.MfgCode ORDER BY (P.MfgListPrice - PR.SellPrice) / P.MfgListPrice DESC) AS RowNumber,
        P.ProductId, 
        P.MfgPartNumber,
        P.DescriptionText,
        P.MfgListPrice,
        PR.SellPrice,
        (P.MfgListPrice - PR.SellPrice) / P.MfgListPrice AS SavingsPercentage
    FROM    Product P
    INNER JOIN    Price PR
        ON    P.ProductId = PR.ProductId
    INNER JOIN    SpecialPricingXRef SP
        ON    PR.PriceGroupId = SP.PriceGroupId
    WHERE    SP.PricingType = 'Promo'
        AND    P.MfgListPrice > PR.SellPrice
) 

SELECT    ProductId,
    MfgPartNumber,
    DescriptionText,
    MfgListPrice,
    SellPrice,
    SavingsPercentage
FROM    InnerTable
WHERE    RowNumber <= 3
ORDER BY    SavingsPercentage DESC;

the query plan for sub query vs CTE is the same so its all a matter of choice and readability.

About these ads
  1. Danish
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: