Adventures in Machine Learning

Mastering Common Table Expressions and Recursive Queries in SQL

Introduction to Common Table Expressions

Structured Query Language (SQL) is a programming language used to access and manage data in relational databases. Common Table Expressions (CTE) is a feature of SQL that allows you to define a temporary named result set that can be referenced within a main query without the need for repeated code.

Essentially, CTEs act as virtual tables that are available only for the duration of a query.

CTEs are useful for several reasons, including improving query readability, breaking down complex queries into smaller, reusable pieces, and reducing the execution time of queries.

In this article, we will explore the definition and functionality of CTEs, along with examples of SQL queries that utilize CTEs.

Definition and Functionality

A Common Table Expression is a temporary named result set that can be accessed multiple times in a single query. It can be created using the

WITH keyword and can be used to enhance the readability and performance of SQL queries. CTEs can be thought of as virtual tables that are only available for the duration of a query and can be referenced by other queries within the main query.

In more technical terms, a CTE defines a query that can be treated like a table or a view. The CTE query is executed once and the results are stored in a temporary result set.

The main query then references that result set to build the final output.

Syntax and Examples

The syntax of CTEs consists of the

WITH keyword, followed by the name of the CTE query and the columns it returns. The main query references the CTE query by name.

The syntax for CTEs can be summarized as:

WITH CTE_Query_Name (Column1, Column2, ...) AS (
    CTE_Query
)
SELECT ... FROM ...
JOIN CTE_Query_Name ON ...

Here’s an example of how to use a CTE to create a report of employees who have worked at the company for over 10 years:

WITH TenYearEmployees (EmployeeID, FirstName, LastName, Tenure) AS (
    SELECT 
        EmployeeID,
        FirstName,
        LastName,
        DATEDIFF(YEAR, HireDate, GETDATE()) as Tenure
    FROM Employees
    WHERE
        DATEDIFF(YEAR, HireDate, GETDATE()) >= 10
)
SELECT * FROM TenYearEmployees

In this example, we created a CTE named “TenYearEmployees” that contains the EmployeeID, FirstName, LastName, and Tenure (years of employment) for employees who have worked at the company for over 10 years. The main query then selects all columns from the “TenYearEmployees” CTE.

Examples of SQL Queries using Common Table Expressions

Example 1: Obtaining a report with the highest priced item sold

In this example, we will use a CTE to obtain a report showing the highest priced item sold:

WITH MaxSale (ItemName, SaleAmount) AS (
    SELECT TOP 1
        Items.ItemName,
        MAX(Sales.SaleAmount)
    FROM Sales
    JOIN Items ON Sales.ItemID = Items.ItemID
    GROUP BY Items.ItemName
)
SELECT * FROM MaxSale

In this example, we created a CTE named “MaxSale” that retrieves the highest sale amount for each item, then selects the item with the highest sale amount. The main query then selects all columns from the “MaxSale” CTE.

Example 2: Generating a report with the highest daily revenue by branch

In this example, we will use a CTE to generate a report of the highest daily revenue by branch:

WITH BranchRevenue (BranchID, Revenue, Date) AS (
    SELECT
        Sales.BranchID,
        SUM(Sales.SaleAmount),
        CAST(Sales.SaleDate AS Date)
    FROM Sales
    GROUP BY Sales.BranchID, CAST(Sales.SaleDate AS Date)
)
SELECT 
    Branches.BranchName,
    BranchRevenue.Date,
    MAX(BranchRevenue.Revenue) as HighestRevenue
FROM Branches
JOIN BranchRevenue ON Branches.BranchID = BranchRevenue.BranchID
GROUP BY BranchRevenue.Date, Branches.BranchName

In this example, we created a CTE named “BranchRevenue” that calculates the total sales revenue for each branch on each day. The main query then joins the “Branches” and “BranchRevenue” tables, grouping the results by date and branch name, and selecting the date and highest revenue for each date and branch.

Example 3: Organizing a long query using SQL CTEs

In this example, we will use multiple CTEs to organize a long query:

WITH 
OrderedItems (ItemID, ItemName, Price) AS (
    SELECT 
        Items.ItemID,
        Items.ItemName,
        Items.Price
    FROM Items
    ORDER BY Items.ItemName
),
TotalSales (ItemID, TotalSales) AS (
    SELECT 
        Sales.ItemID,
        SUM(Sales.Quantity)
    FROM Sales
    GROUP BY Sales.ItemID
),
Locations (LocationID, LocationName, ClientID) AS (
    SELECT 
        Locations.LocationID,
        Locations.LocationName,
        Clients.ClientID
    FROM Locations
    JOIN Clients ON Clients.ClientID = Locations.ClientID
)
SELECT 
    OrderedItems.ItemName,
    OrderedItems.Price,
    TotalSales.TotalSales,
    Locations.LocationName
FROM OrderedItems
JOIN TotalSales ON OrderedItems.ItemID = TotalSales.ItemID
JOIN Sales ON Sales.ItemID = TotalSales.ItemID
JOIN Locations ON Locations.LocationID = Sales.LocationID
AND Locations.ClientID = 'ABC123'

In this example, we created three CTEs to organize a long query. The “OrderedItems” CTE orders the items by name, the “TotalSales” CTE calculates the total sales for each item, and the “Locations” CTE retrieves location information for a specific client.

The main query then joins the “OrderedItems”, “TotalSales”, “Sales”, and “Locations” tables, selecting the ordered item name, price, total sales, and location name for a specific client.

Example 4: Obtaining a report with the largest ticket sold by branch

In this example, we will use a CTE to obtain a report showing the largest ticket sold by branch:

WITH LargestTicketByBranch (
    BranchID,
    SaleAmount,
    RowNum
) AS (
    SELECT
        Sales.BranchID,
        Sales.SaleAmount,
        ROW_NUMBER() OVER(
            PARTITION BY Sales.BranchID 
            ORDER BY Sales.SaleAmount DESC
        ) AS RowNum
    FROM Sales
)
SELECT *
FROM LargestTicketByBranch
WHERE RowNum = 1

In this example, we created a CTE named “LargestTicketByBranch” that uses the ROW_NUMBER() function to calculate the row number for each sale partitioned by branch, and ordered by sale amount in descending order. The main query then selects all columns from the “LargestTicketByBranch” CTE where the row number is equal to one, thus selecting the largest ticket sold by each branch.

Example 5: Using a nested CTE to obtain items priced over $90 sold by London-2

In this example, we will use a nested CTE to obtain a report of items priced over $90 sold by London-2:

WITH 
LocationSales (LocationID, ItemID) AS (
    SELECT
        LocationSales.LocationID,
        LocationSales.ItemID
    FROM Sales
    INNER JOIN Locations ON Sales.LocationID = Locations.LocationID
    WHERE Locations.LocationName = 'London-2'
),
HighPricedItems (ItemID, ItemName, Price) AS (
    SELECT
        Items.ItemID,
        Items.ItemName,
        Items.Price
    FROM Items
    WHERE Items.Price > 90
)
SELECT
    HighPricedItems.ItemName,
    HighPricedItems.Price
FROM HighPricedItems
INNER JOIN (
    SELECT DISTINCT ItemID 
    FROM LocationSales
) AS SoldItems ON HighPricedItems.ItemID = SoldItems.ItemID

In this example, we created a nested CTE to obtain a report of items priced over $90 sold by London-2. The “LocationSales” CTE retrieves all item sales for London-2, while the “HighPricedItems” CTE retrieves all items with a price over $90.

The main query then joins the “HighPricedItems” and “LocationSales” tables, selecting the item name and price for items sold in London-2.

Conclusion

Common Table Expressions are a useful feature of SQL that allows you to define temporary named result sets that can be reused within a query. CTEs improve query readability and performance and allow you to organize complex queries into smaller, reusable pieces.

The examples provided in this article demonstrate how CTEs can be used to obtain reports with the highest priced item sold, generate reports of the highest daily revenue by branch, organize long queries, obtain reports with the largest ticket sold by branch, and filter items priced over $90 sold by a specific location.

3) Recursive Queries and Common Table Expressions

Recursive queries are a subset of SQL that allows you to work with recursive data structures. It enables you to loop through and traverse through hierarchical data, making it a useful feature for dealing with trees, graphs, organizational data, and many other types of data structures that need to be traversed recursively.

The primary functionality of recursive queries is that they allow you to reference a non-recursive query term as an initial starting point and then join it with a recursive query term. The recursive query term is defined in a Common Table Expression (CTE) and refers back to itself in order to progress through each step of the recursive hierarchy.

Syntax and Examples

The syntax for recursive queries using Common Table Expressions is similar to that of CTEs, with the addition of the RECURSIVE keyword. Here is an example of how to use a recursive query to traverse a simple hierarchical structure:

WITH RECURSIVE EmployeeHierarchy (EmployeeID, EmployeeName, ManagerID, Level) AS (
    SELECT EmployeeID, EmployeeName, ManagerID, 1 as Level
    FROM Employees
    WHERE ManagerID IS NULL
    UNION ALL
    SELECT Employees.EmployeeID, Employees.EmployeeName, Employees.ManagerID, EmployeeHierarchy.Level + 1
    FROM Employees
    JOIN EmployeeHierarchy ON Employees.ManagerID = EmployeeHierarchy.EmployeeID
)
SELECT EmployeeName, Level
FROM EmployeeHierarchy
ORDER BY Level DESC;

In this example, we create a recursive CTE named “EmployeeHierarchy” that selects the top-level employees (i.e. those with no manager), and then joins them with their respective employees at the next level. The second part of the query joins the “Employees” table with the “EmployeeHierarchy” CTE and recursively references itself to traverse the hierarchy until all employees have been included.

The final part of the query selects the employee’s name and level and orders the result set by descending level.

Conclusion and Resources

Using Common Table Expressions allows for more readable SQL queries and presents many benefits, including breaking down complex queries and improving performance. The addition of recursive queries to CTEs makes it easier to manage and traverse hierarchical data structures.

If you want to learn more about recursive queries using CTEs, there are many online resources available. You can start by taking an online course such as the Recursive Queries course offered by Codeacademy, or tackle the Advanced SQL track that offers advanced CTE concepts and recursive data structures.

Moreover, for more comprehensive information on CTEs and recursive queries, you can refer to SQL interview questions, which provide insights into how these concepts are used in a real-world scenario. Additional online resources and forums also offer discussions on recursive SQL tree traversal and other complex CTE examples that can help you grasp and master this feature.

Common Table Expressions (CTEs) are a powerful feature of SQL that allow you to create temporary result sets that can be referenced in a main query. CTEs improve query readability and performance, and can be used to organize and breakdown complex queries.

Recursive queries are a subset of SQL that can work with recursive data structures through the use of CTEs. By using recursive queries with CTEs, you can easily traverse hierarchical data structures and graphs. Taking online courses and accessing online resources can help in mastering these concepts, which are widely used in real-world scenarios, including SQL interview questions and recursive SQL tree traversal.

The ability to understand and use CTEs and recursive queries provides valuable insights into managing and structuring data suited for many complex problems.

Popular Posts