Adventures in Machine Learning

Mastering Common Table Expressions and Recursive Queries in SQL

Introduction to Common Table Expressions

Structured Query Language (SQL) is a programming language used to access and manage data in relational databases. Common Table Expressions (CTE) is a feature of SQL that allows you to define a temporary named result set that can be referenced within a main query without the need for repeated code.

Essentially, CTEs act as virtual tables that are available only for the duration of a query.

CTEs are useful for several reasons, including improving query readability, breaking down complex queries into smaller, reusable pieces, and reducing the execution time of queries.

In this article, we will explore the definition and functionality of CTEs, along with examples of SQL queries that utilize CTEs.

Definition and Functionality

A Common Table Expression is a temporary named result set that can be accessed multiple times in a single query. It can be created using the

WITH keyword and can be used to enhance the readability and performance of SQL queries. CTEs can be thought of as virtual tables that are only available for the duration of a query and can be referenced by other queries within the main query.

In more technical terms, a CTE defines a query that can be treated like a table or a view. The CTE query is executed once and the results are stored in a temporary result set.

The main query then references that result set to build the final output.

Syntax and Examples

The syntax of CTEs consists of the

WITH keyword, followed by the name of the CTE query and the columns it returns. The main query references the CTE query by name.

The syntax for CTEs can be summarized as:

WITH CTE_Query_Name (Column1, Column2, …) AS (

CTE_Query

)

SELECT … FROM …

JOIN CTE_Query_Name ON … Here’s an example of how to use a CTE to create a report of employees who have worked at the company for over 10 years:

WITH TenYearEmployees (EmployeeID, FirstName, LastName, Tenure) AS (

SELECT

EmployeeID,

FirstName,

LastName,

DATEDIFF(YEAR, HireDate, GETDATE()) as Tenure

FROM Employees

WHERE

DATEDIFF(YEAR, HireDate, GETDATE()) >= 10

)

SELECT * FROM TenYearEmployees

In this example, we created a CTE named “TenYearEmployees” that contains the EmployeeID, FirstName, LastName, and Tenure (years of employment) for employees who have worked at the company for over 10 years. The main query then selects all columns from the “TenYearEmployees” CTE.

Examples of SQL Queries using Common Table Expressions

Example 1: Obtaining a report with the highest priced item sold

In this example, we will use a CTE to obtain a report showing the highest priced item sold:

WITH MaxSale (ItemName, SaleAmount) AS (

SELECT TOP 1

Items.ItemName,

MAX(Sales.SaleAmount)

FROM Sales

JOIN Items ON Sales.ItemID = Items.ItemID

GROUP BY Items.ItemName

)

SELECT * FROM MaxSale

In this example, we created a CTE named “MaxSale” that retrieves the highest sale amount for each item, then selects the item with the highest sale amount. The main query then selects all columns from the “MaxSale” CTE.

Example 2: Generating a report with the highest daily revenue by branch

In this example, we will use a CTE to generate a report of the highest daily revenue by branch:

WITH BranchRevenue (BranchID, Revenue, Date) AS (

SELECT

Sales.BranchID,

SUM(Sales.SaleAmount),

CAST(Sales.SaleDate AS Date)

FROM Sales

GROUP BY Sales.BranchID, CAST(Sales.SaleDate AS Date)

)

SELECT

Branches.BranchName,

BranchRevenue.Date,

MAX(BranchRevenue.Revenue) as HighestRevenue

FROM Branches

JOIN BranchRevenue ON Branches.BranchID = BranchRevenue.BranchID

GROUP BY BranchRevenue.Date, Branches.BranchName

In this example, we created a CTE named “BranchRevenue” that calculates the total sales revenue for each branch on each day. The main query then joins the “Branches” and “BranchRevenue” tables, grouping the results by date and branch name, and selecting the date and highest revenue for each date and branch.

Example 3: Organizing a long query using SQL CTEs

In this example, we will use multiple CTEs to organize a long query:

WITH

OrderedItems (ItemID, ItemName, Price) AS (

SELECT

Items.ItemID,

Items.ItemName,

Items.Price

FROM Items

ORDER BY Items.ItemName

),

TotalSales (ItemID, TotalSales) AS (

SELECT

Sales.ItemID,

SUM(Sales.Quantity)

FROM Sales

GROUP BY Sales.ItemID

),

Locations (LocationID, LocationName, ClientID) AS (

SELECT

Locations.LocationID,

Locations.LocationName,

Clients.ClientID

FROM Locations

JOIN Clients ON Clients.ClientID = Locations.ClientID

)

SELECT

OrderedItems.ItemName,

OrderedItems.Price,

TotalSales.TotalSales,

Locations.LocationName

FROM OrderedItems

JOIN TotalSales ON OrderedItems.ItemID = TotalSales.ItemID

JOIN Sales ON Sales.ItemID = TotalSales.ItemID

JOIN Locations ON Locations.LocationID = Sales.LocationID

AND Locations.ClientID = ‘ABC123’

In this example, we created three CTEs to organize a long query. The “OrderedItems” CTE orders the items by name, the “TotalSales” CTE calculates the total sales for each item, and the “Locations” CTE retrieves location information for a specific client.

The main query then joins the “OrderedItems”, “TotalSales”, “Sales”, and “Locations” tables, selecting the ordered item name, price, total sales, and location name for a specific client. Example 4: Obtaining a report with the largest ticket sold by branch

In this example, we will use a CTE to obtain a report showing the largest ticket sold by branch:

WITH LargestTicketByBranch (

BranchID,

SaleAmount,

RowNum

) AS (

SELECT

Sales.BranchID,

Sales.SaleAmount,

ROW_NUMBER() OVER(

PARTITION BY Sales.BranchID

ORDER BY Sales.SaleAmount DESC

) AS RowNum

FROM Sales

)

SELECT *

FROM LargestTicketByBranch

WHERE RowNum = 1

In this example, we created a CTE named “LargestTicketByBranch” that uses the ROW_NUMBER() function to calculate the row number for each sale partitioned by branch, and ordered by sale amount in descending order. The main query then selects all columns from the “LargestTicketByBranch” CTE where the row number is equal to one, thus selecting the largest ticket sold by each branch.

Example 5: Using a nested CTE to obtain items priced over $90 sold by London-2

In this example, we will use a nested CTE to obtain a report of items priced over $90 sold by London-2:

WITH

LocationSales (LocationID, ItemID) AS (

SELECT

LocationSales.LocationID,

LocationSales.ItemID

FROM Sales

INNER JOIN Locations ON Sales.LocationID = Locations.LocationID

WHERE Locations.LocationName = ‘London-2’

),

HighPricedItems (ItemID, ItemName, Price) AS (

SELECT

Items.ItemID,

Items.ItemName,

Items.Price

FROM Items

WHERE Items.Price > 90

)

SELECT

HighPricedItems.ItemName,

HighPricedItems.Price

FROM HighPricedItems

INNER JOIN (

SELECT DISTINCT ItemID

FROM LocationSales

) AS SoldItems ON HighPricedItems.ItemID = SoldItems.ItemID

In this example, we created a nested CTE to obtain a report of items priced over $90 sold by London-2. The “LocationSales” CTE retrieves all item sales for London-2, while the “HighPricedItems” CTE retrieves all items with a price over $90.

The main query then joins the “HighPricedItems” and “LocationSales” tables, selecting the item name and price for items sold in London-2.

Conclusion

Common Table Expressions are a useful feature of SQL that allows you to define temporary named result sets that can be reused within a query. CTEs improve query readability and performance and allow you to organize complex queries into smaller, reusable pieces.

The examples provided in this article demonstrate how CTEs can be used to obtain reports with the highest priced item sold, generate reports of the highest daily revenue by branch, organize long queries, obtain reports with the largest ticket sold by branch, and filter items priced over $90 sold by a specific location.

3) Recursive Queries and Common Table Expressions

Recursive queries are a subset of SQL that allows you to work with recursive data structures. It enables you to loop through and traverse through hierarchical data, making it a useful feature for dealing with trees, graphs, organizational data, and many other types of data structures that need to be traversed recursively.

The primary functionality of recursive queries is that they allow you to reference a non-recursive query term as an initial starting point and then join it with a recursive query term. The recursive query term is defined in a Common Table Expression (CTE) and refers back to itself in order to progress through each step of the recursive hierarchy.

Syntax and Examples

The syntax for recursive queries using Common Table Expressions is similar to that of CTEs, with the addition of the RECURSIVE keyword. Here is an example of how to use a recursive query to traverse a simple hierarchical structure:

WITH RECURSIVE EmployeeHierarchy (EmployeeID, EmployeeName, ManagerID, Level) AS (

SELECT EmployeeID, EmployeeName, ManagerID, 1 as Level

FROM Employees

WHERE ManagerID IS NULL

UNION ALL

SELECT Employees.EmployeeID, Employees.EmployeeName, Employees.ManagerID, EmployeeHierarchy.Level + 1

FROM Employees

JOIN EmployeeHierarchy ON Employees.ManagerID = EmployeeHierarchy.EmployeeID

)

SELECT EmployeeName, Level

FROM EmployeeHierarchy

ORDER BY Level DESC;

In this example, we create a recursive CTE named “EmployeeHierarchy” that selects the top-level employees (i.e. those with no manager), and then joins them with their respective employees at the next level. The second part of the query joins the “Employees” table with the “EmployeeHierarchy” CTE and recursively references itself to traverse the hierarchy until all employees have been included.

The final part of the query selects the employee’s name and level and orders the result set by descending level.

Conclusion and Resources

Using Common Table Expressions allows for more readable SQL queries and presents many benefits, including breaking down complex queries and improving performance. The addition of recursive queries to CTEs makes it easier to manage and traverse hierarchical data structures.

If you want to learn more about recursive queries using CTEs, there are many online resources available. You can start by taking an online course such as the Recursive Queries course offered by Codeacademy, or tackle the Advanced SQL track that offers advanced CTE concepts and recursive data structures.

Moreover, for more comprehensive information on CTEs and recursive queries, you can refer to SQL interview questions, which provide insights into how these concepts are used in a real-world scenario. Additional online resources and forums also offer discussions on recursive SQL tree traversal and other complex CTE examples that can help you grasp and master this feature.

Common Table Expressions (CTEs) are a powerful feature of SQL that allow you to create temporary result sets that can be referenced in a main query. CTEs improve query readability and performance, and can be used to organize and breakdown complex queries.

Recursive queries are a subset of SQL that can work with recursive data structures through the use of CTEs. By using recursive queries with CTEs, you can easily traverse hierarchical data structures and graphs. Taking online courses and accessing online resources can help in mastering these concepts, which are widely used in real-world scenarios, including SQL interview questions and recursive SQL tree traversal.

The ability to understand and use CTEs and recursive queries provides valuable insights into managing and structuring data suited for many complex problems.

Popular Posts