CTEs vs. Subqueries: Which One Should You Use?
If you’re an Oracle or SQL developer, you’ve probably been using subqueries for years. But what about CTEs?
What are they, and how can they be used to write better code? In this article, we explore the differences between CTEs and subqueries, the benefits of using CTEs over subqueries, and when you should make the switch to CTEs and Subqueries.
Before we dive into the benefits of using CTEs, let’s define what they are.
CTEs, or Common Table Expressions, are temporary result sets that are defined within a SELECT, INSERT, UPDATE, or DELETE statement. They are similar to subqueries in that they allow you to break up complex queries into smaller, more manageable pieces.
However, CTEs differ from subqueries in that they are reusable within a query, making them more efficient and easier to read. On the other hand, subqueries are queries that are embedded in a larger query.
They can be used to retrieve data that will be used in the main query’s selection criteria or to calculate aggregate values. Subqueries can also be used to join tables that would not otherwise be joinable.
Differences between CTEs and Subqueries
The primary difference between CTEs and subqueries is their structure. Subqueries are embedded within a larger query, while CTEs are defined separately and then referenced within a query.
This makes CTEs easier to read and understand, especially for complex queries. Another advantage of CTEs over subqueries is that they can be nested.
This means that you can use one CTE within another CTE to break up complex computations into parts. This makes it easier to read and understand the code, and it also makes it easier to modify the query if needed.
Finally, CTEs allow recursion, which is the ability to refer to a query in its own definition. This can be useful if you are trying to retrieve hierarchical data or if you need to perform calculations that require multiple levels of computation.
Benefits of CTEs over Subqueries
Now that we’ve defined CTEs and subqueries and identified their differences, let’s take a closer look at the benefits of CTEs.
Meaningful Names
One of the benefits of using CTEs is that they allow you to use meaningful names. Instead of having a subquery that is buried within a larger query and hard to read, you can define a CTE with a name that describes what the query does.
For example, instead of having a subquery that calculates the average age of customers, you could define a CTE called ‘customer_age’ that holds the calculation. This makes it easier to read and understand the query, and it also makes it easier to modify the query if needed.
Reusability
Another advantage of CTEs over subqueries is that they are reusable within a query. This means that you can define a CTE once and then reference it multiple times within the same query.
For example, if you need to calculate the sales for each quarter, you could define a CTE called ‘quarterly_sales’ that holds the calculation. You could then reference this CTE multiple times within the same query, making it more efficient and easier to read.
Computation Process
CTEs also allow you to divide complex computations into parts. This can be useful if you are trying to perform a calculation that involves multiple steps.
For example, if you need to calculate the average salary of employees who work in different departments, you could define a CTE to hold each department’s average salary calculation. You could then reference these CTEs within another CTE to calculate the overall average salary.
This makes the code easier to read, understand, and modify if needed.
Nested Computations
As mentioned earlier, CTEs can be nested, which makes it easier to break up complex computations into smaller parts. You can define a CTE within another CTE, making it easier to read and understand the code.
You can also reference the inner CTE in the outer CTE, making it more efficient and easier to modify if needed.
Recursion
Finally, CTEs allow recursion, which is the ability to refer to a query in its own definition. This can be useful if you are trying to retrieve hierarchical data or perform calculations that require multiple levels of computation.
For example, you can use recursion to retrieve all the employees who report to a specific manager.
Conclusion
In conclusion, CTEs are a powerful tool for Oracle and SQL developers. They offer several advantages over subqueries, including meaningful names, reusability, division of complex computations, nested computations, and recursion.
While subqueries are still useful in certain situations, CTEs are quickly becoming the preferred choice for developers who want to write cleaner, more efficient code. So, if you haven’t already, consider making the switch to CTEs. Your code (and your coworkers) will thank you!
Examples of CTEs vs. Subqueries
Now that we’ve covered the differences between CTEs and subqueries, let’s take a look at some examples of how you can use both to write better code. In this section, we’ll cover five different scenarios and show you how to use CTEs and subqueries to solve them.
Example 1: Comparing performance with average KPI
Suppose you have a table that contains employee performance data and you want to compare each employee’s performance to the average KPI (Key Performance Indicator) for all employees. You could use a subquery to calculate the average KPI and then join it with the main table to calculate the deviation from the average.
Here’s an example of how you would do that:
SELECT emp_id, KPI - avg_KPI AS deviation
FROM performance
JOIN (SELECT AVG(KPI) AS avg_KPI
FROM performance) AS subq
This query joins the performance table with a subquery that calculates the average KPI of all employees. It then subtracts the average KPI from each employee’s KPI to calculate the deviation.
However, this query can become unwieldy, especially if you need to perform additional calculations or join multiple tables. Alternatively, you could use a CTE to define the calculation of the average KPI and then reference it in the main query.
Here’s how you would do that:
WITH avg_KPI AS (SELECT AVG(KPI) AS avg_KPI
FROM performance)
SELECT emp_id, KPI - avg_KPI AS deviation
FROM performance
CROSS JOIN avg_KPI
This query defines the average KPI calculation in a CTE called ‘avg_KPI’, which is then referenced in the main query using a CROSS JOIN. This makes the query easier to read and understand, especially if you need to perform additional calculations or join multiple tables.
Example 2: Comparing performance of different departments
Suppose you want to compare the performance of different departments within your organization. You could use a subquery to calculate the average KPI for each department and then join it with the main table to calculate the deviation.
Here’s how you would do that:
SELECT dept_id, emp_id, KPI - subq.avg_KPI AS deviation
FROM performance
JOIN (SELECT dept_id, AVG(KPI) AS avg_KPI
FROM performance
GROUP BY dept_id) AS subq
ON performance.dept_id = subq.dept_id
This query joins the performance table with a subquery that calculates the average KPI for each department. It then subtracts the average KPI from each employee’s KPI to calculate the deviation.
While this query works, it can quickly become unreadable as you add more calculations. Alternatively, you could use a CTE to define the average KPI for each department and then reference it in the main query.
Here’s how you would do that:
WITH dept_avg AS (SELECT dept_id, AVG(KPI) as avg_KPI
FROM performance
GROUP BY dept_id)
SELECT performance.dept_id, emp_id, KPI - dept_avg.avg_KPI AS deviation
FROM performance
JOIN dept_avg ON performance.dept_id = dept_avg.dept_id
This query defines the average KPI for each department in a CTE called ‘dept_avg’, which is then referenced in the main query using a JOIN. This makes the query more readable and easier to understand, especially if you need to add more calculations or join multiple tables.
Example 3: Counting junior and experienced employees in each department
Suppose you want to count the number of junior and experienced employees in each department. You could use a subquery to define the criteria for junior and experienced employees and then join it with the main table to calculate the counts.
Here’s how you would do that:
SELECT dept_id,
SUM(CASE WHEN years_of_experience < 5 THEN 1 ELSE 0 END) AS junior_count,
SUM(CASE WHEN years_of_experience >= 5 THEN 1 ELSE 0 END) AS experienced_count
FROM employees
GROUP BY dept_id
This query uses a subquery to define the criteria for junior and experienced employees, which are then used to calculate the counts using a CASE statement. While this query works, it can become difficult to read and understand as you add more criteria.
Alternatively, you could use a CTE to define the criteria for junior and experienced employees and then reference it in the main query. Here’s how you would do that:
WITH employee_type AS (SELECT emp_id,
CASE WHEN years_of_experience < 5 THEN 'Junior' ELSE 'Experienced' END AS type
FROM employees)
SELECT dept_id,
SUM(CASE WHEN type = 'Junior' THEN 1 ELSE 0 END) AS junior_count,
SUM(CASE WHEN type = 'Experienced' THEN 1 ELSE 0 END) AS experienced_count
FROM employee_type
JOIN employees ON employees.emp_id = employee_type.emp_id
GROUP BY dept_id
This query uses a CTE called 'employee_type' to define the criteria for junior and experienced employees, which are then used to calculate the counts using a CASE statement. This makes the query more readable and easier to understand, especially if you need to add more criteria.
Example 4: Calculating number of above-average KPI employees in each department
Suppose you want to calculate the number of employees in each department whose KPI is above the department's average KPI. You could use a subquery to calculate the average KPI for each department and then join it with the main table to calculate the counts.
Here's how you would do that:
SELECT dept_id, COUNT(*) AS above_average_count
FROM performance
JOIN (SELECT dept_id, AVG(KPI) AS avg_KPI
FROM performance
GROUP BY dept_id) AS dept_avg
ON performance.dept_id = dept_avg.dept_id AND performance.KPI > dept_avg.avg_KPI
GROUP BY dept_id
This query joins the performance table with a subquery that calculates the average KPI for each department. It then counts the number of employees whose KPI is above the department's average KPI.
While this query works, it can become hard to read and understand as the queries become more complex. Alternatively, you could use a CTE to define the average KPI for each department and then reference it in the main query.
Here's how you would do that:
WITH dept_avg AS (SELECT dept_id, AVG(KPI) as avg_KPI
FROM performance
GROUP BY dept_id)
SELECT performance.dept_id, COUNT(*) AS above_average_count
FROM performance
JOIN dept_avg ON performance.dept_id = dept_avg.dept_id
WHERE performance.KPI > dept_avg.avg_KPI
GROUP BY performance.dept_id
This query defines the average KPI for each department in a CTE called 'dept_avg', which is then referenced in the main query using a JOIN. It then counts the number of employees whose KPI is above the department's average KPI.
This makes the query more readable and easier to understand, especially if you need to add more criteria. Example 5: Using recursive CTEs to analyze hierarchical structures
Suppose you have a table that contains hierarchical data, such as an organizational chart, and you want to analyze the relationship between employees and their superiors.
You could use a recursive CTE to perform this analysis. Here's an example:
WITH org_chart (emp_id, superior_id, level) AS (
SELECT emp_id, superior_id, 0
FROM employees
WHERE superior_id IS NULL
UNION ALL
SELECT employees.emp_id, employees.superior_id, org_chart.level + 1
FROM employees
JOIN org_chart ON employees.superior_id = org_chart.emp_id
)
SELECT *
FROM org_chart
This query creates a CTE called 'org_chart' that defines a recursive relationship between the employee and their superiors. It starts with the root level where the superior_id is null, and then recursively joins the employees table with the org_chart CTE until all levels have been reached.
This makes it easier to retrieve hierarchical data and perform calculations on it.
Conclusion
In this article, we covered the differences between CTEs and subqueries and the benefits of using CTEs. We also provided five examples of how you can use both to write better code. By using CTEs, you can create more readable and efficient queries that are easier to understand and maintain.
In conclusion, CTEs and subqueries are two essential tools for Oracle and SQL developers that allow the breaking down of complex queries into more manageable parts. While subqueries have been the traditional go-to, there are significant advantages to using CTEs. These include the ability to create reusable queries, meaningful names resulting in better code readability, the division of complex computations, the ability to nest computations seamlessly, and recursion.
By using CTEs instead of subqueries, developers can create efficient queries that are easier to understand, modify and maintain. Therefore, it's essential for developers to know when to use CTEs instead of subqueries to write cleaner code, increase efficiency, and save time.