15 Advanced SQL Concepts With Examples (2024 Edition)

•

April 17, 2024

•

20 min read

Welcome to the 2024 edition of 15 advanced SQL concepts with examples. Whether you're a seasoned SQL developer or just starting your journey into the world of databases, this comprehensive guide will explore advanced SQL topics, elevating your database querying skills. These concepts will equip you with the knowledge to handle complex data challenges confidently. Through clear explanations and practical examples, you'll understand SQL's capabilities better and how to leverage them effectively. Get ready to dive into the intricacies of SQL and unlock the full potential of your data manipulation and analysis abilities.

What is Advanced SQL?

Advanced SQL refers to concepts and techniques beyond the foundational skills of querying data from relational databases. While basic SQL equips you with standard operations like filtering, sorting, and altering tables, advanced SQL delves deeper, offering functionalities like window functions, aggregate functions, pivoting techniques, and many more.

Advanced SQL enables you to manage:

Complex Data Manipulation: You can easily handle intricate data transformations with techniques like pivoting and unpivoting. These techniques allow you to reshape data between row-based and column-based formats for better analysis. ‍
Automation and Reusability: Concepts like stored procedures and triggers let you automate repetitive tasks and promote code reusability. Stored procedures are pre-written SQL code blocks that can be reused with parameters, while triggers are database objects that automatically execute specific actions based on events.

15 Advanced SQL Concepts

Here are the 15 advanced SQL concepts that will help to transform your queries and enhance your database expertise:

Subqueries

Subqueries, known as nested queries, are a powerful feature of advanced SQL that allows you to embed a complete statement within another SQL statement. By breaking down complex data retrieval into smaller subqueries, the logic behind your SQL code becomes much easier to understand.

You can use subqueries within various clauses of the outer query, including SELECT, FROM, WHERE, and HAVING. Below is an example of using subquery to find customers who have made purchases in the last month.

 
SELECT customer_name 
FROM customers 
WHERE customer_id IN (SELECT customer_id 
FROM orders 
WHERE order_date >= DATE_SUB(NOW(), INTERVAL 1 MONTH));

Joins

Joins are a fundamental concept for working with relational databases. They act as the bridge between tables, allowing you to combine data from multiple tables based on defined relationships. There are different types of joins, such as INNER, FULL, RIGHT, and LEFT.

Here,

The LEFT JOIN includes all rows from the left table and matching rows from the right table. The result set will contain NULL for the rows that don’t have a matching row on the right side.
Conversely, RIGHT JOIN consists of all records from the right table rows and matching rows from the left while leaving unmatched entries from the left side as NULL.
‍INNER JOIN is the default join, which only returns the rows where the join condition is met in both tables.
The FULL JOIN combines left and right join results, including all rows from both tables, even if there’s no match in the other table.

 
SELECT table1.column1,table1.column2,table2.column1,....
FROM table1
INNER JOIN table2
ON table1.matching_column = table2.matching_column;
table1: First table.
table2: Second table
matching_column: Column common to both the tables.

Union

The UNION operator combines the results of multiple SELECT statements into a single, unified result set. By default, it removes all duplicate rows based on the values in all selected columns.

In the example mentioned below, the SQL query combines customer names from two subsets of the customers table based on their countries.

 
SELECT customer_name 
FROM customers 
WHERE country = 'USA'
UNION
SELECT customer_name 
FROM customers 
WHERE country = 'Canada';

Aggregate Functions

The aggregate function in advanced SQL summarizes data by performing calculations on groups of values and returning a single result. It is often used with the GROUP BY clause to categorize and analyze data from a broader perspective. The common aggregate functions are COUNT, SUM, AVG, MIN, and MAX.

For example, the aggregate function below calculates each product's total sales amount.

 
SELECT product_category, SUM(order_amount) AS total_sales
FROM orders
GROUP BY product_category;

Window Functions

Window functions are special operations in SQL that allow you to perform calculations or operations on a group of rows at once, considering their relationship within the group. Common window functions include the LAG, RANK, DENSE_RANK, ROW_NUMBER, and many more.

For example, the ROW_NUMBER function can be used to rank customers by their maximum order amount.

 
SELECT customer_name, order_amount, 
       ROW_NUMBER() OVER (ORDER BY order_amount DESC) AS rank
FROM orders;

Common Table Expressions (CTEs)

CTEs are the temporary named result sets defined with a single SQL statement, such as SELECT, DELETE, INSERT, or CREATE VIEW. These CTEs act as virtual tables that can be referenced within the same statement for further processing.

For instance, CTEs can be used to calculate each customer's total order amount.

 
WITH customer_orders AS (
    SELECT customer_id, SUM(order_amount) AS total_amount
    FROM orders
    GROUP BY customer_id
)
SELECT customers.customer_name, customer_orders.total_amount
FROM customers
INNER JOIN customer_orders
ON customers.customer_id = customer_orders.customer_id;

Pivoting

Pivoting is a technique for transforming data from a row-oriented format into a column-oriented format. The example below presents a pivoting query to know the total sales for each product across different months.

 
SELECT product,
  SUM(amount) AS Jan_Sales,
  SUM(amount) AS Feb_Sales,
  SUM(amount) AS Mar_Sales
FROM sales
PIVOT (
  SUM(amount) FOR month IN ('Jan', 'Feb’, 'Mar')
) AS pivoted_data;

Recursive Queries

Recursive queries in SQL are a powerful technique for working with hierarchical data. Unlike traditional SQL queries that process data in a single pass, recursive queries can call themselves repeatedly until a specific condition is met.

Recursive queries are typically implemented using CTEs to achieve the desired outcome. The example below uses a recursive query to retrieve all the employees and their managers, showing the reporting structure.

 

WITH EmployeeHierarchy (id, name, manager_id, level) AS (
  -- Base Case: Get all top-level employees (with no manager)
  SELECT id, name, manager_id, 1 AS level
  FROM employees
  WHERE manager_id IS NULL
UNION ALL
#Recursive Case: Find employees who report to someone in the hierarchy
  SELECT e.id, e.name, e.manager_id, h.level + 1
  FROM employees e
  INNER JOIN EmployeeHierarchy h ON e.manager_id = h.id
)
SELECT * FROM EmployeeHierarchy;

String Manipulation

String manipulation refers to a collection of powerful functions that enable you to modify, extract, and transform text data stored within your database. The functions like CONCAT, SUBSTRING, and REPLACE are commonly used for tasks such as concatenating strings, extracting substrings, and replacing characters.

Here is an example of the CONCAT function to combine first and last names,

 
SELECT CONCAT(first_name, ' ', last_name) AS full_name
FROM employees;

Date and Time Functions

Advanced SQL offers a rich set of functions for manipulating and working with time and dates. Functions like DATEPART are used for extracting parts of a date, DATE_ADD adds a specified number of days to date, and DATEDIFF calculates the difference between two dates in specified units, whereas DATE_FORMAT arranges dates according to a specified format string.

For example,

 
SELECT DATEPART(MONTH, order_date) AS order_month
FROM orders;

Case Statements

Case statements are the expressions for implementing conditional logic within your queries. They allow you to evaluate different conditions and return corresponding values, mimicking an if-then-else structure.

In the example given below, the SQL statement categorizes customers based on their total order amount.

 
SELECT customer_name, 
       CASE 
           WHEN total_order_amount >= 1000 THEN 'High Value'
           WHEN total_order_amount >= 500 THEN 'Medium Value'
           ELSE 'Low Value'
       END AS customer_category
FROM customers;

User-Defined Functions

User-defined functions (UDFs) in SQL allow you to extend the functionality of your database system by creating custom functions. These functions act like reusable modules that accept input (parameters), perform specific operations, and return a result, similar to functions in programming languages.

There are two different types of UDFs—scalar and table-valued functions. Where

Scalar functions are the most common type, returning a single scalar value (number, string, date, etc.) as output.
Table-valued functions return a complete result set like a virtual table, enabling you to perform more complex data manipulations.

In the example listed below, the UDF calculate_discount() calculates the discounted price of a product given its original price and discount rate.

 
CREATE FUNCTION calculate_discount(price DECIMAL(10,2), discount_rate INT)
RETURNS DECIMAL(10,2)
BEGIN
  DECLARE discount DECIMAL(10,2);
  SET discount = price * (discount_rate / 100.0);
  RETURN price - discount;
END;

Temporary Tables

Temporary tables do not permanently store data in the database; instead, they exist temporarily. Unlike permanent tables, temporary tables do not exist after the session ends or when manually deleted. You can also insert, update, and delete records within temporary tables, similar to permanent tables.

For example,

 
CREATE TEMPORARY TABLE temp_customer_orders (
  customer_id INT,
  total_order_amount DECIMAL(10,2),
  PRIMARY KEY (customer_id)
);
INSERT INTO temp_customer_orders
SELECT customer_id, SUM(order_amount) AS total_order_amount
FROM orders
WHERE order_date BETWEEN '2024-01-01' AND '2024-04-03'  -- Current date
GROUP BY customer_id;
SELECT c.customer_name, COALESCE(t.total_order_amount / COUNT(o.order_id), 0.00) AS average_order_value
FROM customers c
LEFT JOIN temp_customer_orders t ON c.customer_id = t.customer_id
LEFT JOIN orders o ON c.customer_id = o.customer_id  -- For calculating order count
WHERE o.order_date BETWEEN '2024-01-01' AND '2024-04-03'  -- Current date
GROUP BY c.customer_id, c.customer_name, t.total_order_amount;

External Query Filter

External query filters, predicate pushdown, or filter pushdown are optimization techniques to improve query performance. They leverage the processing power of external data sources (like another database or application) to filter data before it's transferred to your central SQL server.

In the example listed below, the external query filter (stored procedure in this case) fetches the data from an external system.

 
CREATE VIEW active_customers AS
SELECT customer_id, customer_name
FROM crm_customers
WHERE is_active = 1;
SELECT o.order_id, o.customer_id, c.customer_name
FROM orders o
INNER JOIN active_customers c ON o.customer_id = c.customer_id;
WHERE o.customer_id IN (
	SELECT customer_id
	FROM temp_gold_customers 
);

Query Optimization

Query optimization techniques include methods like indexing, proper join strategy selection, and avoiding unnecessary data retrieval. These strategies improve the performance of SQL queries by analyzing execution plans and choosing efficient execution strategies.

For instance, this code snippet creates an index for potential performance optimization. It then executes a query that might benefit from that index if applied to your application's specific table structure and query patterns.

 
CREATE INDEX IF NOT EXISTS idx_product_category_stock (category_id, quantity); #Assuming you don't already have this index
SELECT *
FROM products
WHERE category_id = 1
AND quantity > 0;

Unlock The Full Potential Of Your Relational Databases With Airbyte

Before performing your advanced SQL operations on the data, you must centralize data collected from different sources leveraging data integration tools like Airbyte.

Airbyte is a data integration platform that supports modern ELT approaches. It simplifies collaboration by enabling you to connect to a wide variety of data sources, including SaaS applications, databases, and flat files. Airbyte’s user-friendly interface streamlines data flow configurations, while the powerful API empowers programmatic control for advanced users.

Key features of Airbyte:

Extensive Connector Library: Airbyte’s 350+ pre-built connectors support various data sources and destinations. This eliminates the need for manual integration work for most common platforms.‍
Customizable Connectors: Even if a specific connector isn't readily available, Airbyte empowers you to build your own using its Connector Development Kit (CDK) within a short time frame. The CDK allows you to leverage various programming languages to build these connectors, as long as they're built into docker images that implement the Airbyte specification.‍
Data Replication: Airbyte offers granular control over data selection. You can choose specific data streams to replicate, catering to your precise needs. It supports various replication methods, including full refresh, incremental updates, and Change Data Capture (CDC) for efficient data movement.

Conclusion

Becoming proficient in advanced SQL queries can significantly improve your capacity to analyze and handle data within a relational database. This article delves into 15 advanced SQL concepts, offering a robust groundwork for addressing intricate data tasks. As you practice more, you'll become proficient at managing and analyzing data. This newfound expertise will empower you to make data-driven decisions with greater confidence, unlock the full potential of your relational databases, and ultimately gain a significant edge.

Limitless data movement with free Alpha and Beta connectors

Introducing: our Free Connector Program

The data movement infrastructure for the modern data teams.

Try a 14-day free trial