Advanced SQL Topics
Master advanced SQL with Mermaid diagrams covering CTEs, recursive queries, PIVOT, dynamic SQL, optimization, and partitioning
Advanced SQL techniques enable complex data manipulation, hierarchical queries, performance optimization, and dynamic query generation for sophisticated database applications.
Advanced SQL Topics Overview
graph TB
A[Advanced SQL] --> B[CTEs]
A --> C[Recursive Queries]
A --> D[PIVOT/UNPIVOT]
A --> E[Dynamic SQL]
A --> F[Optimization]
A --> G[Partitioning]
B --> B1[Readable queries]
B --> B2[Reusable subqueries]
C --> C1[Hierarchical data]
C --> C2[Tree traversal]
D --> D1[Rows to columns]
D --> D2[Columns to rows]
E --> E1[Runtime SQL generation]
E --> E2[Flexible queries]
F --> F1[Execution plans]
F --> F2[Index optimization]
G --> G1[Large table management]
G --> G2[Query performance]
style A fill:#2196F3
style B fill:#4CAF50
style C fill:#FF9800
style D fill:#9C27B0
style E fill:#F44336
style F fill:#00BCD4
style G fill:#E91E63
Key Points:
- CTEs: Common Table Expressions for readable, reusable queries
- Recursive: Handle hierarchical data (org charts, categories)
- PIVOT: Transform rows to columns for reporting
- Dynamic SQL: Generate SQL at runtime for flexibility
- Optimization: Improve query performance with execution plans
- Partitioning: Manage large tables efficiently
CTE (Common Table Expression) Flow
sequenceDiagram
participant Query
participant CTE as CTE Definition
participant Main as Main Query
participant Result
Query->>CTE: WITH cte_name AS (SELECT...)
CTE->>CTE: Execute subquery
CTE->>Main: Make result available
Main->>Main: Reference CTE
Main->>Result: Return final result
Note over CTE: Temporary named result set
Key Points:
- Readability: Named subqueries improve code clarity
- Reusability: Reference CTE multiple times in query
- Scope: CTE exists only for single query execution
- Performance: Similar to subquery, not materialized
- Multiple CTEs: Can define multiple CTEs in one query
Recursive CTE Structure
flowchart TB
A[Recursive CTE] --> B[Anchor Member]
A --> C[UNION ALL]
A --> D[Recursive Member]
B --> E[Initial rows]
B --> F[Base case]
D --> G[References CTE itself]
D --> H[Adds more rows]
C --> I[Combines results]
I --> J{More rows?}
J -->|Yes| D
J -->|No| K[Final Result]
style A fill:#2196F3
style B fill:#4CAF50
style D fill:#FF9800
style K fill:#9C27B0
Key Points:
- Anchor: Initial query, base case (non-recursive)
- Recursive: References CTE itself, adds rows iteratively
- UNION ALL: Combines anchor and recursive results
- Termination: Stops when recursive member returns no rows
- Use Cases: Hierarchies, trees, graphs, sequences
PIVOT Transformation
graph LR
A[Original Table] --> B[PIVOT Operation]
B --> C[Transformed Table]
A --> A1[Rows: Product, Month, Sales]
C --> C1[Columns: Product, Jan, Feb, Mar]
style A fill:#4CAF50
style B fill:#FF9800
style C fill:#2196F3
Key Points:
- Rows to Columns: Transform row values into column headers
- Aggregation: Requires aggregate function (SUM, COUNT, AVG)
- Fixed Columns: Column names must be known in advance
- Dynamic PIVOT: Use dynamic SQL for unknown columns
- Reporting: Ideal for cross-tabulation reports
Dynamic SQL Execution
sequenceDiagram
participant App as Application
participant Builder as SQL Builder
participant DB as Database
participant Result
App->>Builder: Provide parameters
Builder->>Builder: Construct SQL string
Builder->>DB: EXECUTE/PREPARE statement
DB->>DB: Parse and compile
DB->>DB: Execute query
DB->>Result: Return data
Result->>App: Final result
Key Points:
- Runtime Generation: Build SQL statements dynamically
- Flexibility: Adapt queries based on conditions
- Parameterization: Use parameters to prevent SQL injection
- Performance: Compiled each time, no plan caching
- Use Cases: Dynamic filters, column selection, table names
Query Optimization Strategies
flowchart TB
A[Query Optimization] --> B[Index Usage]
A --> C[Query Rewriting]
A --> D[Execution Plan]
A --> E[Statistics]
B --> B1[Create appropriate indexes]
B --> B2[Covering indexes]
C --> C1[Avoid SELECT *]
C --> C2[Use EXISTS vs IN]
C --> C3[Avoid functions on columns]
D --> D1[Analyze with EXPLAIN]
D --> D2[Identify bottlenecks]
E --> E1[Update statistics]
E --> E2[Help optimizer]
style A fill:#2196F3
style B fill:#4CAF50
style C fill:#FF9800
style D fill:#9C27B0
style E fill:#00BCD4
Key Points:
- Indexes: Create on WHERE, JOIN, ORDER BY columns
- Avoid Functions: Don't use functions on indexed columns in WHERE
- EXISTS vs IN: EXISTS faster for large datasets
- SELECT Specific: Avoid SELECT *, specify needed columns
- EXPLAIN: Analyze execution plans to find issues
Execution Plan Analysis
graph TB
A[Query] --> B[Parser]
B --> C[Optimizer]
C --> D[Execution Plan]
D --> E[Table Scan]
D --> F[Index Seek]
D --> G[Index Scan]
D --> H[Nested Loop]
D --> I[Hash Join]
E --> E1[Slow - reads all rows]
F --> F1[Fast - direct lookup]
G --> G1[Moderate - scans index]
H --> H1[Good for small sets]
I --> I1[Good for large sets]
style E fill:#F44336
style F fill:#4CAF50
style G fill:#FF9800
Key Points:
- Table Scan: Reads entire table, slowest
- Index Seek: Direct lookup using index, fastest
- Index Scan: Scans entire index, moderate speed
- Join Types: Nested loop, hash join, merge join
- Cost: Optimizer estimates cost, chooses lowest
Table Partitioning
graph TB
A[Large Table] --> B[Partition by Range]
A --> C[Partition by List]
A --> D[Partition by Hash]
B --> B1[Date ranges]
B --> B2[2020, 2021, 2022]
C --> C1[Discrete values]
C --> C2[Region: US, EU, ASIA]
D --> D1[Hash function]
D --> D2[Even distribution]
style A fill:#2196F3
style B fill:#4CAF50
style C fill:#FF9800
style D fill:#9C27B0
Key Points:
- Range: Partition by value ranges (dates, IDs)
- List: Partition by discrete values (regions, categories)
- Hash: Even distribution using hash function
- Benefits: Faster queries, easier maintenance, parallel processing
- Partition Pruning: Query only relevant partitions
Temporary Tables vs CTEs
graph LR
A[Temporary Storage] --> B[Temporary Tables]
A --> C[CTEs]
B --> B1[Physical storage]
B --> B2[Can have indexes]
B --> B3[Multiple queries]
C --> C1[Logical view]
C --> C2[No indexes]
C --> C3[Single query only]
style B fill:#4CAF50
style C fill:#FF9800
Key Points:
- Temp Tables: Physical storage, persist across statements
- CTEs: Logical, exist only for single query
- Indexes: Temp tables can have indexes, CTEs cannot
- Performance: Temp tables better for large intermediate results
- Simplicity: CTEs better for readability and simple cases
Code Examples
Common Table Expression (CTE)
-- Simple CTE
WITH high_value_customers AS (
SELECT
customer_id,
SUM(total_amount) AS total_spent
FROM orders
GROUP BY customer_id
HAVING SUM(total_amount) > 10000
)
SELECT
c.customer_name,
hvc.total_spent
FROM high_value_customers hvc
JOIN customers c ON hvc.customer_id = c.customer_id;
-- Multiple CTEs
WITH
monthly_sales AS (
SELECT
YEAR(order_date) AS year,
MONTH(order_date) AS month,
SUM(total_amount) AS revenue
FROM orders
GROUP BY YEAR(order_date), MONTH(order_date)
),
avg_monthly AS (
SELECT AVG(revenue) AS avg_revenue
FROM monthly_sales
)
SELECT
ms.year,
ms.month,
ms.revenue,
am.avg_revenue,
ms.revenue - am.avg_revenue AS difference
FROM monthly_sales ms
CROSS JOIN avg_monthly am;
Recursive CTE
-- Employee hierarchy
WITH RECURSIVE employee_hierarchy AS (
-- Anchor: Top-level employees
SELECT
employee_id,
employee_name,
manager_id,
1 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive: Add subordinates
SELECT
e.employee_id,
e.employee_name,
e.manager_id,
eh.level + 1
FROM employees e
INNER JOIN employee_hierarchy eh ON e.manager_id = eh.employee_id
)
SELECT * FROM employee_hierarchy ORDER BY level, employee_name;
-- Number sequence generator
WITH RECURSIVE numbers AS (
SELECT 1 AS n
UNION ALL
SELECT n + 1 FROM numbers WHERE n < 100
)
SELECT * FROM numbers;
PIVOT
-- SQL Server PIVOT
SELECT *
FROM (
SELECT product_name, month, sales_amount
FROM sales
) AS source_data
PIVOT (
SUM(sales_amount)
FOR month IN ([Jan], [Feb], [Mar], [Apr])
) AS pivot_table;
-- MySQL alternative (conditional aggregation)
SELECT
product_name,
SUM(CASE WHEN month = 'Jan' THEN sales_amount ELSE 0 END) AS Jan,
SUM(CASE WHEN month = 'Feb' THEN sales_amount ELSE 0 END) AS Feb,
SUM(CASE WHEN month = 'Mar' THEN sales_amount ELSE 0 END) AS Mar
FROM sales
GROUP BY product_name;
Dynamic SQL
-- SQL Server
DECLARE @sql NVARCHAR(MAX);
DECLARE @table_name NVARCHAR(100) = 'customers';
DECLARE @min_amount DECIMAL(10,2) = 1000;
SET @sql = N'SELECT * FROM ' + QUOTENAME(@table_name) +
N' WHERE total_spent > @amount';
EXEC sp_executesql @sql, N'@amount DECIMAL(10,2)', @amount = @min_amount;
-- MySQL
SET @table_name = 'customers';
SET @sql = CONCAT('SELECT * FROM ', @table_name, ' WHERE total_spent > 1000');
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
Query Optimization
-- Use EXISTS instead of IN
SELECT * FROM customers c
WHERE EXISTS (
SELECT 1 FROM orders o WHERE o.customer_id = c.customer_id
);
-- Avoid functions on indexed columns
-- Bad: WHERE YEAR(order_date) = 2024
-- Good:
WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01';
-- Use covering index
CREATE INDEX idx_covering ON orders(customer_id, order_date, total_amount);
SELECT customer_id, order_date, total_amount
FROM orders
WHERE customer_id = 123; -- Uses index only, no table access
Table Partitioning
-- Range partitioning (SQL Server)
CREATE PARTITION FUNCTION pf_orders_by_year(DATE)
AS RANGE RIGHT FOR VALUES ('2021-01-01', '2022-01-01', '2023-01-01');
CREATE PARTITION SCHEME ps_orders_by_year
AS PARTITION pf_orders_by_year ALL TO ([PRIMARY]);
CREATE TABLE orders (
order_id INT,
order_date DATE,
total_amount DECIMAL(10,2)
) ON ps_orders_by_year(order_date);
Best Practices
- Use CTEs for Readability: Break complex queries into logical parts
- Limit Recursive Depth: Add MAXRECURSION hint to prevent infinite loops
- Parameterize Dynamic SQL: Prevent SQL injection attacks
- Analyze Execution Plans: Use EXPLAIN to identify bottlenecks
- Index Strategically: Create indexes on frequently queried columns
- Partition Large Tables: Improve query performance and maintenance
- Update Statistics: Keep optimizer statistics current
- Test Performance: Compare different approaches with real data
Source: SQL Documentation
Last Updated: 2026-06-12
Comments
Share a question, correction, or practical insight about this article.
Checking login status...
Loading approved comments...