Full Stack • Java • System Design • Cloud • AI Engineering

SQL2026-06-12

SQL Aggregate Functions: Complete Guide with Examples

Comprehensive guide to SQL aggregate functions including COUNT, SUM, AVG, MIN, MAX, GROUP BY, HAVING, ROLLUP, CUBE, and GROUPING SETS with visual diagrams

SQL Aggregate Functions

Table of Contents

  1. Introduction to Aggregate Functions
  2. COUNT Function
  3. SUM Function
  4. AVG Function
  5. MIN and MAX Functions
  6. GROUP BY Clause
  7. HAVING Clause
  8. GROUPING SETS
  9. ROLLUP
  10. CUBE
  11. Advanced Aggregation
  12. Best Practices

Introduction to Aggregate Functions

Aggregate functions perform calculations on a set of values and return a single value. They are commonly used with GROUP BY clause to group results.

┌─────────────────────────────────────────────────────────────┐
│              AGGREGATE FUNCTIONS                             │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  COUNT()    →  Count number of rows                         │
│  SUM()      →  Calculate sum of values                      │
│  AVG()      →  Calculate average of values                  │
│  MIN()      →  Find minimum value                           │
│  MAX()      →  Find maximum value                           │
│  GROUP BY   →  Group rows for aggregation                   │
│  HAVING     →  Filter groups (like WHERE for groups)        │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Aggregate Function Flow

┌─────────────────────────────────────────────────────────────┐
│           AGGREGATE FUNCTION EXECUTION                       │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  1. FROM      →  Get data from tables                       │
│  2. WHERE     →  Filter individual rows                     │
│  3. GROUP BY  →  Group rows                                 │
│  4. Aggregate →  Apply aggregate functions                  │
│  5. HAVING    →  Filter groups                              │
│  6. SELECT    →  Select columns                             │
│  7. ORDER BY  →  Sort results                               │
│  8. LIMIT     →  Limit results                              │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Detailed Aggregate Functions Overview:

┌─────────────────────────────────────────────────────────────┐
│         AGGREGATE FUNCTIONS - DETAILED OVERVIEW              │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  How Aggregate Functions Work:                               │
│  ┌──────────────────────────────────────────────────────┐   │
│  │                                                       │   │
│  │  Input: Multiple Rows                                 │   │
│  │  ┌─────────────────────────────────────────────┐     │   │
│  │  │ Row 1: value = 100                          │     │   │
│  │  │ Row 2: value = 200                          │     │   │
│  │  │ Row 3: value = 150                          │     │   │
│  │  │ Row 4: value = 300                          │     │   │
│  │  │ Row 5: value = 250                          │     │   │
│  │  └──────────────┬──────────────────────────────┘     │   │
│  │                 │                                     │   │
│  │                 ▼                                     │   │
│  │  Aggregate Function Processing:                       │   │
│  │  ┌─────────────────────────────────────────────┐     │   │
│  │  │ COUNT(*) → 5 rows                           │     │   │
│  │  │ SUM(value) → 1000                           │     │   │
│  │  │ AVG(value) → 200                            │     │   │
│  │  │ MIN(value) → 100                            │     │   │
│  │  │ MAX(value) → 300                            │     │   │
│  │  └──────────────┬──────────────────────────────┘     │   │
│  │                 │                                     │   │
│  │                 ▼                                     │   │
│  │  Output: Single Value per Group                       │   │
│  │  ┌─────────────────────────────────────────────┐     │   │
│  │  │ Result: Aggregated value                    │     │   │
│  │  └─────────────────────────────────────────────┘     │   │
│  │                                                       │   │
│  └──────────────────────────────────────────────────────┘   │
│                                                              │
│  Function Comparison:                                        │
│  ┌──────────────────────────────────────────────────────┐   │
│  │                                                       │   │
│  │  Function │ Purpose          │ NULL Handling         │   │
│  │  ─────────┼──────────────────┼─────────────────────  │   │
│  │  COUNT(*) │ Count all rows   │ Includes NULLs        │   │
│  │  COUNT(c) │ Count non-NULL   │ Excludes NULLs        │   │
│  │  SUM(c)   │ Sum values       │ Ignores NULLs         │   │
│  │  AVG(c)   │ Average values   │ Ignores NULLs         │   │
│  │  MIN(c)   │ Minimum value    │ Ignores NULLs         │   │
│  │  MAX(c)   │ Maximum value    │ Ignores NULLs         │   │
│  │                                                       │   │
│  └──────────────────────────────────────────────────────┘   │
│                                                              │
│  With GROUP BY:                                              │
│  ┌──────────────────────────────────────────────────────┐   │
│  │                                                       │   │
│  │  Data:                                                │   │
│  │  ┌────────────┬────────┐                             │   │
│  │  │ category   │ amount │                             │   │
│  │  ├────────────┼────────┤                             │   │
│  │  │ Electronics│ 100    │ ┐                           │   │
│  │  │ Electronics│ 200    │ ├─ Group 1                  │   │
│  │  │ Electronics│ 150    │ ┘                           │   │
│  │  │ Books      │ 50     │ ┐                           │   │
│  │  │ Books      │ 75     │ ├─ Group 2                  │   │
│  │  │ Books      │ 60     │ ┘                           │   │
│  │  └────────────┴────────┘                             │   │
│  │           │                                           │   │
│  │           ▼                                           │   │
│  │  GROUP BY category:                                   │   │
│  │  ┌────────────┬───────┬─────┬─────┬─────┐           │   │
│  │  │ category   │ COUNT │ SUM │ AVG │ MAX │           │   │
│  │  ├────────────┼───────┼─────┼─────┼─────┤           │   │
│  │  │ Electronics│   3   │ 450 │ 150 │ 200 │           │   │
│  │  │ Books      │   3   │ 185 │ 62  │ 75  │           │   │
│  │  └────────────┴───────┴─────┴─────┴─────┘           │   │
│  │                                                       │   │
│  └──────────────────────────────────────────────────────┘   │
│                                                              │
│  Common Use Cases:                                           │
│  ┌──────────────────────────────────────────────────────┐   │
│  │                                                       │   │
│  │  📊 Reporting:                                        │   │
│  │     • Total sales per month                          │   │
│  │     • Average order value                            │   │
│  │     • Customer count by region                       │   │
│  │                                                       │   │
│  │  📈 Analytics:                                        │   │
│  │     • Revenue trends                                 │   │
│  │     • Performance metrics                            │   │
│  │     • Statistical analysis                           │   │
│  │                                                       │   │
│  │  🎯 Business Intelligence:                            │   │
│  │     • KPI calculations                               │   │
│  │     • Comparative analysis                           │   │
│  │     • Data summarization                             │   │
│  │                                                       │   │
│  └──────────────────────────────────────────────────────┘   │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Sample Tables

-- Sales table
CREATE TABLE sales (
    sale_id INT PRIMARY KEY,
    product_id INT,
    category VARCHAR(50),
    region VARCHAR(50),
    sale_date DATE,
    quantity INT,
    unit_price DECIMAL(10,2),
    total_amount DECIMAL(10,2)
);

INSERT INTO sales VALUES
(1, 101, 'Electronics', 'North', '2024-01-15', 2, 500.00, 1000.00),
(2, 102, 'Electronics', 'South', '2024-01-16', 1, 300.00, 300.00),
(3, 103, 'Furniture', 'North', '2024-01-17', 3, 200.00, 600.00),
(4, 104, 'Furniture', 'East', '2024-01-18', 1, 450.00, 450.00),
(5, 101, 'Electronics', 'West', '2024-01-19', 2, 500.00, 1000.00),
(6, 105, 'Clothing', 'North', '2024-01-20', 5, 50.00, 250.00),
(7, 106, 'Clothing', 'South', '2024-01-21', 3, 75.00, 225.00),
(8, 103, 'Furniture', 'West', '2024-01-22', 2, 200.00, 400.00),
(9, 102, 'Electronics', 'East', '2024-02-01', 1, 300.00, 300.00),
(10, 107, 'Clothing', 'North', '2024-02-02', 4, 60.00, 240.00);

-- Employees table
CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    name VARCHAR(50),
    department VARCHAR(50),
    salary DECIMAL(10,2),
    hire_date DATE,
    manager_id INT
);

INSERT INTO employees VALUES
(1, 'John CEO', 'Executive', 150000, '2020-01-01', NULL),
(2, 'Jane Manager', 'Sales', 80000, '2020-06-01', 1),
(3, 'Bob Manager', 'IT', 85000, '2020-06-01', 1),
(4, 'Alice Employee', 'Sales', 50000, '2021-01-15', 2),
(5, 'Charlie Employee', 'Sales', 48000, '2021-02-01', 2),
(6, 'David Employee', 'IT', 55000, '2021-03-01', 3),
(7, 'Eve Employee', 'IT', 52000, '2021-03-15', 3),
(8, 'Frank Employee', 'Sales', 51000, '2021-04-01', 2);

COUNT Function

COUNT() returns the number of rows that match a specified criterion.

Syntax Variations

COUNT(*)              -- Count all rows (including NULLs)
COUNT(column)         -- Count non-NULL values in column
COUNT(DISTINCT column) -- Count unique non-NULL values

Example 1: Basic COUNT

-- Count total number of sales
SELECT COUNT(*) AS total_sales
FROM sales;

/*
total_sales
-----------
10
*/

-- Count sales by category
SELECT 
    category,
    COUNT(*) AS sale_count
FROM sales
GROUP BY category
ORDER BY sale_count DESC;

/*
category    | sale_count
------------|------------
Electronics | 4
Furniture   | 3
Clothing    | 3
*/

Example 2: COUNT with DISTINCT

-- Count unique products sold
SELECT COUNT(DISTINCT product_id) AS unique_products
FROM sales;

/*
unique_products
---------------
7
*/

-- Count unique products per category
SELECT 
    category,
    COUNT(DISTINCT product_id) AS unique_products,
    COUNT(*) AS total_sales
FROM sales
GROUP BY category;

/*
category    | unique_products | total_sales
------------|-----------------|------------
Electronics | 2               | 4
Furniture   | 2               | 3
Clothing    | 3               | 3
*/

Example 3: COUNT with Conditions

-- Count sales over $500
SELECT COUNT(*) AS high_value_sales
FROM sales
WHERE total_amount > 500;

/*
high_value_sales
----------------
4
*/

-- Count sales by region with conditions
SELECT 
    region,
    COUNT(*) AS total_sales,
    COUNT(CASE WHEN total_amount > 500 THEN 1 END) AS high_value_sales,
    COUNT(CASE WHEN total_amount <= 500 THEN 1 END) AS low_value_sales
FROM sales
GROUP BY region;

/*
region | total_sales | high_value_sales | low_value_sales
-------|-------------|------------------|----------------
North  | 3           | 1                | 2
South  | 2           | 0                | 2
East   | 2           | 0                | 2
West   | 3           | 2                | 1
*/

Example 4: COUNT with NULL Handling

-- Create table with NULLs
CREATE TABLE orders (
    order_id INT,
    customer_id INT,
    notes TEXT
);

INSERT INTO orders VALUES
(1, 101, 'Express delivery'),
(2, 102, NULL),
(3, 103, 'Gift wrap'),
(4, 104, NULL);

-- COUNT(*) includes NULLs
SELECT COUNT(*) AS total_orders FROM orders;  -- Returns 4

-- COUNT(column) excludes NULLs
SELECT COUNT(notes) AS orders_with_notes FROM orders;  -- Returns 2

-- Count NULLs
SELECT COUNT(*) - COUNT(notes) AS orders_without_notes FROM orders;  -- Returns 2

Example 5: COUNT with Multiple Groups

-- Count sales by category and region
SELECT 
    category,
    region,
    COUNT(*) AS sale_count,
    SUM(quantity) AS total_quantity
FROM sales
GROUP BY category, region
ORDER BY category, region;

/*
category    | region | sale_count | total_quantity
------------|--------|------------|---------------
Clothing    | North  | 2          | 9
Clothing    | South  | 1          | 3
Electronics | East   | 1          | 1
Electronics | North  | 1          | 2
Electronics | South  | 1          | 1
Electronics | West   | 1          | 2
Furniture   | East   | 1          | 1
Furniture   | North  | 1          | 3
Furniture   | West   | 1          | 2
*/

SUM Function

SUM() calculates the total sum of a numeric column.

Example 1: Basic SUM

-- Total sales amount
SELECT SUM(total_amount) AS total_revenue
FROM sales;

/*
total_revenue
-------------
4765.00
*/

-- Sum by category
SELECT 
    category,
    SUM(total_amount) AS category_revenue,
    SUM(quantity) AS total_quantity
FROM sales
GROUP BY category
ORDER BY category_revenue DESC;

/*
category    | category_revenue | total_quantity
------------|------------------|---------------
Electronics | 2600.00          | 6
Furniture   | 1450.00          | 6
Clothing    | 715.00           | 12
*/

Example 2: SUM with Calculations

-- Calculate total revenue and profit
SELECT 
    category,
    SUM(total_amount) AS revenue,
    SUM(quantity * unit_price * 0.3) AS estimated_profit,
    SUM(total_amount) - SUM(quantity * unit_price * 0.7) AS profit_margin
FROM sales
GROUP BY category;

Example 3: SUM with Conditions (CASE)

-- Sum sales by region with conditional totals
SELECT 
    region,
    SUM(total_amount) AS total_sales,
    SUM(CASE WHEN category = 'Electronics' THEN total_amount ELSE 0 END) AS electronics_sales,
    SUM(CASE WHEN category = 'Furniture' THEN total_amount ELSE 0 END) AS furniture_sales,
    SUM(CASE WHEN category = 'Clothing' THEN total_amount ELSE 0 END) AS clothing_sales
FROM sales
GROUP BY region
ORDER BY total_sales DESC;

/*
region | total_sales | electronics_sales | furniture_sales | clothing_sales
-------|-------------|-------------------|-----------------|---------------
North  | 1850.00     | 1000.00           | 600.00          | 250.00
West   | 1400.00     | 1000.00           | 400.00          | 0.00
South  | 525.00      | 300.00            | 0.00            | 225.00
East   | 750.00      | 300.00            | 450.00          | 0.00
*/

Example 4: SUM with Date Grouping

-- Monthly sales totals
SELECT 
    YEAR(sale_date) AS year,
    MONTH(sale_date) AS month,
    COUNT(*) AS sale_count,
    SUM(total_amount) AS monthly_revenue,
    AVG(total_amount) AS avg_sale_amount
FROM sales
GROUP BY YEAR(sale_date), MONTH(sale_date)
ORDER BY year, month;

/*
year | month | sale_count | monthly_revenue | avg_sale_amount
-----|-------|------------|-----------------|----------------
2024 | 1     | 8          | 4225.00         | 528.13
2024 | 2     | 2          | 540.00          | 270.00
*/

Example 5: Running Total (Cumulative SUM)

-- Running total of sales
SELECT 
    sale_date,
    total_amount,
    SUM(total_amount) OVER (ORDER BY sale_date) AS running_total
FROM sales
ORDER BY sale_date;

/*
sale_date  | total_amount | running_total
-----------|--------------|---------------
2024-01-15 | 1000.00      | 1000.00
2024-01-16 | 300.00       | 1300.00
2024-01-17 | 600.00       | 1900.00
2024-01-18 | 450.00       | 2350.00
...
*/

AVG Function

AVG() calculates the average value of a numeric column.

Example 1: Basic AVG

-- Average sale amount
SELECT AVG(total_amount) AS avg_sale_amount
FROM sales;

/*
avg_sale_amount
---------------
476.50
*/

-- Average by category
SELECT 
    category,
    COUNT(*) AS sale_count,
    AVG(total_amount) AS avg_sale_amount,
    MIN(total_amount) AS min_sale,
    MAX(total_amount) AS max_sale
FROM sales
GROUP BY category
ORDER BY avg_sale_amount DESC;

/*
category    | sale_count | avg_sale_amount | min_sale | max_sale
------------|------------|-----------------|----------|----------
Electronics | 4          | 650.00          | 300.00   | 1000.00
Furniture   | 3          | 483.33          | 400.00   | 600.00
Clothing    | 3          | 238.33          | 225.00   | 250.00
*/

Example 2: AVG with ROUND

-- Rounded averages
SELECT 
    category,
    ROUND(AVG(total_amount), 2) AS avg_sale,
    ROUND(AVG(quantity), 2) AS avg_quantity,
    ROUND(AVG(unit_price), 2) AS avg_unit_price
FROM sales
GROUP BY category;

/*
category    | avg_sale | avg_quantity | avg_unit_price
------------|----------|--------------|---------------
Electronics | 650.00   | 1.50         | 400.00
Furniture   | 483.33   | 2.00         | 283.33
Clothing    | 238.33   | 4.00         | 61.67
*/

Example 3: AVG with Filtering

-- Average salary by department (excluding CEO)
SELECT 
    department,
    COUNT(*) AS employee_count,
    ROUND(AVG(salary), 2) AS avg_salary,
    MIN(salary) AS min_salary,
    MAX(salary) AS max_salary
FROM employees
WHERE department != 'Executive'
GROUP BY department
ORDER BY avg_salary DESC;

/*
department | employee_count | avg_salary | min_salary | max_salary
-----------|----------------|------------|------------|------------
IT         | 3              | 64000.00   | 52000.00   | 85000.00
Sales      | 4              | 57250.00   | 48000.00   | 80000.00
*/

Example 4: Weighted Average

-- Weighted average price (by quantity)
SELECT 
    category,
    SUM(quantity * unit_price) / SUM(quantity) AS weighted_avg_price,
    AVG(unit_price) AS simple_avg_price
FROM sales
GROUP BY category;

/*
category    | weighted_avg_price | simple_avg_price
------------|-------------------|------------------
Electronics | 400.00            | 400.00
Furniture   | 216.67            | 283.33
Clothing    | 59.58             | 61.67
*/

Example 5: AVG with NULL Handling

-- AVG ignores NULL values
CREATE TABLE test_scores (
    student_id INT,
    test1 INT,
    test2 INT,
    test3 INT
);

INSERT INTO test_scores VALUES
(1, 85, 90, 88),
(2, 78, NULL, 82),  -- NULL is ignored in AVG
(3, 92, 95, NULL);

SELECT 
    student_id,
    AVG(test1) AS avg_test1,
    AVG(test2) AS avg_test2,
    AVG(test3) AS avg_test3,
    (COALESCE(test1, 0) + COALESCE(test2, 0) + COALESCE(test3, 0)) / 3 AS avg_all
FROM test_scores
GROUP BY student_id, test1, test2, test3;

MIN and MAX Functions

MIN() returns the minimum value, MAX() returns the maximum value.

Example 1: Basic MIN and MAX

-- Find price range
SELECT 
    MIN(unit_price) AS lowest_price,
    MAX(unit_price) AS highest_price,
    MAX(unit_price) - MIN(unit_price) AS price_range
FROM sales;

/*
lowest_price | highest_price | price_range
-------------|---------------|-------------
50.00        | 500.00        | 450.00
*/

Example 2: MIN and MAX by Group

-- Price range by category
SELECT 
    category,
    MIN(unit_price) AS min_price,
    MAX(unit_price) AS max_price,
    AVG(unit_price) AS avg_price,
    MAX(unit_price) - MIN(unit_price) AS price_range
FROM sales
GROUP BY category
ORDER BY price_range DESC;

/*
category    | min_price | max_price | avg_price | price_range
------------|-----------|-----------|-----------|-------------
Electronics | 300.00    | 500.00    | 400.00    | 200.00
Furniture   | 200.00    | 450.00    | 283.33    | 250.00
Clothing    | 50.00     | 75.00     | 61.67     | 25.00
*/

Example 3: MIN and MAX with Dates

-- Find first and last sale dates
SELECT 
    category,
    MIN(sale_date) AS first_sale,
    MAX(sale_date) AS last_sale,
    DATEDIFF(MAX(sale_date), MIN(sale_date)) AS days_between
FROM sales
GROUP BY category;

/*
category    | first_sale | last_sale  | days_between
------------|------------|------------|-------------
Electronics | 2024-01-15 | 2024-02-01 | 17
Furniture   | 2024-01-17 | 2024-01-22 | 5
Clothing    | 2024-01-20 | 2024-02-02 | 13
*/

Example 4: MIN and MAX with Strings

-- Alphabetical range of employee names
SELECT 
    department,
    MIN(name) AS first_alphabetically,
    MAX(name) AS last_alphabetically,
    COUNT(*) AS employee_count
FROM employees
GROUP BY department;

/*
department | first_alphabetically | last_alphabetically | employee_count
-----------|---------------------|---------------------|---------------
Executive  | John CEO            | John CEO            | 1
Sales      | Alice Employee      | Jane Manager        | 4
IT         | Bob Manager         | Eve Employee        | 3
*/

Example 5: Finding Records with MIN/MAX Values

-- Find the highest paid employee in each department
SELECT 
    e1.department,
    e1.name,
    e1.salary
FROM employees e1
WHERE e1.salary = (
    SELECT MAX(e2.salary)
    FROM employees e2
    WHERE e2.department = e1.department
)
ORDER BY e1.salary DESC;

/*
department | name         | salary
-----------|--------------|----------
Executive  | John CEO     | 150000.00
IT         | Bob Manager  | 85000.00
Sales      | Jane Manager | 80000.00
*/

GROUP BY Clause

GROUP BY groups rows that have the same values in specified columns into summary rows.

GROUP BY Execution Flow Diagram:

┌─────────────────────────────────────────────────────────────┐
│              GROUP BY EXECUTION FLOW                         │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Query Execution Order:                                      │
│  ┌──────────────────────────────────────────────────────┐   │
│  │                                                       │   │
│  │  1. FROM clause                                       │   │
│  │     ┌─────────────────────────────────┐              │   │
│  │     │ Identify source tables          │              │   │
│  │     │ Perform JOINs if needed         │              │   │
│  │     └──────────────┬──────────────────┘              │   │
│  │                    │                                  │   │
│  │                    ▼                                  │   │
│  │  2. WHERE clause                                      │   │
│  │     ┌─────────────────────────────────┐              │   │
│  │     │ Filter individual rows          │              │   │
│  │     │ BEFORE grouping                 │              │   │
│  │     └──────────────┬──────────────────┘              │   │
│  │                    │                                  │   │
│  │                    ▼                                  │   │
│  │  3. GROUP BY clause                                   │   │
│  │     ┌─────────────────────────────────┐              │   │
│  │     │ Group rows by specified columns │              │   │
│  │     │                                 │              │   │
│  │     │ Example Data:                   │              │   │
│  │     │ ┌──────────┬────────┐           │              │   │
│  │     │ │ category │ amount │           │              │   │
│  │     │ ├──────────┼────────┤           │              │   │
│  │     │ │ A        │ 100    │ ┐         │              │   │
│  │     │ │ A        │ 200    │ ├─ Group A│              │   │
│  │     │ │ A        │ 150    │ ┘         │              │   │
│  │     │ │ B        │ 300    │ ┐         │              │   │
│  │     │ │ B        │ 250    │ ├─ Group B│              │   │
│  │     │ │ B        │ 400    │ ┘         │              │   │
│  │     │ └──────────┴────────┘           │              │   │
│  │     └──────────────┬──────────────────┘              │   │
│  │                    │                                  │   │
│  │                    ▼                                  │   │
│  │  4. Aggregate Functions                               │   │
│  │     ┌─────────────────────────────────┐              │   │
│  │     │ Apply aggregations per group    │              │   │
│  │     │                                 │              │   │
│  │     │ Group A: COUNT=3, SUM=450       │              │   │
│  │     │ Group B: COUNT=3, SUM=950       │              │   │
│  │     └──────────────┬──────────────────┘              │   │
│  │                    │                                  │   │
│  │                    ▼                                  │   │
│  │  5. HAVING clause                                     │   │
│  │     ┌─────────────────────────────────┐              │   │
│  │     │ Filter groups                   │              │   │
│  │     │ AFTER aggregation               │              │   │
│  │     │                                 │              │   │
│  │     │ Example: HAVING SUM > 500       │              │   │
│  │     │ Result: Only Group B passes     │              │   │
│  │     └──────────────┬──────────────────┘              │   │
│  │                    │                                  │   │
│  │                    ▼                                  │   │
│  │  6. SELECT clause                                     │   │
│  │     ┌─────────────────────────────────┐              │   │
│  │     │ Select columns and aggregates   │              │   │
│  │     └──────────────┬──────────────────┘              │   │
│  │                    │                                  │   │
│  │                    ▼                                  │   │
│  │  7. ORDER BY clause                                   │   │
│  │     ┌─────────────────────────────────┐              │   │
│  │     │ Sort final results              │              │   │
│  │     └──────────────┬──────────────────┘              │   │
│  │                    │                                  │   │
│  │                    ▼                                  │   │
│  │  8. LIMIT/OFFSET                                      │   │
│  │     ┌─────────────────────────────────┐              │   │
│  │     │ Limit number of results         │              │   │
│  │     └─────────────────────────────────┘              │   │
│  │                                                       │   │
│  └──────────────────────────────────────────────────────┘   │
│                                                              │
│  Visual Example:                                             │
│  ┌──────────────────────────────────────────────────────┐   │
│  │                                                       │   │
│  │  Original Data (10 rows):                             │   │
│  │  ┌──────────┬────────┬────────┐                      │   │
│  │  │ dept     │ emp    │ salary │                      │   │
│  │  ├──────────┼────────┼────────┤                      │   │
│  │  │ Sales    │ John   │ 50000  │ ┐                    │   │
│  │  │ Sales    │ Jane   │ 55000  │ │                    │   │
│  │  │ Sales    │ Bob    │ 52000  │ ├─ Sales Group      │   │
│  │  │ IT       │ Alice  │ 60000  │ │                    │   │
│  │  │ IT       │ Tom    │ 65000  │ ├─ IT Group         │   │
│  │  │ IT       │ Sarah  │ 62000  │ │                    │   │
│  │  │ HR       │ Mike   │ 48000  │ │                    │   │
│  │  │ HR       │ Lisa   │ 50000  │ ├─ HR Group         │   │
│  │  └──────────┴────────┴────────┘ ┘                    │   │
│  │           │                                           │   │
│  │           ▼ GROUP BY dept                             │   │
│  │                                                       │   │
│  │  Grouped Result (3 rows):                             │   │
│  │  ┌──────────┬───────┬────────────┬────────────┐      │   │
│  │  │ dept     │ COUNT │ AVG_SALARY │ MAX_SALARY │      │   │
│  │  ├──────────┼───────┼────────────┼────────────┤      │   │
│  │  │ Sales    │   3   │   52333    │   55000    │      │   │
│  │  │ IT       │   3   │   62333    │   65000    │      │   │
│  │  │ HR       │   2   │   49000    │   50000    │      │   │
│  │  └──────────┴───────┴────────────┴────────────┘      │   │
│  │                                                       │   │
│  └──────────────────────────────────────────────────────┘   │
│                                                              │
│  Multiple Column GROUP BY:                                   │
│  ┌──────────────────────────────────────────────────────┐   │
│  │                                                       │   │
│  │  GROUP BY dept, year                                  │   │
│  │                                                       │   │
│  │  Creates groups for each unique combination:          │   │
│  │  ┌──────────┬──────┬───────┐                         │   │
│  │  │ dept     │ year │ COUNT │                         │   │
│  │  ├──────────┼──────┼───────┤                         │   │
│  │  │ Sales    │ 2023 │   5   │ ← Unique combination   │   │
│  │  │ Sales    │ 2024 │   7   │ ← Unique combination   │   │
│  │  │ IT       │ 2023 │   3   │ ← Unique combination   │   │
│  │  │ IT       │ 2024 │   4   │ ← Unique combination   │   │
│  │  └──────────┴──────┴───────┘                         │   │
│  │                                                       │   │
│  └──────────────────────────────────────────────────────┘   │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Example 1: Single Column GROUP BY

-- Sales by category
SELECT 
    category,
    COUNT(*) AS sale_count,
    SUM(total_amount) AS total_revenue,
    AVG(total_amount) AS avg_sale
FROM sales
GROUP BY category
ORDER BY total_revenue DESC;

Example 2: Multiple Column GROUP BY

-- Sales by category and region
SELECT 
    category,
    region,
    COUNT(*) AS sale_count,
    SUM(total_amount) AS revenue,
    AVG(total_amount) AS avg_sale
FROM sales
GROUP BY category, region
ORDER BY category, revenue DESC;

/*
category    | region | sale_count | revenue | avg_sale
------------|--------|------------|---------|----------
Clothing    | North  | 2          | 490.00  | 245.00
Clothing    | South  | 1          | 225.00  | 225.00
Electronics | North  | 1          | 1000.00 | 1000.00
Electronics | West   | 1          | 1000.00 | 1000.00
Electronics | South  | 1          | 300.00  | 300.00
Electronics | East   | 1          | 300.00  | 300.00
Furniture   | North  | 1          | 600.00  | 600.00
Furniture   | East   | 1          | 450.00  | 450.00
Furniture   | West   | 1          | 400.00  | 400.00
*/

Example 3: GROUP BY with Expressions

-- Sales by year and month
SELECT 
    YEAR(sale_date) AS year,
    MONTH(sale_date) AS month,
    MONTHNAME(sale_date) AS month_name,
    COUNT(*) AS sale_count,
    SUM(total_amount) AS monthly_revenue
FROM sales
GROUP BY YEAR(sale_date), MONTH(sale_date), MONTHNAME(sale_date)
ORDER BY year, month;

/*
year | month | month_name | sale_count | monthly_revenue
-----|-------|------------|------------|----------------
2024 | 1     | January    | 8          | 4225.00
2024 | 2     | February   | 2          | 540.00
*/

Example 4: GROUP BY with CASE

-- Group sales into price ranges
SELECT 
    CASE 
        WHEN total_amount < 300 THEN 'Low (< 300)'
        WHEN total_amount BETWEEN 300 AND 600 THEN 'Medium (300-600)'
        ELSE 'High (> 600)'
    END AS price_range,
    COUNT(*) AS sale_count,
    SUM(total_amount) AS total_revenue,
    AVG(total_amount) AS avg_amount
FROM sales
GROUP BY 
    CASE 
        WHEN total_amount < 300 THEN 'Low (< 300)'
        WHEN total_amount BETWEEN 300 AND 600 THEN 'Medium (300-600)'
        ELSE 'High (> 600)'
    END
ORDER BY avg_amount;

/*
price_range        | sale_count | total_revenue | avg_amount
-------------------|------------|---------------|------------
Low (< 300)        | 4          | 965.00        | 241.25
Medium (300-600)   | 3          | 1350.00       | 450.00
High (> 600)       | 3          | 2450.00       | 816.67
*/

Example 5: GROUP BY ALL Columns

-- Find duplicate sales (if any)
SELECT 
    product_id,
    category,
    region,
    sale_date,
    COUNT(*) AS duplicate_count
FROM sales
GROUP BY product_id, category, region, sale_date
HAVING COUNT(*) > 1;

HAVING Clause

HAVING filters groups after aggregation (WHERE filters rows before aggregation).

WHERE vs HAVING

┌─────────────────────────────────────────────────────────────┐
│                  WHERE vs HAVING                             │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  WHERE                          HAVING                       │
│  ─────────────────────────────  ──────────────────────────  │
│  • Filters individual rows      • Filters groups            │
│  • Before GROUP BY              • After GROUP BY            │
│  • Cannot use aggregates        • Can use aggregates        │
│  • Faster (reduces data early)  • Slower (after grouping)   │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Example 1: Basic HAVING

-- Categories with more than 2 sales
SELECT 
    category,
    COUNT(*) AS sale_count,
    SUM(total_amount) AS total_revenue
FROM sales
GROUP BY category
HAVING COUNT(*) > 2
ORDER BY sale_count DESC;

/*
category    | sale_count | total_revenue
------------|------------|---------------
Electronics | 4          | 2600.00
Furniture   | 3          | 1450.00
Clothing    | 3          | 715.00
*/

Example 2: HAVING with Multiple Conditions

-- High-performing categories
SELECT 
    category,
    COUNT(*) AS sale_count,
    SUM(total_amount) AS total_revenue,
    AVG(total_amount) AS avg_sale
FROM sales
GROUP BY category
HAVING COUNT(*) >= 3 
   AND SUM(total_amount) > 1000
ORDER BY total_revenue DESC;

/*
category    | sale_count | total_revenue | avg_sale
------------|------------|---------------|----------
Electronics | 4          | 2600.00       | 650.00
Furniture   | 3          | 1450.00       | 483.33
*/

Example 3: WHERE and HAVING Together

-- Sales in 2024 January, grouped by category, with revenue > 500
SELECT 
    category,
    COUNT(*) AS sale_count,
    SUM(total_amount) AS total_revenue
FROM sales
WHERE YEAR(sale_date) = 2024 
  AND MONTH(sale_date) = 1  -- WHERE filters rows first
GROUP BY category
HAVING SUM(total_amount) > 500  -- HAVING filters groups
ORDER BY total_revenue DESC;

/*
category    | sale_count | total_revenue
------------|------------|---------------
Electronics | 3          | 2300.00
Furniture   | 3          | 1450.00
*/

Example 4: HAVING with Subquery

-- Categories with above-average revenue
SELECT 
    category,
    SUM(total_amount) AS total_revenue
FROM sales
GROUP BY category
HAVING SUM(total_amount) > (
    SELECT AVG(category_revenue)
    FROM (
        SELECT SUM(total_amount) AS category_revenue
        FROM sales
        GROUP BY category
    ) AS category_totals
)
ORDER BY total_revenue DESC;

Example 5: HAVING with Complex Aggregates

-- Departments with high average salary and multiple employees
SELECT 
    department,
    COUNT(*) AS employee_count,
    ROUND(AVG(salary), 2) AS avg_salary,
    MIN(salary) AS min_salary,
    MAX(salary) AS max_salary
FROM employees
GROUP BY department
HAVING COUNT(*) > 1 
   AND AVG(salary) > 50000
ORDER BY avg_salary DESC;

/*
department | employee_count | avg_salary | min_salary | max_salary
-----------|----------------|------------|------------|------------
IT         | 3              | 64000.00   | 52000.00   | 85000.00
Sales      | 4              | 57250.00   | 48000.00   | 80000.00
*/

GROUPING SETS

GROUPING SETS allows you to define multiple grouping sets in a single query.

Example 1: Basic GROUPING SETS

-- Sales totals by category, region, and overall
SELECT 
    category,
    region,
    SUM(total_amount) AS total_revenue
FROM sales
GROUP BY GROUPING SETS (
    (category, region),  -- Group by both
    (category),          -- Group by category only
    (region),            -- Group by region only
    ()                   -- Grand total
)
ORDER BY category, region;

/*
category    | region | total_revenue
------------|--------|---------------
Clothing    | North  | 490.00
Clothing    | South  | 225.00
Clothing    | NULL   | 715.00        -- Category subtotal
Electronics | East   | 300.00
Electronics | North  | 1000.00
Electronics | South  | 300.00
Electronics | West   | 1000.00
Electronics | NULL   | 2600.00       -- Category subtotal
Furniture   | East   | 450.00
Furniture   | North  | 600.00
Furniture   | West   | 400.00
Furniture   | NULL   | 1450.00       -- Category subtotal
NULL        | East   | 750.00        -- Region subtotal
NULL        | North  | 2090.00       -- Region subtotal
NULL        | South  | 525.00        -- Region subtotal
NULL        | West   | 1400.00       -- Region subtotal
NULL        | NULL   | 4765.00       -- Grand total
*/

Example 2: GROUPING SETS vs UNION

-- GROUPING SETS (efficient)
SELECT category, region, SUM(total_amount) AS revenue
FROM sales
GROUP BY GROUPING SETS ((category), (region), ());

-- Equivalent using UNION (less efficient)
SELECT category, NULL AS region, SUM(total_amount) AS revenue
FROM sales GROUP BY category
UNION ALL
SELECT NULL, region, SUM(total_amount) AS revenue
FROM sales GROUP BY region
UNION ALL
SELECT NULL, NULL, SUM(total_amount) AS revenue
FROM sales;

ROLLUP

ROLLUP creates subtotals and grand totals in a hierarchical manner.

Example 1: Basic ROLLUP

-- Hierarchical totals: category → region → grand total
SELECT 
    category,
    region,
    COUNT(*) AS sale_count,
    SUM(total_amount) AS total_revenue
FROM sales
GROUP BY ROLLUP(category, region)
ORDER BY category, region;

/*
category    | region | sale_count | total_revenue
------------|--------|------------|---------------
Clothing    | North  | 2          | 490.00
Clothing    | South  | 1          | 225.00
Clothing    | NULL   | 3          | 715.00        -- Category subtotal
Electronics | East   | 1          | 300.00
Electronics | North  | 1          | 1000.00
Electronics | South  | 1          | 300.00
Electronics | West   | 1          | 1000.00
Electronics | NULL   | 4          | 2600.00       -- Category subtotal
Furniture   | East   | 1          | 450.00
Furniture   | North  | 1          | 600.00
Furniture   | West   | 1          | 400.00
Furniture   | NULL   | 3          | 1450.00       -- Category subtotal
NULL        | NULL   | 10         | 4765.00       -- Grand total
*/

Example 2: ROLLUP with Three Levels

-- Year → Month → Day hierarchy
SELECT 
    YEAR(sale_date) AS year,
    MONTH(sale_date) AS month,
    DAY(sale_date) AS day,
    SUM(total_amount) AS daily_revenue
FROM sales
GROUP BY ROLLUP(YEAR(sale_date), MONTH(sale_date), DAY(sale_date))
ORDER BY year, month, day;

Example 3: Identifying ROLLUP Rows

-- Use GROUPING() to identify subtotal rows
SELECT 
    category,
    region,
    SUM(total_amount) AS revenue,
    GROUPING(category) AS is_category_total,
    GROUPING(region) AS is_region_total,
    CASE 
        WHEN GROUPING(category) = 1 AND GROUPING(region) = 1 THEN 'Grand Total'
        WHEN GROUPING(region) = 1 THEN 'Category Subtotal'
        ELSE 'Detail'
    END AS row_type
FROM sales
GROUP BY ROLLUP(category, region)
ORDER BY category, region;

CUBE

CUBE creates subtotals for all possible combinations of dimensions.

Example 1: Basic CUBE

-- All possible combinations of category and region
SELECT 
    category,
    region,
    COUNT(*) AS sale_count,
    SUM(total_amount) AS total_revenue
FROM sales
GROUP BY CUBE(category, region)
ORDER BY category, region;

/*
category    | region | sale_count | total_revenue
------------|--------|------------|---------------
NULL        | NULL   | 10         | 4765.00       -- Grand total
NULL        | East   | 2          | 750.00        -- Region total
NULL        | North  | 3          | 2090.00       -- Region total
NULL        | South  | 2          | 525.00        -- Region total
NULL        | West   | 3          | 1400.00       -- Region total
Clothing    | NULL   | 3          | 715.00        -- Category total
Clothing    | North  | 2          | 490.00
Clothing    | South  | 1          | 225.00
Electronics | NULL   | 4          | 2600.00       -- Category total
Electronics | East   | 1          | 300.00
Electronics | North  | 1          | 1000.00
Electronics | South  | 1          | 300.00
Electronics | West   | 1          | 1000.00
Furniture   | NULL   | 3          | 1450.00       -- Category total
Furniture   | East   | 1          | 450.00
Furniture   | North  | 1          | 600.00
Furniture   | West   | 1          | 400.00
*/

Example 2: CUBE vs ROLLUP

-- ROLLUP: Hierarchical (category → region)
-- Creates: (category, region), (category), ()
SELECT category, region, SUM(total_amount) AS revenue
FROM sales
GROUP BY ROLLUP(category, region);

-- CUBE: All combinations
-- Creates: (category, region), (category), (region), ()
SELECT category, region, SUM(total_amount) AS revenue
FROM sales
GROUP BY CUBE(category, region);

Advanced Aggregation

Example 1: Conditional Aggregation

-- Pivot-like aggregation using CASE
SELECT 
    region,
    SUM(CASE WHEN category = 'Electronics' THEN total_amount ELSE 0 END) AS electronics,
    SUM(CASE WHEN category = 'Furniture' THEN total_amount ELSE 0 END) AS furniture,
    SUM(CASE WHEN category = 'Clothing' THEN total_amount ELSE 0 END) AS clothing,
    SUM(total_amount) AS total
FROM sales
GROUP BY region
ORDER BY total DESC;

/*
region | electronics | furniture | clothing | total
-------|-------------|-----------|----------|--------
North  | 1000.00     | 600.00    | 490.00   | 2090.00
West   | 1000.00     | 400.00    | 0.00     | 1400.00
East   | 300.00      | 450.00    | 0.00     | 750.00
South  | 300.00      | 0.00      | 225.00   | 525.00
*/

Example 2: String Aggregation

-- MySQL: GROUP_CONCAT
SELECT 
    category,
    GROUP_CONCAT(DISTINCT region ORDER BY region) AS regions,
    COUNT(DISTINCT region) AS region_count
FROM sales
GROUP BY category;

/*
category    | regions              | region_count
------------|----------------------|-------------
Clothing    | North,South          | 2
Electronics | East,North,South,West| 4
Furniture   | East,North,West      | 3
*/

-- PostgreSQL: STRING_AGG
SELECT 
    category,
    STRING_AGG(DISTINCT region, ', ' ORDER BY region) AS regions
FROM sales
GROUP BY category;

-- SQL Server: STRING_AGG (SQL Server 2017+)
SELECT 
    category,
    STRING_AGG(region, ', ') WITHIN GROUP (ORDER BY region) AS regions
FROM sales
GROUP BY category;

Example 3: Statistical Aggregates

-- Standard deviation and variance
SELECT 
    category,
    COUNT(*) AS sale_count,
    AVG(total_amount) AS avg_amount,
    STDDEV(total_amount) AS std_dev,
    VARIANCE(total_amount) AS variance,
    MIN(total_amount) AS min_amount,
    MAX(total_amount) AS max_amount
FROM sales
GROUP BY category;

Example 4: Percentile Calculations

-- Median and percentiles (MySQL 8.0+)
SELECT 
    category,
    AVG(total_amount) AS mean,
    PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY total_amount) AS median,
    PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY total_amount) AS q1,
    PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY total_amount) AS q3
FROM sales
GROUP BY category;

Example 5: Nested Aggregation

-- Average of category averages
SELECT 
    AVG(category_avg) AS overall_avg
FROM (
    SELECT 
        category,
        AVG(total_amount) AS category_avg
    FROM sales
    GROUP BY category
) AS category_averages;

Best Practices

1. Use Appropriate Aggregate Functions

-- ✅ GOOD: Use COUNT(*) for row count
SELECT COUNT(*) FROM sales;

-- ❌ AVOID: COUNT(column) when you want all rows
SELECT COUNT(product_id) FROM sales;  -- Excludes NULLs

2. Include All Non-Aggregated Columns in GROUP BY

-- ✅ GOOD: All non-aggregated columns in GROUP BY
SELECT category, region, SUM(total_amount)
FROM sales
GROUP BY category, region;

-- ❌ ERROR: Missing column in GROUP BY
SELECT category, region, SUM(total_amount)
FROM sales
GROUP BY category;  -- Error: region not in GROUP BY

3. Use HAVING for Group Filtering

-- ✅ GOOD: HAVING for aggregate conditions
SELECT category, SUM(total_amount) AS revenue
FROM sales
GROUP BY category
HAVING SUM(total_amount) > 1000;

-- ❌ WRONG: WHERE with aggregate (won't work)
SELECT category, SUM(total_amount) AS revenue
FROM sales
WHERE SUM(total_amount) > 1000  -- Error!
GROUP BY category;

4. Handle NULLs Appropriately

-- ✅ GOOD: Use COALESCE for NULL handling
SELECT 
    category,
    COALESCE(SUM(total_amount), 0) AS total_revenue
FROM sales
GROUP BY category;

5. Use Indexes on GROUP BY Columns

-- ✅ GOOD: Create indexes for frequently grouped columns
CREATE INDEX idx_sales_category ON sales(category);
CREATE INDEX idx_sales_region ON sales(region);
CREATE INDEX idx_sales_date ON sales(sale_date);

6. Consider Performance with Large Datasets

-- ✅ GOOD: Filter before grouping
SELECT category, SUM(total_amount)
FROM sales
WHERE sale_date >= '2024-01-01'  -- Filter first
GROUP BY category;

-- ❌ SLOWER: Group all data then filter
SELECT category, SUM(total_amount)
FROM sales
GROUP BY category
HAVING MIN(sale_date) >= '2024-01-01';

7. Use Meaningful Aliases

-- ✅ GOOD: Clear aliases
SELECT 
    category,
    COUNT(*) AS sale_count,
    SUM(total_amount) AS total_revenue,
    AVG(total_amount) AS avg_sale_amount
FROM sales
GROUP BY category;

-- ❌ BAD: Unclear aliases
SELECT 
    category,
    COUNT(*) AS c,
    SUM(total_amount) AS s,
    AVG(total_amount) AS a
FROM sales
GROUP BY category;

Summary

This section covered:

  • ✅ COUNT function (with DISTINCT and NULL handling)
  • ✅ SUM function (with conditional aggregation)
  • ✅ AVG function (with weighted averages)
  • ✅ MIN and MAX functions
  • ✅ GROUP BY clause (single and multiple columns)
  • ✅ HAVING clause (filtering groups)
  • ✅ GROUPING SETS (multiple groupings)
  • ✅ ROLLUP (hierarchical totals)
  • ✅ CUBE (all combinations)
  • ✅ Advanced aggregation techniques
  • ✅ Best practices