SQL Functions

Master SQL functions with Mermaid diagrams covering scalar, table-valued, built-in, and user-defined functions with practical examples

SQL Functions are reusable code blocks that accept parameters, perform operations, and return values. They encapsulate logic and can be used in queries, stored procedures, and other functions.

Function Types Overview

graph TB
    A[SQL Functions] --> B[Scalar Functions]
    A --> C[Table-Valued Functions]
    A --> D[Built-in Functions]
    
    B --> B1[Return single value]
    B --> B2[Used in expressions]
    
    C --> C1[Inline TVF]
    C --> C2[Multi-Statement TVF]
    
    C1 --> C1A[Single SELECT]
    C1 --> C1B[Better performance]
    
    C2 --> C2A[Multiple statements]
    C2 --> C2B[Complex logic]
    
    D --> D1[String, Math, Date]
    D --> D2[Aggregate, Conditional]
    
    style A fill:#2196F3
    style B fill:#4CAF50
    style C fill:#FF9800
    style D fill:#9C27B0

Key Points:

Scalar: Returns single value (number, string, date)
Table-Valued: Returns table result set
Inline TVF: Single SELECT, better performance
Multi-Statement TVF: Complex logic with variables
Built-in: Database-provided functions (UPPER, SUM, NOW)

Functions vs Stored Procedures

graph LR
    A[Functions] --> A1[Must return value]
    A --> A2[Can be used in SELECT]
    A --> A3[Cannot modify data]
    A --> A4[Deterministic possible]
    
    B[Stored Procedures] --> B1[Optional return]
    B --> B2[Cannot use in SELECT]
    B --> B3[Can modify data]
    B --> B4[Always non-deterministic]
    
    style A fill:#4CAF50
    style B fill:#FF9800

Key Points:

Functions: Return values, used in queries, read-only
Procedures: Execute actions, modify data, transactions
Use Functions: Calculations, transformations, validations
Use Procedures: Business logic, data modifications, workflows
Performance: Functions can be optimized by query planner

Scalar Function Execution

sequenceDiagram
    participant Q as Query
    participant F as Scalar Function
    participant DB as Database
    
    Q->>F: Call with parameters
    F->>DB: Execute logic
    DB->>F: Process data
    F->>F: Calculate result
    F->>Q: Return single value
    
    Note over Q,F: Used in SELECT, WHERE, etc.

Key Points:

Single Value: Returns one value per call
Inline Usage: Can be used anywhere expressions allowed
Parameters: Accept input parameters
Deterministic: Same input always returns same output
Performance: Efficient for simple calculations

Table-Valued Function Flow

flowchart TB
    A[Call TVF with parameters] --> B{Function Type?}
    B -->|Inline TVF| C[Single SELECT statement]
    B -->|Multi-Statement| D[Declare table variable]
    
    C --> E[Return result set]
    
    D --> F[Execute multiple statements]
    F --> G[Populate table variable]
    G --> H[Return table variable]
    
    E --> I[Use in FROM clause]
    H --> I
    
    style C fill:#4CAF50
    style D fill:#FF9800
    style I fill:#2196F3

Key Points:

Inline TVF: Single SELECT, acts like parameterized view
Multi-Statement: Complex logic, uses table variable
Performance: Inline TVF faster, optimized by query planner
Usage: Used in FROM clause like regular tables
Flexibility: Can JOIN with other tables

Deterministic vs Non-Deterministic

graph TB
    A[Function Behavior] --> B[Deterministic]
    A --> C[Non-Deterministic]
    
    B --> B1[Same input = Same output]
    B --> B2[Can be indexed]
    B --> B3[Examples: ABS, UPPER, CONCAT]
    
    C --> C1[Same input = Different output]
    C --> C2[Cannot be indexed]
    C --> C3[Examples: NOW, RAND, NEWID]
    
    style B fill:#4CAF50
    style B2 fill:#4CAF50
    style C fill:#F44336
    style C2 fill:#F44336

Key Points:

Deterministic: Predictable, same result for same input
Non-Deterministic: Varies, depends on time, random, etc.
Indexing: Only deterministic functions can be indexed
Performance: Deterministic functions can be cached
Best Practice: Mark user-defined functions as deterministic when possible

Built-in Function Categories

graph LR
    A[Built-in Functions] --> B[String]
    A --> C[Numeric]
    A --> D[Date/Time]
    A --> E[Aggregate]
    A --> F[Conditional]
    
    B --> B1[UPPER, LOWER, CONCAT]
    C --> C1[ABS, ROUND, CEILING]
    D --> D1[NOW, DATEADD, DATEDIFF]
    E --> E1[SUM, AVG, COUNT]
    F --> F1[CASE, COALESCE, NULLIF]
    
    style A fill:#2196F3
    style B fill:#4CAF50
    style C fill:#FF9800
    style D fill:#9C27B0
    style E fill:#F44336
    style F fill:#00BCD4

Key Points:

String: Text manipulation (UPPER, SUBSTRING, TRIM)
Numeric: Mathematical operations (ROUND, ABS, POWER)
Date/Time: Date calculations (NOW, DATEADD, EXTRACT)
Aggregate: Group calculations (SUM, AVG, COUNT, MAX)
Conditional: Logic operations (CASE, COALESCE, IF)

Function Performance Impact

flowchart LR
    A[Query with Function] --> B{Function in WHERE?}
    B -->|Yes| C[Cannot use index]
    B -->|No| D[Can use index]
    
    C --> E[Full table scan]
    E --> F[Slow performance]
    
    D --> G[Index seek]
    G --> H[Fast performance]
    
    style C fill:#F44336
    style E fill:#F44336
    style F fill:#F44336
    style D fill:#4CAF50
    style G fill:#4CAF50
    style H fill:#4CAF50

Key Points:

WHERE Clause: Functions on columns prevent index usage
Sargable: Avoid functions in WHERE for better performance
Alternative: Rewrite queries to avoid column functions
Computed Columns: Pre-calculate and index results
Best Practice: Apply functions to constants, not columns

User-Defined Function Lifecycle

sequenceDiagram
    participant Dev as Developer
    participant DB as Database
    participant App as Application
    
    Dev->>DB: CREATE FUNCTION
    DB->>DB: Parse and compile
    DB->>DB: Store in system catalog
    
    App->>DB: Call function
    DB->>DB: Execute function
    DB->>App: Return result
    
    Dev->>DB: ALTER FUNCTION (update)
    Dev->>DB: DROP FUNCTION (remove)

Key Points:

CREATE: Define function with parameters and logic
Compilation: Parsed and stored in database
Execution: Called from queries or procedures
Modification: ALTER to update, DROP to remove
Permissions: Control who can create/execute functions

Code Examples

Scalar Function

-- MySQL scalar function
DELIMITER //
CREATE FUNCTION calculate_discount(
    price DECIMAL(10,2),
    discount_percent INT
)
RETURNS DECIMAL(10,2)
DETERMINISTIC
BEGIN
    RETURN price * (1 - discount_percent / 100.0);
END //
DELIMITER ;

-- Usage
SELECT 
    product_name,
    price,
    calculate_discount(price, 10) AS discounted_price
FROM products;

-- SQL Server scalar function
CREATE FUNCTION dbo.GetFullName(
    @FirstName VARCHAR(50),
    @LastName VARCHAR(50)
)
RETURNS VARCHAR(101)
AS
BEGIN
    RETURN @FirstName + ' ' + @LastName;
END;

-- Usage
SELECT dbo.GetFullName(first_name, last_name) AS full_name
FROM employees;

Inline Table-Valued Function

-- SQL Server inline TVF
CREATE FUNCTION dbo.GetCustomerOrders(@CustomerId INT)
RETURNS TABLE
AS
RETURN (
    SELECT 
        order_id,
        order_date,
        total_amount,
        status
    FROM orders
    WHERE customer_id = @CustomerId
);

-- Usage
SELECT * FROM dbo.GetCustomerOrders(123);

-- Join with other tables
SELECT 
    c.customer_name,
    o.order_id,
    o.total_amount
FROM customers c
CROSS APPLY dbo.GetCustomerOrders(c.customer_id) o;

Multi-Statement Table-Valued Function

-- SQL Server multi-statement TVF
CREATE FUNCTION dbo.GetEmployeeHierarchy(@ManagerId INT)
RETURNS @Result TABLE (
    employee_id INT,
    employee_name VARCHAR(100),
    level INT
)
AS
BEGIN
    DECLARE @Level INT = 1;
    
    -- Insert direct reports
    INSERT INTO @Result
    SELECT employee_id, employee_name, @Level
    FROM employees
    WHERE manager_id = @ManagerId;
    
    -- Insert indirect reports (simplified)
    WHILE EXISTS (SELECT 1 FROM @Result WHERE level = @Level)
    BEGIN
        SET @Level = @Level + 1;
        
        INSERT INTO @Result
        SELECT e.employee_id, e.employee_name, @Level
        FROM employees e
        INNER JOIN @Result r ON e.manager_id = r.employee_id
        WHERE r.level = @Level - 1;
    END;
    
    RETURN;
END;

-- Usage
SELECT * FROM dbo.GetEmployeeHierarchy(1);

Built-in Functions

-- String functions
SELECT 
    UPPER(first_name) AS upper_name,
    LOWER(last_name) AS lower_name,
    CONCAT(first_name, ' ', last_name) AS full_name,
    SUBSTRING(email, 1, CHARINDEX('@', email) - 1) AS username,
    LENGTH(description) AS desc_length
FROM users;

-- Numeric functions
SELECT 
    ABS(-15.5) AS absolute_value,
    ROUND(123.456, 2) AS rounded,
    CEILING(123.45) AS ceiling_value,
    FLOOR(123.99) AS floor_value,
    POWER(2, 3) AS power_result;

-- Date functions
SELECT 
    NOW() AS current_datetime,
    CURDATE() AS current_date,
    DATE_ADD(order_date, INTERVAL 30 DAY) AS due_date,
    DATEDIFF(NOW(), created_at) AS days_old,
    YEAR(order_date) AS order_year
FROM orders;

-- Conditional functions
SELECT 
    product_name,
    CASE 
        WHEN price < 10 THEN 'Cheap'
        WHEN price < 50 THEN 'Moderate'
        ELSE 'Expensive'
    END AS price_category,
    COALESCE(discount_price, price) AS final_price,
    NULLIF(quantity, 0) AS safe_quantity
FROM products;

Best Practices

Use Inline TVF: Prefer inline over multi-statement for better performance
Mark Deterministic: Declare functions as DETERMINISTIC when applicable
Avoid in WHERE: Don't use functions on indexed columns in WHERE clause
Handle NULLs: Always handle NULL input values properly
Keep Simple: Complex logic belongs in stored procedures
Document Well: Comment parameters, return values, and logic
Test Performance: Compare function vs inline query performance
Use Built-in: Prefer built-in functions over custom when possible

Loading likes...

Comments

Share a question, correction, or practical insight about this article.

Loading approved comments...