Full Stack • Java • System Design • Cloud • AI Engineering

SQL Functions

Master SQL functions with Mermaid diagrams covering scalar, table-valued, built-in, and user-defined functions with practical examples

SQL Functions are reusable code blocks that accept parameters, perform operations, and return values. They encapsulate logic and can be used in queries, stored procedures, and other functions.

Function Types Overview

graph TB
    A[SQL Functions] --> B[Scalar Functions]
    A --> C[Table-Valued Functions]
    A --> D[Built-in Functions]
    
    B --> B1[Return single value]
    B --> B2[Used in expressions]
    
    C --> C1[Inline TVF]
    C --> C2[Multi-Statement TVF]
    
    C1 --> C1A[Single SELECT]
    C1 --> C1B[Better performance]
    
    C2 --> C2A[Multiple statements]
    C2 --> C2B[Complex logic]
    
    D --> D1[String, Math, Date]
    D --> D2[Aggregate, Conditional]
    
    style A fill:#2196F3
    style B fill:#4CAF50
    style C fill:#FF9800
    style D fill:#9C27B0

Key Points:

  • Scalar: Returns single value (number, string, date)
  • Table-Valued: Returns table result set
  • Inline TVF: Single SELECT, better performance
  • Multi-Statement TVF: Complex logic with variables
  • Built-in: Database-provided functions (UPPER, SUM, NOW)

Functions vs Stored Procedures

graph LR
    A[Functions] --> A1[Must return value]
    A --> A2[Can be used in SELECT]
    A --> A3[Cannot modify data]
    A --> A4[Deterministic possible]
    
    B[Stored Procedures] --> B1[Optional return]
    B --> B2[Cannot use in SELECT]
    B --> B3[Can modify data]
    B --> B4[Always non-deterministic]
    
    style A fill:#4CAF50
    style B fill:#FF9800

Key Points:

  • Functions: Return values, used in queries, read-only
  • Procedures: Execute actions, modify data, transactions
  • Use Functions: Calculations, transformations, validations
  • Use Procedures: Business logic, data modifications, workflows
  • Performance: Functions can be optimized by query planner

Scalar Function Execution

sequenceDiagram
    participant Q as Query
    participant F as Scalar Function
    participant DB as Database
    
    Q->>F: Call with parameters
    F->>DB: Execute logic
    DB->>F: Process data
    F->>F: Calculate result
    F->>Q: Return single value
    
    Note over Q,F: Used in SELECT, WHERE, etc.

Key Points:

  • Single Value: Returns one value per call
  • Inline Usage: Can be used anywhere expressions allowed
  • Parameters: Accept input parameters
  • Deterministic: Same input always returns same output
  • Performance: Efficient for simple calculations

Table-Valued Function Flow

flowchart TB
    A[Call TVF with parameters] --> B{Function Type?}
    B -->|Inline TVF| C[Single SELECT statement]
    B -->|Multi-Statement| D[Declare table variable]
    
    C --> E[Return result set]
    
    D --> F[Execute multiple statements]
    F --> G[Populate table variable]
    G --> H[Return table variable]
    
    E --> I[Use in FROM clause]
    H --> I
    
    style C fill:#4CAF50
    style D fill:#FF9800
    style I fill:#2196F3

Key Points:

  • Inline TVF: Single SELECT, acts like parameterized view
  • Multi-Statement: Complex logic, uses table variable
  • Performance: Inline TVF faster, optimized by query planner
  • Usage: Used in FROM clause like regular tables
  • Flexibility: Can JOIN with other tables

Deterministic vs Non-Deterministic

graph TB
    A[Function Behavior] --> B[Deterministic]
    A --> C[Non-Deterministic]
    
    B --> B1[Same input = Same output]
    B --> B2[Can be indexed]
    B --> B3[Examples: ABS, UPPER, CONCAT]
    
    C --> C1[Same input = Different output]
    C --> C2[Cannot be indexed]
    C --> C3[Examples: NOW, RAND, NEWID]
    
    style B fill:#4CAF50
    style B2 fill:#4CAF50
    style C fill:#F44336
    style C2 fill:#F44336

Key Points:

  • Deterministic: Predictable, same result for same input
  • Non-Deterministic: Varies, depends on time, random, etc.
  • Indexing: Only deterministic functions can be indexed
  • Performance: Deterministic functions can be cached
  • Best Practice: Mark user-defined functions as deterministic when possible

Built-in Function Categories

graph LR
    A[Built-in Functions] --> B[String]
    A --> C[Numeric]
    A --> D[Date/Time]
    A --> E[Aggregate]
    A --> F[Conditional]
    
    B --> B1[UPPER, LOWER, CONCAT]
    C --> C1[ABS, ROUND, CEILING]
    D --> D1[NOW, DATEADD, DATEDIFF]
    E --> E1[SUM, AVG, COUNT]
    F --> F1[CASE, COALESCE, NULLIF]
    
    style A fill:#2196F3
    style B fill:#4CAF50
    style C fill:#FF9800
    style D fill:#9C27B0
    style E fill:#F44336
    style F fill:#00BCD4

Key Points:

  • String: Text manipulation (UPPER, SUBSTRING, TRIM)
  • Numeric: Mathematical operations (ROUND, ABS, POWER)
  • Date/Time: Date calculations (NOW, DATEADD, EXTRACT)
  • Aggregate: Group calculations (SUM, AVG, COUNT, MAX)
  • Conditional: Logic operations (CASE, COALESCE, IF)

Function Performance Impact

flowchart LR
    A[Query with Function] --> B{Function in WHERE?}
    B -->|Yes| C[Cannot use index]
    B -->|No| D[Can use index]
    
    C --> E[Full table scan]
    E --> F[Slow performance]
    
    D --> G[Index seek]
    G --> H[Fast performance]
    
    style C fill:#F44336
    style E fill:#F44336
    style F fill:#F44336
    style D fill:#4CAF50
    style G fill:#4CAF50
    style H fill:#4CAF50

Key Points:

  • WHERE Clause: Functions on columns prevent index usage
  • Sargable: Avoid functions in WHERE for better performance
  • Alternative: Rewrite queries to avoid column functions
  • Computed Columns: Pre-calculate and index results
  • Best Practice: Apply functions to constants, not columns

User-Defined Function Lifecycle

sequenceDiagram
    participant Dev as Developer
    participant DB as Database
    participant App as Application
    
    Dev->>DB: CREATE FUNCTION
    DB->>DB: Parse and compile
    DB->>DB: Store in system catalog
    
    App->>DB: Call function
    DB->>DB: Execute function
    DB->>App: Return result
    
    Dev->>DB: ALTER FUNCTION (update)
    Dev->>DB: DROP FUNCTION (remove)

Key Points:

  • CREATE: Define function with parameters and logic
  • Compilation: Parsed and stored in database
  • Execution: Called from queries or procedures
  • Modification: ALTER to update, DROP to remove
  • Permissions: Control who can create/execute functions

Code Examples

Scalar Function

-- MySQL scalar function
DELIMITER //
CREATE FUNCTION calculate_discount(
    price DECIMAL(10,2),
    discount_percent INT
)
RETURNS DECIMAL(10,2)
DETERMINISTIC
BEGIN
    RETURN price * (1 - discount_percent / 100.0);
END //
DELIMITER ;

-- Usage
SELECT 
    product_name,
    price,
    calculate_discount(price, 10) AS discounted_price
FROM products;

-- SQL Server scalar function
CREATE FUNCTION dbo.GetFullName(
    @FirstName VARCHAR(50),
    @LastName VARCHAR(50)
)
RETURNS VARCHAR(101)
AS
BEGIN
    RETURN @FirstName + ' ' + @LastName;
END;

-- Usage
SELECT dbo.GetFullName(first_name, last_name) AS full_name
FROM employees;

Inline Table-Valued Function

-- SQL Server inline TVF
CREATE FUNCTION dbo.GetCustomerOrders(@CustomerId INT)
RETURNS TABLE
AS
RETURN (
    SELECT 
        order_id,
        order_date,
        total_amount,
        status
    FROM orders
    WHERE customer_id = @CustomerId
);

-- Usage
SELECT * FROM dbo.GetCustomerOrders(123);

-- Join with other tables
SELECT 
    c.customer_name,
    o.order_id,
    o.total_amount
FROM customers c
CROSS APPLY dbo.GetCustomerOrders(c.customer_id) o;

Multi-Statement Table-Valued Function

-- SQL Server multi-statement TVF
CREATE FUNCTION dbo.GetEmployeeHierarchy(@ManagerId INT)
RETURNS @Result TABLE (
    employee_id INT,
    employee_name VARCHAR(100),
    level INT
)
AS
BEGIN
    DECLARE @Level INT = 1;
    
    -- Insert direct reports
    INSERT INTO @Result
    SELECT employee_id, employee_name, @Level
    FROM employees
    WHERE manager_id = @ManagerId;
    
    -- Insert indirect reports (simplified)
    WHILE EXISTS (SELECT 1 FROM @Result WHERE level = @Level)
    BEGIN
        SET @Level = @Level + 1;
        
        INSERT INTO @Result
        SELECT e.employee_id, e.employee_name, @Level
        FROM employees e
        INNER JOIN @Result r ON e.manager_id = r.employee_id
        WHERE r.level = @Level - 1;
    END;
    
    RETURN;
END;

-- Usage
SELECT * FROM dbo.GetEmployeeHierarchy(1);

Built-in Functions

-- String functions
SELECT 
    UPPER(first_name) AS upper_name,
    LOWER(last_name) AS lower_name,
    CONCAT(first_name, ' ', last_name) AS full_name,
    SUBSTRING(email, 1, CHARINDEX('@', email) - 1) AS username,
    LENGTH(description) AS desc_length
FROM users;

-- Numeric functions
SELECT 
    ABS(-15.5) AS absolute_value,
    ROUND(123.456, 2) AS rounded,
    CEILING(123.45) AS ceiling_value,
    FLOOR(123.99) AS floor_value,
    POWER(2, 3) AS power_result;

-- Date functions
SELECT 
    NOW() AS current_datetime,
    CURDATE() AS current_date,
    DATE_ADD(order_date, INTERVAL 30 DAY) AS due_date,
    DATEDIFF(NOW(), created_at) AS days_old,
    YEAR(order_date) AS order_year
FROM orders;

-- Conditional functions
SELECT 
    product_name,
    CASE 
        WHEN price < 10 THEN 'Cheap'
        WHEN price < 50 THEN 'Moderate'
        ELSE 'Expensive'
    END AS price_category,
    COALESCE(discount_price, price) AS final_price,
    NULLIF(quantity, 0) AS safe_quantity
FROM products;

Best Practices

  1. Use Inline TVF: Prefer inline over multi-statement for better performance
  2. Mark Deterministic: Declare functions as DETERMINISTIC when applicable
  3. Avoid in WHERE: Don't use functions on indexed columns in WHERE clause
  4. Handle NULLs: Always handle NULL input values properly
  5. Keep Simple: Complex logic belongs in stored procedures
  6. Document Well: Comment parameters, return values, and logic
  7. Test Performance: Compare function vs inline query performance
  8. Use Built-in: Prefer built-in functions over custom when possible

Loading likes...

Comments

Share a question, correction, or practical insight about this article.

Loading approved comments...