Full Stack • Java • System Design • Cloud • AI Engineering

SQL2026-06-08

SQL Basics - Complete Guide

Comprehensive guide to SQL fundamentals including databases, tables, CRUD operations, filtering, sorting, joins, grouping, constraints, transactions, indexes, and real-world examples with best practices.

SQL Basics - Complete Guide

SQL (Structured Query Language) is the standard language for managing and manipulating relational databases. It's essential for backend developers, data engineers, and anyone working with data.

Why Learn SQL?

  • Universal: Works across all major database systems
  • Powerful: Handle millions of records efficiently
  • In-Demand: Required skill for most tech jobs
  • Declarative: Focus on what you want, not how to get it
  • Standardized: Supported by most major database systems
  • Easy to Learn: Simple syntax and concepts

What is a Database?

A database is an organized collection of structured data stored electronically. It allows efficient storage, retrieval, and management of information.

Database Hierarchy

Database Management System (DBMS)
    |
    +-- Database (e.g., company_db)
         |
         +-- Schema (e.g., sales, hr)
              |
              +-- Tables (e.g., employees, orders)
                   |
                   +-- Rows (Records)
                   +-- Columns (Fields)

Real-World Example

E-Commerce Database
    |
    +-- customers (user accounts)
    +-- products (inventory)
    +-- orders (purchases)
    +-- order_items (order details)
    +-- payments (transactions)
    +-- reviews (customer feedback)

What is a Table?

A table is a collection of related data organized in rows and columns. Each table represents a specific entity in your system.

Table Structure

Table: employees
+----+----------+-------------+--------+------------+
| id | name     | department  | salary | hire_date  |
+----+----------+-------------+--------+------------+
| 1  | Venu     | Engineering | 150000 | 2023-01-15 |
| 2  | John     | Finance     | 120000 | 2023-02-20 |
| 3  | Sarah    | Engineering | 140000 | 2023-03-10 |
| 4  | Mike     | Marketing   | 110000 | 2023-04-05 |
+----+----------+-------------+--------+------------+

Components:

  • Rows (Records): Individual entries (e.g., one employee)
  • Columns (Fields): Attributes (e.g., name, salary)
  • Primary Key: Unique identifier (e.g., id)

SQL Categories

SQL commands are divided into five categories:

1. DDL (Data Definition Language)

Purpose: Define and modify database structure

CREATE    -- Create new database objects
ALTER     -- Modify existing objects
DROP      -- Delete objects
TRUNCATE  -- Remove all data from table
RENAME    -- Rename objects

2. DML (Data Manipulation Language)

Purpose: Manipulate data within tables

INSERT    -- Add new records
UPDATE    -- Modify existing records
DELETE    -- Remove records
MERGE     -- Insert or update (upsert)

3. DQL (Data Query Language)

Purpose: Retrieve data from database

SELECT    -- Query and retrieve data

4. DCL (Data Control Language)

Purpose: Control access and permissions

GRANT     -- Give privileges
REVOKE    -- Remove privileges

5. TCL (Transaction Control Language)

Purpose: Manage database transactions

COMMIT    -- Save changes permanently
ROLLBACK  -- Undo changes
SAVEPOINT -- Set transaction checkpoint

Data Types

Understanding data types is crucial for efficient database design.

Numeric Types

-- Integer types
INT           -- Standard integer (-2B to 2B)
BIGINT        -- Large integer (-9 quintillion to 9 quintillion)
SMALLINT      -- Small integer (-32K to 32K)
TINYINT       -- Tiny integer (0 to 255)

-- Decimal types
DECIMAL(10,2) -- Fixed precision (e.g., 12345678.90)
NUMERIC(10,2) -- Same as DECIMAL
FLOAT         -- Approximate floating point
DOUBLE        -- Double precision floating point

String Types

CHAR(10)      -- Fixed length (always 10 chars)
VARCHAR(100)  -- Variable length (up to 100 chars)
TEXT          -- Large text (up to 65,535 chars)
LONGTEXT      -- Very large text (up to 4GB)

Date and Time Types

DATE          -- Date only (YYYY-MM-DD)
TIME          -- Time only (HH:MM:SS)
DATETIME      -- Date and time
TIMESTAMP     -- Date, time with timezone
YEAR          -- Year only (YYYY)

Other Types

BOOLEAN       -- TRUE or FALSE
BLOB          -- Binary large object
JSON          -- JSON data (PostgreSQL, MySQL 5.7+)
ENUM          -- Enumerated list of values

DDL Commands in Detail

CREATE DATABASE

-- Create a new database
CREATE DATABASE company_db;

-- Create with character set
CREATE DATABASE company_db
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;

-- Use the database
USE company_db;

CREATE TABLE

-- Basic table creation
CREATE TABLE employees (
    id INT PRIMARY KEY AUTO_INCREMENT,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) UNIQUE,
    department VARCHAR(50),
    salary DECIMAL(10,2),
    hire_date DATE,
    is_active BOOLEAN DEFAULT TRUE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Table with multiple constraints
CREATE TABLE orders (
    order_id INT PRIMARY KEY AUTO_INCREMENT,
    customer_id INT NOT NULL,
    order_date DATE NOT NULL,
    total_amount DECIMAL(10,2) CHECK (total_amount >= 0),
    status VARCHAR(20) DEFAULT 'pending',
    FOREIGN KEY (customer_id) REFERENCES customers(id)
        ON DELETE CASCADE
        ON UPDATE CASCADE
);

ALTER TABLE

-- Add a new column
ALTER TABLE employees
ADD COLUMN phone VARCHAR(15);

-- Modify column data type
ALTER TABLE employees
MODIFY COLUMN salary DECIMAL(12,2);

-- Rename column
ALTER TABLE employees
RENAME COLUMN phone TO mobile_number;

-- Drop column
ALTER TABLE employees
DROP COLUMN mobile_number;

-- Add constraint
ALTER TABLE employees
ADD CONSTRAINT chk_salary CHECK (salary > 0);

-- Add foreign key
ALTER TABLE orders
ADD CONSTRAINT fk_customer
FOREIGN KEY (customer_id) REFERENCES customers(id);

DROP and TRUNCATE

-- Drop table (removes structure and data)
DROP TABLE IF EXISTS temp_table;

-- Truncate table (removes all data, keeps structure)
TRUNCATE TABLE logs;

-- Difference:
-- DROP: Removes everything, cannot rollback
-- TRUNCATE: Faster than DELETE, resets auto-increment
-- DELETE: Slower, can rollback, doesn't reset auto-increment

DML Commands in Detail

INSERT

-- Insert single row
INSERT INTO employees (first_name, last_name, email, department, salary)
VALUES ('Venu', 'Gopal', '[email protected]', 'Engineering', 150000);

-- Insert multiple rows
INSERT INTO employees (first_name, last_name, department, salary)
VALUES 
    ('John', 'Doe', 'Finance', 120000),
    ('Sarah', 'Smith', 'Engineering', 140000),
    ('Mike', 'Johnson', 'Marketing', 110000);

-- Insert from another table
INSERT INTO employees_backup
SELECT * FROM employees
WHERE hire_date < '2023-01-01';

-- Insert with default values
INSERT INTO employees (first_name, last_name)
VALUES ('Jane', 'Doe');  -- Other columns get default values

UPDATE

-- Update single record
UPDATE employees
SET salary = 160000
WHERE id = 1;

-- Update multiple columns
UPDATE employees
SET 
    salary = salary * 1.10,
    department = 'Senior Engineering'
WHERE id = 1;

-- Update with condition
UPDATE employees
SET salary = salary * 1.05
WHERE department = 'Engineering'
  AND hire_date < '2023-01-01';

-- Update using subquery
UPDATE employees
SET salary = (
    SELECT AVG(salary) * 1.2
    FROM employees
    WHERE department = 'Engineering'
)
WHERE id = 1;

-- ⚠️ DANGER: Update without WHERE updates ALL rows!
-- UPDATE employees SET salary = 50000;  -- DON'T DO THIS!

DELETE

-- Delete specific record
DELETE FROM employees
WHERE id = 1;

-- Delete with condition
DELETE FROM employees
WHERE department = 'Marketing'
  AND hire_date < '2020-01-01';

-- Delete using subquery
DELETE FROM employees
WHERE salary < (
    SELECT AVG(salary)
    FROM employees
);

-- ⚠️ DANGER: Delete without WHERE deletes ALL rows!
-- DELETE FROM employees;  -- DON'T DO THIS!

-- Safe practice: Always test with SELECT first
SELECT * FROM employees WHERE id = 1;  -- Test
DELETE FROM employees WHERE id = 1;    -- Then delete

DQL - SELECT Queries

Basic SELECT

-- Select all columns
SELECT * FROM employees;

-- Select specific columns
SELECT first_name, last_name, salary
FROM employees;

-- Select with alias
SELECT 
    first_name AS "First Name",
    last_name AS "Last Name",
    salary AS "Annual Salary"
FROM employees;

-- Select with calculations
SELECT 
    first_name,
    salary,
    salary * 12 AS annual_salary,
    salary * 0.3 AS tax_amount
FROM employees;

WHERE Clause - Filtering

-- Comparison operators
SELECT * FROM employees WHERE salary > 100000;
SELECT * FROM employees WHERE department = 'Engineering';
SELECT * FROM employees WHERE hire_date >= '2023-01-01';

-- Logical operators
SELECT * FROM employees
WHERE department = 'Engineering'
  AND salary > 120000;

SELECT * FROM employees
WHERE department = 'Engineering'
   OR department = 'Finance';

SELECT * FROM employees
WHERE NOT department = 'Marketing';

-- BETWEEN operator
SELECT * FROM employees
WHERE salary BETWEEN 100000 AND 150000;

-- IN operator
SELECT * FROM employees
WHERE department IN ('Engineering', 'Finance', 'Marketing');

-- LIKE operator (pattern matching)
SELECT * FROM employees
WHERE first_name LIKE 'J%';        -- Starts with J

SELECT * FROM employees
WHERE email LIKE '%@gmail.com';    -- Ends with @gmail.com

SELECT * FROM employees
WHERE first_name LIKE '_ohn';      -- Second char is 'o', then 'hn'

-- IS NULL / IS NOT NULL
SELECT * FROM employees
WHERE email IS NULL;

SELECT * FROM employees
WHERE email IS NOT NULL;

ORDER BY - Sorting

-- Sort ascending (default)
SELECT * FROM employees
ORDER BY salary ASC;

-- Sort descending
SELECT * FROM employees
ORDER BY salary DESC;

-- Sort by multiple columns
SELECT * FROM employees
ORDER BY department ASC, salary DESC;

-- Sort with NULL handling
SELECT * FROM employees
ORDER BY email NULLS FIRST;  -- PostgreSQL

SELECT * FROM employees
ORDER BY email NULLS LAST;   -- PostgreSQL

LIMIT and OFFSET - Pagination

-- Get first 10 records
SELECT * FROM employees
LIMIT 10;

-- Get records 11-20 (pagination)
SELECT * FROM employees
LIMIT 10 OFFSET 10;

-- Top 5 highest paid employees
SELECT * FROM employees
ORDER BY salary DESC
LIMIT 5;

-- MySQL alternative syntax
SELECT * FROM employees
LIMIT 10, 10;  -- OFFSET 10, LIMIT 10

DISTINCT - Remove Duplicates

-- Get unique departments
SELECT DISTINCT department
FROM employees;

-- Get unique combinations
SELECT DISTINCT department, hire_date
FROM employees;

-- Count unique values
SELECT COUNT(DISTINCT department)
FROM employees;

Aggregate Functions

Aggregate functions perform calculations on multiple rows and return a single value.

COUNT

-- Count all rows
SELECT COUNT(*) FROM employees;

-- Count non-null values
SELECT COUNT(email) FROM employees;

-- Count distinct values
SELECT COUNT(DISTINCT department) FROM employees;

-- Count with condition
SELECT COUNT(*) FROM employees
WHERE salary > 100000;

SUM

-- Total salary expense
SELECT SUM(salary) AS total_salary
FROM employees;

-- Sum with condition
SELECT SUM(salary) AS engineering_salary
FROM employees
WHERE department = 'Engineering';

AVG

-- Average salary
SELECT AVG(salary) AS average_salary
FROM employees;

-- Average by department
SELECT 
    department,
    AVG(salary) AS avg_salary
FROM employees
GROUP BY department;

-- Round average
SELECT ROUND(AVG(salary), 2) AS avg_salary
FROM employees;

MIN and MAX

-- Minimum and maximum salary
SELECT 
    MIN(salary) AS lowest_salary,
    MAX(salary) AS highest_salary
FROM employees;

-- Earliest and latest hire date
SELECT 
    MIN(hire_date) AS first_hire,
    MAX(hire_date) AS latest_hire
FROM employees;

GROUP BY and HAVING

GROUP BY

-- Count employees by department
SELECT 
    department,
    COUNT(*) AS employee_count
FROM employees
GROUP BY department;

-- Average salary by department
SELECT 
    department,
    AVG(salary) AS avg_salary,
    MIN(salary) AS min_salary,
    MAX(salary) AS max_salary
FROM employees
GROUP BY department;

-- Multiple grouping columns
SELECT 
    department,
    YEAR(hire_date) AS hire_year,
    COUNT(*) AS count
FROM employees
GROUP BY department, YEAR(hire_date);

HAVING

HAVING filters grouped data (use after GROUP BY). WHERE filters before grouping.

-- Departments with more than 5 employees
SELECT 
    department,
    COUNT(*) AS employee_count
FROM employees
GROUP BY department
HAVING COUNT(*) > 5;

-- Departments with average salary > 120000
SELECT 
    department,
    AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 120000;

-- Combined WHERE and HAVING
SELECT 
    department,
    AVG(salary) AS avg_salary
FROM employees
WHERE hire_date >= '2023-01-01'  -- Filter before grouping
GROUP BY department
HAVING AVG(salary) > 100000;     -- Filter after grouping

JOINs - Combining Tables

JOINs are used to combine rows from two or more tables based on related columns.

Sample Tables for JOIN Examples

-- Employees table
CREATE TABLE employees (
    emp_id INT PRIMARY KEY,
    name VARCHAR(50),
    dept_id INT,
    salary DECIMAL(10,2)
);

-- Departments table
CREATE TABLE departments (
    dept_id INT PRIMARY KEY,
    dept_name VARCHAR(50),
    location VARCHAR(50)
);

-- Sample data
INSERT INTO employees VALUES
(1, 'Venu', 1, 150000),
(2, 'John', 2, 120000),
(3, 'Sarah', 1, 140000),
(4, 'Mike', NULL, 110000);  -- No department

INSERT INTO departments VALUES
(1, 'Engineering', 'New York'),
(2, 'Finance', 'Chicago'),
(3, 'Marketing', 'San Francisco');  -- No employees

INNER JOIN

Returns only matching records from both tables.

SELECT 
    e.name,
    e.salary,
    d.dept_name,
    d.location
FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id;

-- Result: Only employees with departments (Venu, John, Sarah)
-- Mike (no dept) and Marketing dept (no employees) are excluded

Visual Representation:

Employees ∩ Departments
(Only matching records)

LEFT JOIN (LEFT OUTER JOIN)

Returns all records from left table and matching records from right table.

SELECT 
    e.name,
    e.salary,
    d.dept_name,
    d.location
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.dept_id;

-- Result: All employees including Mike (dept_name will be NULL)
-- Marketing department is excluded

Visual Representation:

All Employees + Matching Departments
(Mike shows with NULL department)

RIGHT JOIN (RIGHT OUTER JOIN)

Returns all records from right table and matching records from left table.

SELECT 
    e.name,
    e.salary,
    d.dept_name,
    d.location
FROM employees e
RIGHT JOIN departments d ON e.dept_id = d.dept_id;

-- Result: All departments including Marketing (name will be NULL)
-- Mike is excluded

FULL OUTER JOIN

Returns all records from both tables, with NULLs where no match exists.

SELECT 
    e.name,
    e.salary,
    d.dept_name,
    d.location
FROM employees e
FULL OUTER JOIN departments d ON e.dept_id = d.dept_id;

-- Result: All employees AND all departments
-- Mike shows with NULL department
-- Marketing shows with NULL employee

CROSS JOIN

Returns Cartesian product (all possible combinations).

SELECT 
    e.name,
    d.dept_name
FROM employees e
CROSS JOIN departments d;

-- Result: 4 employees × 3 departments = 12 rows
-- Every employee paired with every department

SELF JOIN

Join a table with itself.

-- Find employees with same salary
SELECT 
    e1.name AS employee1,
    e2.name AS employee2,
    e1.salary
FROM employees e1
JOIN employees e2 ON e1.salary = e2.salary
WHERE e1.emp_id < e2.emp_id;

-- Find employee hierarchy (if manager_id exists)
SELECT 
    e.name AS employee,
    m.name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.emp_id;

Multiple JOINs

-- Join three tables
SELECT 
    e.name,
    d.dept_name,
    p.project_name
FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id
INNER JOIN projects p ON e.emp_id = p.emp_id;

Subqueries

A subquery is a query nested inside another query.

Subquery in WHERE Clause

-- Employees earning above average
SELECT name, salary
FROM employees
WHERE salary > (
    SELECT AVG(salary)
    FROM employees
);

-- Employees in Engineering department
SELECT name, salary
FROM employees
WHERE dept_id = (
    SELECT dept_id
    FROM departments
    WHERE dept_name = 'Engineering'
);

Subquery with IN

-- Employees in specific departments
SELECT name, salary
FROM employees
WHERE dept_id IN (
    SELECT dept_id
    FROM departments
    WHERE location IN ('New York', 'Chicago')
);

Subquery in SELECT

-- Show salary difference from average
SELECT 
    name,
    salary,
    (SELECT AVG(salary) FROM employees) AS avg_salary,
    salary - (SELECT AVG(salary) FROM employees) AS difference
FROM employees;

Subquery in FROM (Derived Table)

-- Average salary by department, then filter
SELECT dept_name, avg_salary
FROM (
    SELECT 
        d.dept_name,
        AVG(e.salary) AS avg_salary
    FROM employees e
    JOIN departments d ON e.dept_id = d.dept_id
    GROUP BY d.dept_name
) AS dept_avg
WHERE avg_salary > 120000;

Correlated Subquery

A subquery that references columns from the outer query.

-- Employees earning above their department average
SELECT e1.name, e1.salary, e1.dept_id
FROM employees e1
WHERE salary > (
    SELECT AVG(e2.salary)
    FROM employees e2
    WHERE e2.dept_id = e1.dept_id
);

EXISTS

Check if subquery returns any rows.

-- Customers who have placed orders
SELECT c.name
FROM customers c
WHERE EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.id
);

-- Departments with employees
SELECT d.dept_name
FROM departments d
WHERE EXISTS (
    SELECT 1
    FROM employees e
    WHERE e.dept_id = d.dept_id
);

Constraints

Constraints enforce rules on data in tables.

PRIMARY KEY

-- Single column primary key
CREATE TABLE employees (
    id INT PRIMARY KEY,
    name VARCHAR(50)
);

-- Composite primary key
CREATE TABLE order_items (
    order_id INT,
    product_id INT,
    quantity INT,
    PRIMARY KEY (order_id, product_id)
);

-- Add primary key to existing table
ALTER TABLE employees
ADD PRIMARY KEY (id);

FOREIGN KEY

-- Create with foreign key
CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    customer_id INT,
    FOREIGN KEY (customer_id) 
        REFERENCES customers(id)
        ON DELETE CASCADE
        ON UPDATE CASCADE
);

-- ON DELETE options:
-- CASCADE: Delete child records when parent is deleted
-- SET NULL: Set foreign key to NULL
-- RESTRICT: Prevent deletion if child records exist
-- NO ACTION: Same as RESTRICT

UNIQUE

-- Unique constraint
CREATE TABLE users (
    id INT PRIMARY KEY,
    email VARCHAR(100) UNIQUE,
    username VARCHAR(50) UNIQUE
);

-- Add unique constraint
ALTER TABLE users
ADD CONSTRAINT uk_email UNIQUE (email);

NOT NULL

CREATE TABLE employees (
    id INT PRIMARY KEY,
    name VARCHAR(50) NOT NULL,
    email VARCHAR(100) NOT NULL
);

CHECK

-- Check constraint
CREATE TABLE employees (
    id INT PRIMARY KEY,
    name VARCHAR(50),
    age INT CHECK (age >= 18),
    salary DECIMAL(10,2) CHECK (salary > 0)
);

-- Add check constraint
ALTER TABLE employees
ADD CONSTRAINT chk_salary CHECK (salary BETWEEN 30000 AND 500000);

DEFAULT

CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    order_date DATE DEFAULT CURRENT_DATE,
    status VARCHAR(20) DEFAULT 'pending',
    is_active BOOLEAN DEFAULT TRUE
);

Transactions

A transaction is a sequence of operations performed as a single logical unit of work.

ACID Properties

Atomicity: All operations succeed or all fail (no partial completion) Consistency: Database remains in valid state before and after transaction Isolation: Concurrent transactions don't interfere with each other Durability: Committed changes are permanent

Transaction Commands

-- Start transaction
BEGIN TRANSACTION;  -- or START TRANSACTION; or BEGIN;

-- Perform operations
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;

-- Commit (save changes)
COMMIT;

-- Or rollback (undo changes)
ROLLBACK;

Real-World Example: Money Transfer

BEGIN TRANSACTION;

-- Deduct from sender
UPDATE accounts 
SET balance = balance - 500 
WHERE account_id = 'A123';

-- Add to receiver
UPDATE accounts 
SET balance = balance + 500 
WHERE account_id = 'B456';

-- Check if both succeeded
IF @@ERROR = 0
    COMMIT;
ELSE
    ROLLBACK;

Savepoints

BEGIN TRANSACTION;

INSERT INTO orders VALUES (1, 'Order 1');
SAVEPOINT sp1;

INSERT INTO orders VALUES (2, 'Order 2');
SAVEPOINT sp2;

INSERT INTO orders VALUES (3, 'Order 3');

-- Rollback to sp2 (keeps Order 1 and 2, removes Order 3)
ROLLBACK TO SAVEPOINT sp2;

COMMIT;

Indexes

Indexes improve query performance by creating a data structure for fast lookups.

Create Index

-- Single column index
CREATE INDEX idx_employee_name 
ON employees(name);

-- Composite index
CREATE INDEX idx_dept_salary 
ON employees(dept_id, salary);

-- Unique index
CREATE UNIQUE INDEX idx_email 
ON employees(email);

When to Use Indexes

✅ Use indexes for:

  • Columns in WHERE clauses
  • Columns in JOIN conditions
  • Columns in ORDER BY
  • Foreign key columns
  • Columns with high selectivity

❌ Avoid indexes for:

  • Small tables
  • Columns with frequent updates
  • Columns with low selectivity (e.g., boolean)
  • Tables with heavy INSERT/UPDATE operations

Index Performance

-- Without index (Table Scan)
SELECT * FROM employees WHERE name = 'Venu';
-- Scans all rows: O(n)

-- With index (Index Seek)
CREATE INDEX idx_name ON employees(name);
SELECT * FROM employees WHERE name = 'Venu';
-- Uses index: O(log n)

Drop Index

DROP INDEX idx_employee_name ON employees;

Views

A view is a virtual table based on a SQL query.

Create View

-- Simple view
CREATE VIEW active_employees AS
SELECT id, name, department, salary
FROM employees
WHERE is_active = TRUE;

-- Complex view with joins
CREATE VIEW employee_details AS
SELECT 
    e.name,
    e.salary,
    d.dept_name,
    d.location
FROM employees e
JOIN departments d ON e.dept_id = d.dept_id;

-- Use view like a table
SELECT * FROM active_employees;
SELECT * FROM employee_details WHERE dept_name = 'Engineering';

Benefits of Views

  • Security: Hide sensitive columns
  • Simplicity: Simplify complex queries
  • Consistency: Ensure consistent data access
  • Abstraction: Hide underlying table structure

Update View

-- Replace view
CREATE OR REPLACE VIEW active_employees AS
SELECT id, name, department, salary, hire_date
FROM employees
WHERE is_active = TRUE;

Drop View

DROP VIEW IF EXISTS active_employees;

String Functions

-- Concatenation
SELECT CONCAT(first_name, ' ', last_name) AS full_name
FROM employees;

-- Uppercase/Lowercase
SELECT UPPER(name), LOWER(name)
FROM employees;

-- Substring
SELECT SUBSTRING(name, 1, 3) AS short_name
FROM employees;

-- Length
SELECT name, LENGTH(name) AS name_length
FROM employees;

-- Trim whitespace
SELECT TRIM('  Hello  ') AS trimmed;

-- Replace
SELECT REPLACE(email, '@old.com', '@new.com')
FROM employees;

Date Functions

-- Current date and time
SELECT 
    CURRENT_DATE,
    CURRENT_TIME,
    CURRENT_TIMESTAMP,
    NOW();

-- Extract parts
SELECT 
    YEAR(hire_date) AS year,
    MONTH(hire_date) AS month,
    DAY(hire_date) AS day
FROM employees;

-- Date arithmetic
SELECT 
    hire_date,
    DATE_ADD(hire_date, INTERVAL 1 YEAR) AS anniversary,
    DATEDIFF(CURRENT_DATE, hire_date) AS days_employed
FROM employees;

-- Format date
SELECT DATE_FORMAT(hire_date, '%Y-%m-%d') AS formatted_date
FROM employees;

CASE Statement

-- Simple CASE
SELECT 
    name,
    salary,
    CASE 
        WHEN salary >= 150000 THEN 'High'
        WHEN salary >= 100000 THEN 'Medium'
        ELSE 'Low'
    END AS salary_grade
FROM employees;

-- CASE in aggregation
SELECT 
    department,
    COUNT(CASE WHEN salary > 120000 THEN 1 END) AS high_earners,
    COUNT(CASE WHEN salary <= 120000 THEN 1 END) AS regular_earners
FROM employees
GROUP BY department;

Common Table Expressions (CTE)

-- Simple CTE
WITH high_earners AS (
    SELECT name, salary, department
    FROM employees
    WHERE salary > 120000
)
SELECT * FROM high_earners
WHERE department = 'Engineering';

-- Multiple CTEs
WITH 
dept_avg AS (
    SELECT dept_id, AVG(salary) AS avg_salary
    FROM employees
    GROUP BY dept_id
),
high_depts AS (
    SELECT dept_id
    FROM dept_avg
    WHERE avg_salary > 120000
)
SELECT e.name, e.salary
FROM employees e
JOIN high_depts h ON e.dept_id = h.dept_id;

Window Functions

Window functions perform calculations across rows related to the current row.

-- ROW_NUMBER: Assign unique row numbers
SELECT 
    name,
    salary,
    ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num
FROM employees;

-- RANK: Rank with gaps for ties
SELECT 
    name,
    salary,
    RANK() OVER (ORDER BY salary DESC) AS rank
FROM employees;

-- DENSE_RANK: Rank without gaps
SELECT 
    name,
    salary,
    DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank
FROM employees;

-- PARTITION BY: Window per group
SELECT 
    name,
    department,
    salary,
    RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank
FROM employees;

-- LAG and LEAD: Access previous/next rows
SELECT 
    name,
    salary,
    LAG(salary) OVER (ORDER BY salary) AS prev_salary,
    LEAD(salary) OVER (ORDER BY salary) AS next_salary
FROM employees;

SQL Query Execution Order

Understanding execution order helps write better queries:

1. FROM       -- Get tables
2. JOIN       -- Combine tables
3. WHERE      -- Filter rows
4. GROUP BY   -- Group rows
5. HAVING     -- Filter groups
6. SELECT     -- Choose columns
7. DISTINCT   -- Remove duplicates
8. ORDER BY   -- Sort results
9. LIMIT      -- Restrict count

Example:

SELECT department, AVG(salary) AS avg_salary  -- 6
FROM employees                                 -- 1
WHERE hire_date >= '2023-01-01'               -- 3
GROUP BY department                            -- 4
HAVING AVG(salary) > 100000                   -- 5
ORDER BY avg_salary DESC                       -- 8
LIMIT 5;                                       -- 9

Real-World Examples

Example 1: E-Commerce Order Analysis

-- Find top 5 customers by total spending
SELECT 
    c.name,
    COUNT(o.order_id) AS total_orders,
    SUM(o.total_amount) AS total_spent
FROM customers c
JOIN orders o ON c.id = o.customer_id
WHERE o.order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)
GROUP BY c.id, c.name
HAVING SUM(o.total_amount) > 1000
ORDER BY total_spent DESC
LIMIT 5;

Example 2: Employee Salary Analysis

-- Find employees earning above department average
SELECT 
    e.name,
    e.salary,
    d.dept_name,
    dept_avg.avg_salary
FROM employees e
JOIN departments d ON e.dept_id = d.dept_id
JOIN (
    SELECT dept_id, AVG(salary) AS avg_salary
    FROM employees
    GROUP BY dept_id
) dept_avg ON e.dept_id = dept_avg.dept_id
WHERE e.salary > dept_avg.avg_salary
ORDER BY e.salary DESC;

Example 3: Product Inventory Management

-- Find products that need reordering
SELECT 
    p.product_name,
    p.current_stock,
    p.reorder_level,
    p.reorder_level - p.current_stock AS units_needed
FROM products p
WHERE p.current_stock < p.reorder_level
  AND p.is_active = TRUE
ORDER BY units_needed DESC;

Best Practices

Query Optimization

  1. **Use specific columns instead of SELECT ***
-- ❌ Bad
SELECT * FROM employees;

-- ✅ Good
SELECT id, name, salary FROM employees;
  1. Use indexes on frequently queried columns
CREATE INDEX idx_email ON employees(email);
  1. Avoid functions in WHERE clause
-- ❌ Bad (prevents index usage)
SELECT * FROM employees WHERE YEAR(hire_date) = 2023;

-- ✅ Good
SELECT * FROM employees 
WHERE hire_date BETWEEN '2023-01-01' AND '2023-12-31';
  1. Use EXISTS instead of IN for large datasets
-- ✅ Better performance
SELECT * FROM customers c
WHERE EXISTS (
    SELECT 1 FROM orders o WHERE o.customer_id = c.id
);

Data Integrity

  1. Always use WHERE in UPDATE/DELETE
-- ⚠️ Test first
SELECT * FROM employees WHERE id = 1;

-- Then execute
DELETE FROM employees WHERE id = 1;
  1. Use transactions for critical operations
BEGIN TRANSACTION;
-- Your operations
COMMIT;
  1. Implement proper constraints
CREATE TABLE orders (
    id INT PRIMARY KEY,
    customer_id INT NOT NULL,
    total DECIMAL(10,2) CHECK (total >= 0),
    FOREIGN KEY (customer_id) REFERENCES customers(id)
);

Code Quality

  1. Use meaningful aliases
-- ✅ Good
SELECT 
    e.name AS employee_name,
    d.dept_name AS department
FROM employees e
JOIN departments d ON e.dept_id = d.dept_id;
  1. Format queries for readability
SELECT 
    e.name,
    e.salary,
    d.dept_name
FROM employees e
INNER JOIN departments d 
    ON e.dept_id = d.dept_id
WHERE e.salary > 100000
ORDER BY e.salary DESC;
  1. Comment complex queries
-- Calculate year-over-year revenue growth
SELECT 
    YEAR(order_date) AS year,
    SUM(total_amount) AS revenue,
    LAG(SUM(total_amount)) OVER (ORDER BY YEAR(order_date)) AS prev_year_revenue
FROM orders
GROUP BY YEAR(order_date);

Common Interview Questions

1. Difference Between WHERE and HAVING?

WHERE: Filters rows before grouping HAVING: Filters groups after aggregation

-- WHERE: Filter before grouping
SELECT department, COUNT(*)
FROM employees
WHERE salary > 100000  -- Filter individual rows
GROUP BY department;

-- HAVING: Filter after grouping
SELECT department, AVG(salary)
FROM employees
GROUP BY department
HAVING AVG(salary) > 120000;  -- Filter aggregated results

2. Difference Between DELETE, TRUNCATE, and DROP?

Command Purpose Speed Rollback Reset Auto-increment
DELETE Remove rows Slow Yes No
TRUNCATE Remove all rows Fast No Yes
DROP Remove table Fast No N/A
DELETE FROM employees WHERE id = 1;  -- Remove specific rows
TRUNCATE TABLE employees;            -- Remove all rows, keep structure
DROP TABLE employees;                -- Remove table completely

3. Difference Between INNER JOIN and LEFT JOIN?

-- INNER JOIN: Only matching records
SELECT * FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id;
-- Result: Only employees with departments

-- LEFT JOIN: All left table records
SELECT * FROM employees e
LEFT JOIN departments d ON e.dept_id = d.dept_id;
-- Result: All employees, even without departments (NULL for dept fields)

4. What is Normalization?

Normalization is organizing data to reduce redundancy and improve integrity.

1NF: Atomic values, no repeating groups 2NF: 1NF + No partial dependencies 3NF: 2NF + No transitive dependencies

5. Explain ACID Properties

  • Atomicity: All or nothing
  • Consistency: Valid state always
  • Isolation: Transactions don't interfere
  • Durability: Changes are permanent

6. What is an Index and When to Use It?

An index is a data structure that improves query speed.

Use when:

  • Frequent WHERE/JOIN columns
  • Large tables
  • Read-heavy operations

Avoid when:

  • Small tables
  • Write-heavy operations
  • Low selectivity columns

7. Difference Between Primary Key and Unique Key?

Feature Primary Key Unique Key
NULL values Not allowed Allowed (one NULL)
Per table One only Multiple allowed
Clustered index Yes (default) No

8. What is a Subquery?

A query nested inside another query.

-- Scalar subquery (returns single value)
SELECT * FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);

-- Correlated subquery (references outer query)
SELECT e1.name, e1.salary
FROM employees e1
WHERE salary > (
    SELECT AVG(e2.salary)
    FROM employees e2
    WHERE e2.dept_id = e1.dept_id
);

9. Explain Self Join

Joining a table with itself.

-- Find employees with same salary
SELECT 
    e1.name AS employee1,
    e2.name AS employee2,
    e1.salary
FROM employees e1
JOIN employees e2 ON e1.salary = e2.salary
WHERE e1.id < e2.id;

10. What is a View?

A virtual table based on a query result.

CREATE VIEW high_earners AS
SELECT name, salary, department
FROM employees
WHERE salary > 120000;

-- Use like a table
SELECT * FROM high_earners;

Performance Tuning Tips

1. Use EXPLAIN to Analyze Queries

EXPLAIN SELECT * FROM employees WHERE salary > 100000;
-- Shows execution plan, index usage, row scans

2. Avoid N+1 Query Problem

-- ❌ Bad: N+1 queries
SELECT * FROM orders;  -- 1 query
-- Then for each order:
SELECT * FROM customers WHERE id = ?;  -- N queries

-- ✅ Good: Single query with JOIN
SELECT o.*, c.*
FROM orders o
JOIN customers c ON o.customer_id = c.id;

3. Use Connection Pooling

Reuse database connections instead of creating new ones.

4. Batch Operations

-- ❌ Bad: Multiple inserts
INSERT INTO logs VALUES (1, 'Log 1');
INSERT INTO logs VALUES (2, 'Log 2');
INSERT INTO logs VALUES (3, 'Log 3');

-- ✅ Good: Batch insert
INSERT INTO logs VALUES 
    (1, 'Log 1'),
    (2, 'Log 2'),
    (3, 'Log 3');

5. Use Appropriate Data Types

-- ❌ Bad: VARCHAR for numbers
CREATE TABLE products (price VARCHAR(20));

-- ✅ Good: DECIMAL for money
CREATE TABLE products (price DECIMAL(10,2));

SQL Cheat Sheet

╔════════════════════════════════════════════════════════╗
║                    SQL QUICK REFERENCE                  ║
╠════════════════════════════════════════════════════════╣
║ DDL (Structure)                                         ║
║   CREATE, ALTER, DROP, TRUNCATE                        ║
║                                                         ║
║ DML (Data)                                             ║
║   INSERT, UPDATE, DELETE                               ║
║                                                         ║
║ DQL (Query)                                            ║
║   SELECT, WHERE, ORDER BY, GROUP BY, HAVING            ║
║                                                         ║
║ Joins                                                   ║
║   INNER, LEFT, RIGHT, FULL, CROSS, SELF                ║
║                                                         ║
║ Aggregates                                             ║
║   COUNT, SUM, AVG, MIN, MAX                            ║
║                                                         ║
║ Constraints                                            ║
║   PRIMARY KEY, FOREIGN KEY, UNIQUE, NOT NULL, CHECK    ║
║                                                         ║
║ Transactions                                           ║
║   BEGIN, COMMIT, ROLLBACK, SAVEPOINT                   ║
║                                                         ║
║ Performance                                            ║
║   INDEX, EXPLAIN, VIEW                                 ║
╚════════════════════════════════════════════════════════╝

Summary

SQL is the foundation of data management in modern applications. This guide covered:

Database Fundamentals: Tables, relationships, data types ✅ DDL Commands: CREATE, ALTER, DROP, TRUNCATE ✅ DML Commands: INSERT, UPDATE, DELETE ✅ DQL Queries: SELECT, WHERE, ORDER BY, GROUP BY ✅ Joins: INNER, LEFT, RIGHT, FULL, SELF ✅ Advanced Concepts: Subqueries, CTEs, Window Functions ✅ Constraints: PRIMARY KEY, FOREIGN KEY, UNIQUE ✅ Transactions: ACID properties, COMMIT, ROLLBACK ✅ Performance: Indexes, query optimization ✅ Best Practices: Code quality, data integrity

Next Steps

  1. Practice: Use online SQL playgrounds (SQLFiddle, DB Fiddle)
  2. Real Projects: Build applications with databases
  3. Advanced Topics: Stored procedures, triggers, partitioning
  4. Specific DBMS: Learn PostgreSQL, MySQL, or SQL Server specifics
  5. Interview Prep: Practice LeetCode SQL problems

Resources


Remember: SQL is a skill that improves with practice. Start with simple queries and gradually tackle more complex scenarios. Happy querying! 🚀