What is SQL?
SQL (Structured Query Language) is the standard declarative language for relational databases. You specify what data you want, not how to retrieve it — the query optimizer figures out the execution plan.
SQL sub-languages
DDL (Data Definition Language) — defines schema:
CREATE TABLE users (id INT PRIMARY KEY, name VARCHAR(100), email VARCHAR(255) UNIQUE)ALTER TABLE users ADD COLUMN age INT DEFAULT 0DROP TABLE usersCREATE INDEX idx_email ON users (email)
DML (Data Manipulation Language) — reads and modifies data:
SELECT— query dataINSERT INTO users (name, email) VALUES ('Alice', 'a@b.com')UPDATE users SET email = 'new@b.com' WHERE id = 1DELETE FROM users WHERE id = 1MERGE/INSERT ... ON CONFLICT(upsert) — insert or update depending on existence
DCL (Data Control Language) — manages permissions:
GRANT SELECT, INSERT ON users TO analyst_roleREVOKE DELETE ON users FROM intern_role
TCL (Transaction Control Language) — manages transactions:
BEGIN— start a transactionCOMMIT— make changes permanentROLLBACK— undo all changes since BEGINSAVEPOINT sp1/ROLLBACK TO sp1— partial rollback
SELECT anatomy
The logical order of a SELECT statement differs from the written order:
Written order:
SELECT department, AVG(salary) AS avg_sal
FROM employees
WHERE hire_date > '2020-01-01'
GROUP BY department
HAVING AVG(salary) > 80000
ORDER BY avg_sal DESC
LIMIT 10;
Logical execution order:
FROM employees— identify the source tableWHERE hire_date > '2020-01-01'— filter rowsGROUP BY department— form groupsHAVING AVG(salary) > 80000— filter groupsSELECT department, AVG(salary)— compute output columnsORDER BY avg_sal DESC— sort resultsLIMIT 10— truncate output
Understanding this order explains why you cannot use a column alias defined in SELECT within the WHERE clause — WHERE executes before SELECT.
NULL semantics (three-valued logic)
SQL uses three-valued logic: TRUE, FALSE, and UNKNOWN. Any comparison involving NULL yields UNKNOWN:
NULL = NULL→ UNKNOWN (not TRUE!)NULL > 5→ UNKNOWNNULL AND TRUE→ UNKNOWNNULL OR TRUE→ TRUE (short-circuit)WHERE x = NULLmatches nothing — useWHERE x IS NULL
COALESCE(a, b, c) returns the first non-NULL argument. It is the standard way to provide default values: COALESCE(nickname, first_name, 'Anonymous').
Declarative nature
SQL is declarative: you describe the result set, not the algorithm. The same query might be executed as:
- A sequential scan with a filter
- An index lookup followed by a heap fetch
- A merge join of two sorted inputs
- A hash join with parallel workers
The query optimizer chooses the best plan based on table statistics, index availability, and cost estimates. This is why the same SQL query can perform very differently depending on the database engine, indexes, and data distribution.
Real-Life: The Power of Declarative Queries
Consider a query to find the top 5 customers by total spending in the last 30 days:
SELECT c.name, SUM(o.amount) AS total
FROM customers c
JOIN orders o ON c.id = o.customer_id
WHERE o.created_at > CURRENT_DATE - INTERVAL '30 days'
GROUP BY c.id, c.name
ORDER BY total DESC
LIMIT 5;
What the optimizer might decide:
- Use a range index scan on
orders.created_atto find recent orders (instead of scanning all orders). - For each matching order, use a nested loop join with an index lookup on
customers.id. - Or, if most orders are recent, do a hash join: build a hash table on customers, scan orders sequentially.
The developer wrote 7 lines of SQL. The optimizer generated a plan that would take 200+ lines of imperative code. And it automatically adapts: if you add an index on created_at, the plan changes without modifying the query.
NULL gotcha in practice:
-- This finds nothing even if some users have no email!
SELECT * FROM users WHERE email != 'test@example.com';
-- NULL != 'test@example.com' is UNKNOWN, filtered out
-- Fix:
SELECT * FROM users WHERE email != 'test@example.com' OR email IS NULL;
-- Or: WHERE COALESCE(email, '') != 'test@example.com'
Savepoints in practice:
BEGIN;
INSERT INTO orders (...) VALUES (...);
SAVEPOINT before_payment;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
-- Oops, insufficient funds:
ROLLBACK TO before_payment;
-- Order insert is still active, try a different payment method
COMMIT;