Introduction to MySQL

MySQL is an open-source relational database management system (RDBMS) that uses Structured Query Language (SQL) for managing and manipulating databases. It's one of the most popular database systems in the world.

Why MySQL?

MySQL provides several advantages for database management:

  • Open Source: Free to use with a large community support
  • Cross-Platform: Runs on various operating systems
  • High Performance: Optimized for speed and efficiency
  • Scalability: Handles large amounts of data
  • Security: Robust data protection and user management

Key Features of MySQL

MySQL offers a comprehensive set of features for database management:

1. Relational Database

MySQL organizes data into tables with rows and columns, allowing relationships between different data entities.

2. SQL Support

Full support for SQL standards, making it easy to learn and use with existing SQL knowledge.

3. ACID Compliance

Ensures database transactions are processed reliably (Atomicity, Consistency, Isolation, Durability).

4. Client/Server Architecture

Supports multiple clients connecting to a central database server simultaneously.

Note: This tutorial assumes you have basic knowledge of databases and SQL concepts. We'll start with the fundamentals and progress to advanced topics.

Basic MySQL Terminology

  • Database: A collection of related tables
  • Table: A structure that organizes data into rows and columns
  • Column: A specific field in a table that holds a particular type of data
  • Row: A single record in a table containing values for each column
  • Primary Key: A unique identifier for each row in a table
  • Foreign Key: A field that links to the primary key of another table

Select a topic from the left sidebar to dive deeper into specific MySQL concepts.

MySQL Installation & Setup

MySQL Installation: The process of setting up MySQL server and client tools on your system to create and manage databases. Proper installation ensures optimal performance and security.

Prerequisites

Before installing MySQL, ensure your system meets these requirements:

  • Operating System: Windows 10/11, macOS 10.14+, or Linux (Ubuntu 18.04+, CentOS 7+)
  • RAM: Minimum 2GB (4GB recommended for development)
  • Storage: At least 2GB free space
  • Administrator/root access for installation
  • Internet connection for downloading packages

Installation Methods by Platform

Choose the installation method that best fits your operating system:

Platform Recommended Method Alternative Methods
Windows MySQL Installer ZIP Archive, Docker
macOS DMG Package Homebrew, Docker
Linux (Ubuntu/Debian) APT Repository Docker, Tarball
Linux (CentOS/RHEL) YUM Repository Docker, Tarball

Step-by-Step Installation Guide

1. Windows Installation

Follow these detailed steps for Windows installation:

Windows Installation Steps
# Step 1: Download MySQL Installer
• Visit https://dev.mysql.com/downloads/installer/
• Download the MySQL Installer (recommended: web community version)

# Step 2: Run the Installer
• Double-click the downloaded .msi file
• Choose "Developer Default" for full development setup
• Click "Execute" to install required products

# Step 3: Product Configuration
• After installation, click "Next" to configure products
• For MySQL Server, choose "Standalone MySQL Server"
• Select "Config Type": Development Computer
• Use default TCP/IP port: 3306

# Step 4: Authentication Method
Recommended: Use Strong Password Encryption
• Set root password (minimum 8 characters, mix of letters, numbers, symbols)
Important: Remember this password!

# Step 5: Windows Service
• Configure MySQL as a Windows Service
• Service Name: MySQL80 (default)
• Start MySQL Server at System Startup

# Step 6: Apply Configuration
• Click "Execute" to apply all configurations
• Wait for all operations to complete successfully
• Click "Finish" to complete installation

2. macOS Installation

For macOS users, here's the installation process:

macOS Installation Steps
# Method 1: DMG Package (Recommended)
1. Download macOS DMG from MySQL website
2. Double-click the .dmg file to mount it
3. Run the .pkg installer and follow prompts
4. Remember the temporary root password shown at the end
5. Open Terminal and secure installation:
  mysql_secure_installation

# Method 2: Homebrew (Alternative)
1. Install Homebrew if not already installed:
  /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
2. Install MySQL:
  brew install mysql
3. Start MySQL service:
  brew services start mysql
4. Secure installation:
  mysql_secure_installation

3. Linux Installation (Ubuntu)

For Ubuntu and other Debian-based systems:

Ubuntu Installation Steps
# Update package index
sudo apt update

# Install MySQL Server
sudo apt install mysql-server

# Run security script
sudo mysql_secure_installation

# Secure installation prompts:
• Validate Password Plugin: Press y (recommended)
• Password Strength: Choose 2 (STRONG)
• Set root password and confirm
• Remove anonymous users: Press y
• Disallow root login remotely: Press y
• Remove test database: Press y
• Reload privilege tables: Press y

Post-Installation Verification

After installation, verify that MySQL is running correctly:

Verification Commands
# Check MySQL service status (Linux/macOS)
sudo systemctl status mysql
# Or on older systems:
sudo service mysql status

# Connect to MySQL server
mysql -u root -p

# Once connected, run these verification commands:
SELECT VERSION();
SHOW DATABASES;
STATUS;
# Expected output for version check:
+-----------+
| VERSION() |
+-----------+
| 8.0.33 |
+-----------+

# Expected databases:
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| sys |
+--------------------+

Initial Configuration

Configure MySQL for optimal development use:

Basic Configuration Steps
# Create a configuration file (my.cnf or my.ini)
# Location depends on OS:
# Windows: C:\ProgramData\MySQL\MySQL Server 8.0\my.ini
# Linux: /etc/mysql/my.cnf
# macOS: /usr/local/mysql/my.cnf

# Add these basic settings for development:
[mysqld]
default_authentication_plugin=mysql_native_password
character-set-server=utf8mb4
collation-server=utf8mb4_unicode_ci
max_connections=100

[client]
default-character-set=utf8mb4

Installing MySQL Workbench (GUI Tool)

For a graphical interface, install MySQL Workbench:

MySQL Workbench Installation
# Download from: https://dev.mysql.com/downloads/workbench/

# Windows:
• Download .msi installer and run it
• Choose complete installation
• Launch Workbench after installation

# macOS:
• Download .dmg file and drag to Applications
• Launch from Applications folder

# Linux (Ubuntu):
sudo apt install mysql-workbench

Troubleshooting Common Installation Issues

Common Issues and Solutions:
  • Port 3306 already in use: Stop other MySQL instances or change port in configuration
  • Access denied for root user: Use sudo on Linux/macOS or reset root password
  • Service won't start: Check error logs in data directory
  • Forgot root password: Use --skip-grant-tables option to reset
Security Best Practices:
  • Always run mysql_secure_installation after fresh install
  • Use strong passwords for all user accounts
  • Regularly update MySQL to latest version
  • Configure firewall to restrict database access
  • Create separate user accounts for different applications

Next Steps

After successful installation, you're ready to:

  • Create your first database
  • Learn basic SQL commands
  • Set up user accounts and permissions
  • Begin developing database-driven applications

Creating Databases in MySQL

Database: A structured collection of data organized into tables, views, and other objects. Think of it as a digital filing cabinet where you can store and organize related information.

Understanding Databases

Before creating databases, it's important to understand what they are:

  • Container: A database is a container that holds multiple tables
  • Organization: Helps organize related data together
  • Security: Provides a security boundary - you can control access per database
  • Backup Unit: Makes it easy to backup related data together

Prerequisites

Before creating databases, ensure you have:

  • MySQL server installed and running
  • Access to MySQL command line or GUI tool
  • Appropriate privileges (usually as root user initially)
  • Basic understanding of SQL syntax

Step 1: Connect to MySQL Server

First, you need to connect to your MySQL server:

Command Line Connection
# Connect as root user (you'll be prompted for password)
mysql -u root -p

# Alternative: Connect with password in command (less secure)
mysql -u root -pYourPassword

# If successful, you'll see MySQL prompt:
mysql>
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 15
Server version: 8.0.33 MySQL Community Server

mysql>

Step 2: View Existing Databases

Before creating new databases, let's see what already exists:

SQL Command
SHOW DATABASES;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| sys |
+--------------------+
4 rows in set (0.00 sec)
Understanding System Databases:
information_schema: Contains metadata about all databases
mysql: Stores user privileges and system settings
performance_schema: Performance monitoring data
sys: Provides readable views of performance data
Don't modify these system databases directly!

Step 3: Create Your First Database

Now let's create a simple database. We'll use a school management system as an example:

Basic Database Creation
CREATE DATABASE school_db;
Query OK, 1 row affected (0.02 sec)

Understanding the CREATE DATABASE Syntax

Let's break down the command:

  • CREATE DATABASE - The SQL command to create a new database
  • school_db - The name we're giving to our database
  • ; - Required semicolon to end the SQL statement

Step 4: Verify Database Creation

Let's confirm our database was created successfully:

SQL Command
SHOW DATABASES;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| school_db |
| sys |
+--------------------+
5 rows in set (0.00 sec)

Step 5: Select the Database

Before working with tables, you need to tell MySQL which database to use:

SQL Command
USE school_db;
Database changed
Tip: The USE command tells MySQL that all subsequent commands should apply to the specified database. You'll need to do this every time you connect to MySQL and want to work with a specific database.

Advanced Database Creation Options

CREATE DATABASE has several optional parameters for more control:

Advanced Database Creation
-- Create database with specific character set
CREATE DATABASE company_db
  CHARACTER SET utf8mb4
  COLLATE utf8mb4_unicode_ci;

-- Create database only if it doesn't exist (prevents errors)
CREATE DATABASE IF NOT EXISTS inventory_db;

Understanding Character Sets and Collations

  • Character Set: Defines what characters can be stored (utf8mb4 supports all Unicode characters)
  • Collation: Defines how characters are sorted and compared
  • utf8mb4_unicode_ci is recommended for most applications

More Practical Examples

Let's create databases for different types of applications:

Multiple Database Examples
-- E-commerce website database
CREATE DATABASE ecommerce_db
  CHARACTER SET utf8mb4
  COLLATE utf8mb4_unicode_ci;

-- Blog application database
CREATE DATABASE blog_db;

-- Employee management system
CREATE DATABASE employee_management;

-- Check all our databases
SHOW DATABASES;
+------------------------+
| Database |
+------------------------+
| information_schema |
| blog_db |
| ecommerce_db |
| employee_management |
| mysql |
| performance_schema |
| school_db |
| sys |
+------------------------+
8 rows in set (0.00 sec)

Viewing Database Information

You can get detailed information about a specific database:

Database Information Commands
-- Show create statement for a database
SHOW CREATE DATABASE school_db;

-- Check which database is currently selected
SELECT DATABASE();
+----------+-------------------------------------------------------------------+
| Database | Create Database |
+----------+-------------------------------------------------------------------+
| school_db| CREATE DATABASE `school_db` /*!40100 DEFAULT CHARACTER SET utf8mb4 */ |
+----------+-------------------------------------------------------------------+
1 row in set (0.00 sec)

+------------+
| DATABASE() |
+------------+
| school_db |
+------------+
1 row in set (0.00 sec)

Deleting Databases

To remove a database (use with extreme caution!):

Database Deletion
-- Delete a database (PERMANENT - cannot be undone!)
DROP DATABASE blog_db;

-- Safer: Only delete if it exists
DROP DATABASE IF EXISTS temp_db;
⚠️ CRITICAL WARNING:
The DROP DATABASE command permanently deletes the entire database and ALL its contents (tables, data, everything). There is NO UNDO for this operation. Always double-check the database name before executing this command!

Best Practices for Database Creation

Naming Conventions:
  • Use descriptive, meaningful names (school_db, not db1)
  • Use lowercase letters and underscores (my_database, not MyDatabase)
  • Avoid spaces and special characters
  • Keep names reasonably short but descriptive
Configuration Best Practices:
  • Always specify CHARACTER SET and COLLATION for consistency
  • Use utf8mb4 for international applications
  • Plan your database structure before creation
  • Create separate databases for different applications

Common Mistakes to Avoid

  • Forgetting the semicolon (;) at the end of SQL statements
  • Using reserved keywords as database names (like SELECT, TABLE, etc.)
  • Not selecting a database with USE before creating tables
  • Using DROP DATABASE without double-checking the name
  • Creating too many databases when one would suffice

Practice Exercise

Try these exercises to practice what you've learned:

  1. Create a database named "library_system"
  2. Create a database named "hospital_management" with utf8mb4 character set
  3. Use SHOW DATABASES to verify both were created
  4. Select the "library_system" database
  5. Use SELECT DATABASE() to confirm it's selected

Creating Tables in MySQL

Table: A database object that stores data in rows and columns. Think of it as a spreadsheet where each row represents a record and each column represents a specific piece of information about that record.

Understanding Table Structure

Before creating tables, let's understand the components:

  • Table Name: Unique identifier for the table within a database
  • Columns/Fields: Define the type of data stored (name, age, email, etc.)
  • Rows/Records: Individual entries in the table
  • Data Types: Specify what kind of data each column can hold
  • Constraints: Rules that enforce data integrity

Prerequisites

Before creating tables, ensure you have:

  • A database created (we'll use "school_db" from previous lesson)
  • Selected the database using USE command
  • Basic understanding of data types
  • MySQL command line or GUI tool access

Step 1: Select Your Database

First, make sure you're working in the right database:

Select Database
USE school_db;
Database changed

Step 2: Basic Table Creation Syntax

The basic CREATE TABLE statement follows this pattern:

Basic CREATE TABLE Syntax
CREATE TABLE table_name (
  column1_name data_type constraints,
  column2_name data_type constraints,
  column3_name data_type constraints,
  ...
);

Step 3: Create Your First Table - Students

Let's create a simple students table to track basic information:

Creating Students Table
CREATE TABLE students (
  student_id INT PRIMARY KEY AUTO_INCREMENT,
  first_name VARCHAR(50) NOT NULL,
  last_name VARCHAR(50) NOT NULL,
  email VARCHAR(100) UNIQUE,
  date_of_birth DATE,
  enrollment_date DATE DEFAULT (CURRENT_DATE)
);
Query OK, 0 rows affected (0.05 sec)

Understanding Our Students Table Structure

Let's break down each column in a proper table format:

Column Name Data Type Constraints Description
student_id INT PRIMARY KEY AUTO_INCREMENT Unique ID that automatically increases
first_name VARCHAR(50) NOT NULL Student's first name (required)
last_name VARCHAR(50) NOT NULL Student's last name (required)
email VARCHAR(100) UNIQUE Email address (must be unique)
date_of_birth DATE None Birth date (optional)
enrollment_date DATE DEFAULT (CURRENT_DATE) Auto-fills with today's date

Step 4: Verify Table Creation

Let's confirm our table was created successfully:

View Table Structure
-- Show all tables in current database
SHOW TABLES;

-- View detailed structure of students table
DESCRIBE students;
+--------------------+
| Tables_in_school_db |
+--------------------+
| students |
+--------------------+
1 row in set (0.00 sec)

+----------------+--------------+------+-----+------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+------------+----------------+
| student_id | int | NO | PRI | NULL | auto_increment |
| first_name | varchar(50) | NO | | NULL | |
| last_name | varchar(50) | NO | | NULL | |
| email | varchar(100) | YES | UNI | NULL | |
| date_of_birth | date | YES | | NULL | |
| enrollment_date| date | YES | | curdate() | |
+----------------+--------------+------+-----+------------+----------------+

More Practical Table Examples

Let's create additional tables for our school database:

Example 1: Courses Table

Creating Courses Table
CREATE TABLE courses (
  course_id INT PRIMARY KEY AUTO_INCREMENT,
  course_name VARCHAR(100) NOT NULL,
  course_code VARCHAR(10) UNIQUE NOT NULL,
  credits INT DEFAULT 3,
  description TEXT
);

Example 2: Teachers Table

Creating Teachers Table
CREATE TABLE teachers (
  teacher_id INT PRIMARY KEY AUTO_INCREMENT,
  first_name VARCHAR(50) NOT NULL,
  last_name VARCHAR(50) NOT NULL,
  email VARCHAR(100) UNIQUE,
  phone VARCHAR(20),
  hire_date DATE NOT NULL,
  salary DECIMAL(10, 2)
);

Understanding Common Data Types

Data Type Description Example
INT Whole numbers 25, -10, 1000
VARCHAR(n) Variable-length text (n = max characters) "John", "Computer Science"
DECIMAL(p,s) Exact decimal numbers (p=precision, s=scale) 99999.99 (DECIMAL(7,2))
TEXT Large text data Long descriptions, notes
BOOLEAN True/False values TRUE, FALSE

Understanding Common Constraints

Constraint Description Example
PRIMARY KEY Uniquely identifies each row student_id INT PRIMARY KEY
AUTO_INCREMENT Automatically generates sequential numbers id INT AUTO_INCREMENT
NOT NULL Column must have a value name VARCHAR(50) NOT NULL
UNIQUE All values must be different email VARCHAR(100) UNIQUE
DEFAULT Sets default value if none provided status VARCHAR(20) DEFAULT 'active'
FOREIGN KEY Links to primary key in another table FOREIGN KEY (course_id) REFERENCES courses(course_id)

Creating Tables with Foreign Keys

Let's create a table that relates students to courses:

Creating Enrollments Table with Foreign Keys
CREATE TABLE enrollments (
  enrollment_id INT PRIMARY KEY AUTO_INCREMENT,
  student_id INT NOT NULL,
  course_id INT NOT NULL,
  enrollment_date DATE DEFAULT (CURRENT_DATE),
  grade CHAR(2),
  
  FOREIGN KEY (student_id) REFERENCES students(student_id),
  FOREIGN KEY (course_id) REFERENCES courses(course_id),
  
  UNIQUE (student_id, course_id)
);
Foreign Key Explanation:
Foreign keys create relationships between tables. In this example:
• student_id in enrollments table references student_id in students table
• course_id in enrollments table references course_id in courses table
• This ensures we only enroll existing students in existing courses

Modifying Existing Tables

You can modify tables after creation using ALTER TABLE:

ALTER TABLE Examples
-- Add a new column
ALTER TABLE students
  ADD COLUMN phone_number VARCHAR(20);

-- Modify an existing column
ALTER TABLE students
  MODIFY COLUMN email VARCHAR(150) NOT NULL;

-- Drop a column
ALTER TABLE students
  DROP COLUMN phone_number;

-- Add a constraint
ALTER TABLE students
  ADD CONSTRAINT chk_email CHECK (email LIKE '%@%');

Viewing Table Information

Various ways to inspect your tables:

Table Inspection Commands
-- Show all tables
SHOW TABLES;

-- Show table structure
DESCRIBE students;
-- OR
SHOW COLUMNS FROM students;

-- Show how table was created
SHOW CREATE TABLE students;

Deleting Tables

To remove a table (use with caution!):

Dropping Tables
-- Delete a table and all its data
DROP TABLE enrollments;

-- Safer: Only drop if table exists
DROP TABLE IF EXISTS temp_table;
⚠️ IMPORTANT WARNING:
DROP TABLE permanently deletes the table and ALL its data. This action cannot be undone. Always backup important data before dropping tables.

Best Practices for Table Creation

Naming Conventions:
  • Use plural nouns for table names (students, courses)
  • Use singular nouns for column names (first_name, not first_names)
  • Use lowercase and underscores (student_id, not StudentID)
  • Be consistent across your database
Design Tips:
  • Always define a primary key for each table
  • Use appropriate data types for each column
  • Add NOT NULL constraint for required fields
  • Use VARCHAR instead of CHAR for variable-length text
  • Plan relationships between tables before creating them

Common Mistakes to Avoid

  • Forgetting to select a database before creating tables
  • Using reserved keywords as table or column names
  • Not specifying primary keys for tables
  • Using wrong data types (VARCHAR for dates, etc.)
  • Creating tables without planning relationships
  • Forgetting commas between column definitions

Practice Exercise

Create a complete library management system with these tables:

  1. books table with: book_id, title, author, isbn, publication_year
  2. members table with: member_id, name, email, join_date, membership_type
  3. borrowings table with: borrowing_id, book_id, member_id, borrow_date, return_date
  4. Add appropriate data types and constraints
  5. Create foreign key relationships between borrowings and the other tables
Next Steps:
Now that you can create tables, the next step is learning how to insert data into them. You'll learn how to add students, courses, and enrollment records to make your database useful.

MySQL Data Types

Data Types: Define the kind of data that can be stored in a column. Choosing the right data type is crucial for database performance, storage efficiency, and data integrity.

Why Data Types Matter

Selecting appropriate data types provides several benefits:

  • Data Integrity: Prevents invalid data from being stored
  • Storage Efficiency: Optimizes disk space usage
  • Performance: Improves query speed and efficiency
  • Validation: Automatically validates data format
  • Memory Usage: Reduces memory consumption

Numeric Data Types

Used for storing numbers - integers, decimals, and floating-point values:

Data Type Storage Range Description Example
TINYINT 1 byte -128 to 127 Very small integer age, status flags
SMALLINT 2 bytes -32,768 to 32,767 Small integer year, small counts
INT 4 bytes -2.1B to 2.1B Standard integer id, quantity, price
BIGINT 8 bytes ±9.2 quintillion Large integer population, big counts
DECIMAL(p,s) Varies Depends on precision Exact decimal numbers money, measurements
FLOAT 4 bytes ±3.4E+38 Single-precision floating point scientific data
DOUBLE 8 bytes ±1.8E+308 Double-precision floating point scientific calculations
DECIMAL Precision Tips:
• DECIMAL(5,2) = 999.99 (5 total digits, 2 after decimal)
• DECIMAL(10,2) = 99,999,999.99 (good for currency)
• Use DECIMAL for money to avoid floating-point errors

String Data Types

Used for storing text and character data:

Data Type Max Size Description When to Use Example
CHAR(n) 255 chars Fixed-length string Fixed-size codes country codes, status
VARCHAR(n) 65,535 chars Variable-length string Most text data names, addresses
TINYTEXT 255 bytes Small text data Short descriptions short notes
TEXT 65,535 bytes Standard text data Long descriptions articles, comments
MEDIUMTEXT 16MB Medium text data Large documents books, manuals
LONGTEXT 4GB Very large text data Huge documents encyclopedias
CHAR vs VARCHAR:
CHAR(10) always uses 10 characters storage
VARCHAR(10) uses only needed characters + 1 byte
• Use CHAR for fixed-length data (like "US", "CA" for country codes)
• Use VARCHAR for variable-length data (like names, addresses)

Date and Time Data Types

Used for storing dates, times, and timestamps:

Data Type Format Range Description Example
DATE YYYY-MM-DD 1000-01-01 to 9999-12-31 Date only birth_date, order_date
TIME HH:MM:SS -838:59:59 to 838:59:59 Time only duration, time_of_day
DATETIME YYYY-MM-DD HH:MM:SS 1000-01-01 00:00:00 to 9999-12-31 23:59:59 Date and time created_at, updated_at
TIMESTAMP YYYY-MM-DD HH:MM:SS 1970-01-01 00:00:01 to 2038-01-19 03:14:07 Automatic timestamp last_login, modified_date
YEAR YYYY 1901 to 2155 Year value graduation_year, founding_year
DATETIME vs TIMESTAMP:
DATETIME - Larger range, no timezone conversion
TIMESTAMP - Smaller range, automatic timezone conversion
• Use TIMESTAMP for auto-updating fields (like last_modified)
• Use DATETIME for future dates beyond 2038

Binary Data Types

Used for storing binary data like files, images, or encrypted data:

Data Type Max Size Description When to Use
BINARY(n) 255 bytes Fixed-length binary Encrypted fixed data
VARBINARY(n) 65,535 bytes Variable-length binary Encrypted variable data
TINYBLOB 255 bytes Small binary object Very small files
BLOB 65,535 bytes Binary large object Small images, documents
MEDIUMBLOB 16MB Medium binary object Medium files, images
LONGBLOB 4GB Large binary object Large files, videos
⚠️ BLOB Storage Warning:
Storing large files in databases is generally not recommended. Consider storing files on disk and saving file paths in the database instead. This improves performance and makes backups easier.

Other Special Data Types

MySQL also provides specialized data types for specific use cases:

Data Type Description Example Usage
ENUM Predefined list of values ENUM('small','medium','large') Size categories, status
SET Multiple choices from predefined list SET('red','green','blue') Tags, multiple categories
BOOLEAN True/False values (alias for TINYINT(1)) TRUE, FALSE Flags, yes/no fields
JSON JSON formatted data {"name": "John", "age": 30} Flexible schema data

Practical Examples with Different Data Types

Let's create tables using various data types:

Products Table with Multiple Data Types
CREATE TABLE products (
  product_id INT PRIMARY KEY AUTO_INCREMENT,
  product_name VARCHAR(100) NOT NULL,
  description TEXT,
  price DECIMAL(10, 2) NOT NULL,
  quantity INT DEFAULT 0,
  size ENUM('small', 'medium', 'large'),
  is_available BOOLEAN DEFAULT TRUE,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  metadata JSON
);
User Profiles Table
CREATE TABLE user_profiles (
  user_id INT PRIMARY KEY,
  username VARCHAR(50) UNIQUE NOT NULL,
  email VARCHAR(100) UNIQUE NOT NULL,
  birth_date DATE,
  phone VARCHAR(20),
  bio TEXT,
  last_login DATETIME,
  login_count INT DEFAULT 0,
  preferences JSON
);

Data Type Selection Guidelines

Choosing the Right Data Type:
  • For IDs: INT or BIGINT with AUTO_INCREMENT
  • For names/titles: VARCHAR with appropriate length
  • For money: DECIMAL to avoid floating-point errors
  • For dates: DATE for dates only, DATETIME for date+time
  • For true/false: BOOLEAN or TINYINT(1)
  • For large text: TEXT or LONGTEXT
  • For categories: ENUM for fixed choices

Common Data Type Mistakes

  • Using VARCHAR for all text: Use CHAR for fixed-length codes
  • Using FLOAT for money: Use DECIMAL to avoid rounding errors
  • Oversizing VARCHAR: Choose appropriate max length
  • Using TEXT for short strings: Use VARCHAR for better performance
  • Storing dates as strings: Use DATE/DATETIME for date operations
  • Using INT for small numbers: Use TINYINT or SMALLINT when appropriate

Practice Exercise

Create a table for a blog system with appropriate data types:

  1. posts table with: post_id, title, content, author_id, created_date, published_status, view_count
  2. categories table with: category_id, name, description, created_date
  3. tags table with: tag_id, name, color_code
  4. Choose the most appropriate data types for each column
  5. Consider storage efficiency and performance
Remember: Proper data type selection is one of the most important database design decisions. It affects performance, storage, and data integrity throughout your application's lifetime.

MySQL Constraints

Constraints: Rules enforced on data columns to maintain data integrity, accuracy, and reliability in the database. Constraints ensure that only valid data is stored in your tables.

Why Constraints Matter

Using constraints provides several important benefits:

  • Data Integrity: Prevents invalid or inconsistent data
  • Data Accuracy: Ensures data follows business rules
  • Relationship Management: Maintains links between related tables
  • Application Reliability: Reduces application-level validation code
  • Database Consistency: Ensures data follows defined rules

Types of Constraints in MySQL

Constraint Description Usage Example When to Use
PRIMARY KEY Uniquely identifies each row in a table student_id INT PRIMARY KEY Every table should have one
FOREIGN KEY Links to primary key in another table FOREIGN KEY (course_id) REFERENCES courses(course_id) When relating tables
UNIQUE Ensures all values in a column are different email VARCHAR(100) UNIQUE Email, username, product codes
NOT NULL Ensures a column cannot have NULL values first_name VARCHAR(50) NOT NULL Required fields
CHECK Ensures values meet specific conditions CHECK (age >= 18) Business rules validation
DEFAULT Sets a default value when no value is specified status VARCHAR(20) DEFAULT 'active' Common default values
AUTO_INCREMENT Automatically generates sequential numbers id INT AUTO_INCREMENT Primary key columns

PRIMARY KEY Constraint

Uniquely identifies each record in a table. A table can have only one primary key.

Primary Key Examples
-- Single column primary key
CREATE TABLE students (
  student_id INT PRIMARY KEY AUTO_INCREMENT,
  first_name VARCHAR(50) NOT NULL
);

-- Composite primary key (multiple columns)
CREATE TABLE enrollments (
  student_id INT,
  course_id INT,
  enrollment_date DATE,
  PRIMARY KEY (student_id, course_id)
);
Primary Key Best Practices:
  • Every table should have a primary key
  • Use AUTO_INCREMENT for surrogate keys
  • Keep primary keys simple and stable
  • Use composite keys only when necessary
  • Primary key values cannot be NULL

FOREIGN KEY Constraint

Creates a relationship between two tables and maintains referential integrity.

Foreign Key Examples
-- Basic foreign key
CREATE TABLE orders (
  order_id INT PRIMARY KEY AUTO_INCREMENT,
  customer_id INT NOT NULL,
  order_date DATE,
  FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

-- Foreign key with actions
CREATE TABLE order_items (
  order_id INT,
  product_id INT,
  quantity INT,
  FOREIGN KEY (order_id) REFERENCES orders(order_id)
    ON DELETE CASCADE
    ON UPDATE CASCADE
);

Foreign Key Actions

Action Description When to Use
RESTRICT Prevents deletion/update of parent record Default behavior, safe option
CASCADE Deletes/updates child records when parent changes When child records depend completely on parent
SET NULL Sets foreign key to NULL when parent is deleted When relationship is optional
NO ACTION Similar to RESTRICT (standard SQL) Standard compliance

UNIQUE Constraint

Ensures all values in a column are different from each other.

Unique Constraint Examples
-- Single column unique constraint
CREATE TABLE users (
  user_id INT PRIMARY KEY AUTO_INCREMENT,
  username VARCHAR(50) UNIQUE NOT NULL,
  email VARCHAR(100) UNIQUE NOT NULL
);

-- Composite unique constraint (multiple columns)
CREATE TABLE class_rosters (
  class_id INT,
  student_id INT,
  semester VARCHAR(10),
  UNIQUE (class_id, student_id, semester)
);

NOT NULL Constraint

Ensures a column cannot contain NULL values.

NOT NULL Examples
CREATE TABLE employees (
  employee_id INT PRIMARY KEY AUTO_INCREMENT,
  first_name VARCHAR(50) NOT NULL,
  last_name VARCHAR(50) NOT NULL,
  email VARCHAR(100) NOT NULL,
  phone VARCHAR(20), -- This can be NULL
  hire_date DATE NOT NULL
);

CHECK Constraint

Ensures that values in a column satisfy a specific condition (MySQL 8.0+).

CHECK Constraint Examples
CREATE TABLE products (
  product_id INT PRIMARY KEY AUTO_INCREMENT,
  product_name VARCHAR(100) NOT NULL,
  price DECIMAL(10, 2) CHECK (price > 0),
  quantity INT CHECK (quantity >= 0),
  category VARCHAR(50) CHECK (category IN ('electronics', 'clothing', 'books'))
);

-- Named CHECK constraint
CREATE TABLE employees (
  employee_id INT PRIMARY KEY,
  salary DECIMAL(10, 2),
  CONSTRAINT chk_salary CHECK (salary >= 0)
);

DEFAULT Constraint

Sets a default value for a column when no value is specified.

DEFAULT Constraint Examples
CREATE TABLE orders (
  order_id INT PRIMARY KEY AUTO_INCREMENT,
  order_date DATE DEFAULT (CURRENT_DATE),
  status VARCHAR(20) DEFAULT 'pending',
  priority INT DEFAULT 1,
  is_active BOOLEAN DEFAULT TRUE
);

Complete Example: E-commerce Database

Let's create a complete example using multiple constraints:

E-commerce Database with Constraints
CREATE TABLE customers (
  customer_id INT PRIMARY KEY AUTO_INCREMENT,
  email VARCHAR(100) UNIQUE NOT NULL,
  first_name VARCHAR(50) NOT NULL,
  last_name VARCHAR(50) NOT NULL,
  phone VARCHAR(20),
  registration_date DATE DEFAULT (CURRENT_DATE)
);

CREATE TABLE products (
  product_id INT PRIMARY KEY AUTO_INCREMENT,
  product_name VARCHAR(100) NOT NULL,
  price DECIMAL(10, 2) CHECK (price > 0),
  stock_quantity INT DEFAULT 0 CHECK (stock_quantity >= 0),
  is_available BOOLEAN DEFAULT TRUE
);

CREATE TABLE orders (
  order_id INT PRIMARY KEY AUTO_INCREMENT,
  customer_id INT NOT NULL,
  order_date DATETIME DEFAULT CURRENT_TIMESTAMP,
  status ENUM('pending', 'confirmed', 'shipped', 'delivered') DEFAULT 'pending',
  total_amount DECIMAL(10, 2) CHECK (total_amount >= 0),
  FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
    ON DELETE RESTRICT
    ON UPDATE CASCADE
);

Adding Constraints to Existing Tables

You can add constraints to existing tables using ALTER TABLE:

Adding Constraints with ALTER TABLE
-- Add primary key
ALTER TABLE students
  ADD PRIMARY KEY (student_id);

-- Add foreign key
ALTER TABLE orders
  ADD FOREIGN KEY (customer_id) REFERENCES customers(customer_id);

-- Add unique constraint
ALTER TABLE users
  ADD CONSTRAINT uc_email UNIQUE (email);

-- Add check constraint
ALTER TABLE products
  ADD CONSTRAINT chk_price CHECK (price > 0);

-- Add not null constraint
ALTER TABLE employees
  MODIFY COLUMN email VARCHAR(100) NOT NULL;

Removing Constraints

Constraints can be removed when needed:

Removing Constraints
-- Drop primary key
ALTER TABLE students
  DROP PRIMARY KEY;

-- Drop foreign key
ALTER TABLE orders
  DROP FOREIGN KEY orders_ibfk_1;

-- Drop unique constraint
ALTER TABLE users
  DROP INDEX uc_email;

-- Drop check constraint
ALTER TABLE products
  DROP CONSTRAINT chk_price;

Constraint Best Practices

Constraint Guidelines:
  • Always define primary keys for all tables
  • Use foreign keys to maintain referential integrity
  • Add NOT NULL constraints for required fields
  • Use UNIQUE constraints for columns that should be unique
  • Use CHECK constraints for business rule validation
  • Set meaningful DEFAULT values where appropriate
  • Name your constraints for easier management

Common Constraint Mistakes

  • Not using primary keys: Every table needs a primary key
  • Forgetting foreign keys: Leads to orphaned records
  • Overusing CASCADE: Can accidentally delete important data
  • Not using NOT NULL: Allows incomplete data
  • Ignoring unique constraints: Allows duplicate data where it shouldn't be
  • Complex check constraints: Can impact performance

Practice Exercise

Create a library management system with proper constraints:

  1. books table with: book_id (PK), isbn (UNIQUE), title (NOT NULL), publication_year (CHECK > 1450)
  2. members table with: member_id (PK), email (UNIQUE, NOT NULL), join_date (DEFAULT CURRENT_DATE)
  3. borrowings table with: borrowing_id (PK), book_id (FK), member_id (FK), borrow_date (DEFAULT), return_date (CHECK > borrow_date)
  4. Add appropriate foreign key relationships with CASCADE actions
  5. Add CHECK constraints for business rules
Remember: Constraints are your first line of defense for data quality. They ensure data integrity at the database level, which is more reliable than application-level validation alone.

Inserting Data in MySQL

INSERT Statement: Used to add new records (rows) to a table. This is how you populate your database with actual data after creating tables.

Prerequisites

Before inserting data, ensure you have:

  • Tables created with proper structure
  • Appropriate permissions to insert data
  • Understanding of your table's columns and data types
  • Selected the correct database using USE statement

Basic INSERT Syntax

The basic INSERT statement has several forms:

Syntax Type Description When to Use
INSERT INTO table VALUES (...) Insert values for all columns in order When providing values for all columns
INSERT INTO table (columns) VALUES (...) Insert values for specific columns When providing values for some columns
INSERT INTO table SET column=value Insert using SET syntax Alternative syntax, less common
INSERT INTO table SELECT ... Insert data from another table When copying data between tables

Step 1: Basic Single Row Insertion

Let's start with inserting a single record into our students table:

Basic INSERT Statement
-- Insert a single student
INSERT INTO students (
  first_name, last_name, email, date_of_birth
) VALUES (
  'John', 'Doe', 'john.doe@email.com', '2000-05-15'
);
Query OK, 1 row affected (0.02 sec)
Note: We didn't include student_id because it's AUTO_INCREMENT, and we didn't include enrollment_date because it has a DEFAULT value. MySQL automatically handles these.

Step 2: Insert Multiple Rows

You can insert multiple rows with a single INSERT statement:

Multiple Row Insertion
INSERT INTO students (
  first_name, last_name, email, date_of_birth
) VALUES
  ('Jane', 'Smith', 'jane.smith@email.com', '1999-08-22'),
  ('Bob', 'Johnson', 'bob.johnson@email.com', '2001-03-10'),
  ('Alice', 'Brown', 'alice.brown@email.com', '2000-11-30');
Query OK, 3 rows affected (0.01 sec)
Records: 3 Duplicates: 0 Warnings: 0

Step 3: Verify Inserted Data

Let's check our inserted data using SELECT statement:

Verify Inserted Data
SELECT * FROM students;
+------------+------------+-----------+-------------------------+---------------+----------------+
| student_id | first_name | last_name | email | date_of_birth | enrollment_date |
+------------+------------+-----------+-------------------------+---------------+----------------+
| 1 | John | Doe | john.doe@email.com | 2000-05-15 | 2023-10-25 |
| 2 | Jane | Smith | jane.smith@email.com | 1999-08-22 | 2023-10-25 |
| 3 | Bob | Johnson | bob.johnson@email.com | 2001-03-10 | 2023-10-25 |
| 4 | Alice | Brown | alice.brown@email.com | 2000-11-30 | 2023-10-25 |
+------------+------------+-----------+-------------------------+---------------+----------------+
4 rows in set (0.00 sec)

Different INSERT Syntax Methods

Method 1: Column List (Recommended)

Using Column List
INSERT INTO products (
  product_name, price, category
) VALUES (
  'Laptop', 999.99, 'Electronics'
);

Method 2: Without Column List

Without Column List
INSERT INTO products VALUES (
  NULL, 'Smartphone', 699.99, 'Electronics', 50, TRUE
);

Method 3: SET Syntax

Using SET Syntax
INSERT INTO products
  SET product_name = 'Tablet',
    price = 299.99,
    category = 'Electronics';
Best Practice: Always use the column list method (Method 1). It's clearer, safer, and won't break if you add new columns to the table later.

Handling Different Data Types

Data Type Insert Format Example Notes
VARCHAR/CHAR Single quotes 'John Doe' Use single quotes for text
INT/DECIMAL No quotes 25, 99.99 Numbers without quotes
DATE Single quotes '2023-10-25' YYYY-MM-DD format
DATETIME Single quotes '2023-10-25 14:30:00' YYYY-MM-DD HH:MM:SS
BOOLEAN TRUE/FALSE or 1/0 TRUE, 1, FALSE, 0 Both formats work
NULL NULL keyword NULL No quotes for NULL

Working with AUTO_INCREMENT Columns

When a column has AUTO_INCREMENT, MySQL automatically generates the value:

AUTO_INCREMENT Handling
-- Let MySQL generate the ID (recommended)
INSERT INTO students (first_name, last_name)
  VALUES ('Charlie', 'Wilson');

-- Explicitly set the ID (not recommended)
INSERT INTO students (student_id, first_name, last_name)
  VALUES (100, 'David', 'Lee');

-- Get the last auto-generated ID
SELECT LAST_INSERT_ID();

Working with DEFAULT Values

Columns with DEFAULT values can be omitted from INSERT statements:

Using DEFAULT Values
-- Let MySQL use the default value
INSERT INTO orders (customer_id, total_amount)
  VALUES (1, 99.99);
-- order_date will use DEFAULT CURRENT_TIMESTAMP

-- Explicitly use DEFAULT keyword
INSERT INTO products (product_name, price, stock_quantity)
  VALUES ('Mouse', 29.99, DEFAULT);

-- Override the default value
INSERT INTO orders (customer_id, order_date, total_amount)
  VALUES (2, '2023-10-20 10:00:00', 149.99);

Inserting Data with Foreign Keys

When inserting into tables with foreign keys, reference existing records:

Inserting with Foreign Keys
-- First, insert into parent table
INSERT INTO courses (course_name, course_code, credits)
  VALUES
  ('Mathematics', 'MATH101', 3),
  ('Computer Science', 'CS101', 4);

-- Then, insert into child table with valid foreign keys
INSERT INTO enrollments (student_id, course_id, grade)
  VALUES
  (1, 1, 'A'),
  (1, 2, 'B+'),
  (2, 1, 'A-');

INSERT with SELECT (Copying Data)

You can insert data from one table into another:

Copying Data Between Tables
-- Create a backup table
CREATE TABLE students_backup (
  student_id INT PRIMARY KEY,
  first_name VARCHAR(50),
  last_name VARCHAR(50)
);

-- Copy data from students to backup
INSERT INTO students_backup (student_id, first_name, last_name)
  SELECT student_id, first_name, last_name FROM students;

Handling Errors and Constraints

Constraint Potential Error Solution
PRIMARY KEY Duplicate entry error Use AUTO_INCREMENT or ensure unique values
FOREIGN KEY Cannot add or update child row Ensure referenced record exists in parent table
UNIQUE Duplicate entry for key Use unique values or update existing record
NOT NULL Column cannot be null Provide value for required columns
CHECK Check constraint violation Ensure value meets condition

INSERT IGNORE and ON DUPLICATE KEY UPDATE

Advanced techniques for handling duplicate entries:

Handling Duplicates
-- Ignore duplicate key errors
INSERT IGNORE INTO students (student_id, first_name, last_name)
  VALUES (1, 'John', 'Updated');
-- If student_id 1 exists, this insert is silently ignored

-- Update on duplicate key
INSERT INTO students (student_id, first_name, last_name)
  VALUES (1, 'John', 'Updated')
  ON DUPLICATE KEY UPDATE
  first_name = 'John', last_name = 'Updated';

Best Practices for Inserting Data

Insertion Guidelines:
  • Always specify column names for clarity and safety
  • Use multiple row inserts for better performance with large datasets
  • Validate data before insertion when possible
  • Use transactions for multiple related inserts
  • Handle AUTO_INCREMENT columns properly
  • Be mindful of foreign key constraints
  • Use INSERT IGNORE or ON DUPLICATE KEY UPDATE for duplicate handling

Common INSERT Mistakes

  • Forgetting column lists: Can cause errors if table structure changes
  • Wrong data types: Inserting strings without quotes, etc.
  • Violating constraints: Not checking foreign key relationships
  • Date format errors: Using wrong date formats
  • Missing required fields: Forgetting NOT NULL columns
  • Single quotes in text: Not escaping special characters

Practice Exercise

Populate the library management system with sample data:

  1. Insert 5 books into the books table with different categories
  2. Insert 3 library members into the members table
  3. Insert borrowing records linking books to members
  4. Use multiple row insertion for books and members
  5. Verify all inserts worked by selecting the data
  6. Try inserting a duplicate ISBN and handle it with INSERT IGNORE
Next Steps: Now that you can insert data, the next step is learning how to retrieve and query that data using SELECT statements. You'll learn how to filter, sort, and analyze your data.

SELECT Queries in MySQL

SELECT Statement: Used to retrieve data from one or more tables. This is the most commonly used SQL command for querying and analyzing data in your database.

Why SELECT Queries Matter

SELECT queries are essential for:

  • Data Retrieval: Extracting specific information from databases
  • Data Analysis: Generating reports and insights
  • Application Development: Displaying data in applications
  • Decision Making: Supporting business decisions with data
  • Data Validation: Checking data quality and integrity

Basic SELECT Syntax

The SELECT statement has several components that can be combined:

Clause Purpose Required
SELECT Specifies columns to retrieve Yes
FROM Specifies table(s) to query Yes (in most cases)
WHERE Filters rows based on conditions No
ORDER BY Sorts the result set No
GROUP BY Groups rows with same values No
HAVING Filters grouped rows No
LIMIT Restricts number of rows returned No

Step 1: Basic SELECT Queries

Let's start with simple queries to retrieve data from our students table:

Basic SELECT Examples
-- Select all columns from students table
SELECT * FROM students;

-- Select specific columns
SELECT first_name, last_name, email FROM students;

-- Select with column aliases
SELECT
  first_name AS 'First Name',
  last_name AS 'Last Name',
  email AS 'Email Address'
FROM students;
+------------+-----------+-------------------------+
| First Name | Last Name | Email Address |
+------------+-----------+-------------------------+
| John | Doe | john.doe@email.com |
| Jane | Smith | jane.smith@email.com |
| Bob | Johnson | bob.johnson@email.com |
| Alice | Brown | alice.brown@email.com |
+------------+-----------+-------------------------+
4 rows in set (0.00 sec)

Step 2: Using DISTINCT for Unique Values

Get unique values from a column:

DISTINCT Examples
-- Get unique last names
SELECT DISTINCT last_name FROM students;

-- Get unique combinations of first and last names
SELECT DISTINCT first_name, last_name FROM students;

Step 3: Basic Calculations and Expressions

Perform calculations in SELECT statements:

Calculations in SELECT
-- Calculate age from date of birth
SELECT
  first_name,
  last_name,
  date_of_birth,
  YEAR(CURRENT_DATE) - YEAR(date_of_birth) AS age
FROM students;

-- Mathematical operations
SELECT
  product_name,
  price,
  price * 0.9 AS discounted_price,
  price * quantity AS total_value
FROM products;

Common SELECT Patterns

Pattern Example Use Case
Select all columns SELECT * FROM table Quick data inspection
Select specific columns SELECT col1, col2 FROM table Performance optimization
Column aliases SELECT col AS alias FROM table Better column names
Calculated columns SELECT price * quantity AS total Dynamic calculations
Distinct values SELECT DISTINCT col FROM table Unique value analysis

Working with Multiple Tables

Query data from multiple tables using JOINs:

Basic JOIN Examples
-- Inner join students with enrollments
SELECT
  s.first_name,
  s.last_name,
  c.course_name,
  e.grade
FROM students s
  INNER JOIN enrollments e ON s.student_id = e.student_id
  INNER JOIN courses c ON e.course_id = c.course_id;

-- Left join to include all students (even without enrollments)
SELECT
  s.first_name,
  s.last_name,
  c.course_name
FROM students s
  LEFT JOIN enrollments e ON s.student_id = e.student_id
  LEFT JOIN courses c ON e.course_id = c.course_id;

JOIN Types Overview

JOIN Type Description Result Use Case
INNER JOIN Returns matching rows from both tables Intersection Find related records
LEFT JOIN Returns all rows from left table, matched rows from right All left + matching right Find records with optional relationships
RIGHT JOIN Returns all rows from right table, matched rows from left All right + matching left Less common, similar to LEFT JOIN
FULL OUTER JOIN Returns all rows when there's a match in either table Union of both tables MySQL doesn't support directly
CROSS JOIN Returns Cartesian product of both tables All combinations Generate test data, combinations

Using WHERE Clause for Filtering

Filter rows based on specific conditions:

WHERE Clause Examples
-- Basic equality filter
SELECT * FROM students
  WHERE last_name = 'Doe';

-- Multiple conditions with AND/OR
SELECT * FROM students
  WHERE last_name = 'Doe' AND first_name = 'John';

-- Pattern matching with LIKE
SELECT * FROM students
  WHERE email LIKE '%@gmail.com';

-- Range filtering
SELECT * FROM students
  WHERE date_of_birth BETWEEN '2000-01-01' AND '2000-12-31';

-- IN operator for multiple values
SELECT * FROM students
  WHERE last_name IN ('Doe', 'Smith', 'Johnson');

Common WHERE Clause Operators

Operator Description Example
= Equal to WHERE age = 25
!= or <> Not equal to WHERE status != 'inactive'
>, <, >=, <= Comparison operators WHERE price > 100
BETWEEN Range inclusive WHERE age BETWEEN 18 AND 65
LIKE Pattern matching WHERE name LIKE 'J%'
IN Multiple possible values WHERE status IN ('active', 'pending')
IS NULL Null check WHERE email IS NULL
IS NOT NULL Not null check WHERE phone IS NOT NULL

Sorting Results with ORDER BY

Control the order of your results:

ORDER BY Examples
-- Sort by single column ascending (default)
SELECT * FROM students
  ORDER BY last_name;

-- Sort by single column descending
SELECT * FROM students
  ORDER BY enrollment_date DESC;

-- Sort by multiple columns
SELECT * FROM students
  ORDER BY last_name ASC, first_name ASC;

-- Sort by calculated column
SELECT
  first_name, last_name, date_of_birth,
  YEAR(CURRENT_DATE) - YEAR(date_of_birth) AS age
FROM students
  ORDER BY age DESC;

Limiting Results with LIMIT

Restrict the number of rows returned:

LIMIT Examples
-- Get first 5 rows
SELECT * FROM students
  LIMIT 5;

-- Get 5 rows starting from row 3 (offset)
SELECT * FROM students
  LIMIT 3, 5;

-- Alternative syntax (MySQL 8.0+)
SELECT * FROM students
  LIMIT 5 OFFSET 3;

-- Top 3 oldest students
SELECT * FROM students
  ORDER BY date_of_birth ASC
  LIMIT 3;

Aggregate Functions

Perform calculations on multiple rows and return single values:

Aggregate Function Examples
-- Count total students
SELECT COUNT(*) AS total_students FROM students;

-- Count students with email
SELECT COUNT(email) AS students_with_email FROM students;

-- Average, minimum, maximum prices
SELECT
  AVG(price) AS average_price,
  MIN(price) AS min_price,
  MAX(price) AS max_price,
  SUM(price) AS total_value
FROM products;

Common Aggregate Functions

Function Description Example
COUNT() Counts number of rows COUNT(*), COUNT(column)
SUM() Calculates sum of values SUM(price), SUM(quantity)
AVG() Calculates average value AVG(salary), AVG(score)
MIN() Finds minimum value MIN(price), MIN(age)
MAX() Finds maximum value MAX(price), MAX(age)
GROUP_CONCAT() Concatenates values from group GROUP_CONCAT(name)

Best Practices for SELECT Queries

Query Optimization Tips:
  • Use specific column names instead of * for better performance
  • Add indexes on columns used in WHERE and JOIN conditions
  • Use LIMIT to restrict large result sets
  • Avoid functions on indexed columns in WHERE clauses
  • Use EXISTS instead of IN for subqueries when possible
  • Test queries with EXPLAIN to understand performance
  • Use appropriate data types for better indexing

Common SELECT Mistakes

  • Using SELECT * in production: Can cause performance issues
  • Missing JOIN conditions: Results in Cartesian products
  • Confusing AND/OR logic: Use parentheses for complex conditions
  • Forgetting NULL handling: NULL values behave differently
  • Inefficient LIKE patterns: Leading wildcards can't use indexes
  • Overusing subqueries: Consider JOINs for better performance
  • Ignoring data types: Can cause implicit conversions

Practice Exercise

Practice these SELECT query scenarios:

  1. Retrieve all students born after 2000
  2. Find students with Gmail email addresses
  3. List students in alphabetical order by last name, then first name
  4. Count how many students enrolled each year
  5. Find the average age of students
  6. Get the top 5 most recently enrolled students
  7. Join students with their course enrollments and grades
  8. Find students who are not enrolled in any courses
Next Steps: Now that you can retrieve data with SELECT queries, the next step is learning how to filter data more precisely using WHERE clauses with various operators and conditions.

WHERE Clause in MySQL

WHERE Clause: Used to filter records and return only those that meet specified conditions. It's one of the most powerful features of SQL for data analysis and retrieval.

Why WHERE Clause Matters

The WHERE clause is essential for:

  • Data Filtering: Extracting specific subsets of data
  • Performance: Reducing data transfer by filtering at database level
  • Data Analysis: Focusing on relevant data for reports
  • Application Logic: Implementing business rules in queries
  • Security: Restricting data access based on conditions

Basic WHERE Clause Syntax

The WHERE clause follows the FROM clause in a SELECT statement:

Basic WHERE Syntax
SELECT column1, column2, ...
FROM table_name
WHERE condition;

Comparison Operators

Used to compare values in WHERE conditions:

Operator Description Example Result
= Equal to WHERE age = 25 Rows where age is exactly 25
<> or != Not equal to WHERE status <> 'inactive' Rows where status is not 'inactive'
> Greater than WHERE price > 100 Rows where price exceeds 100
< Less than WHERE age < 18 Rows where age is under 18
>= Greater than or equal to WHERE score >= 80 Rows where score is 80 or higher
<= Less than or equal to WHERE quantity <= 10 Rows where quantity is 10 or less

Step 1: Basic Comparison Examples

Let's start with simple comparison operations on our students table:

Comparison Operator Examples
-- Find students with specific last name
SELECT * FROM students
  WHERE last_name = 'Doe';

-- Find students born before 2000
SELECT * FROM students
  WHERE date_of_birth < '2000-01-01';

-- Find students with specific student_id
SELECT * FROM students
  WHERE student_id >= 5;
+------------+------------+-----------+-----------------------+---------------+----------------+
| student_id | first_name | last_name | email | date_of_birth | enrollment_date |
+------------+------------+-----------+-----------------------+---------------+----------------+
| 1 | John | Doe | john.doe@email.com | 2000-05-15 | 2023-10-25 |
+------------+------------+-----------+-----------------------+---------------+----------------+
1 row in set (0.00 sec)

Logical Operators (AND, OR, NOT)

Combine multiple conditions in WHERE clauses:

Operator Description Example Result
AND All conditions must be true WHERE age > 18 AND status = 'active' Rows where both conditions are true
OR At least one condition must be true WHERE city = 'NYC' OR city = 'LA' Rows where either condition is true
NOT Negates a condition WHERE NOT status = 'inactive' Rows where status is not 'inactive'

Step 2: Combining Conditions with AND/OR

Use logical operators to create complex filters:

Logical Operator Examples
-- Students with specific first AND last name
SELECT * FROM students
  WHERE first_name = 'John' AND last_name = 'Doe';

-- Students with specific first name OR last name
SELECT * FROM students
  WHERE first_name = 'John' OR last_name = 'Smith';

-- Complex combination with parentheses
SELECT * FROM students
  WHERE (first_name = 'John' OR first_name = 'Jane')
    AND date_of_birth > '1999-01-01';
Operator Precedence: AND has higher precedence than OR. Always use parentheses to make complex conditions clear and avoid unexpected results.

BETWEEN Operator

Filter values within a specified range (inclusive):

BETWEEN Examples
-- Students born between 1999 and 2000
SELECT * FROM students
  WHERE date_of_birth BETWEEN '1999-01-01' AND '2000-12-31';

-- Equivalent using comparison operators
SELECT * FROM students
  WHERE date_of_birth >= '1999-01-01'
    AND date_of_birth <= '2000-12-31';

-- NOT BETWEEN (outside the range)
SELECT * FROM students
  WHERE date_of_birth NOT BETWEEN '1999-01-01' AND '2000-12-31';

IN Operator

Check if a value matches any value in a list:

IN Operator Examples
-- Students with specific last names
SELECT * FROM students
  WHERE last_name IN ('Doe', 'Smith', 'Johnson');

-- Equivalent using multiple OR conditions
SELECT * FROM students
  WHERE last_name = 'Doe'
    OR last_name = 'Smith'
    OR last_name = 'Johnson';

-- NOT IN (exclude specific values)
SELECT * FROM students
  WHERE last_name NOT IN ('Doe', 'Smith');

LIKE Operator for Pattern Matching

Search for patterns in text data using wildcards:

Wildcard Description Example Matches
% Matches any sequence of characters (0 or more) WHERE name LIKE 'J%' John, Jane, James, etc.
_ Matches exactly one character WHERE name LIKE 'J_n' Jen, Jon, Jan (but not John)
[charlist] Matches any single character in charlist WHERE name LIKE '[JS]%' John, Sarah (starts with J or S)
[^charlist] Matches any single character not in charlist WHERE name LIKE '[^JS]%' Mark, David (doesn't start with J or S)
LIKE Operator Examples
-- Students with first names starting with 'J'
SELECT * FROM students
  WHERE first_name LIKE 'J%';

-- Students with email containing 'gmail'
SELECT * FROM students
  WHERE email LIKE '%gmail%';

-- Students with 3-letter first names
SELECT * FROM students
  WHERE first_name LIKE '___';

-- NOT LIKE (exclude pattern)
SELECT * FROM students
  WHERE email NOT LIKE '%@gmail.com';

NULL Value Handling

Special handling for NULL values in WHERE clauses:

NULL Handling Examples
-- Students with no email address (NULL)
SELECT * FROM students
  WHERE email IS NULL;

-- Students with email address (NOT NULL)
SELECT * FROM students
  WHERE email IS NOT NULL;

-- ❌ WRONG: This won't work with NULL values
SELECT * FROM students
  WHERE email = NULL; -- Always returns no rows

-- ❌ WRONG: This also won't work
SELECT * FROM students
  WHERE email != NULL; -- Always returns no rows
⚠️ IMPORTANT: Always use IS NULL or IS NOT NULL to check for NULL values. Using = NULL or != NULL will not work as expected because NULL represents unknown value.

Complex WHERE Clause Examples

Combine multiple operators for sophisticated filtering:

Complex WHERE Examples
-- Students born in 2000 with Gmail addresses
SELECT * FROM students
  WHERE YEAR(date_of_birth) = 2000
    AND email LIKE '%@gmail.com';

-- Students with specific names OR born after 2001
SELECT * FROM students
  WHERE last_name IN ('Doe', 'Smith')
    OR date_of_birth > '2001-12-31';

-- Students not named John or Jane, born before 2000
SELECT * FROM students
  WHERE first_name NOT IN ('John', 'Jane')
    AND date_of_birth < '2000-01-01'
    AND email IS NOT NULL;

WHERE Clause with Functions

Use functions in WHERE conditions for dynamic filtering:

Function-based WHERE Conditions
-- Students enrolled this year
SELECT * FROM students
  WHERE YEAR(enrollment_date) = YEAR(CURRENT_DATE);

-- Students with email domains of specific length
SELECT * FROM students
  WHERE LENGTH(SUBSTRING_INDEX(email, '@', -1)) > 10;

-- Students born on weekend
SELECT * FROM students
  WHERE DAYOFWEEK(date_of_birth) IN (1, 7);

Performance Considerations

Practice Impact Recommendation
Using indexed columns High performance Filter on indexed columns when possible
Functions on columns Prevents index usage Avoid WHERE YEAR(date) = 2023
Leading wildcards in LIKE Full table scan Avoid WHERE name LIKE '%son'
OR conditions Can be slow Consider UNION for better performance
NOT conditions Often slow Use positive conditions when possible

Best Practices for WHERE Clauses

WHERE Clause Guidelines:
  • Use parentheses to clarify complex AND/OR logic
  • Always use IS NULL/IS NOT NULL for NULL checks
  • Use IN instead of multiple OR conditions for better readability
  • Avoid functions on indexed columns in WHERE clauses
  • Use BETWEEN for range queries instead of separate comparisons
  • Be careful with LIKE patterns - leading wildcards can't use indexes
  • Test your conditions with sample data to ensure correct logic

Common WHERE Clause Mistakes

  • Using = NULL instead of IS NULL
  • Forgetting parentheses in complex AND/OR conditions
  • Case sensitivity in string comparisons
  • Date format mismatches
  • Using functions that prevent index usage
  • Overly complex conditions that are hard to read
  • Not considering NULL values in logic

Practice Exercise

Practice these WHERE clause scenarios:

  1. Find all students born in the 1990s
  2. Find students with first names starting with 'A' or 'B'
  3. Find students without email addresses
  4. Find students enrolled in the last 30 days
  5. Find students with specific last names but exclude certain first names
  6. Find students with email addresses from educational institutions (.edu)
  7. Find students born on weekdays (Monday-Friday)
  8. Find students with odd-numbered student IDs
Remember: The WHERE clause is your primary tool for data filtering. Mastering its operators and understanding performance implications will make you much more effective at database querying and analysis.

MySQL Operators

Operators: Special symbols or keywords used to perform operations on operands (values, columns, or expressions). Operators are the building blocks of SQL conditions and calculations.

Why Operators Matter

Operators are essential for:

  • Data Filtering: Creating conditions in WHERE clauses
  • Data Calculation: Performing mathematical operations
  • Data Comparison: Comparing values and expressions
  • Logical Operations: Combining multiple conditions
  • Pattern Matching: Searching for text patterns

Operator Categories Overview

Category Purpose Common Operators
Arithmetic Mathematical calculations +, -, *, /, %, DIV
Comparison Compare values and expressions =, !=, <, >, <=, >=
Logical Combine boolean expressions AND, OR, NOT, &&, ||, !
Bitwise Manipulate bits in numbers &, |, ^, ~, <<, >>
Assignment Assign values to variables :=, =
Special Unique SQL operations BETWEEN, IN, LIKE, IS NULL

Arithmetic Operators

Used for mathematical calculations on numeric values:

Operator Description Example Result
+ Addition SELECT 10 + 5; 15
- Subtraction SELECT 10 - 5; 5
* Multiplication SELECT 10 * 5; 50
/ Division SELECT 10 / 3; 3.3333
DIV Integer division SELECT 10 DIV 3; 3
% or MOD Modulo (remainder) SELECT 10 % 3; 1
Arithmetic Operator Examples
-- Basic arithmetic operations
SELECT
  10 + 5 AS addition,
  10 - 5 AS subtraction,
  10 * 5 AS multiplication,
  10 / 3 AS division,
  10 DIV 3 AS integer_division,
  10 % 3 AS modulo;

-- Using arithmetic with table data
SELECT
  product_name,
  price,
  price * 0.9 AS discounted_price,
  price * quantity AS total_value
FROM products;
+-------+-------------+----------------+--------------+-------------------+--------+
| addition | subtraction | multiplication | division | integer_division | modulo |
+-------+-------------+----------------+--------------+-------------------+--------+
| 15 | 5 | 50 | 3.3333 | 3 | 1 |
+-------+-------------+----------------+--------------+-------------------+--------+

Comparison Operators

Used to compare values and return TRUE, FALSE, or NULL:

Operator Description Example Result
= Equal to SELECT 10 = 10; 1 (TRUE)
!= or <> Not equal to SELECT 10 != 5; 1 (TRUE)
> Greater than SELECT 10 > 5; 1 (TRUE)
< Less than SELECT 10 < 5; 0 (FALSE)
>= Greater than or equal to SELECT 10 >= 10; 1 (TRUE)
<= Less than or equal to SELECT 10 <= 5; 0 (FALSE)
<=> NULL-safe equal to SELECT NULL <=> NULL; 1 (TRUE)
Comparison Operator Examples
-- Basic comparisons
SELECT
  10 = 10 AS equal,
  10 != 5 AS not_equal,
  10 > 5 AS greater_than,
  10 < 5 AS less_than,
  10 >= 10 AS greater_equal,
  10 <= 5 AS less_equal;

-- NULL-safe comparison
SELECT
  NULL = NULL AS null_equals_null, -- Returns NULL
  NULL <=> NULL AS null_safe_equals; -- Returns 1 (TRUE)
NULL-Safe Comparison: Use <=> when you need to compare values that might be NULL. Regular comparison operators return NULL when either operand is NULL, but <=> returns TRUE for NULL = NULL.

Logical Operators

Combine boolean expressions and return TRUE, FALSE, or NULL:

Operator Description Example Result
AND or && Logical AND SELECT 1 AND 1; 1 (TRUE)
OR or || Logical OR SELECT 1 OR 0; 1 (TRUE)
NOT or ! Logical NOT SELECT NOT 0; 1 (TRUE)
XOR Logical XOR (exclusive OR) SELECT 1 XOR 1; 0 (FALSE)
Logical Operator Examples
-- Basic logical operations
SELECT
  1 AND 1 AS and_true,
  1 AND 0 AS and_false,
  1 OR 0 AS or_true,
  0 OR 0 AS or_false,
  NOT 1 AS not_true,
  NOT 0 AS not_false,
  1 XOR 0 AS xor_true,
  1 XOR 1 AS xor_false;

-- Using logical operators in WHERE clauses
SELECT * FROM students
  WHERE (age > 18 AND status = 'active')
    OR (grade >= 90 AND NOT suspended);

Bitwise Operators

Perform operations on individual bits of integer values:

Operator Description Example Result (Binary)
& Bitwise AND SELECT 5 & 3; 1 (101 & 011 = 001)
| Bitwise OR SELECT 5 | 3; 7 (101 | 011 = 111)
^ Bitwise XOR SELECT 5 ^ 3; 6 (101 ^ 011 = 110)
~ Bitwise NOT SELECT ~1; 4294967294 (~0001 = 1110)
<< Left shift SELECT 1 << 2; 4 (0001 << 2 = 0100)
>> Right shift SELECT 8 >> 2; 2 (1000 >> 2 = 0010)
Bitwise Operator Examples
SELECT
  5 & 3 AS bitwise_and, -- 101 AND 011 = 001 (1)
  5 | 3 AS bitwise_or, -- 101 OR 011 = 111 (7)
  5 ^ 3 AS bitwise_xor, -- 101 XOR 011 = 110 (6)
  ~1 AS bitwise_not, -- NOT 0001 = 1110 (large number)
  1 << 2 AS left_shift, -- 0001 << 2 = 0100 (4)
  8 >> 2 AS right_shift; -- 1000 >> 2 = 0010 (2)

Special SQL Operators

Unique operators for specific SQL operations:

Operator Description Syntax Example
BETWEEN Range check (inclusive) value BETWEEN min AND max WHERE age BETWEEN 18 AND 65
IN Check value in list value IN (val1, val2, ...) WHERE status IN ('active', 'pending')
LIKE Pattern matching string LIKE pattern WHERE name LIKE 'J%'
IS NULL Check for NULL value column IS NULL WHERE email IS NULL
IS NOT NULL Check for non-NULL value column IS NOT NULL WHERE phone IS NOT NULL
REGEXP Regular expression match string REGEXP pattern WHERE email REGEXP '^[A-Za-z]'
Special Operator Examples
-- BETWEEN example
SELECT * FROM students
  WHERE age BETWEEN 18 AND 25;

-- IN example
SELECT * FROM students
  WHERE department IN ('CS', 'MATH', 'PHYSICS');

-- LIKE example
SELECT * FROM students
  WHERE email LIKE '%@gmail.com';

-- IS NULL example
SELECT * FROM students
  WHERE middle_name IS NULL;

-- REGEXP example
SELECT * FROM students
  WHERE name REGEXP '^J.*n$'; -- Names starting with J and ending with n

Operator Precedence

Order in which operators are evaluated (highest to lowest):

Precedence Operators Description
1 ! Logical NOT
2 - (unary), ~ Unary minus, bitwise NOT
3 ^ Bitwise XOR
4 *, /, DIV, %, MOD Multiplication, division
5 +, - Addition, subtraction
6 <<, >> Bitwise shifts
7 & Bitwise AND
8 | Bitwise OR
9 =, <=>, >=, >, <=, <, <>, !=
IS, LIKE, REGEXP, IN
Comparison, pattern matching
10 BETWEEN, CASE, WHEN, THEN, ELSE Special operators
11 NOT Logical NOT
12 AND, && Logical AND
13 XOR Logical XOR
14 OR, || Logical OR
15 := Assignment
Operator Precedence Examples
-- Multiplication before addition
SELECT 2 + 3 * 4; -- Result: 14 (not 20)

-- AND has higher precedence than OR
SELECT 1 OR 0 AND 0; -- Result: 1 (not 0)
-- Equivalent to: 1 OR (0 AND 0) = 1 OR 0 = 1

-- Use parentheses to control evaluation
SELECT (1 OR 0) AND 0; -- Result: 0
Best Practice: Always use parentheses to make operator precedence explicit, especially with complex expressions involving AND/OR combinations. This improves readability and prevents unexpected results.

Best Practices for Using Operators

Operator Guidelines:
  • Use parentheses to clarify complex expressions
  • Use IS NULL/IS NOT NULL instead of = NULL/!= NULL
  • Prefer IN over multiple OR conditions for better readability
  • Use BETWEEN for range checks instead of separate comparisons
  • Be mindful of operator precedence in complex expressions
  • Use appropriate operators for the data type (string vs numeric)
  • Test your expressions with sample data to ensure correct logic

Common Operator Mistakes

  • Using = NULL instead of IS NULL
  • Forgetting operator precedence in complex expressions
  • Using bitwise operators instead of logical operators
  • Mixing data types in comparisons
  • Using AND when OR is needed (and vice versa)
  • Forgetting that NULL values affect logical operations
  • Using arithmetic operators on non-numeric data

Practice Exercise

Practice using different operators with these scenarios:

  1. Calculate the total price including 8% tax for products
  2. Find students with ages between 20 and 25 using BETWEEN
  3. Use IN operator to find students in specific departments
  4. Combine AND/OR to find active students with high GPA or honors status
  5. Use bitwise operators to check if a number is even or odd
  6. Create complex WHERE conditions using proper parentheses
  7. Use REGEXP to find email addresses with specific patterns
  8. Test operator precedence with mixed arithmetic and logical operations
Remember: Mastering MySQL operators is essential for writing effective queries. Understanding how different operators work and their precedence will help you create more efficient and accurate database operations.

UPDATE & DELETE Statements in MySQL

UPDATE Statement: Modifies existing records in a table by changing column values while keeping the record structure intact.
DELETE Statement: Removes one or more records from a table entirely. Use with extreme caution as this operation is irreversible.

Why UPDATE & DELETE Matter

These statements are essential for database maintenance:

  • Data Correction: Fix errors in existing records
  • Data Maintenance: Keep information current and accurate
  • Data Archiving: Remove obsolete or temporary data
  • Business Logic: Implement status changes and updates
  • Database Cleanup: Remove unwanted or duplicate records

UPDATE Statement Overview

The UPDATE statement modifies existing records:

Clause Purpose Required
UPDATE Specifies table to update Yes
SET Specifies columns and new values Yes
WHERE Specifies which records to update No (but highly recommended)
ORDER BY Specifies update order No
LIMIT Limits number of records updated No

Basic UPDATE Syntax

Let's start with basic UPDATE operations on our students table:

Basic UPDATE Examples
-- Update a single student's email
UPDATE students
  SET email = 'john.new@email.com'
  WHERE student_id = 1;

-- Update multiple columns for a student
UPDATE students
  SET
    first_name = 'Jonathan',
    last_name = 'Doerson',
    email = 'jonathan.doerson@email.com'
  WHERE student_id = 1;

-- Update with expression
UPDATE products
  SET price = price * 1.1 -- Increase price by 10%
  WHERE category = 'Electronics';
Query OK, 1 row affected (0.02 sec)
Rows matched: 1 Changed: 1 Warnings: 0
⚠️ CRITICAL WARNING: Always use a WHERE clause with UPDATE statements unless you intentionally want to update every record in the table. Omitting WHERE can accidentally update all records!

Advanced UPDATE Operations

More complex UPDATE scenarios with multiple conditions and joins:

Advanced UPDATE Examples
-- Update with multiple conditions
UPDATE students
  SET status = 'alumni'
  WHERE graduation_year < 2020
    AND status = 'active';

-- Update with LIMIT (safe testing)
UPDATE students
  SET email = CONCAT(username, '@university.edu')
  WHERE email IS NULL
  LIMIT 10; -- Update only 10 records

-- Update using subquery
UPDATE students
  SET scholarship_amount = (
    SELECT AVG(scholarship_amount)
    FROM students
    WHERE department = 'Computer Science'
  )
  WHERE student_id = 5;

UPDATE with JOIN

Update records based on values from related tables:

UPDATE with JOIN Examples
-- Update student status based on enrollment count
UPDATE students s
  JOIN (
    SELECT student_id, COUNT(*) AS course_count
    FROM enrollments
    GROUP BY student_id
  ) e ON s.student_id = e.student_id
  SET s.status = 'full_time'
  WHERE e.course_count >= 4;

-- Update product prices based on supplier costs
UPDATE products p
  JOIN suppliers s ON p.supplier_id = s.supplier_id
  SET p.price = s.cost_price * 1.3 -- 30% markup
  WHERE p.category = 'Electronics';

DELETE Statement Overview

The DELETE statement removes records from a table:

Clause Purpose Required
DELETE FROM Specifies table to delete from Yes
WHERE Specifies which records to delete No (but CRITICAL to include)
ORDER BY Specifies deletion order No
LIMIT Limits number of records deleted No

Basic DELETE Syntax

Let's explore DELETE operations with proper safety measures:

Basic DELETE Examples
-- Delete a specific student
DELETE FROM students
  WHERE student_id = 10;

-- Delete students with specific criteria
DELETE FROM students
  WHERE graduation_year < 2010
    AND status = 'inactive';

-- Safe DELETE with LIMIT
DELETE FROM temp_logs
  WHERE created_date < '2023-01-01'
  LIMIT 1000; -- Delete in batches
Query OK, 1 row affected (0.01 sec)
⚠️ EXTREME CAUTION: DELETE operations are PERMANENT and cannot be undone. Always backup your data and test your WHERE clause with a SELECT statement first to verify which records will be affected.

DELETE Safety Practices

Practice Description Example
Test with SELECT first Verify records to be deleted SELECT * FROM table WHERE condition
Use transactions Allow rollback if needed START TRANSACTION; DELETE...; ROLLBACK;
Use LIMIT Delete in controlled batches DELETE ... LIMIT 1000
Backup data Always have recent backups mysqldump database table
Use soft deletes Mark as deleted instead of removing UPDATE SET is_deleted = 1

Safe DELETE Workflow

Always follow this safety procedure before deleting:

Safe DELETE Procedure
-- Step 1: Test with SELECT first
SELECT * FROM students
  WHERE graduation_year < 2010
    AND status = 'inactive';

-- Step 2: Use transaction for safety
START TRANSACTION;

-- Step 3: Perform DELETE within transaction
DELETE FROM students
  WHERE graduation_year < 2010
    AND status = 'inactive';

-- Step 4: Verify the results
SELECT ROW_COUNT(); -- Check how many rows were affected

-- Step 5: COMMIT if correct, ROLLBACK if mistake
COMMIT; -- OR ROLLBACK; to undo

DELETE with JOIN

Delete records based on relationships with other tables:

DELETE with JOIN Examples
-- Delete enrollments for students who graduated
DELETE e FROM enrollments e
  JOIN students s ON e.student_id = s.student_id
  WHERE s.graduation_year < 2020;

-- Delete using subquery (alternative approach)
DELETE FROM enrollments
  WHERE student_id IN (
    SELECT student_id
    FROM students
    WHERE graduation_year < 2020
  );

-- Delete orphaned records (no foreign key relationship)
DELETE e FROM enrollments e
  LEFT JOIN students s ON e.student_id = s.student_id
  WHERE s.student_id IS NULL;

TRUNCATE vs DELETE

Understanding the difference between TRUNCATE and DELETE:

Aspect DELETE TRUNCATE
Operation Type DML (Data Manipulation Language) DDL (Data Definition Language)
WHERE Clause Supports WHERE for conditional deletion No WHERE clause - removes all rows
Performance Slower for large tables (row-by-row) Faster (drops and recreates table)
Auto-increment Does not reset auto-increment counter Resets auto-increment counter to 1
Triggers Fires DELETE triggers Does not fire triggers
Rollback Can be rolled back in transactions Cannot be rolled back in some databases
Use Case Selective deletion of specific records Complete table cleanup
TRUNCATE Example
-- Remove all data from a temporary table
TRUNCATE TABLE temp_logs;

-- ❌ TRUNCATE cannot use WHERE clause
-- This will cause an error:
-- TRUNCATE TABLE students WHERE status = 'inactive';

Handling Foreign Key Constraints

Special considerations when updating or deleting related records:

Foreign Key Constraint Examples
-- This will fail if enrollments reference the student
DELETE FROM students
  WHERE student_id = 1;
-- Error: Cannot delete or update a parent row: a foreign key constraint fails

-- Solution 1: Delete child records first
DELETE FROM enrollments
  WHERE student_id = 1;
DELETE FROM students
  WHERE student_id = 1;

-- Solution 2: Use CASCADE delete (if defined)
-- When foreign key is defined with ON DELETE CASCADE
DELETE FROM students
  WHERE student_id = 1;
-- Automatically deletes related enrollments

Best Practices for UPDATE & DELETE

Safety Guidelines:
  • Always backup data before major UPDATE/DELETE operations
  • Use transactions for multi-step operations
  • Test WHERE clauses with SELECT first
  • Use LIMIT for large operations to avoid locking
  • Consider using soft deletes (is_deleted flag) instead of physical deletes
  • Be mindful of foreign key constraints and CASCADE rules
  • Keep audit logs of significant data changes

Common UPDATE & DELETE Mistakes

  • Omitting WHERE clause: Updates/deletes all records
  • Incorrect WHERE conditions: Affects wrong records
  • Not testing with SELECT first: No verification
  • Ignoring foreign key constraints: Causes errors
  • Not using transactions: No rollback capability
  • Forgetting about triggers: Unexpected side effects
  • Not considering performance: Locks large tables

Practice Exercise

Practice these UPDATE and DELETE scenarios safely:

  1. Update all product prices by applying a 15% discount for clearance items
  2. Change the email domain for all students from @old-school.edu to @new-school.edu
  3. Delete temporary log records older than 30 days (use LIMIT for safety)
  4. Update student status to 'graduated' for those with graduation_year in the past
  5. Delete duplicate student records (keep the one with the latest enrollment_date)
  6. Use a transaction to safely update multiple related tables
  7. Practice the safe DELETE workflow with SELECT verification
  8. Update records using a JOIN with another table
Remember: With great power comes great responsibility. UPDATE and DELETE are powerful operations that can significantly impact your data. Always prioritize safety, verification, and backups when working with these statements.

JOIN Operations in MySQL

JOIN Operations: Used to combine rows from two or more tables based on related columns between them. JOINs are fundamental for working with relational databases and retrieving related data.

Why JOIN Operations Matter

JOIN operations are essential for:

  • Data Relationships: Combining related data from multiple tables
  • Normalized Databases: Working with properly structured databases
  • Complex Queries: Answering business questions across multiple entities
  • Reporting: Creating comprehensive reports from related data
  • Data Analysis: Analyzing relationships between different data entities

Sample Data for JOIN Examples

Let's set up sample tables to demonstrate different JOIN types:

Sample Tables Setup
-- Students table
CREATE TABLE students (
  student_id INT PRIMARY KEY AUTO_INCREMENT,
  first_name VARCHAR(50),
  last_name VARCHAR(50)
);

-- Courses table
CREATE TABLE courses (
  course_id INT PRIMARY KEY AUTO_INCREMENT,
  course_name VARCHAR(100)
);

-- Enrollments table (junction table)
CREATE TABLE enrollments (
  enrollment_id INT PRIMARY KEY AUTO_INCREMENT,
  student_id INT,
  course_id INT,
  grade CHAR(2),
  FOREIGN KEY (student_id) REFERENCES students(student_id),
  FOREIGN KEY (course_id) REFERENCES courses(course_id)
);

JOIN Types Overview

JOIN Type Description Venn Diagram Use Case
INNER JOIN Returns matching rows from both tables Intersection Find related records that exist in both tables
LEFT JOIN Returns all rows from left table + matching rows from right Full left circle + intersection Find all records from left table with optional related data
RIGHT JOIN Returns all rows from right table + matching rows from left Full right circle + intersection Find all records from right table with optional related data
FULL OUTER JOIN Returns all rows when there's a match in either table Union of both circles Find all records from both tables (MySQL workaround needed)
CROSS JOIN Returns Cartesian product of both tables All combinations Generate all possible combinations
SELF JOIN Join a table with itself Same table Find relationships within the same table

INNER JOIN

Returns only the rows that have matching values in both tables:

INNER JOIN Examples
-- Basic INNER JOIN syntax
SELECT
  s.first_name, s.last_name,
  c.course_name, e.grade
FROM students s
  INNER JOIN enrollments e ON s.student_id = e.student_id
  INNER JOIN courses c ON e.course_id = c.course_id;

-- INNER JOIN with WHERE clause
SELECT
  s.first_name, s.last_name, c.course_name
FROM students s
  INNER JOIN enrollments e ON s.student_id = e.student_id
  INNER JOIN courses c ON e.course_id = c.course_id
WHERE c.course_name = 'Mathematics';

-- Traditional syntax (older style, not recommended)
SELECT s.first_name, s.last_name, c.course_name
FROM students s, enrollments e, courses c
WHERE s.student_id = e.student_id
  AND e.course_id = c.course_id;
+------------+-----------+-------------+-------+
| first_name | last_name | course_name | grade |
+------------+-----------+-------------+-------+
| John | Doe | Mathematics | A |
| Jane | Smith | Science | B+ |
| Bob | Johnson | Mathematics | A- |
+------------+-----------+-------------+-------+
3 rows in set (0.00 sec)

LEFT JOIN

Returns all rows from the left table and matched rows from the right table. Unmatched rows from the right table will contain NULLs:

LEFT JOIN Examples
-- Find all students and their enrollments (if any)
SELECT
  s.first_name, s.last_name,
  c.course_name, e.grade
FROM students s
  LEFT JOIN enrollments e ON s.student_id = e.student_id
  LEFT JOIN courses c ON e.course_id = c.course_id;

-- Find students with no enrollments
SELECT
  s.first_name, s.last_name
FROM students s
  LEFT JOIN enrollments e ON s.student_id = e.student_id
WHERE e.enrollment_id IS NULL;

-- LEFT JOIN with multiple conditions
SELECT
  s.first_name, s.last_name,
  c.course_name, e.grade
FROM students s
  LEFT JOIN enrollments e ON s.student_id = e.student_id
    AND e.grade > 'B'
  LEFT JOIN courses c ON e.course_id = c.course_id;

RIGHT JOIN

Returns all rows from the right table and matched rows from the left table. Unmatched rows from the left table will contain NULLs:

RIGHT JOIN Examples
-- Find all courses and students enrolled in them
SELECT
  c.course_name,
  s.first_name, s.last_name
FROM students s
  RIGHT JOIN enrollments e ON s.student_id = e.student_id
  RIGHT JOIN courses c ON e.course_id = c.course_id;

-- Find courses with no students enrolled
SELECT
  c.course_name
FROM students s
  RIGHT JOIN enrollments e ON s.student_id = e.student_id
  RIGHT JOIN courses c ON e.course_id = c.course_id
WHERE s.student_id IS NULL;
Note: RIGHT JOIN is less commonly used than LEFT JOIN. Most developers prefer to structure their queries using LEFT JOIN for better readability and consistency.

FULL OUTER JOIN (MySQL Workaround)

MySQL doesn't support FULL OUTER JOIN directly, but we can simulate it using UNION:

FULL OUTER JOIN Simulation
-- Simulate FULL OUTER JOIN using UNION
SELECT
  s.first_name, s.last_name,
  c.course_name, e.grade
FROM students s
  LEFT JOIN enrollments e ON s.student_id = e.student_id
  LEFT JOIN courses c ON e.course_id = c.course_id

UNION

SELECT
  s.first_name, s.last_name,
  c.course_name, e.grade
FROM students s
  RIGHT JOIN enrollments e ON s.student_id = e.student_id
  RIGHT JOIN courses c ON e.course_id = c.course_id
WHERE s.student_id IS NULL;

CROSS JOIN

Returns the Cartesian product of both tables (every row from first table combined with every row from second table):

CROSS JOIN Examples
-- Generate all possible student-course combinations
SELECT
  s.first_name, s.last_name,
  c.course_name
FROM students s
  CROSS JOIN courses c;

-- CROSS JOIN with WHERE (similar to INNER JOIN)
SELECT
  s.first_name, s.last_name,
  c.course_name
FROM students s
  CROSS JOIN courses c
WHERE s.department = c.department;

SELF JOIN

Join a table with itself to find relationships within the same table:

SELF JOIN Examples
-- Find students who share the same last name
SELECT
  a.first_name, a.last_name,
  b.first_name AS same_last_name_student
FROM students a
  INNER JOIN students b ON a.last_name = b.last_name
WHERE a.student_id != b.student_id;

-- Find hierarchical relationships (manager-employee)
SELECT
  e.first_name AS employee_name,
  m.first_name AS manager_name
FROM employees e
  LEFT JOIN employees m ON e.manager_id = m.employee_id;

JOIN Conditions and Multiple Tables

Scenario JOIN Approach Example
Two tables Single JOIN condition FROM A JOIN B ON A.id = B.a_id
Three tables Two JOIN conditions FROM A JOIN B ON A.id = B.a_id JOIN C ON B.id = C.b_id
Multiple conditions AND in ON clause FROM A JOIN B ON A.id = B.a_id AND A.type = B.type
Complex relationships Multiple JOIN types FROM A LEFT JOIN B ON... INNER JOIN C ON...
Self-referencing SELF JOIN with aliases FROM A a1 JOIN A a2 ON a1.parent = a2.id

JOIN Performance Considerations

Factor Impact Recommendation
Indexes High impact Index columns used in JOIN conditions
JOIN Order Moderate impact Put smaller tables first in JOIN sequence
JOIN Type High impact Use INNER JOIN when possible, avoid CROSS JOIN
WHERE Clause High impact Filter early with WHERE before JOINs
Number of Tables High impact Avoid joining too many tables in one query
Data Types Moderate impact Ensure JOIN columns have same data types

Common JOIN Patterns

Real-world scenarios where different JOIN types are useful:

Common JOIN Patterns
-- Pattern 1: Get orders with customer information
SELECT o.order_id, o.order_date, c.customer_name
FROM orders o
  INNER JOIN customers c ON o.customer_id = c.customer_id;

-- Pattern 2: Find products never ordered
SELECT p.product_name
FROM products p
  LEFT JOIN order_items oi ON p.product_id = oi.product_id
WHERE oi.order_id IS NULL;

-- Pattern 3: Many-to-many relationship
SELECT s.first_name, s.last_name, c.course_name
FROM students s
  INNER JOIN enrollments e ON s.student_id = e.student_id
  INNER JOIN courses c ON e.course_id = c.course_id;

-- Pattern 4: Hierarchical data
SELECT e.employee_name, m.employee_name AS manager_name
FROM employees e
  LEFT JOIN employees m ON e.manager_id = m.employee_id;

JOIN Best Practices

JOIN Guidelines:
  • Always use explicit JOIN syntax instead of comma-separated tables
  • Use table aliases for better readability with multiple tables
  • Put filter conditions in WHERE clause rather than ON clause when possible
  • Use INNER JOIN when you only want matching records
  • Use LEFT JOIN when you want all records from the left table
  • Index columns used in JOIN conditions for better performance
  • Be consistent with your JOIN syntax throughout the application
  • Test JOIN queries with EXPLAIN to understand performance

Common JOIN Mistakes

  • Cartesian products: Forgetting JOIN conditions
  • Wrong JOIN type: Using INNER JOIN when LEFT JOIN is needed
  • Performance issues: Joining too many tables without indexes
  • Ambiguous columns: Not qualifying column names with table aliases
  • NULL handling: Not considering NULL values in JOIN conditions
  • Condition placement: Putting filter conditions in wrong clause (ON vs WHERE)
  • Data type mismatches: Joining columns with different data types

Practice Exercise

Practice these JOIN scenarios:

  1. Create INNER JOIN to find all students enrolled in 'Computer Science' courses
  2. Use LEFT JOIN to find all courses and count how many students are enrolled in each
  3. Write a query to find students who are not enrolled in any courses
  4. Create a SELF JOIN to find employees and their managers
  5. Use multiple JOINs to display student names, course names, and instructor names
  6. Write a query using RIGHT JOIN to find all departments and their employees
  7. Create a CROSS JOIN to generate all possible product-size combinations
  8. Use JOIN with aggregate functions to find the average grade per course
Remember: Mastering JOIN operations is crucial for working with relational databases. Understanding the different JOIN types and when to use each will enable you to extract meaningful insights from your data relationships.

Aggregate Functions in MySQL

Aggregate Functions: Functions that perform calculations on multiple rows and return a single summary value. They are essential for data analysis, reporting, and statistical operations.

Why Aggregate Functions Matter

Aggregate functions are crucial for:

  • Data Summarization: Creating summary reports from detailed data
  • Statistical Analysis: Calculating averages, totals, and distributions
  • Business Intelligence: Supporting decision-making with summarized data
  • Performance Monitoring: Tracking metrics and KPIs
  • Data Quality: Identifying data patterns and anomalies

Common Aggregate Functions Overview

Function Description Data Types NULL Handling
COUNT() Counts the number of rows All types Ignores NULL in COUNT(column)
SUM() Calculates the sum of values Numeric Ignores NULL values
AVG() Calculates the average value Numeric Ignores NULL values
MIN() Finds the minimum value All comparable types Ignores NULL values
MAX() Finds the maximum value All comparable types Ignores NULL values
GROUP_CONCAT() Concatenates values from a group String Ignores NULL values
STDDEV() Calculates standard deviation Numeric Ignores NULL values
VARIANCE() Calculates statistical variance Numeric Ignores NULL values

Sample Data for Examples

Let's use a sample sales data table for our examples:

Sample Sales Data
CREATE TABLE sales (
  sale_id INT PRIMARY KEY AUTO_INCREMENT,
  product_name VARCHAR(100),
  category VARCHAR(50),
  sale_date DATE,
  quantity INT,
  unit_price DECIMAL(10, 2),
  region VARCHAR(50)
);

COUNT() Function

Counts the number of rows or non-NULL values:

COUNT() Examples
-- Count all rows in the table
SELECT COUNT(*) AS total_sales FROM sales;

-- Count non-NULL values in a specific column
SELECT COUNT(region) AS sales_with_region FROM sales;

-- Count distinct values
SELECT COUNT(DISTINCT category) AS unique_categories FROM sales;

-- Count with condition
SELECT COUNT(*) AS high_value_sales
FROM sales
WHERE unit_price > 100;
+-------------+
| total_sales |
+-------------+
| 150 |
+-------------+
1 row in set (0.00 sec)

SUM() Function

Calculates the sum of numeric values:

SUM() Examples
-- Total quantity sold
SELECT SUM(quantity) AS total_quantity FROM sales;

-- Total revenue (quantity * unit_price)
SELECT SUM(quantity * unit_price) AS total_revenue FROM sales;

-- Sum with condition
SELECT SUM(unit_price) AS electronics_revenue
FROM sales
WHERE category = 'Electronics';

-- Sum distinct values (rarely used)
SELECT SUM(DISTINCT unit_price) AS sum_unique_prices FROM sales;

AVG() Function

Calculates the average value of numeric columns:

AVG() Examples
-- Average unit price
SELECT AVG(unit_price) AS average_price FROM sales;

-- Average with ROUND for readability
SELECT ROUND(AVG(unit_price), 2) AS average_price FROM sales;

-- Average quantity per sale
SELECT AVG(quantity) AS avg_quantity FROM sales;

-- Average with condition
SELECT AVG(unit_price) AS avg_electronics_price
FROM sales
WHERE category = 'Electronics';

MIN() and MAX() Functions

Find the minimum and maximum values in a column:

MIN() and MAX() Examples
-- Price range analysis
SELECT
  MIN(unit_price) AS min_price,
  MAX(unit_price) AS max_price,
  MAX(unit_price) - MIN(unit_price) AS price_range
FROM sales;

-- Date range of sales
SELECT
  MIN(sale_date) AS first_sale,
  MAX(sale_date) AS last_sale
FROM sales;

-- Minimum and maximum by category
SELECT
  category,
  MIN(unit_price) AS min_price,
  MAX(unit_price) AS max_price
FROM sales
GROUP BY category;

GROUP_CONCAT() Function

Concatenates values from multiple rows into a single string:

GROUP_CONCAT() Examples
-- List all product names in each category
SELECT
  category,
  GROUP_CONCAT(product_name) AS products
FROM sales
GROUP BY category;

-- With custom separator
SELECT
  category,
  GROUP_CONCAT(product_name SEPARATOR ' | ') AS products
FROM sales
GROUP BY category;

-- With DISTINCT and ORDER BY
SELECT
  region,
  GROUP_CONCAT(DISTINCT category ORDER BY category) AS categories
FROM sales
GROUP BY region;

-- Limited number of values
SELECT
  category,
  GROUP_CONCAT(product_name ORDER BY unit_price DESC LIMIT 3) AS top_products
FROM sales
GROUP BY category;

Statistical Aggregate Functions

Advanced statistical calculations:

Statistical Functions
-- Standard deviation and variance
SELECT
  STDDEV(unit_price) AS price_std_dev,
  VARIANCE(unit_price) AS price_variance
FROM sales;

-- Population vs sample standard deviation
SELECT
  STDDEV_POP(unit_price) AS population_stddev,
  STDDEV_SAMP(unit_price) AS sample_stddev
FROM sales;

Using Aggregate Functions with GROUP BY

Group data and calculate aggregates for each group:

GROUP BY with Aggregates
-- Sales summary by category
SELECT
  category,
  COUNT(*) AS total_sales,
  SUM(quantity) AS total_quantity,
  SUM(quantity * unit_price) AS total_revenue,
  AVG(unit_price) AS avg_price
FROM sales
GROUP BY category;

-- Monthly sales summary
SELECT
  YEAR(sale_date) AS sale_year,
  MONTH(sale_date) AS sale_month,
  COUNT(*) AS total_sales,
  SUM(quantity * unit_price) AS monthly_revenue
FROM sales
GROUP BY YEAR(sale_date), MONTH(sale_date)
ORDER BY sale_year, sale_month;

-- Multiple grouping columns
SELECT
  region,
  category,
  COUNT(*) AS sales_count,
  SUM(quantity * unit_price) AS region_category_revenue
FROM sales
GROUP BY region, category
ORDER BY region, category;

HAVING Clause with Aggregate Functions

Filter groups based on aggregate results:

HAVING Clause Examples
-- Categories with more than 10 sales
SELECT
  category,
  COUNT(*) AS sales_count
FROM sales
  GROUP BY category
  HAVING COUNT(*) > 10;

-- Categories with average price over $50
SELECT
  category,
  AVG(unit_price) AS avg_price
FROM sales
  GROUP BY category
  HAVING AVG(unit_price) > 50;

-- Multiple HAVING conditions
SELECT
  region,
  SUM(quantity * unit_price) AS total_revenue,
  COUNT(*) AS sales_count
FROM sales
  GROUP BY region
  HAVING SUM(quantity * unit_price) > 10000
    AND COUNT(*) > 5;

Aggregate Function Behavior

Scenario Behavior Example Result
Empty table with COUNT(*) Returns 0 COUNT(*) → 0
All NULL values with SUM/AVG Returns NULL SUM(NULL_column) → NULL
Mixed NULL and values with SUM Ignores NULL, sums values SUM(10, NULL, 20) → 30
MIN/MAX with all NULLs Returns NULL MAX(NULL_column) → NULL
COUNT(column) with all NULLs Returns 0 COUNT(NULL_column) → 0
GROUP BY with no rows Returns empty result set No rows returned

Advanced Aggregate Techniques

Combining multiple aggregate functions and techniques:

Advanced Aggregate Examples
-- Percentage of total calculation
SELECT
  category,
  SUM(quantity * unit_price) AS category_revenue,
  (SUM(quantity * unit_price) / (SELECT SUM(quantity * unit_price) FROM sales)) * 100 AS revenue_percentage
FROM sales
GROUP BY category;

-- Running totals using window functions (MySQL 8.0+)
SELECT
  sale_date,
  quantity * unit_price AS daily_revenue,
  SUM(quantity * unit_price) OVER (ORDER BY sale_date) AS running_total
FROM sales
ORDER BY sale_date;

-- Conditional aggregates
SELECT
  region,
  SUM(CASE WHEN category = 'Electronics' THEN quantity * unit_price ELSE 0 END) AS electronics_revenue,
  SUM(CASE WHEN category = 'Clothing' THEN quantity * unit_price ELSE 0 END) AS clothing_revenue
FROM sales
GROUP BY region;

Best Practices for Aggregate Functions

Aggregate Guidelines:
  • Use COUNT(*) when you need to count all rows, including NULLs
  • Use COUNT(column) when you want to count non-NULL values only
  • Be aware of NULL handling in aggregate functions
  • Use GROUP BY appropriately to avoid unexpected results
  • Use HAVING for filtering groups, WHERE for filtering rows
  • Consider indexing columns used in GROUP BY for better performance
  • Use ROUND() with AVG() for more readable results
  • Test your aggregate queries with sample data to verify calculations

Common Aggregate Function Mistakes

  • Mixing aggregated and non-aggregated columns without GROUP BY
  • Using WHERE instead of HAVING for aggregate conditions
  • Forgetting NULL handling in calculations
  • Incorrect GROUP BY columns leading to wrong groupings
  • Performance issues with large datasets and no indexes
  • Misunderstanding COUNT(*) vs COUNT(column) behavior
  • Floating point precision issues with AVG()

Practice Exercise

Practice these aggregate function scenarios:

  1. Calculate total revenue, average sale amount, and number of sales
  2. Find the top 3 selling products by quantity
  3. Calculate monthly sales trends with revenue and average order value
  4. Use GROUP_CONCAT to list all customers who purchased each product
  5. Find categories with average price above overall average price
  6. Calculate running totals of sales using window functions
  7. Use conditional aggregates to compare sales across different regions
  8. Find products that have never been sold (using COUNT and LEFT JOIN)
Remember: Aggregate functions are powerful tools for data analysis and reporting. Understanding how each function handles NULL values, works with GROUP BY, and interacts with other SQL clauses will help you create accurate and efficient summary queries.

GROUP BY & HAVING in MySQL

GROUP BY Clause: Groups rows that have the same values in specified columns into summary rows. It's used with aggregate functions to perform operations on each group of rows.
HAVING Clause: Filters groups based on aggregate function results. It's similar to WHERE but works on grouped data rather than individual rows.

Why GROUP BY & HAVING Matter

These clauses are essential for:

  • Data Summarization: Creating meaningful summaries from detailed data
  • Category Analysis: Analyzing data by different categories or segments
  • Statistical Reporting: Generating reports with totals, averages, and counts
  • Data Segmentation: Breaking down data into meaningful groups
  • Business Intelligence: Supporting decision-making with grouped insights

GROUP BY vs HAVING Comparison

Aspect GROUP BY HAVING
Purpose Groups rows into summary rows Filters groups based on aggregate conditions
Operation Level Works on individual rows before grouping Works on grouped results after aggregation
Used With Aggregate functions (COUNT, SUM, AVG, etc.) Aggregate functions in conditions
Position in Query After WHERE, before HAVING After GROUP BY, before ORDER BY
Filter Type Cannot use aggregate functions Must use aggregate functions
Performance Benefits from indexed columns Applied after grouping is complete

Sample Data for Examples

Let's use an enhanced sales dataset for our examples:

Sample Sales Data
CREATE TABLE sales (
  sale_id INT PRIMARY KEY AUTO_INCREMENT,
  product_name VARCHAR(100),
  category VARCHAR(50),
  sale_date DATE,
  quantity INT,
  unit_price DECIMAL(10, 2),
  region VARCHAR(50),
  salesperson VARCHAR(50)
);

Basic GROUP BY Syntax

The fundamental structure of GROUP BY queries:

Basic GROUP BY Structure
SELECT
  column1, column2, ..., aggregate_function(column)
FROM table_name
WHERE condition
GROUP BY column1, column2, ...
HAVING aggregate_condition
ORDER BY column;

Single Column GROUP BY

Grouping by a single column with various aggregate functions:

Single Column Grouping Examples
-- Count sales by category
SELECT
  category,
  COUNT(*) AS sales_count
FROM sales
GROUP BY category
ORDER BY sales_count DESC;

-- Total revenue by region
SELECT
  region,
  SUM(quantity * unit_price) AS total_revenue
FROM sales
GROUP BY region
ORDER BY total_revenue DESC;

-- Average price by category
SELECT
  category,
  ROUND(AVG(unit_price), 2) AS average_price,
  MIN(unit_price) AS min_price,
  MAX(unit_price) AS max_price
FROM sales
GROUP BY category
ORDER BY average_price DESC;
+-------------+-------------+
| category | sales_count |
+-------------+-------------+
| Electronics | 45 |
| Clothing | 38 |
| Books | 27 |
+-------------+-------------+
3 rows in set (0.00 sec)

Multiple Column GROUP BY

Grouping by multiple columns for more detailed analysis:

Multiple Column Grouping Examples
-- Sales by region and category
SELECT
  region,
  category,
  COUNT(*) AS sales_count,
  SUM(quantity * unit_price) AS total_revenue
FROM sales
GROUP BY region, category
ORDER BY region, total_revenue DESC;

-- Monthly sales by category
SELECT
  YEAR(sale_date) AS sale_year,
  MONTH(sale_date) AS sale_month,
  category,
  SUM(quantity) AS total_quantity,
  SUM(quantity * unit_price) AS monthly_revenue
FROM sales
GROUP BY YEAR(sale_date), MONTH(sale_date), category
ORDER BY sale_year, sale_month, category;

-- Salesperson performance by region and category
SELECT
  salesperson,
  region,
  category,
  COUNT(*) AS deals_closed,
  SUM(quantity * unit_price) AS total_sales
FROM sales
GROUP BY salesperson, region, category
ORDER BY total_sales DESC;

GROUP BY with Expressions

Using expressions and functions in GROUP BY clauses:

GROUP BY with Expressions
-- Group by year and month
SELECT
  YEAR(sale_date) AS sale_year,
  MONTH(sale_date) AS sale_month,
  COUNT(*) AS sales_count
FROM sales
GROUP BY YEAR(sale_date), MONTH(sale_date)
ORDER BY sale_year, sale_month;

-- Group by price ranges
SELECT
  CASE
    WHEN unit_price < 50 THEN 'Budget'
    WHEN unit_price < 100 THEN 'Mid-range'
    ELSE 'Premium'
  END AS price_category,
  COUNT(*) AS product_count,
  AVG(unit_price) AS avg_price
FROM sales
GROUP BY price_category
ORDER BY avg_price;

-- Group by quarter
SELECT
  YEAR(sale_date) AS sale_year,
  QUARTER(sale_date) AS sale_quarter,
  SUM(quantity * unit_price) AS quarterly_revenue
FROM sales
GROUP BY YEAR(sale_date), QUARTER(sale_date)
ORDER BY sale_year, sale_quarter;

HAVING Clause Fundamentals

Using HAVING to filter groups based on aggregate conditions:

Basic HAVING Examples
-- Categories with more than 10 sales
SELECT
  category,
  COUNT(*) AS sales_count
FROM sales
GROUP BY category
HAVING COUNT(*) > 10
ORDER BY sales_count DESC;

-- Regions with total revenue over $10,000
SELECT
  region,
  SUM(quantity * unit_price) AS total_revenue
FROM sales
GROUP BY region
HAVING SUM(quantity * unit_price) > 10000
ORDER BY total_revenue DESC;

-- Categories with average price above $50
SELECT
  category,
  AVG(unit_price) AS average_price
FROM sales
GROUP BY category
HAVING AVG(unit_price) > 50
ORDER BY average_price DESC;

Complex HAVING Conditions

Using multiple conditions and different aggregate functions in HAVING:

Complex HAVING Conditions
-- Multiple HAVING conditions with AND
SELECT
  region,
  COUNT(*) AS sales_count,
  SUM(quantity * unit_price) AS total_revenue
FROM sales
GROUP BY region
  HAVING COUNT(*) > 5
    AND SUM(quantity * unit_price) > 5000
ORDER BY total_revenue DESC;

-- HAVING with OR conditions
SELECT
  category,
  COUNT(*) AS sales_count,
  AVG(unit_price) AS avg_price
FROM sales
GROUP BY category
  HAVING COUNT(*) > 20
    OR AVG(unit_price) > 100
ORDER BY sales_count DESC;

-- HAVING with different aggregate functions
SELECT
  salesperson,
  COUNT(*) AS deals,
  AVG(unit_price) AS avg_deal_size,
  SUM(quantity * unit_price) AS total_sales
FROM sales
GROUP BY salesperson
  HAVING COUNT(*) >= 10
    AND AVG(unit_price) > 50
    AND SUM(quantity * unit_price) > 1000
ORDER BY total_sales DESC;

WHERE vs HAVING Comparison

Scenario Use WHERE Use HAVING
Filter individual rows ✓ BEFORE grouping ✗ Not applicable
Filter based on aggregate results ✗ Cannot use aggregates ✓ AFTER grouping
Filter on non-aggregated columns ✓ More efficient ✓ Possible but less efficient
Performance impact Reduces rows before grouping Filters after grouping is done
Column aliases ✗ Cannot use SELECT aliases ✓ Can use SELECT aliases
WHERE vs HAVING Examples
-- Efficient: WHERE filters rows before grouping
SELECT
  category,
  COUNT(*) AS sales_count
FROM sales
WHERE sale_date >= '2023-01-01'
GROUP BY category
HAVING COUNT(*) > 5;

-- Inefficient: HAVING filters after grouping (same result)
SELECT
  category,
  COUNT(*) AS sales_count
FROM sales
GROUP BY category
HAVING COUNT(*) > 5
    AND MIN(sale_date) >= '2023-01-01';

GROUP BY with JOINs

Combining GROUP BY with table joins for complex analysis:

GROUP BY with JOIN Examples
-- Create related tables
CREATE TABLE products (
  product_id INT PRIMARY KEY,
  product_name VARCHAR(100),
  supplier_id INT
);
CREATE TABLE suppliers (
  supplier_id INT PRIMARY KEY,
  supplier_name VARCHAR(100)
);

-- Sales by supplier
SELECT
  s.supplier_name,
  COUNT(*) AS sales_count,
  SUM(sa.quantity * sa.unit_price) AS total_revenue
FROM sales sa
  JOIN products p ON sa.product_name = p.product_name
  JOIN suppliers s ON p.supplier_id = s.supplier_id
GROUP BY s.supplier_name
HAVING SUM(sa.quantity * sa.unit_price) > 5000
ORDER BY total_revenue DESC;

Advanced GROUP BY Techniques

Rollup, grouping sets, and other advanced grouping features:

Advanced Grouping Examples
-- WITH ROLLUP for subtotals (MySQL extension)
SELECT
  region,
  category,
  SUM(quantity * unit_price) AS total_revenue
FROM sales
GROUP BY region, category WITH ROLLUP;

-- Filtering ROLLUP results
SELECT
  COALESCE(region, 'All Regions') AS region,
  COALESCE(category, 'All Categories') AS category,
  SUM(quantity * unit_price) AS total_revenue
FROM sales
GROUP BY region, category WITH ROLLUP
HAVING total_revenue > 1000;

GROUP BY Best Practices

Grouping Guidelines:
  • Include all non-aggregated columns from SELECT in GROUP BY
  • Use WHERE to filter rows before grouping for better performance
  • Use HAVING only for conditions involving aggregate functions
  • Index columns used in GROUP BY for large datasets
  • Be mindful of NULL values in grouping columns
  • Use meaningful column aliases for aggregate results
  • Test your GROUP BY queries with sample data to verify groupings
  • Consider the order of grouping columns for logical results

Common GROUP BY & HAVING Mistakes

  • Omitting non-aggregated columns from GROUP BY clause
  • Using HAVING instead of WHERE for non-aggregate conditions
  • Confusing column order in multiple column GROUP BY
  • Forgetting about NULL groups in results
  • Performance issues with unindexed GROUP BY columns
  • Incorrect aggregate calculations due to wrong groupings
  • Misunderstanding WHERE vs HAVING execution order

Practice Exercise

Practice these GROUP BY and HAVING scenarios:

  1. Find the top 5 best-selling products by total quantity sold
  2. Calculate monthly revenue trends and identify months with revenue over $10,000
  3. Find salespeople who have closed more than 15 deals with average deal size over $75
  4. Identify regions where Electronics category sales exceed Clothing sales
  5. Calculate the percentage contribution of each category to total revenue
  6. Use WITH ROLLUP to create a sales summary with regional and category subtotals
  7. Find products that have been sold in all regions
  8. Identify seasonal patterns by grouping sales by quarter and year
Remember: GROUP BY and HAVING are powerful tools for data analysis and reporting. Understanding the difference between filtering rows (WHERE) and filtering groups (HAVING) is crucial for writing efficient and accurate queries. Always consider the execution order and performance implications when working with grouped data.

Subqueries in MySQL

Subquery: A SQL query nested inside another query. Also known as inner query or nested query, it's used to return data that will be used in the main query as a condition to further restrict the data to be retrieved.

Why Subqueries Matter

Subqueries are essential for:

  • Complex Filtering: Creating dynamic conditions based on other query results
  • Data Comparison: Comparing values against calculated results
  • Multi-step Analysis: Breaking complex problems into manageable steps
  • Flexible Queries: Writing queries that adapt to changing data
  • Advanced Reporting: Creating sophisticated business intelligence queries

Subquery Types Overview

Type Description Common Use Example
Scalar Subquery Returns single value Comparisons, calculations WHERE price > (SELECT AVG(price) FROM products)
Row Subquery Returns single row Multiple column comparisons WHERE (id, name) = (SELECT id, name FROM...)
Column Subquery Returns single column IN, ANY, ALL operators WHERE id IN (SELECT product_id FROM orders)
Table Subquery Returns multiple rows/columns FROM clause, JOINs SELECT * FROM (SELECT ...) AS subquery
Correlated Subquery References outer query Row-by-row processing WHERE salary > (SELECT AVG(salary) FROM employees e2 WHERE e1.dept = e2.dept)

Sample Data for Examples

Let's use employee and department data for our examples:

Sample Tables Setup
CREATE TABLE employees (
  employee_id INT PRIMARY KEY AUTO_INCREMENT,
  first_name VARCHAR(50),
  last_name VARCHAR(50),
  department VARCHAR(50),
  salary DECIMAL(10, 2),
  hire_date DATE
);

CREATE TABLE departments (
  department_id INT PRIMARY KEY,
  department_name VARCHAR(50),
  budget DECIMAL(12, 2)
);

CREATE TABLE projects (
  project_id INT PRIMARY KEY,
  project_name VARCHAR(100),
  department VARCHAR(50),
  budget DECIMAL(12, 2)
);

Subqueries in WHERE Clause

Using subqueries to create dynamic conditions:

WHERE Clause Subqueries
-- Employees with above-average salary
SELECT first_name, last_name, salary
FROM employees
WHERE salary > (
  SELECT AVG(salary) FROM employees
);

-- Employees in departments with high budget
SELECT first_name, last_name, department
FROM employees
WHERE department IN (
  SELECT department_name FROM departments
  WHERE budget > 1000000
);

-- Employees hired after the last hire in IT department
SELECT first_name, last_name, hire_date
FROM employees
WHERE hire_date > (
  SELECT MAX(hire_date) FROM employees
  WHERE department = 'IT'
);
+------------+-----------+----------+
| first_name | last_name | salary |
+------------+-----------+----------+
| John | Doe | 85000.00 |
| Jane | Smith | 92000.00 |
| Bob | Johnson | 78000.00 |
+------------+-----------+----------+
3 rows in set (0.00 sec)

Subqueries with Comparison Operators

Operator Description Subquery Must Return Example
=, !=, >, <, >=, <= Single value comparison Single value WHERE salary > (SELECT AVG(salary) FROM...)
IN / NOT IN Check list membership Single column WHERE department IN (SELECT name FROM depts)
ANY / SOME Compare to any value in list Single column WHERE salary > ANY (SELECT salary FROM...)
ALL Compare to all values in list Single column WHERE salary > ALL (SELECT salary FROM...)
EXISTS / NOT EXISTS Check if subquery returns rows Any result set WHERE EXISTS (SELECT 1 FROM projects WHERE...)
Comparison Operator Examples
-- IN: Employees in specific departments
SELECT first_name, last_name, department
FROM employees
WHERE department IN (
  SELECT department_name FROM departments
  WHERE budget > 500000
);

-- ANY: Employees earning more than any IT employee
SELECT first_name, last_name, salary
FROM employees
WHERE salary > ANY (
  SELECT salary FROM employees
  WHERE department = 'IT'
);

-- ALL: Employees earning more than all Marketing employees
SELECT first_name, last_name, salary
FROM employees
WHERE salary > ALL (
  SELECT salary FROM employees
  WHERE department = 'Marketing'
);

EXISTS and NOT EXISTS

Checking for existence of related records:

EXISTS Examples
-- Departments that have employees
SELECT department_name
FROM departments d
WHERE EXISTS (
  SELECT 1 FROM employees e
  WHERE e.department = d.department_name
);

-- Employees without projects
SELECT first_name, last_name
FROM employees e
WHERE NOT EXISTS (
  SELECT 1 FROM projects p
  WHERE p.department = e.department
);

-- Departments with high-budget projects
SELECT department_name
FROM departments d
WHERE EXISTS (
  SELECT 1 FROM projects p
  WHERE p.department = d.department_name
    AND p.budget > 100000
);

Correlated Subqueries

Subqueries that reference columns from the outer query:

Correlated Subquery Examples
-- Employees earning more than their department average
SELECT first_name, last_name, department, salary
FROM employees e1
WHERE salary > (
  SELECT AVG(salary) FROM employees e2
  WHERE e2.department = e1.department
);

-- Latest hire in each department
SELECT first_name, last_name, department, hire_date
FROM employees e1
WHERE hire_date = (
  SELECT MAX(hire_date) FROM employees e2
  WHERE e2.department = e1.department
);

-- Employees with salary in top 3 of their department
SELECT first_name, last_name, department, salary
FROM employees e1
WHERE (
  SELECT COUNT(DISTINCT e2.salary)
  FROM employees e2
  WHERE e2.department = e1.department
    AND e2.salary > e1.salary
) < 3;

Subqueries in SELECT Clause

Using subqueries to calculate additional columns:

SELECT Clause Subqueries
-- Employee salary compared to department average
SELECT
  first_name, last_name, department, salary,
  (
    SELECT AVG(salary) FROM employees e2
    WHERE e2.department = e1.department
  ) AS department_avg_salary,
  salary - (
    SELECT AVG(salary) FROM employees e2
    WHERE e2.department = e1.department
  ) AS difference_from_avg
FROM employees e1
ORDER BY department, salary DESC;

-- Department info with employee count
SELECT
  department_name,
  budget,
  (
    SELECT COUNT(*) FROM employees e
    WHERE e.department = d.department_name
  ) AS employee_count
FROM departments d;

Subqueries in FROM Clause (Derived Tables)

Using subqueries as temporary tables in the FROM clause:

FROM Clause Subqueries
-- Department summary using derived table
SELECT
  department,
  avg_salary,
  employee_count
FROM (
  SELECT
    department,
    AVG(salary) AS avg_salary,
    COUNT(*) AS employee_count
  FROM employees
  GROUP BY department
  ) AS dept_summary
WHERE avg_salary > 60000
ORDER BY avg_salary DESC;

-- Complex analysis with multiple derived tables
SELECT
  d.department_name,
  d.budget,
  es.avg_salary,
  es.employee_count
FROM departments d
  JOIN (
    SELECT
      department,
      AVG(salary) AS avg_salary,
      COUNT(*) AS employee_count
    FROM employees
    GROUP BY department
  ) es ON d.department_name = es.department;

Subqueries in HAVING Clause

Using subqueries to filter grouped results:

HAVING Clause Subqueries
-- Departments with average salary above company average
SELECT
  department,
  AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > (
  SELECT AVG(salary) FROM employees
);

-- Departments with more employees than the smallest department
SELECT
  department,
  COUNT(*) AS employee_count
FROM employees
GROUP BY department
HAVING COUNT(*) > (
  SELECT MIN(emp_count) FROM (
    SELECT COUNT(*) AS emp_count
    FROM employees
    GROUP BY department
  ) AS dept_counts
);

Subquery Performance Considerations

Subquery Type Performance Impact Optimization Tips
Scalar Subquery Low to Moderate Ensure subquery returns single value quickly
Correlated Subquery High (runs for each row) Consider rewriting as JOIN where possible
IN Subquery Moderate to High Use EXISTS instead for better performance
Derived Table (FROM) Moderate Limit rows in subquery, add indexes
EXISTS Subquery Good (stops at first match) Use indexed columns in correlation

Subquery vs JOIN Comparison

Aspect Subquery JOIN
Readability Often more intuitive for complex conditions Can be cleaner for simple relationships
Performance Correlated subqueries can be slow Generally faster for large datasets
Flexibility Better for complex calculations and conditions Better for combining multiple tables
NULL Handling EXISTS handles NULLs well JOINs may exclude NULL relationships
Use Case When you need to check conditions or calculate values When you need to combine related data
Subquery vs JOIN Examples
-- Using SUBQUERY: Employees in high-budget departments
SELECT first_name, last_name
FROM employees
WHERE department IN (
  SELECT department_name FROM departments
  WHERE budget > 1000000
);

-- Using JOIN: Same result
SELECT DISTINCT e.first_name, e.last_name
FROM employees e
  JOIN departments d ON e.department = d.department_name
WHERE d.budget > 1000000;

Common Subquery Patterns

Real-world scenarios where subqueries are particularly useful:

Common Subquery Patterns
-- Pattern 1: Finding maximum/minimum per group
SELECT product_name, category, price
FROM products p1
WHERE price = (
  SELECT MAX(price) FROM products p2
  WHERE p2.category = p1.category
);

-- Pattern 2: Checking for existence of related records
SELECT customer_name
FROM customers c
WHERE EXISTS (
  SELECT 1 FROM orders o
  WHERE o.customer_id = c.customer_id
    AND o.order_date >= '2023-01-01'
);

-- Pattern 3: Comparing to aggregate values
SELECT employee_name, salary
FROM employees
WHERE salary > (
  SELECT AVG(salary) FROM employees
);

Subquery Best Practices

Subquery Guidelines:
  • Use EXISTS instead of IN when checking for existence (better performance)
  • Avoid correlated subqueries in large datasets when possible
  • Use derived tables (FROM subqueries) for complex calculations
  • Always test subqueries independently first
  • Use meaningful aliases for subquery tables
  • Consider rewriting correlated subqueries as JOINs for better performance
  • Use appropriate indexes on columns used in subquery conditions
  • Limit the number of rows returned by subqueries when possible

Common Subquery Mistakes

  • Returning multiple rows when single value expected
  • Using correlated subqueries unnecessarily on large tables
  • Forgetting table aliases in correlated subqueries
  • Performance issues with nested subqueries
  • Incorrect operator usage (e.g., = instead of IN)
  • NULL handling issues in comparison subqueries
  • Circular references in complex subquery structures

Practice Exercise

Practice these subquery scenarios:

  1. Find employees who earn more than the average salary of their department
  2. List departments that have no employees using NOT EXISTS
  3. Find products that have never been ordered using a subquery
  4. Calculate each employee's salary as a percentage of their department's total salary
  5. Find the second highest salary in the company using a subquery
  6. List customers who have placed orders in the last 30 days using EXISTS
  7. Find departments where the average salary is greater than the company average
  8. Use a derived table to find monthly sales trends and compare to previous months
Remember: Subqueries are powerful tools for solving complex data problems, but they should be used judiciously. Always consider performance implications and test your subqueries with realistic data volumes. When in doubt, try rewriting correlated subqueries as JOINs to see if performance improves.

Indexes in MySQL

Database Index: A data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space. Indexes are used to quickly locate data without having to search every row in a database table.

Why Indexes Matter

Indexes are crucial for:

  • Query Performance: Dramatically speed up SELECT queries
  • Data Retrieval: Enable fast lookups and range scans
  • Sorting Efficiency: Improve ORDER BY and GROUP BY performance
  • Join Operations: Accelerate table joins on indexed columns
  • Constraint Enforcement: Support UNIQUE and PRIMARY KEY constraints

Index Types in MySQL

Index Type Description Use Case Syntax Example
PRIMARY KEY Unique identifier for each row, automatically indexed Every table should have one PRIMARY KEY (id)
UNIQUE Index Ensures all values are unique, allows NULL Email, username, product codes UNIQUE (email)
INDEX / KEY Standard non-unique index for faster searches Foreign keys, frequently searched columns INDEX (last_name)
FULLTEXT Specialized index for text searching Content search, document retrieval FULLTEXT (content)
SPATIAL For spatial data types (GIS data) Geographic data, mapping applications SPATIAL (location)
Composite Index Index on multiple columns Multi-column searches, covering indexes INDEX (last_name, first_name)

How Indexes Work

Understanding the underlying mechanism:

Index Analogy
-- Without index: Full table scan (like reading entire book)
SELECT * FROM employees WHERE last_name = 'Smith';
-- MySQL must check every row in the table

-- With index: Index lookup (like using book index)
CREATE INDEX idx_last_name ON employees(last_name);
SELECT * FROM employees WHERE last_name = 'Smith';
-- MySQL uses index to find matching rows quickly

Creating Indexes

Different ways to create indexes in MySQL:

Creating Indexes Examples
-- Create table with indexes
CREATE TABLE employees (
  employee_id INT PRIMARY KEY AUTO_INCREMENT,
  first_name VARCHAR(50),
  last_name VARCHAR(50),
  email VARCHAR(100) UNIQUE,
  department VARCHAR(50),
  salary DECIMAL(10, 2),
  hire_date DATE,
  INDEX idx_last_name (last_name),
  INDEX idx_department (department),
  INDEX idx_hire_date (hire_date)
);

-- Add index to existing table
CREATE INDEX idx_salary ON employees(salary);

-- Create composite index
CREATE INDEX idx_name_department ON employees(last_name, department);

-- Create unique index
CREATE UNIQUE INDEX idx_unique_email ON employees(email);

-- Create fulltext index
CREATE FULLTEXT INDEX idx_ft_content ON articles(content);

When to Create Indexes

Scenario Recommendation Example
Primary Key Columns Always (automatically created) PRIMARY KEY (id)
Foreign Key Columns Highly recommended INDEX (customer_id)
Frequently searched columns Recommended WHERE last_name = 'Smith'
Columns in WHERE clauses Consider if selective enough WHERE status = 'active'
JOIN conditions Highly recommended ON users.id = orders.user_id
ORDER BY / GROUP BY columns Recommended for large tables ORDER BY created_date DESC
Columns with high selectivity Good candidate Email, username, product_code
Columns with low selectivity Avoid (e.g., gender, status flags) WHERE gender = 'M'

Composite Indexes

Indexes on multiple columns and their optimal usage:

Composite Index Examples
-- Create composite index
CREATE INDEX idx_last_first ON employees(last_name, first_name);

-- These queries can use the composite index:
SELECT * FROM employees WHERE last_name = 'Smith';
SELECT * FROM employees WHERE last_name = 'Smith' AND first_name = 'John';
SELECT * FROM employees WHERE last_name LIKE 'S%';

-- These queries CANNOT use the composite index effectively:
SELECT * FROM employees WHERE first_name = 'John'; -- Index skip scan in MySQL 8.0+
SELECT * FROM employees WHERE last_name = 'Smith' OR first_name = 'John';

-- Composite index with included columns (MySQL 8.0+)
CREATE INDEX idx_customer_orders ON orders(customer_id, order_date) INCLUDE (total_amount);

Index Cardinality and Selectivity

Metric Definition Impact on Index Effectiveness Example
Cardinality Number of unique values in column Higher cardinality = better index Email (high), Gender (low)
Selectivity Ratio of unique values to total rows Higher selectivity = better index UserID: 100%, Status: 5%
Data Distribution How values are spread across rows Even distribution = better index Dates (even), Status (skewed)

Viewing Index Information

How to examine existing indexes and their usage:

Index Inspection Commands
-- Show all indexes on a table
SHOW INDEX FROM employees;

-- Show index information from information_schema
SELECT
  TABLE_NAME,
  INDEX_NAME,
  SEQ_IN_INDEX,
  COLUMN_NAME,
  CARDINALITY
FROM information_schema.STATISTICS
WHERE TABLE_SCHEMA = 'your_database'
  AND TABLE_NAME = 'employees';

-- Check if index is used in a query
EXPLAIN SELECT * FROM employees WHERE last_name = 'Smith';

-- Show index size and statistics
SELECT
  table_name,
  index_name,
  stat_value * @@innodb_page_size AS index_size_bytes
FROM mysql.innodb_index_stats
WHERE database_name = 'your_database'
  AND table_name = 'employees';

EXPLAIN Command for Index Analysis

Using EXPLAIN to understand query execution and index usage:

EXPLAIN Examples
-- Basic EXPLAIN usage
EXPLAIN SELECT * FROM employees WHERE last_name = 'Smith';

-- EXPLAIN output analysis
EXPLAIN FORMAT=JSON SELECT * FROM employees WHERE department = 'IT';

-- Check index usage on JOIN queries
EXPLAIN
SELECT e.first_name, e.last_name, d.department_name
FROM employees e
  JOIN departments d ON e.department_id = d.department_id
WHERE e.salary > 50000;
+----+-------------+-----------+------------+------+---------------+-------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+---------------+-------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | employees | NULL | ref | idx_last_name | idx_last_name| 203 | const | 1 | 100.00 | NULL |
+----+-------------+-----------+------------+------+---------------+-------------+---------+-------+------+----------+-------+

Index Maintenance and Optimization

Keeping indexes efficient and up-to-date:

Index Maintenance
-- Rebuild index to optimize performance
ALTER TABLE employees ENGINE = InnoDB; -- Rebuilds all indexes

-- Update index statistics
ANALYZE TABLE employees;

-- Check for unused indexes
SELECT
  OBJECT_SCHEMA,
  OBJECT_NAME,
  INDEX_NAME
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE INDEX_NAME IS NOT NULL
  AND COUNT_STAR = 0
  AND OBJECT_SCHEMA = 'your_database';

-- Monitor index usage
SELECT
  TABLE_NAME,
  INDEX_NAME,
  ROWS_READ,
  ROWS_INSERTED,
  ROWS_UPDATED,
  ROWS_DELETED
FROM performance_schema.table_io_waits_summary_by_index_usage;

Dropping and Modifying Indexes

Managing existing indexes:

Index Modification
-- Drop an index
DROP INDEX idx_last_name ON employees;

-- Rename an index (MySQL 8.0+)
ALTER TABLE employees RENAME INDEX idx_old_name TO idx_new_name;

-- Disable an index (MyISAM only)
ALTER TABLE employees DISABLE KEYS;
-- Bulk insert operations here
ALTER TABLE employees ENABLE KEYS;

-- Invisible indexes (MySQL 8.0+)
ALTER TABLE employees ALTER INDEX idx_test INVISIBLE; -- Query optimizer ignores
ALTER TABLE employees ALTER INDEX idx_test VISIBLE; -- Query optimizer uses

Index Costs and Trade-offs

Benefit Cost Consideration
Faster SELECT queries Slower INSERT/UPDATE/DELETE Each index must be updated on write operations
Improved JOIN performance Increased storage space Indexes can be 10-30% of table size
Better sorting performance Maintenance overhead Regular ANALYZE TABLE needed for statistics
Efficient data retrieval Query optimizer complexity More indexes = more choices for optimizer

Index Best Practices

Indexing Guidelines:
  • Index columns used in WHERE, JOIN, ORDER BY, and GROUP BY clauses
  • Use composite indexes for multi-column searches
  • Place the most selective columns first in composite indexes
  • Avoid over-indexing - each index has maintenance costs
  • Regularly monitor and remove unused indexes
  • Use covering indexes when possible (include all needed columns)
  • Consider index selectivity before creating indexes
  • Use EXPLAIN to verify index usage in queries
  • Rebuild indexes periodically for fragmented tables

Common Indexing Mistakes

  • Over-indexing: Creating too many indexes on frequently updated tables
  • Under-indexing: Missing indexes on foreign keys and search columns
  • Wrong column order: Poor ordering in composite indexes
  • Indexing low-selectivity columns: Gender, status flags, etc.
  • Not monitoring index usage: Keeping unused indexes
  • Ignoring index fragmentation: Not maintaining indexes over time
  • Wrong index type: Using standard index where unique is needed
  • Not testing with real data: Creating indexes without performance testing

Practice Exercise

Practice these indexing scenarios:

  1. Create appropriate indexes for a customers table with name, email, and phone searches
  2. Design composite indexes for an orders table with customer_id, order_date, and status queries
  3. Use EXPLAIN to analyze query performance before and after adding indexes
  4. Identify and remove unused indexes from an existing database
  5. Create covering indexes for frequently used reporting queries
  6. Monitor index usage statistics to identify performance bottlenecks
  7. Design indexes for a products table with category, price range, and brand searches
  8. Implement fulltext indexes for product descriptions and search functionality
Remember: Indexes are a powerful tool for optimizing database performance, but they require careful planning and ongoing maintenance. The key is to find the right balance between read performance and write overhead. Always test index changes with realistic data volumes and query patterns.

Views in MySQL

Database View: A virtual table based on the result set of a SQL query. Views don't store data themselves but provide a way to present data from one or more tables in a specific format, acting as a saved query that can be treated like a table.

Why Views Matter

Views are essential for:

  • Data Security: Restrict access to specific columns or rows
  • Query Simplification: Hide complex queries behind simple interfaces
  • Data Abstraction: Provide consistent interface despite schema changes
  • Logical Data Organization: Present data in business-friendly formats
  • Code Reusability: Avoid repeating complex query logic

View Types in MySQL

View Type Description Use Case Limitations
Simple View Based on a single table, no aggregates or GROUP BY Data security, column restrictions Can usually be updated
Complex View Based on multiple tables, uses JOINs, aggregates Reporting, data summarization Often read-only
Updatable View Allows INSERT, UPDATE, DELETE operations Data modification through abstraction Specific conditions required
Materialized View Stores physical copy of data (MySQL doesn't natively support) Performance optimization Requires manual refresh
Inline View Subquery in FROM clause (derived table) Complex query organization Temporary, not stored

Sample Data for Examples

Let's use employee and department data for our view examples:

Sample Tables Setup
CREATE TABLE employees (
  employee_id INT PRIMARY KEY AUTO_INCREMENT,
  first_name VARCHAR(50) NOT NULL,
  last_name VARCHAR(50) NOT NULL,
  email VARCHAR(100) UNIQUE,
  phone VARCHAR(20),
  salary DECIMAL(10, 2),
  department_id INT,
  hire_date DATE,
  is_active BOOLEAN DEFAULT TRUE
);

CREATE TABLE departments (
  department_id INT PRIMARY KEY,
  department_name VARCHAR(50) NOT NULL,
  manager_id INT,
  budget DECIMAL(12, 2)
);

CREATE TABLE projects (
  project_id INT PRIMARY KEY,
  project_name VARCHAR(100),
  department_id INT,
  start_date DATE,
  end_date DATE,
  budget DECIMAL(12, 2)
);

Creating Basic Views

Creating simple and complex views:

Basic View Creation
-- Simple view: Active employees only
CREATE VIEW active_employees AS
SELECT employee_id, first_name, last_name, email, department_id
FROM employees
WHERE is_active = TRUE;

-- Complex view: Employee details with department info
CREATE VIEW employee_directory AS
SELECT
  e.employee_id,
  e.first_name,
  e.last_name,
  e.email,
  e.phone,
  d.department_name,
  e.hire_date
FROM employees e
  JOIN departments d ON e.department_id = d.department_id
WHERE e.is_active = TRUE;

-- View with calculated columns
CREATE VIEW employee_salary_summary AS
SELECT
  department_id,
  COUNT(*) AS employee_count,
  AVG(salary) AS average_salary,
  MAX(salary) AS max_salary,
  MIN(salary) AS min_salary
FROM employees
WHERE is_active = TRUE
GROUP BY department_id;

Using Views

Querying views just like regular tables:

View Usage Examples
-- Query the view like a table
SELECT * FROM employee_directory;

-- Filter and sort view data
SELECT first_name, last_name, department_name
FROM employee_directory
WHERE department_name = 'Engineering'
ORDER BY last_name, first_name;

-- Join views with other tables or views
SELECT
  ed.first_name,
  ed.last_name,
  ed.department_name,
  p.project_name
FROM employee_directory ed
  JOIN projects p ON ed.department_id = p.department_id;

-- Use aggregate functions on views
SELECT
  department_name,
  COUNT(*) AS employee_count
FROM employee_directory
GROUP BY department_name
ORDER BY employee_count DESC;
+------------+-----------+-----------------+---------------------+
| first_name | last_name | department_name | email |
+------------+-----------+-----------------+---------------------+
| John | Doe | Engineering | john.doe@company.com|
| Jane | Smith | Marketing | jane.smith@company.com |
| Bob | Johnson | Engineering | bob.johnson@company.com |
+------------+-----------+-----------------+---------------------+
3 rows in set (0.00 sec)

View Options and Features

Option Description Syntax Use Case
WITH CHECK OPTION Ensures data modifications through view satisfy view definition CREATE VIEW ... WITH CHECK OPTION Data integrity in updatable views
ALGORITHM Specifies how MySQL processes the view ALGORITHM = MERGE | TEMPTABLE Performance optimization
DEFINER Specifies the user who defined the view DEFINER = user@host Security and access control
SQL SECURITY Specifies security context for view execution SQL SECURITY DEFINER | INVOKER Access control management
Advanced View Options
-- View with CHECK OPTION
CREATE VIEW high_salary_employees AS
SELECT employee_id, first_name, last_name, salary
FROM employees
WHERE salary > 80000
WITH CHECK OPTION;

-- View with specific algorithm
CREATE ALGORITHM = MERGE VIEW department_summary AS
SELECT d.department_name, COUNT(e.employee_id) AS employee_count
FROM departments d
  LEFT JOIN employees e ON d.department_id = e.department_id
GROUP BY d.department_name;

-- View with security options
CREATE SQL SECURITY INVOKER VIEW employee_contacts AS
SELECT first_name, last_name, email, phone
FROM employees
WHERE is_active = TRUE;

Updatable Views

Views that allow data modification operations:

Updatable View Examples
-- Create an updatable view
CREATE VIEW employee_basic_info AS
SELECT employee_id, first_name, last_name, email, department_id
FROM employees
WHERE is_active = TRUE;

-- Update through view
UPDATE employee_basic_info
SET email = 'new.email@company.com'
WHERE employee_id = 1;

-- Insert through view
INSERT INTO employee_basic_info
(first_name, last_name, email, department_id)
VALUES ('Sarah', 'Wilson', 'sarah.wilson@company.com', 2);

-- Delete through view
DELETE FROM employee_basic_info
WHERE employee_id = 5;

View Modification and Management

Altering, replacing, and dropping views:

View Management
-- Replace an existing view
CREATE OR REPLACE VIEW employee_directory AS
SELECT
  e.employee_id,
  e.first_name,
  e.last_name,
  e.email,
  d.department_name,
  e.hire_date,
  e.salary -- Added salary column
FROM employees e
  JOIN departments d ON e.department_id = d.department_id
WHERE e.is_active = TRUE;

-- Alter view (MySQL doesn't support direct ALTER VIEW for definition)
-- Use CREATE OR REPLACE instead

-- Drop a view
DROP VIEW IF EXISTS employee_basic_info;

-- Show all views in database
SHOW FULL TABLES WHERE TABLE_TYPE = 'VIEW';

-- Show view definition
SHOW CREATE VIEW employee_directory;

View Security and Access Control

Security Aspect Description Implementation Benefit
Column-level Security Hide sensitive columns from users Exclude columns from view SELECT Protect salary, SSN, personal data
Row-level Security Restrict rows based on conditions Use WHERE clause in view definition Department-specific data access
SQL SECURITY DEFINER View runs with definer's privileges CREATE SQL SECURITY DEFINER VIEW Grant access through view without table permissions
SQL SECURITY INVOKER View runs with invoker's privileges CREATE SQL SECURITY INVOKER VIEW Stricter security, user must have underlying permissions
Security-focused Views
-- Department-specific view for managers
CREATE VIEW engineering_employees AS
SELECT employee_id, first_name, last_name, email, phone
FROM employees
WHERE department_id = 1 -- Engineering department
  AND is_active = TRUE;

-- Public employee directory (hide sensitive info)
CREATE VIEW public_employee_directory AS
SELECT first_name, last_name, email, department_id
FROM employees
WHERE is_active = TRUE;

-- View with definer security
CREATE SQL SECURITY DEFINER VIEW sensitive_employee_data AS
SELECT employee_id, first_name, last_name, salary, hire_date
FROM employees
WHERE is_active = TRUE;

Performance Considerations

Factor Impact Recommendation
View Complexity Complex views with multiple JOINs and aggregates can be slow Use simple views when possible, consider materialized patterns
Algorithm Choice MERGE vs TEMPTABLE affects performance Let MySQL choose automatically in most cases
Underlying Indexes Views benefit from indexes on base tables Ensure proper indexing on tables used in views
Nested Views Views based on other views can compound performance issues Avoid deep nesting of views
Data Volume Large datasets in views can be slow Use WHERE clauses to limit data in views

View Limitations and Restrictions

Understanding what views cannot do:

View Limitations
-- Cannot use subqueries in FROM clause in updatable views
-- This view is NOT updatable:
CREATE VIEW department_stats AS
SELECT d.department_name, stats.avg_salary
FROM departments d
  JOIN (
    SELECT department_id, AVG(salary) AS avg_salary
    FROM employees
    GROUP BY department_id
  ) stats ON d.department_id = stats.department_id;

-- Cannot contain TEMPORARY tables
-- Cannot be created from temporary tables

-- Some aggregate views are not updatable
CREATE VIEW employee_count_by_dept AS
SELECT department_id, COUNT(*) AS employee_count
FROM employees
GROUP BY department_id;
-- This view is read-only

Best Practices for Using Views

View Guidelines:
  • Use views for security to restrict column and row access
  • Simplify complex queries by encapsulating them in views
  • Use meaningful, descriptive names for views
  • Document view purposes and usage in comments
  • Avoid nesting views too deeply (max 2-3 levels)
  • Consider performance implications of complex views
  • Use WITH CHECK OPTION for updatable views to maintain integrity
  • Test views thoroughly with different user permissions
  • Monitor view performance and optimize underlying queries

Common View Mistakes

  • Overusing views: Creating views for simple queries that don't need abstraction
  • Performance issues: Not considering the performance impact of complex views
  • Security misconfiguration: Incorrect use of SQL SECURITY options
  • Nested view complexity: Creating views that depend on other views excessively
  • Assuming updatability: Not checking if a view is actually updatable
  • Missing documentation: Not documenting view purposes and limitations
  • Ignoring base table changes: Not updating views when underlying tables change
  • Wrong algorithm choice: Manually specifying algorithm when not needed

Practice Exercise

Practice these view scenarios:

  1. Create a view that shows employee names and departments without revealing salaries
  2. Design a view for managers that only shows employees in their department
  3. Create a reporting view that summarizes project budgets by department
  4. Implement an updatable view for HR to modify basic employee information
  5. Create a view with WITH CHECK OPTION to ensure data integrity
  6. Design a view that combines employee, department, and project information
  7. Create a view that shows monthly hiring trends using date functions
  8. Implement security-focused views with SQL SECURITY DEFINER option
Remember: Views are powerful tools for data abstraction, security, and query simplification. However, they should be used judiciously with consideration for performance and maintenance. Well-designed views can make your database more secure and easier to use, while poorly designed views can create performance bottlenecks and maintenance challenges.

Stored Procedures in MySQL

Stored Procedure: A prepared SQL code that you can save and reuse. Stored procedures are stored in the database server and can be called by applications, providing a way to encapsulate complex business logic and database operations.

Why Stored Procedures Matter

Stored procedures are essential for:

  • Code Reusability: Write once, use multiple times
  • Performance: Reduced network traffic and pre-compiled execution
  • Security: Control data access through procedure permissions
  • Maintainability: Centralize business logic in database
  • Transaction Management: Complex operations in single transaction

Stored Procedure Advantages

Advantage Description Benefit
Performance Pre-compiled and stored on server Faster execution than dynamic SQL
Security Grant execute permission without table access Better access control and data protection
Maintenance Business logic centralized in database Easier updates and bug fixes
Network Traffic Single call instead of multiple queries Reduced network overhead
Transaction Control Complex operations in single transaction Data integrity and consistency

Basic Stored Procedure Syntax

Fundamental structure of stored procedures:

Basic Stored Procedure Structure
DELIMITER $$

CREATE PROCEDURE procedure_name(IN param1 datatype, OUT param2 datatype)
BEGIN
  -- SQL statements here
  SELECT * FROM table_name;
END $$

DELIMITER ;

Sample Data for Examples

Let's use a banking system example for our procedures:

Sample Banking Tables
CREATE TABLE customers (
  customer_id INT PRIMARY KEY AUTO_INCREMENT,
  first_name VARCHAR(50) NOT NULL,
  last_name VARCHAR(50) NOT NULL,
  email VARCHAR(100) UNIQUE,
  phone VARCHAR(20),
  created_date DATE DEFAULT (CURRENT_DATE)
);

CREATE TABLE accounts (
  account_id INT PRIMARY KEY AUTO_INCREMENT,
  customer_id INT,
  account_type VARCHAR(20),
  balance DECIMAL(15, 2) DEFAULT 0.00,
  created_date DATE DEFAULT (CURRENT_DATE),
  FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

CREATE TABLE transactions (
  transaction_id INT PRIMARY KEY AUTO_INCREMENT,
  account_id INT,
  transaction_type ENUM('DEPOSIT', 'WITHDRAWAL', 'TRANSFER'),
  amount DECIMAL(15, 2),
  transaction_date DATETIME DEFAULT CURRENT_TIMESTAMP,
  description VARCHAR(255),
  FOREIGN KEY (account_id) REFERENCES accounts(account_id)
);

Creating Simple Stored Procedures

Basic procedures without parameters:

Simple Stored Procedures
DELIMITER $$

-- Procedure to get all customers
CREATE PROCEDURE GetAllCustomers()
BEGIN
  SELECT customer_id, first_name, last_name, email, phone
  FROM customers
  ORDER BY last_name, first_name;
END $$

-- Procedure to get customer count
CREATE PROCEDURE GetCustomerCount()
BEGIN
  SELECT COUNT(*) AS total_customers FROM customers;
END $$

-- Procedure with complex logic
CREATE PROCEDURE GetAccountSummary()
BEGIN
  SELECT
    c.first_name,
    c.last_name,
    a.account_type,
    a.balance,
    COUNT(t.transaction_id) AS transaction_count
  FROM customers c
    JOIN accounts a ON c.customer_id = a.customer_id
    LEFT JOIN transactions t ON a.account_id = t.account_id
  GROUP BY c.customer_id, a.account_id
  ORDER BY a.balance DESC;
END $$

DELIMITER ;

Parameter Types in Stored Procedures

Parameter Type Description Syntax Use Case
IN Input parameter (default if not specified) IN param_name datatype Pass values to procedure
OUT Output parameter returns value to caller OUT param_name datatype Return calculated results
INOUT Both input and output parameter INOUT param_name datatype Modify and return values

Procedures with Parameters

Using different parameter types:

Parameterized Procedures
DELIMITER $$

-- IN parameter: Get customer by ID
CREATE PROCEDURE GetCustomerById(IN customerId INT)
BEGIN
  SELECT customer_id, first_name, last_name, email, phone
  FROM customers
  WHERE customer_id = customerId;
END $$

-- OUT parameter: Get total balance
CREATE PROCEDURE GetTotalBalance(OUT total DECIMAL(15, 2))
BEGIN
  SELECT SUM(balance) INTO total FROM accounts;
END $$

-- INOUT parameter: Increment counter
CREATE PROCEDURE IncrementCounter(INOUT counter INT)
BEGIN
  SET counter = counter + 1;
END $$

-- Multiple parameters
CREATE PROCEDURE GetCustomersByDateRange(
  IN startDate DATE,
  IN endDate DATE
)
BEGIN
  SELECT customer_id, first_name, last_name, created_date
  FROM customers
  WHERE created_date BETWEEN startDate AND endDate
  ORDER BY created_date DESC;
END $$

DELIMITER ;

Calling Stored Procedures

How to execute procedures with different parameter types:

Calling Procedures
-- Call procedure without parameters
CALL GetAllCustomers();

-- Call procedure with IN parameters
CALL GetCustomerById(1);
CALL GetCustomersByDateRange('2023-01-01', '2023-12-31');

-- Call procedure with OUT parameters
SET @total = 0;
CALL GetTotalBalance(@total);
SELECT @total AS total_balance;

-- Call procedure with INOUT parameters
SET @counter = 5;
CALL IncrementCounter(@counter);
SELECT @counter AS updated_counter;
+------------+-----------+-----------+---------------------+
| customer_id| first_name| last_name | email |
+------------+-----------+-----------+---------------------+
| 1 | John | Doe | john.doe@email.com |
+------------+-----------+-----------+---------------------+
1 row in set (0.00 sec)

Variables and Control Structures

Using variables, conditionals, and loops in procedures:

Variables and Control Flow
DELIMITER $$

CREATE PROCEDURE CalculateAccountStats(IN accountId INT)
BEGIN
  -- Variable declarations
  DECLARE accountBalance DECIMAL(15, 2);
  DECLARE transactionCount INT DEFAULT 0;
  DECLARE avgTransaction DECIMAL(15, 2);
  DECLARE accountStatus VARCHAR(20);

  -- Get account balance
  SELECT balance INTO accountBalance
  FROM accounts WHERE account_id = accountId;

  -- Get transaction count
  SELECT COUNT(*) INTO transactionCount
  FROM transactions WHERE account_id = accountId;

  -- Calculate average transaction
  IF transactionCount > 0 THEN
    SELECT AVG(amount) INTO avgTransaction
    FROM transactions WHERE account_id = accountId;
  ELSE
    SET avgTransaction = 0;
  END IF;

  -- Determine account status
  CASE
    WHEN accountBalance > 10000 THEN SET accountStatus = 'PREMIUM';
    WHEN accountBalance > 1000 THEN SET accountStatus = 'STANDARD';
    ELSE SET accountStatus = 'BASIC';
  END CASE;

  -- Return results
  SELECT
    accountBalance AS balance,
    transactionCount AS transactions,
    avgTransaction AS avg_transaction_amount,
    accountStatus AS status;
END $$

DELIMITER ;

Error Handling and Transactions

Implementing robust error handling and transaction management:

Error Handling and Transactions
DELIMITER $$

CREATE PROCEDURE TransferFunds(
  IN fromAccount INT,
  IN toAccount INT,
  IN transferAmount DECIMAL(15, 2),
  OUT success BOOLEAN,
  OUT message VARCHAR(255)
)
BEGIN
  DECLARE fromBalance DECIMAL(15, 2);
  DECLARE exit handler FOR sqlexception
  BEGIN
    ROLLBACK;
    SET success = FALSE;
    SET message = 'Transaction failed due to system error';
  END;

  -- Initialize variables
  SET success = FALSE;
  SET message = '';

  -- Start transaction
  START TRANSACTION;

  -- Check if from account exists and has sufficient funds
  SELECT balance INTO fromBalance
  FROM accounts WHERE account_id = fromAccount FOR UPDATE;

  IF fromBalance IS NULL THEN
    SET message = 'Source account not found';
    ROLLBACK;
    LEAVE proc_exit;
  END IF;

  IF fromBalance < transferAmount THEN
    SET message = 'Insufficient funds';
    ROLLBACK;
    LEAVE proc_exit;
  END IF;

  -- Perform the transfer
  UPDATE accounts SET balance = balance - transferAmount
  WHERE account_id = fromAccount;

  UPDATE accounts SET balance = balance + transferAmount
  WHERE account_id = toAccount;

  -- Record transactions
  INSERT INTO transactions (account_id, transaction_type, amount, description)
  VALUES (fromAccount, 'WITHDRAWAL', transferAmount, 'Transfer to account ' + toAccount);

  INSERT INTO transactions (account_id, transaction_type, amount, description)
  VALUES (toAccount, 'DEPOSIT', transferAmount, 'Transfer from account ' + fromAccount);

  -- Commit transaction
  COMMIT;
  SET success = TRUE;
  SET message = 'Transfer completed successfully';
END $$

DELIMITER ;

Managing Stored Procedures

Viewing, modifying, and dropping procedures:

Procedure Management
-- Show all stored procedures
SHOW PROCEDURE STATUS WHERE Db = 'your_database';

-- Show procedure definition
SHOW CREATE PROCEDURE GetCustomerById;

-- Drop a procedure
DROP PROCEDURE IF EXISTS GetCustomerById;

-- Alter procedure (MySQL doesn't support direct ALTER PROCEDURE)
-- Use DROP and CREATE instead
DROP PROCEDURE IF EXISTS GetCustomerById;
CREATE PROCEDURE GetCustomerById(...) -- New definition

Stored Procedure Best Practices

Procedure Guidelines:
  • Use meaningful, descriptive names for procedures
  • Include comprehensive error handling
  • Use transactions for multi-step operations
  • Document procedures with comments
  • Validate input parameters
  • Avoid using SELECT * - specify columns explicitly
  • Consider performance implications of procedure logic
  • Test procedures with various input scenarios
  • Use consistent naming conventions

Common Stored Procedure Mistakes

  • No error handling: Procedures fail without proper error messages
  • Missing transactions: Data inconsistency in multi-step operations
  • Poor naming conventions: Unclear procedure purposes
  • No input validation: Procedures fail with invalid parameters
  • Overly complex procedures: Difficult to maintain and debug
  • No documentation: Hard for other developers to understand
  • Performance issues: Not optimizing queries within procedures
  • Security vulnerabilities: SQL injection risks with dynamic SQL

Practice Exercise

Practice these stored procedure scenarios:

  1. Create a procedure to open a new bank account for an existing customer
  2. Write a procedure that calculates interest for all accounts
  3. Create a procedure to generate monthly account statements
  4. Implement a procedure with transaction handling for account closures
  5. Write a procedure that finds customers with low account activity
  6. Create a procedure with error handling for fund withdrawals
  7. Implement a procedure that generates financial reports with parameters
  8. Write a procedure that updates customer information with validation
Remember: Stored procedures are powerful tools for encapsulating business logic, improving performance, and enhancing security. However, they should be designed carefully with proper error handling, transactions, and documentation. Well-designed procedures can significantly improve application performance and maintainability.

MySQL Triggers

Triggers: Database objects that automatically execute in response to specific events (INSERT, UPDATE, DELETE) on a particular table. Triggers are stored programs associated with a table that activate when the table is modified.

What are Triggers?

Triggers are powerful database features that allow you to automatically execute SQL code when certain database events occur. They help maintain data integrity, enforce business rules, and automate repetitive tasks without requiring application-level code changes.

Trigger Timing and Events

Triggers can be defined with different timing and event combinations:

Timing Event Description
BEFORE INSERT Executes before new data is inserted
AFTER INSERT Executes after new data is inserted
BEFORE UPDATE Executes before data is updated
AFTER UPDATE Executes after data is updated
BEFORE DELETE Executes before data is deleted
AFTER DELETE Executes after data is deleted

Basic Trigger Syntax

SQL Statement
CREATE TRIGGER trigger_name
    trigger_time trigger_event
    ON table_name FOR EACH ROW
    BEGIN
        -- trigger body
        -- SQL statements
    END;

Example 1: BEFORE INSERT Trigger

This trigger automatically sets the creation timestamp before inserting a new record:

SQL Statement
CREATE TRIGGER before_employee_insert
    BEFORE INSERT ON employees FOR EACH ROW
    BEGIN
        SET NEW.created_at = NOW();
        SET NEW.updated_at = NOW();
    END;

Example 2: AFTER INSERT Trigger

This trigger maintains an audit log after new employees are added:

SQL Statement
-- First create an audit table
CREATE TABLE employee_audit (
  id INT PRIMARY KEY AUTO_INCREMENT,
  employee_id INT,
  action VARCHAR(50),
  action_date DATETIME,
  details TEXT
);

-- Create the trigger
CREATE TRIGGER after_employee_insert
    AFTER INSERT ON employees FOR EACH ROW
    BEGIN
        INSERT INTO employee_audit (employee_id, action, action_date, details)
        VALUES (NEW.id, 'INSERT', NOW(), CONCAT('New employee: ', NEW.first_name, ' ', NEW.last_name));
    END;

Example 3: BEFORE UPDATE Trigger

This trigger automatically updates the modification timestamp and validates salary changes:

SQL Statement
CREATE TRIGGER before_employee_update
    BEFORE UPDATE ON employees FOR EACH ROW
    BEGIN
        -- Update timestamp
        SET NEW.updated_at = NOW();
        
        -- Validate salary increase doesn't exceed 50%
        IF NEW.salary > OLD.salary * 1.5 THEN
            SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = 'Salary increase cannot exceed 50%';
        END IF;
    END;

Using OLD and NEW Keywords

In triggers, you can reference the old and new values of rows using OLD and NEW prefixes:

Trigger Type OLD Reference NEW Reference
INSERT NULL (no old values) New row being inserted
UPDATE Original row before update New row after update
DELETE Row being deleted NULL (no new values)

Managing Triggers

View, modify, and delete triggers using these commands:

SQL Statements
-- View all triggers in the database
SHOW TRIGGERS;

-- View triggers for a specific table
SHOW TRIGGERS LIKE 'employees';

-- Drop a trigger
DROP TRIGGER IF EXISTS before_employee_insert;

-- Create or replace a trigger
CREATE OR REPLACE TRIGGER trigger_name
    -- trigger definition

Real-World Use Cases

  • Audit Trail: Track all changes to sensitive data
  • Data Validation: Enforce complex business rules
  • Automatic Timestamps: Set created_at and updated_at fields
  • Derived Columns: Calculate and store computed values
  • Referential Integrity: Maintain consistency across related tables
  • Notification Systems: Send alerts when specific conditions are met
Note: Triggers execute within the same transaction as the statement that invoked them. If a trigger fails, the entire transaction will be rolled back.
Warning: Be cautious with recursive triggers and avoid creating infinite loops. Also, triggers can impact performance, so use them judiciously and test thoroughly.
Best Practices:
  • Keep trigger logic simple and efficient
  • Document triggers thoroughly
  • Avoid complex business logic in triggers
  • Test triggers with various data scenarios
  • Monitor trigger performance in production

MySQL Transactions

Transactions: A sequence of database operations that are treated as a single logical unit of work. Transactions follow the ACID properties (Atomicity, Consistency, Isolation, Durability) to ensure database reliability and data integrity.

What are Transactions?

Transactions allow you to execute multiple SQL statements as a single unit. Either all operations within the transaction complete successfully, or if any operation fails, all changes are rolled back to maintain database consistency. This is crucial for maintaining data integrity in multi-step operations.

ACID Properties

Transactions in MySQL adhere to the ACID principles:

Property Description MySQL Implementation
Atomicity All operations in a transaction succeed or all fail COMMIT and ROLLBACK statements
Consistency Database remains in a consistent state before and after transaction Constraints, triggers, and business rules
Isolation Concurrent transactions don't interfere with each other Transaction isolation levels
Durability Committed transactions persist even after system failure Write-ahead logging and crash recovery

Transaction Control Statements

SQL Statements
-- Start a transaction explicitly
START TRANSACTION;
-- Or use BEGIN
BEGIN;

-- Execute your SQL operations
INSERT INTO accounts (account_id, balance) VALUES (1, 1000.00);
UPDATE accounts SET balance = balance - 100 WHERE account_id = 1;

-- Make changes permanent
COMMIT;

-- Or undo all changes in the transaction
ROLLBACK;

Example 1: Bank Transfer Transaction

This example demonstrates a secure bank transfer between two accounts:

SQL Statement
START TRANSACTION;

-- Deduct amount from sender
UPDATE accounts
  SET balance = balance - 500.00
  WHERE account_id = 123
  AND balance >= 500.00;

-- Check if deduction was successful
IF ROW_COUNT() = 0 THEN
    ROLLBACK;
    SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = 'Insufficient funds';
  END IF;

-- Add amount to receiver
UPDATE accounts
  SET balance = balance + 500.00
  WHERE account_id = 456;

-- Record the transaction
INSERT INTO transactions (from_account, to_account, amount, transaction_date)
  VALUES (123, 456, 500.00, NOW());

-- Commit if all operations succeed
COMMIT;

Transaction Isolation Levels

MySQL supports different isolation levels to control transaction visibility:

Isolation Level Dirty Read Non-Repeatable Read Phantom Read Performance
READ UNCOMMITTED Possible Possible Possible Highest
READ COMMITTED Not Possible Possible Possible High
REPEATABLE READ Not Possible Not Possible Possible Medium
SERIALIZABLE Not Possible Not Possible Not Possible Lowest

Setting Isolation Levels

SQL Statements
-- View current isolation level
SELECT @@transaction_isolation;

-- Set isolation level for current session
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;

-- Set isolation level for next transaction only
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

-- Set global isolation level
SET GLOBAL TRANSACTION ISOLATION LEVEL REPEATABLE READ;

Savepoints

Savepoints allow you to roll back to specific points within a transaction:

SQL Statement
START TRANSACTION;

-- First operation
INSERT INTO orders (customer_id, total) VALUES (1, 100.00);
SET @order_id = LAST_INSERT_ID();

-- Create a savepoint
SAVEPOINT after_order_created;

-- Second operation that might fail
INSERT INTO order_items (order_id, product_id, quantity)
  VALUES (@order_id, 101, 2);

-- If item insertion fails, rollback to savepoint
IF ROW_COUNT() = 0 THEN
    ROLLBACK TO after_order_created;
    -- Continue with alternative logic
  END IF;

COMMIT;

-- Release a savepoint
RELEASE SAVEPOINT after_order_created;

Auto-Commit Mode

MySQL has an auto-commit mode that automatically commits each statement:

SQL Statements
-- Check auto-commit status
SELECT @@autocommit;

-- Disable auto-commit (0 = OFF, 1 = ON)
SET autocommit = 0;

-- With auto-commit OFF, you must explicitly commit
INSERT INTO table1 VALUES (1);
INSERT INTO table2 VALUES (2);
COMMIT;

-- Re-enable auto-commit
SET autocommit = 1;

Common Transaction Patterns

  • Financial Transactions: Ensure money transfers are atomic
  • Inventory Management: Prevent overselling with concurrent updates
  • Order Processing: Create orders and update inventory atomically
  • Data Migration: Move data between tables reliably
  • Batch Operations: Process multiple records as a single unit
Note: InnoDB storage engine supports transactions in MySQL. MyISAM and other storage engines do not support transactions. Always use InnoDB for transactional operations.
Warning: Long-running transactions can lock resources and impact database performance. Keep transactions as short as possible and avoid user interaction within transactions.
Best Practices:
  • Keep transactions short and focused
  • Handle exceptions and implement proper rollback logic
  • Choose appropriate isolation levels for your use case
  • Test transactions with concurrent access scenarios
  • Monitor transaction logs and performance metrics
  • Use savepoints for complex multi-step transactions

MySQL Backup & Recovery

Backup & Recovery: The process of creating copies of database data and structures to protect against data loss, and the procedures to restore data in case of failures, disasters, or accidental deletions.

Why Backup is Essential

Regular backups are crucial for database management to protect against various risks including hardware failures, human errors, software bugs, security breaches, and natural disasters. A proper backup strategy ensures business continuity and data integrity.

Types of MySQL Backups

Backup Type Description Advantages Disadvantages
Logical Backup SQL statements that recreate database objects and data Portable, version-independent, readable Slower for large databases, requires processing power
Physical Backup Copy of actual database files and directories Faster, more efficient for large databases Not portable across versions, binary format
Hot Backup Backup while database is running and accessible No downtime required More complex to implement
Cold Backup Backup while database is shut down Simple, consistent backup Requires downtime
Incremental Backup Only changed data since last backup Faster, less storage required More complex recovery process
Full Backup Complete copy of entire database Simple recovery process Time-consuming, requires more storage

Logical Backup with mysqldump

mysqldump is the most common tool for creating logical backups in MySQL:

Command Line Examples
# Backup a single database
mysqldump -u username -p database_name > backup_file.sql

# Backup all databases
mysqldump -u username -p --all-databases > full_backup.sql

# Backup specific tables
mysqldump -u username -p database_name table1 table2 > tables_backup.sql

# Backup with compression
mysqldump -u username -p database_name | gzip > backup_file.sql.gz

# Backup with consistent data (recommended for InnoDB)
mysqldump -u username -p --single-transaction database_name > backup.sql

# Backup structure only (no data)
mysqldump -u username -p --no-data database_name > structure.sql

# Backup data only (no structure)
mysqldump -u username -p --no-create-info database_name > data.sql

Restoring from mysqldump Backups

Command Line Examples
# Restore a database
mysql -u username -p database_name < backup_file.sql

# Restore all databases
mysql -u username -p < full_backup.sql

# Restore compressed backup
gunzip < backup_file.sql.gz | mysql -u username -p database_name

# Restore to different database name
mysql -u username -p new_database_name < backup_file.sql

Physical Backup Methods

Physical backups copy actual database files for faster backup and recovery:

Command Line Examples
# Cold backup - stop MySQL, copy files, restart
systemctl stop mysql
cp -r /var/lib/mysql /backup/mysql_backup
systemctl start mysql

# Using MySQL Enterprise Backup (commercial)
mysqlbackup --backup-dir=/backup/mysql backup

# Using Percona XtraBackup (open-source alternative)
xtrabackup --backup --target-dir=/backup/mysql
xtrabackup --prepare --target-dir=/backup/mysql

Binary Log Backup and Point-in-Time Recovery

Binary logs record all data changes, enabling point-in-time recovery:

SQL and Command Line Examples
-- Enable binary logging in my.cnf
[mysqld]
log-bin=mysql-bin
server-id=1

# View binary logs
mysql -u username -p -e "SHOW BINARY LOGS;"

# Backup binary logs
cp /var/lib/mysql/mysql-bin.* /backup/binary_logs/

# Point-in-time recovery process
# 1. Restore last full backup
mysql -u root -p < full_backup.sql
# 2. Apply binary logs up to specific time
mysqlbinlog --stop-datetime="2024-01-15 14:30:00" \
  mysql-bin.000001 mysql-bin.000002 | mysql -u root -p

Automated Backup Strategies

Create automated backup scripts for consistent protection:

Shell Script Example
#!/bin/bash
# Daily backup script

BACKUP_DIR="/backup/mysql"
DATE=$(date +%Y%m%d_%H%M%S)
DB_USER="backup_user"
DB_PASSWORD="secure_password"

# Create backup directory
mkdir -p $BACKUP_DIR/$DATE

# Backup each database separately
DATABASES=$(mysql -u $DB_USER -p$DB_PASSWORD -e "SHOW DATABASES;" | grep -Ev "(Database|information_schema|performance_schema)")

for DB in $DATABASES; do
  echo "Backing up $DB..."
  mysqldump -u $DB_USER -p$DB_PASSWORD --single-transaction --routines --triggers $DB | gzip > $BACKUP_DIR/$DATE/$DB.sql.gz
done

# Backup binary logs
mysql -u $DB_USER -p$DB_PASSWORD -e "FLUSH BINARY LOGS;"
cp /var/lib/mysql/mysql-bin.* $BACKUP_DIR/$DATE/

# Clean up old backups (keep 30 days)
find $BACKUP_DIR -type d -mtime +30 -exec rm -rf {} \;

echo "Backup completed: $BACKUP_DIR/$DATE"

Recovery Scenarios

Scenario Recovery Method Steps
Accidental DELETE Point-in-Time Recovery Restore backup, apply binary logs up to before DELETE
Table Corruption Table-level Recovery Restore specific table from backup
Full Server Failure Complete Restore Restore all databases and binary logs
Single Database Loss Database-level Recovery Restore specific database only
Data Center Disaster Disaster Recovery Restore to alternate location from offsite backups

MySQL Enterprise Backup

For enterprise environments, consider MySQL Enterprise Backup:

  • Online hot backups with minimal performance impact
  • Incremental and compressed backups
  • Point-in-time recovery capabilities
  • Backup encryption and verification
  • Integration with MySQL Enterprise Monitor

Best Practices for Backup Strategy

3-2-1 Backup Rule:
  • 3 copies of your data
  • 2 different storage media
  • 1 copy stored offsite
Backup Testing: Regularly test your backup recovery process to ensure it works when needed. An untested backup is as good as no backup.
Security: Protect your backup files with proper permissions and encryption, especially when storing offsite or in the cloud. Backup files contain sensitive data.

Monitoring and Maintenance

SQL Statements for Monitoring
-- Check binary log status
SHOW MASTER STATUS;

-- Monitor backup completion
SELECT TABLE_SCHEMA, TABLE_NAME, UPDATE_TIME
FROM information_schema.tables
ORDER BY UPDATE_TIME DESC;

-- Check database size for backup planning
SELECT TABLE_SCHEMA AS 'Database',
     ROUND(SUM(data_length + index_length) / 1024 / 1024, 2) AS 'Size (MB)'
FROM information_schema.tables
GROUP BY TABLE_SCHEMA;

MySQL Security & Users

MySQL Security: The comprehensive set of features and practices that protect database systems from unauthorized access, data breaches, and malicious activities through user management, privileges, authentication, and encryption.

MySQL Security Architecture

MySQL implements a multi-layered security model that includes user authentication, privilege verification, network security, and data encryption. Understanding this architecture is essential for implementing robust database security.

User Management

Creating and managing database users is the foundation of MySQL security:

SQL Statements - User Management
-- Create a new user
CREATE USER 'username'@'hostname' IDENTIFIED BY 'strong_password';

-- Create user with specific authentication plugin
CREATE USER 'admin'@'localhost' IDENTIFIED WITH mysql_native_password BY 'password';

-- View all users
SELECT user, host, authentication_string FROM mysql.user;

-- Rename a user
RENAME USER 'old_user'@'localhost' TO 'new_user'@'localhost';

-- Drop a user
DROP USER 'username'@'hostname';

-- Set password for user
ALTER USER 'username'@'localhost' IDENTIFIED BY 'new_password';

-- Lock and unlock user accounts
ALTER USER 'username'@'localhost' ACCOUNT LOCK;
ALTER USER 'username'@'localhost' ACCOUNT UNLOCK;

Privilege System

MySQL uses a granular privilege system to control user access:

Privilege Level Description Common Use Cases
Global Privileges Apply to all databases on server Administrative users, DBAs
Database Privileges Apply to all objects in a database Application users, developers
Table Privileges Apply to all columns in a table Restricted application access
Column Privileges Apply to specific columns Compliance, sensitive data
Routine Privileges Apply to stored procedures/functions API-like access patterns

Granting and Revoking Privileges

SQL Statements - Privileges
-- Grant specific privileges on a database
GRANT SELECT, INSERT, UPDATE, DELETE ON company_db.* TO 'app_user'@'localhost';

-- Grant all privileges on a database
GRANT ALL PRIVILEGES ON company_db.* TO 'dba'@'localhost';

-- Grant privileges on specific table
GRANT SELECT ON company_db.employees TO 'report_user'@'%';

-- Grant column-level privileges
GRANT SELECT (id, name, department) ON company_db.employees TO 'hr_user'@'localhost';

-- Grant execute privilege on stored procedures
GRANT EXECUTE ON PROCEDURE company_db.calculate_bonus TO 'app_user'@'localhost';

-- Revoke privileges
REVOKE DELETE ON company_db.* FROM 'app_user'@'localhost';

-- Show granted privileges for a user
SHOW GRANTS FOR 'username'@'hostname';

-- Apply privilege changes immediately
FLUSH PRIVILEGES;

Common Privilege Types

Privilege Description Risk Level
SELECT Read data from tables Low
INSERT Add new rows to tables Medium
UPDATE Modify existing data Medium
DELETE Remove rows from tables High
CREATE Create new databases/tables High
DROP Delete databases/tables Critical
GRANT OPTION Grant privileges to other users Critical
PROCESS View active queries Medium
FILE Read/write files on server Critical

Role-Based Access Control (RBAC)

MySQL 8.0+ supports roles for easier privilege management:

SQL Statements - Roles
-- Create roles
CREATE ROLE 'read_only', 'data_entry', 'db_admin';

-- Grant privileges to roles
GRANT SELECT ON company_db.* TO 'read_only';
GRANT SELECT, INSERT, UPDATE ON company_db.* TO 'data_entry';
GRANT ALL PRIVILEGES ON company_db.* TO 'db_admin';

-- Assign roles to users
GRANT 'read_only' TO 'report_user'@'localhost';
GRANT 'data_entry' TO 'clerk_user'@'localhost';

-- Set default roles
SET DEFAULT ROLE 'read_only' TO 'report_user'@'localhost';

-- Activate roles for current session
SET ROLE 'read_only';

-- View granted roles
SHOW GRANTS USING 'read_only';

Password Management and Policies

SQL Statements - Password Policies
-- Set password expiration policy (MySQL 8.0+)
ALTER USER 'app_user'@'localhost' PASSWORD EXPIRE INTERVAL 90 DAY;

-- Require password change on first login
ALTER USER 'new_user'@'localhost' PASSWORD EXPIRE;

-- Set account with no password (insecure - avoid)
ALTER USER 'test_user'@'localhost' IDENTIFIED BY '';

-- Configure password validation policy
SET GLOBAL validate_password.policy = MEDIUM;
SET GLOBAL validate_password.length = 8;
SET GLOBAL validate_password.number_count = 1;
SET GLOBAL validate_password.special_char_count = 1;

-- Install password validation component (MySQL 8.0+)
INSTALL COMPONENT 'file://component_validate_password';

Network Security and Remote Access

Configuration and SQL Examples
# In my.cnf - Bind to specific interface
bind-address = 127.0.0.1
# Or bind to specific IP
bind-address = 192.168.1.100

-- Create user with specific host restrictions
CREATE USER 'remote_user'@'192.168.1.%' IDENTIFIED BY 'password';
CREATE USER 'vpn_user'@'10.0.0.50' IDENTIFIED BY 'password';

-- Use SSL for secure connections
CREATE USER 'secure_user'@'%' REQUIRE SSL;
CREATE USER 'x509_user'@'%' REQUIRE X509;

-- Check connection information
SELECT user, host, db, command FROM information_schema.processlist;

Security Best Practices Example

Complete Security Setup Example
-- Create application-specific users with least privilege
CREATE USER 'web_app'@'192.168.1.100' IDENTIFIED BY 'StrongPass123!' PASSWORD EXPIRE INTERVAL 90 DAY;
CREATE USER 'backup_user'@'localhost' IDENTIFIED BY 'BackupPass456!' REQUIRE SSL;

-- Grant minimal required privileges
GRANT SELECT, INSERT, UPDATE ON ecommerce.* TO 'web_app'@'192.168.1.100';
GRANT SELECT, RELOAD, PROCESS, LOCK TABLES ON *.* TO 'backup_user'@'localhost';

-- Explicitly deny dangerous privileges
REVOKE DROP, CREATE, ALTER, GRANT OPTION ON ecommerce.* FROM 'web_app'@'192.168.1.100';

-- Create read-only user for reporting
CREATE ROLE 'reporting_role';
GRANT SELECT ON ecommerce.products, ecommerce.categories TO 'reporting_role';
GRANT 'reporting_role' TO 'analyst'@'%';
SET DEFAULT ROLE 'reporting_role' TO 'analyst'@'%';

FLUSH PRIVILEGES;

Security Auditing and Monitoring

SQL Statements for Security Auditing
-- View current user sessions
SELECT * FROM information_schema.processlist;

-- Check user privileges
SELECT * FROM information_schema.user_privileges;

-- Monitor failed login attempts (requires audit plugin)
SELECT * FROM mysql.error_log WHERE error_code = 1045;

-- Check for users without passwords
SELECT user, host FROM mysql.user WHERE authentication_string = '';

-- Find users with administrative privileges
SELECT user, host FROM mysql.user WHERE Super_priv = 'Y' OR Grant_priv = 'Y';

Common Security Risks and Mitigations

Risk Description Mitigation
Weak Passwords Easy-to-guess or default passwords Implement password policies, use strong passwords
Excessive Privileges Users have more access than needed Apply principle of least privilege
Network Exposure Database accessible from untrusted networks Use firewalls, bind to specific interfaces
SQL Injection Malicious SQL execution through applications Use prepared statements, input validation
Default Configurations Using insecure default settings Harden MySQL configuration, remove test databases
Principle of Least Privilege: Always grant users the minimum privileges necessary to perform their required tasks. Regularly review and audit user privileges.
Security Warning: Never use the root user for application connections. Always create dedicated users with restricted privileges for each application or service.
Security Checklist:
  • Change default root password immediately after installation
  • Remove anonymous users and test databases
  • Use strong, unique passwords for all users
  • Implement network security and firewalls
  • Regularly audit user privileges and access patterns
  • Enable logging and monitoring for suspicious activities
  • Keep MySQL updated with security patches

MySQL Performance Optimization

Performance Optimization: The process of improving database response times, throughput, and resource utilization through query tuning, indexing, configuration adjustments, and architectural improvements.

Performance Optimization Layers

MySQL performance optimization occurs at multiple levels, from query-level tuning to server-wide configuration. A systematic approach ensures comprehensive improvements across all layers.

Query Performance Analysis

Use EXPLAIN to analyze query execution plans and identify bottlenecks:

SQL Statements - Query Analysis
-- Basic EXPLAIN
EXPLAIN SELECT * FROM employees WHERE department = 'Sales';

-- EXPLAIN with format options
EXPLAIN FORMAT=JSON SELECT * FROM orders WHERE total_amount > 1000;

-- ANALYZE TABLE to update statistics
ANALYZE TABLE employees, departments;

-- Check index usage
SHOW INDEX FROM employees;

-- Find slow queries (requires slow query log)
SELECT * FROM mysql.slow_log ORDER BY start_time DESC LIMIT 10;

EXPLAIN Output Interpretation

Column Description Optimal Values
type Join type const, eq_ref, ref, range
possible_keys Possible indexes Should list relevant indexes
key Actual index used Should match query needs
rows Estimated rows examined Lower is better
Extra Additional information Avoid "Using filesort", "Using temporary"

Index Optimization Strategies

Proper indexing is crucial for query performance:

SQL Statements - Index Management
-- Create single-column index
CREATE INDEX idx_department ON employees(department);

-- Create composite index
CREATE INDEX idx_dept_salary ON employees(department, salary);

-- Create unique index
CREATE UNIQUE INDEX idx_email ON employees(email);

-- Create full-text index
CREATE FULLTEXT INDEX idx_description ON products(description);

-- Drop index
DROP INDEX idx_department ON employees;

-- Check index usage statistics
SELECT * FROM sys.schema_index_statistics WHERE table_schema = 'company_db';

Query Optimization Techniques

SQL Examples - Query Optimization
-- Instead of SELECT *, specify columns
SELECT id, name, email FROM employees; -- Good
SELECT * FROM employees; -- Avoid

-- Use WHERE instead of HAVING for filtering
SELECT department, AVG(salary) FROM employees WHERE department = 'Sales'; -- Good
SELECT department, AVG(salary) FROM employees GROUP BY department HAVING department = 'Sales'; -- Avoid

-- Use EXISTS instead of IN for large datasets
SELECT name FROM departments d WHERE EXISTS (SELECT 1 FROM employees e WHERE e.department_id = d.id); -- Good
SELECT name FROM departments WHERE id IN (SELECT department_id FROM employees); -- Less efficient

-- Use LIMIT to restrict result sets
SELECT * FROM orders ORDER BY order_date DESC LIMIT 100;

Server Configuration Optimization

Key my.cnf configuration parameters for performance:

Configuration File (my.cnf)
# Buffer Pool Settings (60-80% of available RAM for dedicated DB server)
innodb_buffer_pool_size = 16G
innodb_buffer_pool_instances = 8

# Log Settings
innodb_log_file_size = 2G
innodb_log_buffer_size = 256M

# Connection Settings
max_connections = 200
thread_cache_size = 16
table_open_cache = 4000

# Query Cache (MySQL 5.7) - Disable in MySQL 8.0+
query_cache_type = 0
query_cache_size = 0

# Temporary Tables
tmp_table_size = 256M
max_heap_table_size = 256M

Performance Schema and Sys Schema

Use built-in monitoring to identify performance issues:

SQL Statements - Performance Monitoring
-- Enable Performance Schema (usually enabled by default)
UPDATE performance_schema.setup_instruments SET ENABLED = 'YES', TIMED = 'YES';

-- Top 10 most time-consuming queries
SELECT query, db, exec_count, total_latency, avg_latency
FROM sys.statement_analysis
ORDER BY total_latency DESC LIMIT 10;

-- Check index usage
SELECT * FROM sys.schema_unused_indexes;

-- Monitor InnoDB metrics
SELECT * FROM sys.innodb_buffer_stats_by_schema;

-- Check table IO statistics
SELECT table_name, count_read, count_write
FROM sys.io_global_by_file_by_bytes
WHERE file LIKE '%ibd' ORDER BY total DESC LIMIT 10;

Connection and Thread Optimization

SQL Statements - Connection Management
-- Show current connections
SHOW PROCESSLIST;
SELECT * FROM information_schema.processlist;

-- Kill long-running queries
KILL 1234;

-- Monitor connection statistics
SHOW STATUS LIKE 'Threads_%';
SHOW STATUS LIKE 'Aborted_%';

-- Check max used connections
SHOW STATUS LIKE 'Max_used_connections';

-- Analyze connection errors
SHOW STATUS LIKE 'Connection_errors_%';

Table Optimization and Maintenance

SQL Statements - Table Maintenance
-- Optimize table (defragment)
OPTIMIZE TABLE employees, orders;

-- Check table fragmentation
SELECT TABLE_NAME, DATA_FREE
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'company_db' AND DATA_FREE > 0;

-- Check table statistics
SHOW TABLE STATUS LIKE 'employees';

-- Repair table (if corrupted)
REPAIR TABLE corrupted_table;

-- Convert table to different storage engine
ALTER TABLE my_table ENGINE = InnoDB;

Partitioning for Large Tables

Partitioning improves performance for large datasets:

SQL Statements - Table Partitioning
-- Create range-partitioned table
CREATE TABLE sales (
  sale_id INT,
  sale_date DATE,
  amount DECIMAL(10,2)
) PARTITION BY RANGE (YEAR(sale_date)) (
  PARTITION p2020 VALUES LESS THAN (2021),
  PARTITION p2021 VALUES LESS THAN (2022),
  PARTITION p2022 VALUES LESS THAN (2023),
  PARTITION p2023 VALUES LESS THAN (2024),
  PARTITION pfuture VALUES LESS THAN MAXVALUE
);

-- Query specific partition
SELECT * FROM sales PARTITION (p2023);

-- Add new partition
ALTER TABLE sales REORGANIZE PARTITION pfuture INTO (
  PARTITION p2024 VALUES LESS THAN (2025),
  PARTITION pfuture VALUES LESS THAN MAXVALUE
);

Caching Strategies

Cache Type Purpose Configuration
InnoDB Buffer Pool Cache data and indexes innodb_buffer_pool_size
Query Cache Cache query results (deprecated) query_cache_type, query_cache_size
Table Cache Cache table descriptors table_open_cache
Thread Cache Cache worker threads thread_cache_size
Application Cache Application-level caching Redis, Memcached

Performance Monitoring Script

Shell Script - Performance Check
#!/bin/bash
# MySQL Performance Health Check

echo "=== MySQL Performance Report ==="
echo "Generated: $(date)"
echo

# Check key metrics
mysql -e "SHOW GLOBAL STATUS LIKE 'Questions';"
mysql -e "SHOW GLOBAL STATUS LIKE 'Threads_connected';"
mysql -e "SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_hit_rate';"
mysql -e "SHOW GLOBAL STATUS LIKE 'Slow_queries';"
mysql -e "SHOW GLOBAL STATUS LIKE 'Table_locks_waited';"

# Check current processes
echo "=== Top 10 Long Running Queries ==="
mysql -e "SELECT id, user, host, db, command, time, state, info
FROM information_schema.processlist
WHERE command != 'Sleep'
ORDER BY time DESC LIMIT 10\G"
Performance Tuning Approach: Always measure before and after changes. Use a systematic approach: identify bottlenecks, implement one change at a time, and measure the impact.
Configuration Warning: Avoid copying configuration settings from other servers without understanding your workload and hardware. Oversized buffers can cause swapping and degrade performance.
Quick Wins for Performance:
  • Add indexes on frequently queried columns
  • Optimize slow queries identified in slow query log
  • Increase InnoDB buffer pool size (if RAM available)
  • Use connection pooling in applications
  • Regularly maintain tables with OPTIMIZE TABLE
  • Monitor and kill long-running queries
  • Use appropriate data types to reduce storage

Advanced MySQL Features

Advanced MySQL Features: Powerful capabilities beyond basic SQL operations including window functions, Common Table Expressions (CTEs), JSON support, full-text search, geographic data handling, and enterprise-grade features for complex data processing and analysis.

Window Functions

Window functions perform calculations across a set of table rows that are somehow related to the current row, without collapsing rows like GROUP BY does.

SQL Statements - Window Functions
-- ROW_NUMBER() - Assign unique numbers to rows
SELECT
  employee_id,
  name,
  salary,
  ROW_NUMBER() OVER (ORDER BY salary DESC) AS salary_rank
FROM employees;

-- RANK() and DENSE_RANK() - Handle ties differently
SELECT
  employee_id,
  department,
  salary,
  RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank,
  DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_dense_rank
FROM employees;

-- Running totals with SUM() window function
SELECT
  order_date,
  daily_revenue,
  SUM(daily_revenue) OVER (ORDER BY order_date) AS running_total
FROM daily_sales;

-- LAG and LEAD - Access previous/next row values
SELECT
  month,
  revenue,
  LAG(revenue) OVER (ORDER BY month) AS prev_month_revenue,
  LEAD(revenue) OVER (ORDER BY month) AS next_month_revenue
FROM monthly_revenue;

Common Table Expressions (CTEs)

CTEs provide better readability and enable recursive queries:

SQL Statements - Common Table Expressions
-- Simple CTE for better query organization
WITH department_stats AS (
  SELECT
    department,
    AVG(salary) AS avg_salary,
    COUNT(*) AS employee_count
  FROM employees
  GROUP BY department
)
SELECT * FROM department_stats WHERE avg_salary > 50000;

-- Multiple CTEs in single query
WITH
  high_earners AS (
    SELECT employee_id, name, salary FROM employees WHERE salary > 80000
  ),
  department_high_earners AS (
    SELECT e.department, COUNT(*) AS high_earner_count
    FROM employees e
    JOIN high_earners h ON e.employee_id = h.employee_id
    GROUP BY e.department
  )
SELECT * FROM department_high_earners ORDER BY high_earner_count DESC;

-- Recursive CTE for hierarchical data
WITH RECURSIVE employee_hierarchy AS (
  -- Anchor member: top-level managers
  SELECT employee_id, name, manager_id, 1 AS level
  FROM employees
  WHERE manager_id IS NULL
  UNION ALL
  -- Recursive member: subordinates
  SELECT e.employee_id, e.name, e.manager_id, eh.level + 1
  FROM employees e
  JOIN employee_hierarchy eh ON e.manager_id = eh.employee_id
)
SELECT * FROM employee_hierarchy ORDER BY level, employee_id;

JSON Support

MySQL provides extensive JSON data type support and functions:

SQL Statements - JSON Operations
-- Create table with JSON column
CREATE TABLE products (
  product_id INT PRIMARY KEY AUTO_INCREMENT,
  name VARCHAR(100) NOT NULL,
  details JSON,
  attributes JSON
);

-- Insert JSON data
INSERT INTO products (name, details, attributes) VALUES (
  'Laptop',
  '{"brand": "Dell", "specs": {"ram": "16GB", "storage": "512GB"}, "price": 999.99}',
  '["portable", "gaming", "business"]'
);

-- Extract JSON values
SELECT
  name,
  details->'$.brand' AS brand,
  details->'$.specs.ram' AS ram,
  JSON_EXTRACT(details, '$.price') AS price
FROM products;

-- Query JSON data
SELECT name, details
FROM products
WHERE details->'$.specs.ram' = '"16GB"';

-- Modify JSON data
UPDATE products
  SET details = JSON_SET(details, '$.price', 899.99)
  WHERE product_id = 1;

-- JSON aggregation
SELECT
  department,
  JSON_ARRAYAGG(JSON_OBJECT('id', employee_id, 'name', name)) AS employees
FROM employees
GROUP BY department;

Full-Text Search

Advanced text searching capabilities in MySQL:

SQL Statements - Full-Text Search
-- Create full-text index
CREATE TABLE articles (
  article_id INT PRIMARY KEY AUTO_INCREMENT,
  title VARCHAR(255),
  content TEXT,
  FULLTEXT(title, content)
);

-- Natural language full-text search
SELECT title, MATCH(title, content) AGAINST('database optimization') AS relevance
FROM articles
WHERE MATCH(title, content) AGAINST('database optimization')
ORDER BY relevance DESC;

-- Boolean mode full-text search
SELECT title
FROM articles
WHERE MATCH(title, content) AGAINST('+MySQL -Oracle' IN BOOLEAN MODE);

-- Query expansion (MySQL 8.0+)
SELECT title
FROM articles
WHERE MATCH(title, content) AGAINST('database' WITH QUERY EXPANSION);

Geographic Data and Spatial Features

MySQL supports geographic data types and spatial functions:

SQL Statements - Spatial Data
-- Create table with spatial data
CREATE TABLE locations (
  location_id INT PRIMARY KEY AUTO_INCREMENT,
  name VARCHAR(100),
  coordinates POINT NOT NULL,
  SPATIAL INDEX(coordinates)
);

-- Insert spatial data
INSERT INTO locations (name, coordinates) VALUES
  ('Office', ST_GeomFromText('POINT(40.7128 -74.0060)')),
  ('Store', ST_GeomFromText('POINT(40.7589 -73.9851)'));

-- Calculate distance between points
SELECT
  name,
  ST_Distance_Sphere(
    ST_GeomFromText('POINT(40.7128 -74.0060)'),
    coordinates
  ) AS distance_meters
FROM locations
ORDER BY distance_meters;

-- Find points within radius
SELECT name
FROM locations
WHERE ST_Distance_Sphere(
  ST_GeomFromText('POINT(40.7128 -74.0060)'),
  coordinates
) <= 5000; -- 5km radius

Generated Columns

Automatically computed columns based on other columns:

SQL Statements - Generated Columns
-- Virtual generated column (computed on read)
CREATE TABLE products (
  product_id INT PRIMARY KEY,
  price DECIMAL(10,2),
  quantity INT,
  total_value DECIMAL(12,2) GENERATED ALWAYS AS (price * quantity) VIRTUAL
);

-- Stored generated column (computed on write)
CREATE TABLE employees (
  employee_id INT PRIMARY KEY,
  first_name VARCHAR(50),
  last_name VARCHAR(50),
  full_name VARCHAR(101) GENERATED ALWAYS AS (CONCAT(first_name, ' ', last_name)) STORED
);

-- JSON extracted to generated column
CREATE TABLE products_json (
  product_id INT PRIMARY KEY,
  details JSON,
  brand VARCHAR(50) GENERATED ALWAYS AS (details->'$.brand') VIRTUAL
);

Advanced Indexing Features

SQL Statements - Advanced Indexing
-- Descending indexes (MySQL 8.0+)
CREATE INDEX idx_salary_desc ON employees (salary DESC);

-- Invisible indexes (for testing index removal)
CREATE INDEX idx_test ON employees(department) INVISIBLE;
ALTER INDEX idx_test VISIBLE;

-- Functional indexes (MySQL 8.0+)
CREATE INDEX idx_name_lower ON employees ((LOWER(name)));

-- Multi-valued indexes for JSON arrays
CREATE INDEX idx_attributes ON products ( (CAST(attributes AS CHAR(50) ARRAY)) );

Enterprise Features

Feature Description Use Case
Data Masking Hide sensitive data from unauthorized users GDPR compliance, development environments
Audit Logging Comprehensive logging of database activities Security compliance, troubleshooting
Thread Pool Efficient connection handling for high concurrency Web applications with many concurrent users
Enterprise Backup Hot backups with minimal performance impact Large production databases
Enterprise Monitor Comprehensive monitoring and alerting Production database management

Performance Schema Advanced Usage

SQL Statements - Advanced Performance Monitoring
-- Monitor memory usage
SELECT * FROM performance_schema.memory_summary_global_by_event_name
ORDER BY CURRENT_NUMBER_OF_BYTES_USED DESC LIMIT 10;

-- Analyze table IO
SELECT OBJECT_NAME, COUNT_READ, COUNT_WRITE, SUM_NUMBER_OF_BYTES_READ
FROM performance_schema.table_io_waits_summary_by_table
ORDER BY SUM_TIMER_WAIT DESC LIMIT 10;

-- Monitor replication lag
SELECT CHANNEL_NAME, SERVICE_STATE, LAST_ERROR_NUMBER, LAST_ERROR_MESSAGE
FROM performance_schema.replication_connection_status;
Version Compatibility: Many advanced features like window functions, CTEs, and descending indexes are available in MySQL 8.0+. Always check your MySQL version before using these features.
Performance Considerations: Advanced features like JSON operations and full-text search can be resource-intensive. Test performance implications before deploying to production.
When to Use Advanced Features:
  • Use window functions for complex analytics and rankings
  • Use CTEs for better query organization and recursive queries
  • Use JSON for flexible schema and semi-structured data
  • Use full-text search for text-heavy applications
  • Use spatial features for location-based applications
  • Use generated columns for computed values and data integrity