Week 2: SQL Fundamentals & Database Basics - Complete Guide | EDUSHARK TRAINING

Week 2: SQL Fundamentals & Database Basics

Master the Language of Data - From Basic Queries to Complex JOINs

Duration: 7 Days | Level: Beginner to Intermediate | Prerequisite: Week 1

Welcome to Week 2! Now that you've built a solid foundation in data concepts and Excel, it's time to learn the most essential skill for any data analyst - SQL (Structured Query Language).

SQL is the universal language for communicating with databases. Whether you're working at a startup or a Fortune 500 company, SQL will be your primary tool for extracting, analyzing, and manipulating data. By the end of this week, you'll be writing queries that would have taken hours in Excel - but take just seconds in SQL.

Let's unlock the power of databases!

Day 1

Introduction to Databases

Understanding How Data is Stored and Organized

What is a Database?

Database: A database is an organized collection of structured data stored electronically. It allows you to store, retrieve, modify, and manage large amounts of information efficiently.

Think of a database like a super-powered Excel workbook. While Excel can handle thousands of rows, databases can handle millions or even billions of records while still being fast and efficient.

Real-World Example: Your Favorite E-commerce Site

When you shop online, the website uses multiple databases to store:

  • Product catalog: All products with names, descriptions, prices, images
  • Customer information: Your name, address, email, order history
  • Inventory: Stock levels for each product
  • Orders: Every transaction, payment status, shipping details
  • Reviews: Customer ratings and feedback

Types of Databases

Type Description Examples Best For
Relational (SQL) Data stored in tables with rows and columns, linked by relationships MySQL, PostgreSQL, SQL Server, Oracle Structured data, business applications
NoSQL Flexible schema, stores data in documents, key-value pairs, or graphs MongoDB, Cassandra, Redis Unstructured data, real-time apps
Data Warehouse Optimized for analytics and reporting on large datasets Snowflake, BigQuery, Redshift Business intelligence, analytics

Focus: Relational Databases

In this course, we focus on relational databases because they're the most common in business environments and essential for data analysts. Over 70% of enterprise data is stored in relational databases!

Relational Database Concepts

Tables

A table is like a spreadsheet - it has rows and columns. Each table stores data about one type of entity (customers, products, orders, etc.).

Example: Customers Table

customer_id first_name last_name email city
1 Rahul Sharma rahul@email.com Mumbai
2 Priya Patel priya@email.com Delhi
3 Amit Kumar amit@email.com Bangalore

Key Terminology

  • Row (Record): A single entry in a table (e.g., one customer)
  • Column (Field): A single attribute (e.g., email address)
  • Primary Key: A unique identifier for each row (e.g., customer_id)
  • Foreign Key: A column that links to another table's primary key
  • Schema: The structure of the database (tables, columns, relationships)

Primary Keys and Foreign Keys

These are crucial for understanding how tables relate to each other:

Customers Table

Primary Key: customer_id uniquely identifies each customer

customer_id (PK)name
1Rahul
2Priya

Orders Table

Foreign Key: customer_id links to the customers table

order_id (PK)customer_id (FK)amount
10115000
10213000
10327500

Notice how customer_id in Orders references customer_id in Customers - this is the relationship!

Entity-Relationship (ER) Diagrams

ER diagrams visually show how tables relate to each other. Understanding these helps you write better queries.

Common Relationship Types

  • One-to-One (1:1): Each record in Table A relates to exactly one record in Table B (e.g., User and User_Profile)
  • One-to-Many (1:N): One record in Table A relates to many records in Table B (e.g., Customer to Orders)
  • Many-to-Many (M:N): Multiple records in Table A relate to multiple records in Table B (e.g., Students and Courses)

Introduction to SQL

SQL (Structured Query Language): A standardized programming language used to manage and manipulate relational databases. Pronounced "S-Q-L" or "sequel."

SQL allows you to:

  • Query data: Retrieve information from tables
  • Insert data: Add new records
  • Update data: Modify existing records
  • Delete data: Remove records
  • Create structures: Design tables and databases

SQL Command Categories

Category Purpose Commands
DQL (Data Query Language) Retrieve data SELECT
DML (Data Manipulation Language) Modify data INSERT, UPDATE, DELETE
DDL (Data Definition Language) Define structure CREATE, ALTER, DROP
DCL (Data Control Language) Control access GRANT, REVOKE

For Data Analysts

As a data analyst, you'll primarily use SELECT statements (DQL) to query and analyze data. You typically won't need to create tables or modify data directly - that's usually handled by database administrators or data engineers.

Setting Up Your SQL Environment

To practice SQL, you'll need a database system. Here are the most popular options:

  • MySQL + MySQL Workbench: Free, popular, great for beginners
  • PostgreSQL + pgAdmin: Free, powerful, industry standard
  • SQLite + DB Browser: Lightweight, no server needed, perfect for learning
  • Online platforms: SQLZoo, Mode Analytics, W3Schools SQL Tryit

Recommended for Beginners

Start with MySQL Workbench or use online platforms like SQLZoo to practice without any installation.

Day 1 Key Takeaways

  • A database is an organized collection of structured data
  • Relational databases store data in tables linked by relationships
  • Primary keys uniquely identify records; foreign keys link tables
  • SQL is the standard language for querying relational databases
  • As data analysts, we focus primarily on SELECT queries
Day 2

Basic SELECT Queries

Your First Steps in Retrieving Data

Today you'll write your first SQL queries! The SELECT statement is the foundation of data analysis in SQL - it's how you retrieve data from tables.

Practice Database Setup

For this week's exercises, we'll use a sample e-commerce database with these tables:

  • customers - Customer information
  • products - Product catalog
  • orders - Order transactions
  • order_items - Items in each order
  • categories - Product categories

Download Practice Database Schema

The Basic SELECT Statement

Syntax:
SELECT column1, column2, ... FROM table_name;

Select All Columns

Use the asterisk (*) to select all columns from a table:

SELECT * FROM customers;

Result: Returns ALL columns and ALL rows from the customers table

customer_idfirst_namelast_nameemailcity
1RahulSharmarahul@email.comMumbai
2PriyaPatelpriya@email.comDelhi

Avoid SELECT * in Production

While SELECT * is convenient for exploring data, avoid it in production queries. It retrieves unnecessary columns, slows down queries, and can break if table structure changes. Always specify the columns you need.

Select Specific Columns

List only the columns you need:

SELECT first_name, last_name, email FROM customers;

Result: Only returns these three columns

first_namelast_nameemail
RahulSharmarahul@email.com
PriyaPatelpriya@email.com

Column Aliases with AS

Rename columns in your output for clarity using the AS keyword:

SELECT first_name AS 'First Name', last_name AS 'Last Name', email AS 'Email Address' FROM customers;

Result:

First NameLast NameEmail Address
RahulSharmarahul@email.com

Pro Tip: Alias Syntax

The AS keyword is optional. You can also write: first_name 'First Name'. However, using AS makes your code more readable.

DISTINCT - Removing Duplicates

Use DISTINCT to get unique values only:

-- Get all unique cities where customers live SELECT DISTINCT city FROM customers;

Result:

city
Mumbai
Delhi
Bangalore
-- DISTINCT on multiple columns SELECT DISTINCT city, state FROM customers; -- Returns unique combinations of city AND state

LIMIT and OFFSET

Control how many rows are returned:

-- Get only the first 10 rows SELECT * FROM products LIMIT 10; -- Skip the first 5 rows, then get 10 rows (pagination) SELECT * FROM products LIMIT 10 OFFSET 5; -- MySQL shorthand: LIMIT offset, count SELECT * FROM products LIMIT 5, 10;

Pagination Example

To show results page by page (10 items per page):

  • Page 1: LIMIT 10 OFFSET 0
  • Page 2: LIMIT 10 OFFSET 10
  • Page 3: LIMIT 10 OFFSET 20

Comments in SQL

Good practice: Always comment your code!

-- This is a single-line comment /* This is a multi-line comment for longer explanations */ SELECT customer_id, first_name, -- Customer's first name last_name -- Customer's last name FROM customers;

Calculated Columns

You can perform calculations and create new columns in your SELECT:

SELECT product_name, price, quantity, price * quantity AS total_value FROM products; -- Another example: Calculate with a discount SELECT product_name, price AS original_price, price * 0.9 AS discounted_price -- 10% discount FROM products;

Try It Yourself - Day 2 Exercises

Practice these queries on your sample database:

  1. Select all columns from the products table
  2. Select only product_name and price from products
  3. Get a list of all unique categories
  4. Select the first 5 products with aliases for column names
  5. Create a calculated column showing price with 18% GST added

Day 2 Key Takeaways

  • SELECT retrieves data from tables; specify columns or use * for all
  • Use AS to create meaningful column aliases
  • DISTINCT removes duplicate values from results
  • LIMIT controls how many rows to return; OFFSET skips rows
  • You can perform calculations and create new columns in SELECT
Day 3

Filtering Data with WHERE

Finding Exactly What You Need

Selecting all data is rarely useful. The real power of SQL comes from filtering - retrieving only the records that match specific criteria. The WHERE clause is your filtering tool.

The WHERE Clause

Syntax:
SELECT columns FROM table_name WHERE condition;

Comparison Operators

Operator Description Example
= Equal to WHERE city = 'Mumbai'
<> or != Not equal to WHERE status <> 'Cancelled'
> Greater than WHERE price > 1000
< Less than WHERE quantity < 10
>= Greater than or equal WHERE rating >= 4
<= Less than or equal WHERE age <= 30
-- Find all products priced above 5000 SELECT product_name, price FROM products WHERE price > 5000; -- Find customers from Mumbai SELECT * FROM customers WHERE city = 'Mumbai'; -- Find orders that are not delivered yet SELECT * FROM orders WHERE status <> 'Delivered';

Logical Operators: AND, OR, NOT

Combine multiple conditions using logical operators:

-- AND: Both conditions must be true SELECT * FROM products WHERE category = 'Electronics' AND price > 10000; -- OR: Either condition can be true SELECT * FROM customers WHERE city = 'Mumbai' OR city = 'Delhi'; -- NOT: Negates a condition SELECT * FROM orders WHERE NOT status = 'Cancelled'; -- Complex conditions with parentheses SELECT * FROM products WHERE (category = 'Electronics' OR category = 'Appliances') AND price < 5000;

Watch Your Parentheses!

AND has higher precedence than OR. Use parentheses to ensure conditions are evaluated in the correct order. A OR B AND C is interpreted as A OR (B AND C), not (A OR B) AND C.

BETWEEN - Range Filtering

Use BETWEEN to filter values within a range (inclusive):

-- Products priced between 1000 and 5000 SELECT * FROM products WHERE price BETWEEN 1000 AND 5000; -- Equivalent to: SELECT * FROM products WHERE price >= 1000 AND price <= 5000; -- Orders placed in January 2024 SELECT * FROM orders WHERE order_date BETWEEN '2024-01-01' AND '2024-01-31';

IN - Multiple Value Matching

Use IN to match against multiple values (cleaner than multiple ORs):

-- Customers from specific cities SELECT * FROM customers WHERE city IN ('Mumbai', 'Delhi', 'Bangalore', 'Chennai'); -- Much cleaner than: SELECT * FROM customers WHERE city = 'Mumbai' OR city = 'Delhi' OR city = 'Bangalore' OR city = 'Chennai'; -- NOT IN: Exclude specific values SELECT * FROM orders WHERE status NOT IN ('Cancelled', 'Returned');

LIKE - Pattern Matching

Use LIKE for text pattern matching with wildcards:

Wildcard Meaning Example
% Any sequence of characters (including none) 'A%' matches 'Apple', 'Air', 'A'
_ Exactly one character '_at' matches 'Cat', 'Bat', 'Hat'
-- Products starting with 'Samsung' SELECT * FROM products WHERE product_name LIKE 'Samsung%'; -- Products containing 'Phone' anywhere SELECT * FROM products WHERE product_name LIKE '%Phone%'; -- Products ending with 'Pro' SELECT * FROM products WHERE product_name LIKE '%Pro'; -- Emails from gmail SELECT * FROM customers WHERE email LIKE '%@gmail.com'; -- Names with exactly 5 characters SELECT * FROM customers WHERE first_name LIKE '_____';

IS NULL and IS NOT NULL

Handle missing (NULL) values carefully:

-- Find customers without phone numbers SELECT * FROM customers WHERE phone IS NULL; -- Find orders that have been shipped (shipping_date is NOT empty) SELECT * FROM orders WHERE shipping_date IS NOT NULL; -- WRONG: NULL cannot be compared with = -- SELECT * FROM customers WHERE phone = NULL; -- Won't work!

Understanding NULL

NULL means "unknown" or "no value" - it's not the same as 0 or an empty string. You cannot use = or <> with NULL; you must use IS NULL or IS NOT NULL.

Try It Yourself - Day 3 Exercises

  1. Find all products priced between Rs. 500 and Rs. 2000
  2. Find customers from Mumbai, Delhi, or Bangalore
  3. Find products whose name contains "Wireless"
  4. Find orders placed in January 2024 that are still pending
  5. Find customers who haven't provided their phone number

Day 3 Key Takeaways

  • WHERE filters rows based on specified conditions
  • Use comparison operators (=, <>, >, <, >=, <=) for basic filtering
  • Combine conditions with AND, OR, NOT; use parentheses for clarity
  • BETWEEN filters ranges; IN matches multiple values; LIKE handles patterns
  • Always use IS NULL / IS NOT NULL for NULL comparisons
Day 4

Sorting & Aggregations

Organizing and Summarizing Data

Today we learn two powerful capabilities: sorting data with ORDER BY, and summarizing data with aggregate functions and GROUP BY. These are essential for creating meaningful reports and analytics.

ORDER BY - Sorting Results

Syntax:
SELECT columns FROM table ORDER BY column [ASC|DESC];
-- Sort products by price (ascending - default) SELECT product_name, price FROM products ORDER BY price; -- Sort by price descending (highest first) SELECT product_name, price FROM products ORDER BY price DESC; -- Sort alphabetically by name SELECT * FROM customers ORDER BY last_name ASC; -- Sort by multiple columns SELECT * FROM products ORDER BY category ASC, price DESC; -- First by category A-Z, then within each category by price highest-first -- Sort by column position (not recommended but useful to know) SELECT product_name, price, quantity FROM products ORDER BY 2 DESC; -- Sorts by 2nd column (price)

Aggregate Functions

Aggregate functions perform calculations across multiple rows and return a single value:

Function Description Example
COUNT() Counts number of rows COUNT(*), COUNT(column)
SUM() Adds up values SUM(amount)
AVG() Calculates average AVG(price)
MIN() Finds minimum value MIN(price)
MAX() Finds maximum value MAX(price)
-- Count total number of customers SELECT COUNT(*) AS total_customers FROM customers; -- Count non-null phone numbers SELECT COUNT(phone) AS customers_with_phone FROM customers; -- Calculate total revenue SELECT SUM(amount) AS total_revenue FROM orders; -- Find average product price SELECT AVG(price) AS average_price FROM products; -- Find cheapest and most expensive products SELECT MIN(price) AS cheapest, MAX(price) AS most_expensive FROM products; -- Multiple aggregations in one query SELECT COUNT(*) AS total_orders, SUM(amount) AS total_revenue, AVG(amount) AS avg_order_value, MIN(amount) AS smallest_order, MAX(amount) AS largest_order FROM orders;

GROUP BY - Grouping Data

GROUP BY divides data into groups and applies aggregate functions to each group:

Syntax:
SELECT column, AGGREGATE() FROM table GROUP BY column;
-- Count customers by city SELECT city, COUNT(*) AS customer_count FROM customers GROUP BY city; -- Total sales by product category SELECT category, SUM(price * quantity) AS total_sales FROM products GROUP BY category; -- Average order value by customer SELECT customer_id, COUNT(*) AS order_count, AVG(amount) AS avg_order_value FROM orders GROUP BY customer_id; -- Monthly sales summary SELECT YEAR(order_date) AS year, MONTH(order_date) AS month, COUNT(*) AS orders, SUM(amount) AS revenue FROM orders GROUP BY YEAR(order_date), MONTH(order_date) ORDER BY year, month;

GROUP BY Rule

Every column in your SELECT that is NOT inside an aggregate function MUST appear in the GROUP BY clause. Otherwise, you'll get an error.

HAVING - Filtering Groups

HAVING filters groups after aggregation (WHERE filters rows before aggregation):

-- Cities with more than 100 customers SELECT city, COUNT(*) AS customer_count FROM customers GROUP BY city HAVING COUNT(*) > 100; -- Categories with total sales over 1 lakh SELECT category, SUM(amount) AS total_sales FROM orders GROUP BY category HAVING SUM(amount) > 100000; -- Customers with average order value above 5000 SELECT customer_id, AVG(amount) AS avg_order FROM orders GROUP BY customer_id HAVING AVG(amount) > 5000;

WHERE vs HAVING

WHERE HAVING
Filters individual rows Filters grouped results
Applied BEFORE grouping Applied AFTER grouping
Cannot use aggregate functions Can use aggregate functions
WHERE price > 100 HAVING SUM(price) > 100
-- Combining WHERE and HAVING SELECT category, SUM(amount) AS total_sales FROM orders WHERE order_date >= '2024-01-01' -- Filter rows first GROUP BY category HAVING SUM(amount) > 50000 -- Filter groups after ORDER BY total_sales DESC;

Query Execution Order

Understanding the order helps you write correct queries:

  1. FROM - Identify the table(s)
  2. WHERE - Filter individual rows
  3. GROUP BY - Group the remaining rows
  4. HAVING - Filter the groups
  5. SELECT - Choose columns and calculate aggregates
  6. ORDER BY - Sort the results
  7. LIMIT - Limit the output

Try It Yourself - Day 4 Exercises

  1. Find the top 10 most expensive products (sorted by price descending)
  2. Calculate the total number of orders and total revenue
  3. Find total sales by region, sorted highest to lowest
  4. Count orders per customer, show only those with more than 5 orders
  5. Calculate monthly revenue for 2024, ordered chronologically

Day 4 Key Takeaways

  • ORDER BY sorts results; use ASC (default) or DESC
  • Aggregate functions (COUNT, SUM, AVG, MIN, MAX) summarize data
  • GROUP BY divides data into groups for aggregation
  • HAVING filters groups (after aggregation); WHERE filters rows (before)
  • Query execution order: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY
Day 5

SQL Functions

Transform and Manipulate Your Data

SQL provides built-in functions to transform data within your queries. Today we'll cover string functions, numeric functions, date functions, and conditional logic.

String Functions

Manipulate text data:

-- UPPER and LOWER: Change case SELECT UPPER(first_name) AS upper_name, LOWER(email) AS lower_email FROM customers; -- CONCAT: Join strings together SELECT CONCAT(first_name, ' ', last_name) AS full_name FROM customers; -- LENGTH: Count characters SELECT product_name, LENGTH(product_name) AS name_length FROM products; -- SUBSTRING: Extract part of a string SELECT SUBSTRING(phone, 1, 3) AS area_code -- First 3 characters FROM customers; -- TRIM: Remove leading/trailing spaces SELECT TRIM(product_name) FROM products; SELECT LTRIM(product_name) FROM products; -- Left trim only SELECT RTRIM(product_name) FROM products; -- Right trim only -- REPLACE: Replace text SELECT REPLACE(phone, '-', '') AS phone_no_dashes FROM customers; -- LEFT and RIGHT: Extract from start or end SELECT LEFT(product_code, 3) AS category_prefix, RIGHT(product_code, 4) AS item_number FROM products;

Numeric Functions

Work with numbers:

-- ROUND: Round to specified decimal places SELECT price, ROUND(price, 0) AS rounded_price, ROUND(price, 2) AS two_decimals FROM products; -- CEIL and FLOOR: Round up or down SELECT CEIL(4.2) AS ceiling, -- Returns 5 FLOOR(4.8) AS floor_val; -- Returns 4 -- ABS: Absolute value SELECT ABS(-100) AS absolute; -- Returns 100 -- MOD: Modulo (remainder) SELECT MOD(17, 5) AS remainder; -- Returns 2 -- Calculate discount price SELECT product_name, price, ROUND(price * 0.9, 2) AS discounted_price FROM products;

Date Functions

Work with dates and times:

-- Current date and time SELECT NOW() AS current_datetime, CURDATE() AS current_date, CURTIME() AS current_time; -- Extract parts of a date SELECT order_date, YEAR(order_date) AS year, MONTH(order_date) AS month, DAY(order_date) AS day, DAYNAME(order_date) AS day_name, MONTHNAME(order_date) AS month_name FROM orders; -- Format dates SELECT DATE_FORMAT(order_date, '%d-%m-%Y') AS formatted_date, DATE_FORMAT(order_date, '%M %d, %Y') AS long_format FROM orders; -- Date differences SELECT order_date, shipping_date, DATEDIFF(shipping_date, order_date) AS days_to_ship FROM orders; -- Add or subtract dates SELECT order_date, DATE_ADD(order_date, INTERVAL 7 DAY) AS expected_delivery, DATE_SUB(order_date, INTERVAL 1 MONTH) AS one_month_ago FROM orders;

CASE WHEN - Conditional Logic

Create conditional columns (like IF in Excel):

Syntax:
CASE
  WHEN condition1 THEN result1
  WHEN condition2 THEN result2
  ELSE default_result
END
-- Categorize products by price SELECT product_name, price, CASE WHEN price < 1000 THEN 'Budget' WHEN price < 5000 THEN 'Mid-Range' WHEN price < 10000 THEN 'Premium' ELSE 'Luxury' END AS price_category FROM products; -- Categorize customers by order count SELECT customer_id, COUNT(*) AS order_count, CASE WHEN COUNT(*) = 1 THEN 'New' WHEN COUNT(*) <= 5 THEN 'Regular' ELSE 'VIP' END AS customer_type FROM orders GROUP BY customer_id; -- Calculate bonus based on performance SELECT employee_name, sales_amount, CASE WHEN sales_amount >= 100000 THEN sales_amount * 0.10 WHEN sales_amount >= 50000 THEN sales_amount * 0.05 ELSE 0 END AS bonus FROM employees;

COALESCE and IFNULL

Handle NULL values gracefully:

-- COALESCE: Return first non-NULL value SELECT customer_name, COALESCE(phone, email, 'No Contact') AS contact_info FROM customers; -- IFNULL (MySQL): Replace NULL with a default SELECT product_name, IFNULL(discount, 0) AS discount FROM products; -- Calculate with NULL handling SELECT product_name, price, COALESCE(discount, 0) AS discount, price - COALESCE(discount, 0) AS final_price FROM products;

Try It Yourself - Day 5 Exercises

  1. Create full customer names by concatenating first and last name
  2. Extract the year and month from order dates
  3. Calculate days between order date and delivery date
  4. Create a price category column (Budget/Mid/Premium) using CASE
  5. Handle NULL phone numbers by replacing with 'Not Provided'

Day 5 Key Takeaways

  • String functions: UPPER, LOWER, CONCAT, SUBSTRING, TRIM, REPLACE
  • Numeric functions: ROUND, CEIL, FLOOR, ABS, MOD
  • Date functions: YEAR, MONTH, DAY, DATEDIFF, DATE_ADD, DATE_FORMAT
  • CASE WHEN creates conditional logic in queries
  • COALESCE and IFNULL handle NULL values gracefully
Day 6

Introduction to JOINs

Combining Data from Multiple Tables

Real-world databases have data spread across multiple related tables. JOINs allow you to combine data from these tables based on related columns. This is one of the most important SQL skills!

Why Do We Need JOINs?

Consider this scenario: You want to see order details including customer names. But:

  • The orders table has order_id, customer_id, amount, date
  • The customers table has customer_id, name, email, city

To get customer names with their orders, you need to JOIN these tables!

Types of JOINs

JOIN Type Description Returns
INNER JOIN Only matching records from both tables Intersection
LEFT JOIN All from left table + matches from right Left + matching right
RIGHT JOIN All from right table + matches from left Right + matching left
FULL OUTER JOIN All records from both tables Everything (Union)

INNER JOIN

Returns only rows that have matching values in both tables:

SELECT orders.order_id, orders.order_date, orders.amount, customers.first_name, customers.last_name, customers.city FROM orders INNER JOIN customers ON orders.customer_id = customers.customer_id; -- Using table aliases for cleaner code SELECT o.order_id, o.order_date, o.amount, c.first_name, c.last_name, c.city FROM orders o INNER JOIN customers c ON o.customer_id = c.customer_id;

INNER JOIN Visual

If Customer A has orders but Customer B has no orders:

  • Customer A's orders: Included
  • Customer B: NOT included (no matching orders)

Only rows with matches in BOTH tables appear in results.

LEFT JOIN (LEFT OUTER JOIN)

Returns ALL rows from the left table, plus matching rows from the right table:

-- Find all customers and their orders (including customers with no orders) SELECT c.customer_id, c.first_name, c.last_name, o.order_id, o.amount FROM customers c LEFT JOIN orders o ON c.customer_id = o.customer_id; -- Find customers who have NEVER placed an order SELECT c.customer_id, c.first_name, c.last_name FROM customers c LEFT JOIN orders o ON c.customer_id = o.customer_id WHERE o.order_id IS NULL;

LEFT JOIN Use Cases

  • Find customers who haven't ordered (NULL check)
  • Find products that haven't sold
  • Find employees without assigned tasks
  • Any "find records without matches" scenario

RIGHT JOIN (RIGHT OUTER JOIN)

Returns ALL rows from the right table, plus matching rows from the left:

-- All orders with customer info (including orphaned orders) SELECT c.first_name, c.last_name, o.order_id, o.amount FROM customers c RIGHT JOIN orders o ON c.customer_id = o.customer_id; -- Note: RIGHT JOIN can always be rewritten as LEFT JOIN -- by swapping the table order

Joining Multiple Tables

You can join more than two tables:

-- Orders with customer names and product details SELECT o.order_id, o.order_date, c.first_name, c.last_name, p.product_name, oi.quantity, oi.unit_price FROM orders o INNER JOIN customers c ON o.customer_id = c.customer_id INNER JOIN order_items oi ON o.order_id = oi.order_id INNER JOIN products p ON oi.product_id = p.product_id;

JOIN with Aggregations

Combine JOINs with GROUP BY for powerful analysis:

-- Total spending by customer SELECT c.customer_id, c.first_name, c.last_name, COUNT(o.order_id) AS total_orders, SUM(o.amount) AS total_spent FROM customers c LEFT JOIN orders o ON c.customer_id = o.customer_id GROUP BY c.customer_id, c.first_name, c.last_name ORDER BY total_spent DESC; -- Sales by product category SELECT cat.category_name, COUNT(DISTINCT o.order_id) AS orders, SUM(oi.quantity) AS units_sold, SUM(oi.quantity * oi.unit_price) AS revenue FROM categories cat INNER JOIN products p ON cat.category_id = p.category_id INNER JOIN order_items oi ON p.product_id = oi.product_id INNER JOIN orders o ON oi.order_id = o.order_id GROUP BY cat.category_name ORDER BY revenue DESC;

Try It Yourself - Day 6 Exercises

  1. Join orders with customers to show customer names with each order
  2. Find all customers who have never placed an order
  3. Join orders, order_items, and products to see product names in orders
  4. Calculate total revenue by customer using JOINs and GROUP BY
  5. Find the top 5 selling products by total quantity

Day 6 Key Takeaways

  • JOINs combine data from multiple related tables
  • INNER JOIN returns only matching rows from both tables
  • LEFT JOIN returns all from left table + matches from right
  • Use table aliases (a, b, o, c) for cleaner, readable queries
  • Combine JOINs with GROUP BY for powerful multi-table analysis
Day 7

Week 2 Project & Assessment

E-commerce Database Analysis

Congratulations on reaching the final day of Week 2! Today, you'll apply everything you've learned to analyze a complete e-commerce database using SQL queries.

Project: E-commerce Database Analysis

Project Brief

You've been hired as a data analyst for an online retail company. The management team needs insights from the database to make strategic decisions. Write SQL queries to answer their business questions.

Download Practice Database

Use this SQL script to set up your practice database:

Download E-commerce Database (SQL)

5 tables | 1000+ records | Complete e-commerce schema

Database Schema

The database contains these tables:

  • customers (customer_id, first_name, last_name, email, phone, city, state, join_date)
  • products (product_id, product_name, category_id, price, stock_quantity)
  • categories (category_id, category_name)
  • orders (order_id, customer_id, order_date, status, total_amount)
  • order_items (item_id, order_id, product_id, quantity, unit_price)

Project Requirements

Part 1: Basic Queries (20 points)

  1. List all unique cities where customers are located
  2. Find the 10 most expensive products
  3. Count total number of orders by status
  4. Find products with stock quantity less than 10

Part 2: Filtering & Aggregation (30 points)

  1. Calculate total revenue for each month in 2024
  2. Find the top 5 customers by total spending
  3. Calculate average order value by customer city
  4. List categories with more than Rs. 1,00,000 in total sales
  5. Find products that have never been ordered

Part 3: JOINs & Complex Queries (50 points)

  1. Create a sales report showing: order_id, customer_name, product_name, quantity, total_price
  2. Find the best-selling product in each category
  3. Calculate customer lifetime value (total spending per customer)
  4. Create a monthly revenue trend report with month-over-month growth
  5. Identify customers who made purchases in every month of 2024

Assessment Criteria

Component Weight What's Evaluated
Query Correctness 50% Queries return accurate results
Query Efficiency 20% Using appropriate JOINs, avoiding redundancy
Code Quality 15% Formatting, comments, meaningful aliases
Quiz: SQL Concepts 15% Understanding of SQL fundamentals

Tips for Success

  • Start simple: Build complex queries step by step
  • Test incrementally: Run parts of your query to verify each step
  • Use aliases: Make your queries readable with meaningful table and column aliases
  • Comment your code: Explain what each query does
  • Check for NULLs: Consider edge cases with missing data

Week 2 Complete!

You've completed an intensive week covering:

  • Database concepts and relational data models
  • SELECT queries with filtering using WHERE
  • Sorting with ORDER BY and aggregations with GROUP BY
  • SQL functions for strings, numbers, dates, and conditional logic
  • JOINs to combine data from multiple tables

You can now query any relational database!

Next week, you'll learn Advanced SQL techniques including subqueries, CTEs, and window functions!

Quiz

Week 2 Self-Assessment Quiz

Test Your SQL Knowledge

Answer these questions to check your understanding of Week 2 SQL concepts.

Question 1: SELECT Basics

What does SELECT DISTINCT city FROM customers; return?

  1. All cities including duplicates
  2. Only unique city values (no duplicates)
  3. The first city in the table
  4. An error - DISTINCT is not valid
Show Answer

B. Only unique city values - DISTINCT removes duplicate values from the result set.

Question 2: WHERE Clause

Which operator finds values within a range (inclusive)?

  1. IN
  2. LIKE
  3. BETWEEN
  4. WITHIN
Show Answer

C. BETWEEN - BETWEEN x AND y returns values from x to y inclusive.

Question 3: NULL Values

How do you find rows where the phone column is empty (NULL)?

  1. WHERE phone = NULL
  2. WHERE phone IS NULL
  3. WHERE phone == NULL
  4. WHERE phone = ''
Show Answer

B. WHERE phone IS NULL - NULL cannot be compared using =; you must use IS NULL or IS NOT NULL.

Question 4: Aggregation

What's the difference between WHERE and HAVING?

  1. They are the same thing
  2. WHERE filters groups, HAVING filters rows
  3. WHERE filters rows before grouping, HAVING filters groups after
  4. WHERE is for numbers, HAVING is for text
Show Answer

C. WHERE filters rows before grouping, HAVING filters groups after - Use WHERE to filter individual rows, HAVING to filter aggregated results.

Question 5: GROUP BY

What does this query return? SELECT city, COUNT(*) FROM customers GROUP BY city;

  1. All customers with their cities
  2. The number of unique cities
  3. Each city with the count of customers in that city
  4. An error
Show Answer

C. Each city with the count of customers in that city - GROUP BY groups rows by city, and COUNT(*) counts rows in each group.

Question 6: JOINs

Which JOIN returns all rows from the left table, even if there's no match in the right table?

  1. INNER JOIN
  2. LEFT JOIN
  3. RIGHT JOIN
  4. CROSS JOIN
Show Answer

B. LEFT JOIN - LEFT JOIN returns all rows from the left table plus matching rows from the right. Non-matching rows show NULL for right table columns.

Question 7: CASE WHEN

What does CASE WHEN do in SQL?

  1. Changes the case of text (uppercase/lowercase)
  2. Creates conditional logic to return different values
  3. Sorts data by case sensitivity
  4. Creates a new table
Show Answer

B. Creates conditional logic to return different values - Similar to IF/ELSE, CASE WHEN returns different values based on conditions.

Question 8: Query Order

What is the correct order of SQL clauses?

  1. SELECT, FROM, WHERE, ORDER BY, GROUP BY
  2. FROM, SELECT, WHERE, GROUP BY, ORDER BY
  3. SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY
  4. SELECT, WHERE, FROM, GROUP BY, ORDER BY
Show Answer

C. SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY - This is the standard clause order in SQL queries.

Quiz Complete!

How did you do?

  • 8/8: Excellent! You've mastered SQL basics
  • 6-7/8: Good understanding - review the missed topics
  • 4-5/8: Fair - revisit Days 3-6 material
  • Below 4: Consider re-reading the week's content

About the Instructor

Pawan Rama Mali

Data Analytics professional and educator with extensive experience in SQL, database management, and teaching complex analytical concepts to beginners. Passionate about making data skills accessible to everyone and helping students build practical, job-ready expertise.

Specializations:

  • SQL & Database Design
  • Data Analytics & Business Intelligence
  • R Programming & R Shiny Dashboards
  • Python for Data Analysis

Visit Instructor's Portfolio