Common SQL Mistakes and How to Avoid Them

Stop making these errors and write cleaner, more efficient queries

12 min read | Level: Intermediate | Category: SQL

Introduction

Even experienced data analysts make SQL mistakes. Some errors cause wrong results, while others can crash your database or slow down performance dramatically. In this guide, we'll walk through the 10 most common SQL mistakes and show you exactly how to avoid them.

Why This Matters

Frustration of SQL Errors

A single SQL mistake in production can lead to incorrect reports, data corruption, or hours of debugging. Learning to recognize these patterns will save you time and protect your data integrity.

1

Using SELECT * in Production

The Mistake

Using SELECT * to fetch all columns from a table might seem convenient, but it's a performance killer and maintenance nightmare.

-- BAD: Fetches all columns, even ones you don't need SELECT * FROM customers WHERE country = 'India';

Why It's Bad

  • Performance: Transfers unnecessary data over the network
  • Maintenance: If table structure changes, your code may break
  • Clarity: Other developers don't know which columns you actually need

The Fix

-- GOOD: Only fetch columns you need SELECT customer_id, customer_name, email, city FROM customers WHERE country = 'India';

Best Practice

Always explicitly list the columns you need. The only exception is during quick exploratory analysis in development.

2

Forgetting NULL Handling

The Mistake

NULL values in SQL behave differently than you might expect. Comparing anything to NULL returns NULL (unknown), not TRUE or FALSE.

-- BAD: This will NOT return rows where discount is NULL SELECT * FROM orders WHERE discount != 10; -- BAD: This will also NOT work SELECT * FROM orders WHERE discount = NULL;

Why It's Bad

You might miss important data or get incorrect aggregations. NULL means "unknown" - it's not zero, empty string, or false.

The Fix

-- GOOD: Use IS NULL or IS NOT NULL SELECT * FROM orders WHERE discount != 10 OR discount IS NULL; -- GOOD: Use COALESCE to handle NULLs SELECT order_id, COALESCE(discount, 0) AS discount FROM orders; -- GOOD: Use NULLIF to create NULLs safely SELECT total_sales / NULLIF(total_orders, 0) AS avg_order_value FROM sales_summary;

Watch Out!

Aggregation functions like SUM(), AVG(), COUNT() ignore NULL values. COUNT(*) counts all rows, but COUNT(column_name) only counts non-NULL values.

3

Not Using Aliases Properly

The Mistake

Writing queries without table aliases makes code hard to read and maintain, especially with JOINs.

-- BAD: Confusing without aliases SELECT customers.customer_name, orders.order_id, orders.order_date, order_items.quantity, products.product_name FROM customers JOIN orders ON customers.customer_id = orders.customer_id JOIN order_items ON orders.order_id = order_items.order_id JOIN products ON order_items.product_id = products.product_id;

The Fix

-- GOOD: Clear and readable with aliases SELECT c.customer_name, o.order_id, o.order_date, oi.quantity, p.product_name FROM customers c JOIN orders o ON c.customer_id = o.customer_id JOIN order_items oi ON o.order_id = oi.order_id JOIN products p ON oi.product_id = p.product_id;

Alias Tips

  • Use meaningful abbreviations (c for customers, o for orders)
  • Be consistent across your queries
  • Always alias calculated columns with AS
4

Incorrect JOIN Usage

The Mistake

Using the wrong JOIN type or not understanding how JOINs work leads to missing or duplicate data.

-- MISTAKE: Using INNER JOIN when you need all customers SELECT c.customer_name, o.order_id FROM customers c INNER JOIN orders o ON c.customer_id = o.customer_id; -- This EXCLUDES customers who never placed an order!

Understanding JOINs

JOIN Type Returns Use When
INNER JOIN Only matching rows from both tables You need records that exist in both tables
LEFT JOIN All rows from left + matching from right You need all records from the main table
RIGHT JOIN All rows from right + matching from left Rarely used; use LEFT JOIN instead
FULL OUTER JOIN All rows from both tables You need to see unmatched records from both sides

The Fix

-- CORRECT: Use LEFT JOIN to include all customers SELECT c.customer_name, COUNT(o.order_id) AS total_orders FROM customers c LEFT JOIN orders o ON c.customer_id = o.customer_id GROUP BY c.customer_id, c.customer_name; -- Now customers with 0 orders will show up!
5

Missing WHERE in UPDATE/DELETE

The Mistake

This is the most dangerous SQL mistake. Forgetting the WHERE clause in UPDATE or DELETE affects ALL rows in the table.

-- DISASTER: Updates ALL customers to inactive! UPDATE customers SET status = 'inactive'; -- CATASTROPHE: Deletes ALL orders! DELETE FROM orders;

This Can End Careers!

Running an UPDATE or DELETE without WHERE in production has caused data loss incidents at major companies. Always double-check before executing.

The Fix: Safety Checklist

-- STEP 1: First, run a SELECT to verify your WHERE clause SELECT * FROM customers WHERE last_login < '2025-01-01'; -- STEP 2: Check the count SELECT COUNT(*) FROM customers WHERE last_login < '2025-01-01'; -- STEP 3: Only then run the UPDATE/DELETE UPDATE customers SET status = 'inactive' WHERE last_login < '2025-01-01'; -- BONUS: Use transactions for safety BEGIN TRANSACTION; UPDATE customers SET status = 'inactive' WHERE last_login < '2025-01-01'; -- Verify the changes, then: COMMIT; -- or ROLLBACK if something is wrong
6

Not Using Indexes

The Mistake

Querying large tables without proper indexes makes queries run slowly - sometimes taking minutes instead of milliseconds.

-- SLOW: Full table scan on 10 million rows SELECT * FROM orders WHERE customer_email = 'john@example.com';

The Fix

-- Create an index on frequently queried columns CREATE INDEX idx_orders_email ON orders(customer_email); -- For composite queries, create composite indexes CREATE INDEX idx_orders_date_status ON orders(order_date, status); -- Check if your query uses indexes EXPLAIN SELECT * FROM orders WHERE customer_email = 'john@example.com';

When to Create Indexes

  • Columns used in WHERE clauses frequently
  • Columns used in JOIN conditions
  • Columns used in ORDER BY
  • Foreign key columns
7

String vs Number Comparison

The Mistake

Comparing strings that contain numbers doesn't work as expected because strings are sorted alphabetically.

-- BAD: If order_id is VARCHAR, this gives wrong results SELECT * FROM orders WHERE order_id > '100'; -- Returns: '11', '2', '200' (alphabetical, not numerical) -- BAD: Implicit conversion can cause issues SELECT * FROM orders WHERE order_id = 100; -- order_id is VARCHAR

The Fix

-- GOOD: Cast to proper data type SELECT * FROM orders WHERE CAST(order_id AS INT) > 100; -- BETTER: Fix the table design ALTER TABLE orders MODIFY COLUMN order_id INT;

Best Practice

Always use the correct data types when designing tables. Numbers should be INT/BIGINT, not VARCHAR.

8

GROUP BY Mistakes

The Mistake

Including non-aggregated columns in SELECT that aren't in GROUP BY, or using WHERE instead of HAVING for aggregates.

-- ERROR: customer_name not in GROUP BY or aggregate SELECT customer_id, customer_name, -- This will cause an error in strict mode! SUM(order_total) FROM orders GROUP BY customer_id; -- WRONG: WHERE can't filter aggregated results SELECT customer_id, SUM(order_total) AS total FROM orders WHERE SUM(order_total) > 1000 -- ERROR! GROUP BY customer_id;

The Fix

-- GOOD: Include all non-aggregated columns in GROUP BY SELECT customer_id, customer_name, SUM(order_total) AS total_spent FROM orders GROUP BY customer_id, customer_name; -- GOOD: Use HAVING for aggregate conditions SELECT customer_id, SUM(order_total) AS total FROM orders GROUP BY customer_id HAVING SUM(order_total) > 1000;
WHERE HAVING
Filters rows BEFORE grouping Filters groups AFTER grouping
Cannot use aggregate functions Used with aggregate functions
WHERE status = 'active' HAVING COUNT(*) > 5
9

Subquery Performance Issues

The Mistake

Using correlated subqueries that run once for each row in the outer query, causing severe performance issues.

-- SLOW: Correlated subquery runs for EVERY row SELECT o.order_id, o.order_total, (SELECT AVG(order_total) FROM orders o2 WHERE o2.customer_id = o.customer_id) AS customer_avg FROM orders o;

The Fix

-- FAST: Use JOIN with a derived table instead SELECT o.order_id, o.order_total, ca.customer_avg FROM orders o JOIN ( SELECT customer_id, AVG(order_total) AS customer_avg FROM orders GROUP BY customer_id ) ca ON o.customer_id = ca.customer_id; -- EVEN BETTER: Use window functions SELECT order_id, order_total, AVG(order_total) OVER (PARTITION BY customer_id) AS customer_avg FROM orders;
10

Not Testing on Sample Data

The Mistake

Running untested queries directly on production data without verifying the logic first.

The Fix: Safe Testing Workflow

-- STEP 1: Add LIMIT to test your query logic SELECT * FROM large_table WHERE complex_condition LIMIT 10; -- STEP 2: Use COUNT to understand scale SELECT COUNT(*) FROM large_table WHERE complex_condition; -- STEP 3: For UPDATE/DELETE, test in a transaction BEGIN TRANSACTION; UPDATE large_table SET column = value WHERE condition; SELECT * FROM large_table WHERE condition; -- Verify ROLLBACK; -- Undo if testing, COMMIT if correct

Pro Tips

  • Create a development/staging database with sample data
  • Use EXPLAIN to understand query execution plans
  • Always backup before running bulk UPDATE/DELETE
  • Have a peer review queries before production runs

Quick Reference Checklist

SQL Best Practices Checklist
  • Avoid SELECT * - list specific columns
  • Handle NULLs explicitly with IS NULL, COALESCE
  • Use meaningful table aliases
  • Choose the right JOIN type
  • ALWAYS use WHERE with UPDATE/DELETE
  • Create indexes for frequently queried columns
  • Use correct data types
  • Use HAVING (not WHERE) for aggregate filters
  • Prefer JOINs over correlated subqueries
  • Test queries before running on production

Master SQL with EDUSHARK TRAINING

Want to become proficient in SQL and avoid these common mistakes? Our Data Analytics program includes comprehensive SQL training from basics to advanced queries.

View Our Data Analytics Program