Sci-fi Hall of Famer Ursula LeGuin (Also, in my imagination, a practitioner of SQL) once said:
“There are no right answers to wrong questions.”
In working one-on-one with many who are new to SQL, I see a lot of confusion on how to get started. In this post, I’ll give you a repeatable set of steps you can use to start writing basic SELECT queries.
Let’s start with some word-nerd stuff:
A query is simply… a question!
We’re asking the database to tell us something:
—“How many customers made orders last month?”
—“Who’s the top-selling salesperson?”
—“What product SKUs are available in the Beverage category for our Eastern US region?”
Ready? Here’s the 4 steps to writing effective SQL Select queries:
1 – State the question you’re answering in plain (non-technical) language.
If you don’t know what you’re trying to get, you’ll never get it.
2 – Build a list of the tables you’ll need to “answer the question”.
—Identify the table(s) that contain the data you need. Not just for the SELECT list, but for other common clauses. You can use the question you wrote in Step 1 to being to understand what these other common clauses might be.
(*) Add any tables that contain even one field of data you need to your list here.
—Is there anything in your question about sorting? If so, you’ll need an ORDER BY clause and a field or fields to sort on, in either ascending order (A-to-Z alphabetically, smaller numbers up to larger numbers, or earlier dates to later dates, for example) or descending order (The opposite of that).
(*) Add a table to your list here if you need to sort using those values (It doesn’t have to be in the eventual SELECT list)
—Does the question ask about certain specific “pieces” of data out of a larger “pie” of data? For example, do we want to see information about all our customers, or just ones who’ve bought over a certain amount of goods? Or ones whose companies are in certain sales territories? If so, we will be using a WHERE clause with a yes/no expression to filter in the rows we want.
(*) Add any tables you’ll need to build those expressions.
Notice that not all the fields we’re identifying will appear in the eventual SELECT list. For example, if the question is “Display the product name and price of each product that’s the most expensive in its category,” our SELECT list will only contain 2 columns, product name and price, but will have aggregates that work to figure out which product is indeed the most expensive by category.
3 – Look at all the tables you’ve identified and find out how they’re related.
With simple SELECT queries, we are most often joining data that exists in multiple relational database tables.
If the tables are directly related, we will be joining them on the Primary Key / Foreign Key columns. If not, we are often incorporating other tables into the query simply for “plumbing” purposes”: By joining these tables that we don’t “need”, we’re able to use tables that aren’t directly related to each other in one query.
4 – Build the query!
—If you’re using a query editor that highlights syntax errors, all the better for this step. (MySQL Workbench, SQL Server Management Studio, SQL Workshop on (Oracle) APEX could all work, depending on your specific SQL flavor. We want to make sure we’re on track as we build our query.
For a complete walkthrough on how to build a query like this from a business request, check this post.
In future posts, we’ll talk about more advanced versions of this, where we add aggregate functions and grouping. Until then….