Understanding the Basics SQL

After you set up your database design, it's time to learn queries. Queries are how you retrieve data from the server. Your applications run queries to insert new records, change information, delete records, and retrieve data to display to a user. The MySQL database uses SQL. SQL is slightly different between platforms, but once you understand how to write SQL for MySQL, you'll understand other database servers as well.

The SELECT Query: Retrieving Data from Your Tables

The MySQL SELECT statement is probably the most commonly used statement. You retrieve data much more than you change it, so SELECT is on the top of the list when you want to program SQL for your database.

Let's first take the very basic format of a SELECT statement.

SELECT expression;

SELECT is required, and "expression" is the value you want to return. The expression component can be a number of variables. It can contain aggregate functions, mathematical expressions, and table columns. If you specify a column, you need to also specify a table.

MySQL also requires a semicolon at the end of statements. The semicolon indicates that the end of the statement has been reached. Some MySQL statements are long and span multiple lines. You can use multiple lines, and MySQL will consider it a part of one statement until it finds a semicolon.

For instance, you can add two values.

SELECT 2 + 2;

When you run this query, it returns 4. Notice that no table is specified in the expression. Once you add a column name, you need to specify a table. Take, for instance, the following statement.

SELECT first_name;

Because first_name is a column, the MySQL statement will fail. There is nothing to calculate or retrieve.

Let's add a table to our MySQL statement to make it complete.

SELECT first_name from Customer;

It's important to note that first_name must be a valid column in the Customer table. If you attempt to query an invalid column in the table, MySQL will give you an error.

Let's say we have three customer records in the Customer table. The output would look like this:

first_name

------------------------------

Jake

John

Jane

In the above example, the SELECT statement retrieves all records in the Customer table. Because we only specified one column, only one column is shown. You can view all columns using the asterisk.

Take a look at the following query.

SELECT * from Customer;

The result would be the following:

first_name, last_name, address, birthdate

--------------------------------------------------------------

Jake Smith 999 Somewhere 11/11/1995

John Smith 888 Somewhere 12/22/1995

Jane Johnson 777 Somewhere 8/24/1975

Although using an asterisk is valid SQL syntax, it's generally frowned upon in the database maintenance world. In most cases, if you attempt to pass code reviews with asterisks in your SELECT statements, a database administrator will ask you to specify column names. Standard SQL practice is to always specify columns rather than using an asterisk, because it causes degradation in the database server's performance. It's still useful in one-off queries, and small table queries won't harm most databases. However, you shouldn't use asterisks in your SELECT queries for permanent statements such as stored procedures.

Normally, you query more than one table when you create SELECT statements. SELECT statements use an operator called a JOIN. The JOIN statement links two tables. The first table has the primary key, and the second table contains the foreign key.

Let's look at an example.

SELECT first_name, order_id FROM Customer

INNER JOIN Order ON Customer.customer_id = Order.customer_id;

Let's dissect this statement one-by-one. The first SQL line uses the SELECT statement and defines two columns to return: first_name and order_id. The first table included is the Customer table. However, we also want to see a list of orders for each customer. This is where the INNER JOIN statement is used.

INNER JOIN tells the MySQL engine to join the Order table to the Customer table where the order record has a matching customer_id. Notice that customer_id is located in the Customer table and the Order table. The customer_id in the Customer table is the primary key. The customer_id in the Order table is the foreign key.

With MySQL, you can set up constraints, which means that any Order record must contain a customer_id. Since no order can be made without a customer, this constraint would stop any Order record from being orphaned.

With the above INNER JOIN statement, any customer without an Order will not show in the results. Let's say that Jane is the only customer with an order. Jane has 2 orders. The result is the following:

first_name, order_id

---------------------------------------

Jane 1

Jane 2

In the above result, Jane has two orders with an ID of 1 and 2. The first name shows up twice, because there are two orders in the Order table. Notice that John and Jake do not display at all, because there is no joined Order record.

What if you want to see all customers even if they have no order? You can overcome the INNER JOIN limitation using the OUTER JOIN statement. The OUTER JOIN statement returns all records from the Customer table, and then displays a corresponding Order record. If the customer has no order, the data in the Order part of the query displays NULL.

Let's use the same SELECT query with an OUTER JOIN instead of an INNER JOIN.

SELECT first_name, order_id FROM Customer

LEFT [OUTER] JOIN Order ON Customer.customer_id = Order.customer_id;

The data result is the following:

first_name, order_id

---------------------------------------

Jake NULL

John NULL

Jane 1

Jane 2

Notice that since Jake and John have no orders, the order_id for their customer accounts is NULL.

One common mistake with new database programmers is thinking that NULL means nothing. NULL is a value that can be assigned to database fields. NULL means "a missing unknown value." Database administrators use NULL values when they don't know what to store in a specific field. For instance, suppose you know the customer's first name but not the last name. You can store NULL in the last name field until the customer updates the last_name column.

In our SELECT statement, we specified the table name in the JOIN syntax. Since you can use several JOINs in a SELECT statement, typing table names with every statement can be tedious. MySQL offers a way to alias tables to create shortcuts for table names.

Let's look at the same SELECT statement for customers and replace full table names with aliases.

SELECT first_name, order_id FROM Customer AS c

LEFT [OUTER] JOIN Order AS o ON c.customer_id = o.customer_id;

In the above statement, the Customer table has the c alias and the Order table has the o alias. You can then use these aliases throughout your SELECT statement to make them simpler to read and more condensed. Aliases are common in everyday MySQL coding.

Now that you know how to query your data, you need to filter and order it. Most business queries need a subset of data. You don't normally need to see every customer and every order. You need to see specific orders or customers. For instance, you might want to see a list of orders made in a particular month.

SQL uses the WHERE clause to filter records and return only a subset of data that you want to see. Let's add a WHERE clause to the SELECT statement.

SELECT first_name, order_id FROM Customer AS c

LEFT [OUTER] JOIN Order AS o ON c.customer_id = o.customer_id

WHERE o.order_date >= �8/1/2015' and o.order_date <= �8/31/2015';

If any records in the Order table have an order_date between 8/1/2015 and 8/31/2015, MySQL returns the data set. Notice that we specified the table alias before the column name in the WHERE clause. Using the alias eliminates the possibility of having ambiguous columns in the WHERE clause. For instance, suppose you have create_date in the Customer table and create_date in the Order table. When you query without specifying the table in the WHERE clause, MySQL does not know which table column to use. With the o alias specified, it knows to compare records in the Order table.

Finally, you need to order records. Ordering records help you better review the data. Using a list of orders, you might want the records ordered by the order_date column. You can do this using the ORDER BY clause. Let's add ordering to the SQL statement.

SELECT first_name, order_id FROM Customer AS c

LEFT [OUTER] JOIN Order AS o ON c.customer_id = o.customer_id

WHERE o.order_date >= �8/1/2015' and o.order_date <= �8/31/2015'

ORDER BY order_date;

Now, MySQL will list records in chronological order by order_date. You can also list records in descending order using the DESC keyword. The following code orders records in descending order.

SELECT first_name, order_id FROM Customer AS c

LEFT [OUTER] JOIN Order AS o ON c.customer_id = o.customer_id

WHERE o.order_date >= �8/1/2015' and o.order_date <= �8/31/2015'

ORDER BY order_date DESC;

Let's take the SELECT statement to one more level. We've joined two tables, but let's add another table to the data set. Suppose you also want to see the product that the customer ordered, and you have a list of products in a Product table. You store the product Id in the Order table, which is the foreign key that matches the product's primary key.

Take a look at the following SQL statement:

SELECT first_name, order_id, product_name FROM Customer AS c

LEFT [OUTER] JOIN Order AS o ON c.customer_id = o.customer_id

INNER JOIN Product As p ON o.product_id = p.product_id

WHERE o.order_date >= �8/1/2015' and o.order_date <= �8/31/2015'

ORDER BY order_date DESC;

Assuming only two orders were made in August, the result is the following data set:

first_name, order_id, product_name

---------------------------------------

Jane 1 Service1

Jane 2 Service2

In the above record set, Jane ordered in August. She ordered Service1 for her first order and Service2 for her second order. This data is the result of the OUTER and INNER JOIN combined in the SQL statement. When you create complex SQL statements, they can sometimes have several JOIN statements. Large databases often have several tables that span the entire application, which keeps the data normalized and data integrity stays intact.

Using INSERT to Add Records

When you first create a table, it has no data. You need the INSERT statement to add records to the table. Usually, records are added from the application, but there are times when the database administrator adds records manually using the INSERT SQL statement.

INSERT statements use fewer resources than SELECT statements, so you don't need to be as cautious with syntax. Simple INSERT statements take very little memory and CPU usage.

First, let's take a look at the INSERT template and how the statement is formatted.

INSERT INTO table () VALUES ();

The table expression is replaced with your table name. We'll use the Customer table for our SQL examples.

Let's add a new record to the Customer table.

INSERT INTO Customer (first_name, last_name, address) VALUES (�Philip', �Johnson', �1333 Somewhere Dr');

Notice all string values are contained within ticks. These tick marks indicate to MySQL that it should store data as a string.

The first set of parenthesis is the list of column names that have immediate values. In this example, we're adding values to the first_name, last_name, and address columns. The values in the VALUES statement must be in the same order as the column names. When you have several columns, it can get confusing when you need to ensure that values are ordered with column names. If you accidentally miss the order, your data will be wrong. If you attempt to store the wrong data type in a specific column, MySQL returns an error.

You can use NULL in place of values provided your table column is set up to allow NULL values. Suppose we don't know the customer's last name, but we know the first name and address. We can use NULL as a replacement value.

Take a look at this example.

INSERT INTO Customer (first_name, last_name, address) VALUES (�Philip', NULL, �1333 Somewhere Dr');

Notice that although the last_name column should contain a string, the NULL value is not enclosed with tick marks. If you add tick marks for the NULL value, the database will take the value as a literal and place NULL as a string in the last_name field.

In MySQL, you can set a default value for table values. Instead of manually using NULL for values you don't know, you can just eliminate the column in the INSERT statement. When you work with this technique, the MySQL database engine automatically inserts the default value. This could be NULL or some other value you automatically set for the column. For instance, you could have a "paid" column in an Order table. The paid column could contain the amount a customer actually paid. The default value would be set to 0 until the customer pays for the product.

Let's say you have the last_name column set to automatically insert a NULL value. If you don't know the value when you create the INSERT statement, you can now eliminate it from the list of columns and let MySQL insert NULL as a default. The INSERT query would then change to the following example:

INSERT INTO Customer (first_name, address) VALUES (�Philip', �1333 Somewhere Dr');

The INSERT statement lets you add multiple records. Suppose you want to add two records in one INSERT statement, you would use the following syntax:

INSERT INTO Customer (first_name, address)

VALUES (�Philip', �1333 Somewhere Dr'), (�Jacob', �1222 Somewhere Dr');

The above example adds Philip and Jacob to your Customer table.

DELETE Records from Your Table

While most database administrators prefer that you don't delete records, you still must know the DELETE SQL statement to remove records from a table.

The biggest mistake new database administrators make is using the DELETE statement with no WHERE clause. If you use the DELETE statement with no WHERE clause, you inadvertently delete all records from your table and you're left with restoring records from backups.

You also can't delete records associated with foreign key constraints. Using the Customer and Order example, what if a customer record was deleted and the Order record remains. You now have an order record with no customer, and your data is corrupted. Relational databases are set up to link records and avoid orphaned records. To delete a customer record, you would first need to delete all orders associated with the customer. For this reason, we'll use DELETE examples in the Order table.

Let's first look at a record set from a SELECT statement.

SELECT first_name, order_id FROM Customer

INNER JOIN Order ON Customer.customer_id = Order.customer_id;

This query returns the following data:

first_name, order_id

---------------------------------------

Jane 1

Jane 2

You notice that the second order is a mistake, and you want to delete it. You can remove the record with the DELETE statement.

The following is an example of a DELETE statement that deletes 1 Order record.

DELETE FROM Order WHERE order_id = 2;

The statement above only deletes the order with the order_id of 2. Suppose you accidentally ran the following DELETE statement:

DELETE FROM Order;

This statement deletes all records in the Order table. This mistake can be devastating to a running business database.

In some cases, you might want to delete all records from a table. The DELETE statement takes a heavy toll on database resources, especially if there are several thousands of records. If you decide to remove all data from a table, you want to use the TRUNCATE statement. TRUNCATE deletes every record without damaging performance.

The following statement is an example of the TRUNCATE statement:

TRUNCATE Order;

This statement removes all orders from the Order table. In most cases, you'll want to delete the entire table object when you remove records. The DROP SQL statement deletes the table object from the database.

DROP Order;

As you can see, using the DELETE statement can have dangerous consequences. It's unlikely that you want to delete order records, but knowing how to remove duplicates from the system or delete malformed records is still useful for administrators.

When using a DELETE statement, if you make a mistake you can restore records from backup logs. This is still damaging to a business as the application usually produces errors until backups are restored. For this reason, most administrators prefer to add an "active" column field in each table. The active field is set to 1 or 0. If the record has an active value of 1, then it displays in the application. If the value is 0, then the record is considered deleted and hidden from view. Setting records as active and inactive eliminates many of the mistakes using the DELETE statement, and it keeps data in case you ever need to review older content.

Edit Database Records Using the UPDATE Statement

The final important, common statement in the SQL language is the UPDATE command. This command edits values in your database. This statement should also be used with caution, because you can accidentally update all of your records with the wrong data in the same way you can accidentally delete all of your records.

Just like the DELETE statement, you need the WHERE clause to filter only the records you mean to change.

Let's first take a look at the list of customers in the Customer table.

SELECT customer_id, first_name, last_name, address, birthdate from Customer;

The result would be the following:

customer_id, first_name, last_name, address, birthdate

--------------------------------------------------------------

1 Jake Smith NULL 11/11/1995

2 John Smith 888 Somewhere 12/22/1995

3 Jane Johnson 777 Somewhere 8/24/1975

You notice that the address for Jake is NULL. At the time the record was created, you didn't know the address, so the database stored NULL in its place. You now know the customer's address, so you want to update the record with the new value. You change record data using the UPDATE SQL statement.

Typically, the ID field is the unique field that identifies the customer. We can use this unique identification number to update data for only Jake and no other customers.

Let's update Jake with a new address value.

UPDATE Customer SET address = �345 Somewhere St'

WHERE customer_id = 1;

Now run your SELECT statement again. The result is that Jake has an updated address in his record.

Suppose you accidentally forgot to use a WHERE clause, and you run an UPDATE statement that looks like the following:

UPDATE Customer SET address = �345 Somewhere St';

The above SQL statement is valid, and it will run on your MySQL server. The problem is that MySQL updates every record in your Customer table with the new �345 Somewhere St" address, so every customer has the same address value. As you can see, this can be devastating for businesses with several thousand customers. You'll have to restore data from your transaction logs and backups, which can take some time.

Another common update technique is to update values from one table to another. Suppose we had a list of customers with birthdates and IDs in a second table named TempData. You want to import birthdates into the Customer table from this TempData table.

First, let's run a SELECT statement on the TempData table to view its data.

SELECT customer_id, birthdate FROM TempData;

The result is the following data set:

customer_id, birthdate

-------------------------------------

1 8/24/1975

2 11/11/1995

3 11/12/1995

You can use the customer_id field in both tables to update your main Customer table. Take a look at the following query.

UPDATE Customer, TempData SET Customer.birthdate = TempData.birthdate

WHERE Customer.customer_id = TempData.customer_id

The above statement sets the customer's birthdate in the Customer table to the value stored in the TempData table. Although you don't see an INNER JOIN statement in this SQL query, the MySQL database performs an INNER JOIN first based on matching customer_id records. Once the records are matched, the MySQL engine then edits the data in the birthdate column. This technique is common for database administrators responsible for importing data from external sources. They set up a temporary table, and then use a SQL statement to update records.

One column type that you should not update is a primary key or an auto-incrementing column. Changing a primary key can be devastating to data integrity. Think of our Customer and Order records. The Order table has a foreign key with the customer's ID. This key is used to link the right order to the right customer. What happens if you were to change a customer ID? Just like deleting a customer with a relational constraint on Order, the order records for that particular customer would be orphaned.

Wrapping It Up

This article explained the basics of all four main SQL statements: UPDATE, INSERT, SELECT, and DELETE. As a database administrator or programmer, these four statements are what you'll work with in daily tasks. The SELECT statement is the most common since you'll be asked to create reports and interfaces based on the data in several tables.

Remember to always use the UPDATE and DELETE statements with a WHERE clause to avoid destroying your data. Use these statements with caution, and perform them on test databases before you commit to them.

These four statements are the foundation for MySQL as well as other database platforms. You'll need to know these basics to understand future complex queries.