Querying Data using MySQL Select Statement

SELECT Statement

In order to retrieve data from MySQL database server you use SQL SELECT statement. Here is the most typical form of it.

SELECT column_name1,column_name2…
FROM tables
[WHERE conditions] [GROUP BY group
[HAVING group_conditions]] [ORDER BY sort_columns] [LIMIT limits];

The SQL SELECT statement has many optional parts which you can use. FROM, WHERE, GROUP BY, HAVING, ORDER BY and LIMIT has to be in the order above.

To select all columns in a table you can use (*) notation instead of listing all column name. For example, if you need to query all the columns in departments table, just execute this query:

SELECT * FROM departments;

The SQL SELECT statement also allows you to view some partial data of a table by listing columns’ name after the SELECT keyword. For example if you need to view only first_name and last_name of all students in the student table, you can use the following query:

SELECT first_name, last_name FROM student;

WHERE Clause

The ‘WHERE’ clause of the ‘SELECT’ statement is used to retrieve particular rows from the table by specifying a match condition or search criteria. You use WHERE clause to filter the records you want to see. For example, we can find the students from department id 100 by using the following query:

SELECT * FROM student WHERE dept_id=100;

Various relational operators can be used for comparison like <, <=, >, >=, !=, <>. Similarly, multiple conditions can be combined in WHERE clause using operators like &&, ||, !, AND, OR, NOT, and XOR.

The complete list can be found here – http://dev.mysql.com/doc/refman/5.5/en/expressions.html

GROUP BY Clause

The MySQL GROUP BY clause is used with SQL SELECT statement to group selected records into a set of summary records by one or more column’s value or expression.

The MySQL GROUP BY clause is an optional part of the SQL SELECT statement. The MySQL GROUP BY clause must appear after the WHERE clause or FROM clause if WHERE clause is omitted of the SQL SELECT statement.

The MySQL GROUP BY clause consists of GROUP BY keyword followed by a list of expressions separated by commas. The following illustrates the MySQL GROUP BY clause:

SELECT col1_,col_2,… col_n, aggregate_function(expression)
FROM table
WHERE where_conditions
GROUP BY col_1, col_2, … col_n

Aggregate functions are used with MySQL GROUP BY clause to perform calculation on each group of records on return a single value for each row.

Let’s say if we want to know how many employees are working in each department. In our example, the table ‘dept_emp’ contains all the information about which employee is working in which department. We can find out the result by grouping the departments and counting the no. of employees in each group (department) by using the following query:

SELECT dept_no, COUNT(emp_no) FROM dept_emp GROUP BY dept_no;

We can also combine WHERE clauses as well. For eg. If we want the no. of employees working in a particular department, we can use WHERE clause as follows.

SELECT dept_no, COUNT(emp_no) FROM dept_emp WHERE dept_no='d002' GROUP BY dept_no;

The WHERE clause should be before the GROUP BY.

HAVING Clause

The MySQL HAVING clause is an optional part of and used only with the SQL SELECT statement. The MySQL HAVING clause specifies a filter condition for a group of record or an aggregate. The MySQL HAVING is often used with MySQL GROUP BY clause.

When using with MYSQL GROUP BY clause, you can apply filter condition of the HAVING clause only to the columns appear in the GROUP BY clause. If the MySQL GROUP BY clause is omitted, the MySQL HAVING clause will behave like a WHERE clause.

In the above employee-department example let us say, we want to find only the departments where there are more than 50,000 employees. This cannot be achieved using the WHERE clause because we want to apply the condition ( > 50,000) on the group and not on individual record. Here is where HAVING is useful. The query for the same is as follows.

SELECT dept_no, COUNT(emp_no) FROM dept_emp GROUP BY dept_no HAVING COUNT(emp_no) > 50000;

or use aliases for aggregate function column

SELECT dept_no, COUNT(emp_no) as emp_count FROM dept_emp GROUP BY dept_no HAVING emp_count > 50000;

Note that the MySQL HAVING clause applies to groups as a whole while the WHERE clause applies to individual rows.

Sorting results with ORDER BY

The ORDER BY clause allows you to sort the result set on one or more columns in ascending or descending order. To sort the result set in ascending order you use ASC and in descending order you use DESC keywords. By default, the ORDER BY will sort the result set in ascending order.

For example, if we want to find only the departments where there are more than 50,000 employees with employee count in descending order, you can execute the following query:

SELECT dept_no, COUNT(emp_no) as emp_count FROM dept_emp GROUP BY dept_no HAVING emp_count > 50000 ORDER BY emp_count DESC;

LIMIT Clause

Most of the time when you work with master data tables which contain thousands to millions of records you don’t want to write a query to get all the data from those tables because of application’s performance and high traffic between database server and application server. MySQL supports a cool feature called LIMIT to allow you to constrain the returned records with SELECT statement.

Let’s say you have a database table with thousands of records and you want to get just first N records, you can use the following query:

SELECT * FROM table_name LIMIT N

The MySQL LIMIT also allows you to get a range of records where you decide starting record number and how many records you want to retrieve. The syntax of MySQL LIMIT to select a range of records:

SELECT coln1, coln2,… FROM table_name LIMIT START, N

where, START – starting record/row index, N – number of records/rows to retrieve

Example:
In our ‘employees’ table there are close to 300 thousand records. Let’s say we want only the first 10 records. The query is as follows.

SELECT * FROM employees LIMIT 10;

As another example let us say we would like to retrieve 5 rows starting from 4th row. The query is as follows.

SELECT * FROM employees LIMIT 3, 5;

We have given the starting index as 3 even though we want retrieve from 4th record because start index starts from 0. So, starting index 0 is 1st record, 1 is 2nd record and so on.

DISTINCT

With DISTINCT keyword, you can eliminate the duplicated result from the SELECT statement.

SELECT DISTINCT column_name FROM table_name;

Example:
In our ‘titles’ table of ‘employees’ database, there are more than 400 thousand records containing information about the various titles of all the employees. If we would like to know the titles/roles available in the company, we can use the DISTINCT keyword on ‘title’ column to avoid duplicate titles.