Skip to content

DBAzine.com

Sections
Personal tools
You are here: Home » Oracle » Oracle Articles Archive » Retrieval: Multiple Tables and Aggregation - Part 4
Seeking new owner for this high-traffic DBAzine.com site.
Tap into the potential of this DBA community to expand your business! Interested? Contact us today.
Who Are You?
I am a:
Mainframe True Believer
Distributed Fast-tracker

[ Results | Polls ]
Votes : 4610
 

Retrieval: Multiple Tables and Aggregation - Part 4

by Lex de Haan
From the book, Mastering Oracle SQL and SQL *Plus, Chapter 8, Apress, January 2005.

Part 1  |  Part 2  |  Part 3  |  Part 4  |  Part 5

8.7 The HAVING Clause

If you aggregate rows into groups with GROUP BY, you might also want to filter your query result further by allowing only certain groups into the final query result. You can achieve this with the HAVING clause. Normally, you use the HAVING clause only following a GROUP BY clause. For example, Listing 8-26 shows information about departments with more than four employees.

SQL> select deptno, count(empno)
2 from employees
3 group by deptno
4 having count(*) >= 4;

DEPTNO COUNT(EMPNO)
-------- ------------
20 5
30 6

SQL>

Listing 8-26: HAVING clause example.

However, the SQL language allows you to write queries with a HAVING clause without a preceding GROUP BY clause. In that case, Oracle assumes an implicit GROUP BY on a constant expression, just as when you use group functions in the SELECT clause without specifying a GROUP BY clause; that is, the full table is treated as a single group.

The Difference Between WHERE and HAVING

It is important to distinguish the WHERE clause from the HAVING clause. To illustrate this difference, Listing 8-27 shows a WHERE clause added to the previous query.

SQL> select   deptno, count(empno)
2 from employees
3 where bdate > date '1960-01-01'
4 group by deptno
5 having count(*) >= 4;

DEPTNO COUNT(EMPNO)
-------- ------------
30 5

SQL>

Listing 8-27: HAVING vs.WHERE.

The WHERE condition regarding the day of birth (line 3) can be checked against individual rows of the EMPLOYEES table. On the other hand, the COUNT(*) condition (line 5) makes sense only at the group level. That’s why group functions should never occur in a WHERE clause. They typically result in the following Oracle error message:

ORA-00934: group function is not allowed here.

You’ll see this error message in Listing 8-29, caused by a classic SQL mistake, as discussed shortly.

HAVING Clauses Without Group Functions

On the other hand, valid HAVING clauses without group functions are very rare, and they should be rewritten. In Listing 8-28, the second query is much more efficient than the first one.

SQL> select deptno, count(*)
2 from employees
3 group by deptno
4 having deptno <= 20;

DEPTNO COUNT(*)
-------- --------
10 3
20 5

SQL> select deptno, count(*)
2 from employees
3 where deptno <= 20
4 group by deptno;

DEPTNO COUNT(*)
-------- --------
10 3
20 5

SQL>

Listing 8-28: HAVING cause without a group function.

A Classic SQL Mistake

Take a look at the query in Listing 8-29. It looks very logical, doesn’t it? Who earns more than the average salary?

SQL> select empno
2 from employees
3 where msal > avg(msal);
where msal > avg(msal)
*
ERROR at line 3:
ORA-00934: group function is not allowed here

SQL>

Listing 8-29: Error message: Group function is not allowed here

However, if you think in terms of tuple variables, the problem becomes obvious: the WHERE clause has only a single row as its context, turning the AVG function into something impossible to derive.

You can solve this problem in many ways. Listings 8-30 and 8-31 show two suggestions.

SQL> select e.empno
2 from employees e
3 where e.msal > (select avg(x.msal)
4 from employees x); EMPNO -------- 7566 7698 7782 7788 7839 7902 SQL>

Listing 8-30: One way to find who earns more than the average salary.

SQL> select   e1.empno
2 from employees e1
3 , employees e2
4 group by e1.empno
5 , e1.msal
6 having e1.msal > avg(e2.msal);

MNR
--------
7566
7698
7782
7788
7839
7902

SQL>

Listing 8-31: Another way to find who earns more than the average salary.

The solution in Listing 8-31 would probably not win an SQL beauty contest, but it is certainly worth further examination. This solution is based on the Cartesian product of the EMPLOYEES table with itself. Notice that it doesn’t have a WHERE clause. Notice also that you group on e1.EMPNO and e1.MSAL, which allows you to refer to this column in the HAVING clause.

Grouping on Additional Columns

You sometimes need this (apparently) superfluous grouping on additional columns. For example, suppose you want to see the employee number and the employee name, followed by the total number of course registrations. The query in Listing 8-32, which could be a first attempt to solve this problem, produces an Oracle error message.

SQL> select   e.empno, e.ename, count(*)
2 from employees e
3 join
4 registrations r
5 on (e.empno = r.attendee)
6 group by e.empno;
select e.empno, e.ename, count(*)
*
ERROR at line 1:
ORA-00979: not a GROUP BY expression

SQL>

Listing 8-32: Error message: Not a GROUP BY expression

The pseudo-intermediate result in Listing 8-33 explains what went wrong here, and why you must also group on e.ENAME.

GROUP BY e.EMPNO               GROUP BY e.EMPNO,e.ENAME

e.EMPNO e.ENAME e.INIT ... e.EMPNO e.ENAME e.INIT ...
======= ========= ====== ======= ======== ======
7369 {'SMITH'} {'N'} 7369 'SMITH' {'N'}
7499 {'ALLEN'} {'JAM'} 7499 'ALLEN' {'JAM'}
7521 {'WARD' } ... 7521 ... ...
7566 ... ...

Listing 8-33: Pseudo-Intermediate GROUP BY result.

The two results look similar; however, there is an important difference between sets consisting of a single element, such as {'SMITH'}, and a literal value, such as 'SMITH'. In mathematics, sets with a single element are commonly referred to as singleton sets, or just singletons.

Listing 8-34 shows another instructive mistake.

SQL> select deptno
2 , sum(msal)
3 from employees;
select deptno
*
ERROR at line 1:
ORA-00937: not a single-group group function

SQL>

Listing 8-34: Error message: Not a single-group group function.

In the absence of a GROUP BY clause, the SUM function would return a single row, while DEPTNO would produce 14 department numbers. Two columns with different row counts cannot be presented side-by-side in a single result. After the correction in Listing 8-35, the error message disappears, and you get the desired results.

SQL> select   deptno
2 , sum(msal)
3 from employees
4 group by deptno;

DEPTNO SUM(MSAL)
-------- -------------
10 8750
20 10875
30 9250

SQL>

Listing 8-35: Correction of the error message in Listing 8-34.

In summary, if your query contains a GROUP BY clause, the SELECT clause is allowed to contain only group expressions. A group expression is a column name that is part of the GROUP BY clause, or a group function applied to any other column expression. See also Table 8-3 at the end of Section 8.6.

8.8 Advanced GROUP BY Features

The previous sections showed examples of using “standard” GROUP BY clauses. You can also use some more advanced features of the GROUP BY clause. Here, we will look at GROUP BY CUBE and GROUP BY ROLLUP.

Let’s start with a regular GROUP BY example, shown in Listing 8-36.

SQL> select   deptno, job
2 , count(empno) headcount
3 from employees
4 group by deptno, job;

DEPTNO JOB HEADCOUNT
-------- ---------- ---------
10 MANAGER 1
10 DIRECTOR 1
10 ADMIN 1
20 MANAGER 1
20 TRAINER 4
30 MANAGER 1
30 SALESREP 4
30 ADMIN 1

8 rows selected.

SQL>

Listing 8-36: Regular GROUP BY example.

You get an overview with the number of employees per department, and within each department per job. To keep things simple, let’s forget about department 40, the department without employees.

GROUP BY ROLLUP

Notice what happens if you change the GROUP BY clause and add the keyword ROLLUP, as shown in Listing 8-37.

SQL> select   deptno, job
2 , count(empno) headcount
3 from employees
4 group by ROLLUP(deptno, job);

DEPTNO JOB HEADCOUNT
-------- -------- ---------
10 ADMIN 1
10 MANAGER 1
10 DIRECTOR 1
>>> 10 3 <<<
20 MANAGER 1
20 TRAINER 4
>>> 20 5 <<<
30 ADMIN 1
30 MANAGER 1
30 SALESREP 4
>>> 30 6 <<<
>>> 14 <<<

12 rows selected.

SQL>

Listing 8-37: GROUP BY ROLLUP example.

The ROLLUP addition results in four additional rows, marked with >>> and <<< in Listing 8-37 for readability. Three of these four additional rows show the head count per department over all jobs, and the last row shows the total number of employees.

GROUP BY CUBE

You can also use the CUBE keyword in the GROUP BY clause. Listing 8-38 shows an example.

SQL> select   deptno, job
2 , count(empno) headcount
3 from employees
4 group by CUBE(deptno, job);

DEPTNO JOB HEADCOUNT
-------- -------- ---------
14
>>> ADMIN 2 <<<
>>> MANAGER 3 <<<
>>> TRAINER 4 <<<
>>> DIRECTOR 1 <<<
>>> SALESREP 4 <<<
10 3
10 MANAGER 1
10 DIRECTOR 1
10 ADMIN 1
20 5
20 MANAGER 1
20 TRAINER 4
30 6
30 MANAGER 1
30 SALESREP 4
30 ADMIN 1

17 rows selected.

SQL>

Listing 8-38: GROUP BY CUBE example.

This time, you get five more rows in the query result, marked in the same way with >>> and <<<, showing the number of employees per job, regardless of which department employs them.

Tip: Both GROUP BY CUBE and GROUP BY ROLLUP are two special cases of the GROUP BY GROUPING SETS syntax, offering more flexibility. You can also merge the results of different grouping operations into a single GROUP BY clause by specifying them in a comma-separated list. For more details, see Oracle SQL Reference.

CUBE, ROLLUP, and Null Values

The CUBE and ROLLUP keywords generate many null values in query results, as you can see in Listings 8-37 and 8-38. You can distinguish these system-generated null values from other null values; for example, to replace them with some explanatory text. You can use the GROUPING and GROUPING_ID functions for that purpose.

The GROUPING Function

Listing 8-39 shows an example of the GROUPING function.

SQL> select   deptno
2 , case GROUPING(job)
3 when 0 then job
4 when 1 then '**total**'
5 end job
6 , count(empno) headcount
7 from employees
8 group by rollup(deptno, job);

DEPTNO JOB HEADCOUNT
-------- --------- ---------
10 ADMIN 1
10 MANAGER 1
10 DIRECTOR 1
10 **total** 3
20 MANAGER 1
20 TRAINER 4
20 **total** 5
30 ADMIN 1
30 MANAGER 1
30 SALESREP 4
30 **total** 6
**total** 14

12 rows selected.

SQL>

Listing 8-39
GROUPING Function example.

Unfortunately, the GROUPING function can return only two results: 0 or 1. That’s why the last two lines both show '**total**'.

The GROUPING_ID Function

The GROUPING_ID function is more flexible that the GROUPING function, because it can return several different results, as you can see in Listing 8-40.

SQL> select   deptno
2 , case GROUPING_ID(deptno, job)
3 when 0 then job
4 when 1 then '**dept **'
5 when 3 then '**total**'
6 end job
7 , count(empno) headcount
8 from employees
9 group by rollup(deptno, job);

DEPTNO JOB HEADCOUNT
-------- --------- ---------
10 ADMIN 1
10 MANAGER 1
10 DIRECTOR 1
10 **dept ** 3
20 MANAGER 1
20 TRAINER 4
20 **dept ** 5
30 ADMIN 1
30 MANAGER 1
30 SALESREP 4
30 **dept ** 6
**total** 14

12 rows selected.

SQL>

Listing 8-40: GROUPING_ID Function example with ROLLUP.

You may be puzzled by the value 3 being used on the fifth line in Listing 8-40. Things become clear when you convert 3 to a binary representation, which results in the binary number 11. The two ones in this number act as a flag to trap the situation in which both columns contain a null value. GROUP BY ROLLUP can produce only 1 (binary 01) and 3 (binary 11), but GROUP BY CUBE can also generate 2 (binary 10). Look at the results in Listing 8-41. Obviously, GROUPING_ID produces a 0 (zero) for all “regular” rows in the result.

SQL> select   deptno, job
2 , GROUPING_ID(deptno, job) gid
3 from employees
4 group by cube(deptno, job);

DEPTNO JOB GID
-------- -------- --------
3
ADMIN 2
MANAGER 2
TRAINER 2
DIRECTOR 2
SALESREP 2
10 1
10 ADMIN 0
10 MANAGER 0
10 DIRECTOR 0
20 1

20 MANAGER 0
20 TRAINER 0
30 1
30 ADMIN 0
30 MANAGER 0
30 SALESREP 0

17 rows selected.

SQL>

Listing 8-41: GROUPING_ID Function example with CUBE.


Contributors : Lex de Haan
Last modified 2006-01-06 11:08 AM
Transaction Management
Reduce downtime and increase repeat sales by improving end-user experience.
Free White Paper
Database Recovery
Feeling the increased demands on data protection and storage requirements?
Download Free Report!
 
 

Powered by Plone