Retrieval: Multiple Tables and Aggregation - Part 4

by Lex de Haan

From the book, Mastering Oracle SQL and SQL Plus*, Chapter 8, Apress, January 2005.

Part 1 | Part 2 | Part 3 | Part 4 | Part 5

8.7 The HAVING Clause

If you aggregate rows into groups with GROUP BY, you might also want to filter your query result further by allowing only certain groups into the final query result. You can achieve this with the HAVING clause. Normally, you use the HAVING clause only following a GROUP BY clause. For example, Listing 8-26 shows information about departments with more than four employees.

SQL> select deptno, count(empno)
  2  from   employees
  3  group  by deptno
  4  having count(*) >= 4;

  DEPTNO COUNT(EMPNO)
-------- ------------
      20            5
      30            6

SQL>

Listing 8-26: HAVING clause example.

However, the SQL language allows you to write queries with a HAVING clause without a preceding GROUP BY clause. In that case, Oracle assumes an implicit GROUP BY on a constant expression, just as when you use group functions in the SELECT clause without specifying a GROUP BY clause; that is, the full table is treated as a single group.

The Difference Between WHERE and HAVING

It is important to distinguish the WHERE clause from the HAVING clause. To illustrate this difference, Listing 8-27 shows a WHERE clause added to the previous query.

SQL> select   deptno, count(empno)
  2  from     employees
  3  where    bdate > date '1960-01-01'
  4  group by deptno
  5  having   count(*) >= 4;

  DEPTNO COUNT(EMPNO)
-------- ------------
      30            5

SQL>

Listing 8-27: HAVING vs.WHERE.

The WHERE condition regarding the day of birth (line 3) can be checked against individual rows of the EMPLOYEES table. On the other hand, the COUNT(*) condition (line 5) makes sense only at the group level. That’s why group functions should never occur in a WHERE clause. They typically result in the following Oracle error message:

ORA-00934: group function is not allowed here.

You’ll see this error message in Listing 8-29, caused by a classic SQL mistake, as discussed shortly.

HAVING Clauses Without Group Functions

On the other hand, valid HAVING clauses without group functions are very rare, and they should be rewritten. In Listing 8-28, the second query is much more efficient than the first one.

SQL> select deptno, count(*)
  2  from   employees
  3  group  by deptno
  4  having deptno <= 20;

  DEPTNO COUNT(*)
-------- --------
      10        3
      20        5

SQL> select deptno, count(*)
  2  from   employees
  3  where  deptno <= 20
  4  group  by deptno;

  DEPTNO COUNT(*)
-------- --------
      10        3
      20        5

SQL>

Listing 8-28: HAVING cause without a group function.

A Classic SQL Mistake

Take a look at the query in Listing 8-29. It looks very logical, doesn’t it? Who earns more than the average salary?

SQL> select empno
  2  from   employees
  3  where  msal > avg(msal);
where msal >  avg(msal)
              *
ERROR at line 3:
ORA-00934: group function is not allowed here

SQL>

Listing 8-29: Error message: Group function is not allowed here

However, if you think in terms of tuple variables, the problem becomes obvious: the WHERE clause has only a single row as its context, turning the AVG function into something impossible to derive.

You can solve this problem in many ways. Listings 8-30 and 8-31 show two suggestions.

SQL> select e.empno
  2  from   employees e
  3  where  e.msal > (select avg(x.msal)
  4                   from   employees x);

   EMPNO
--------
    7566
    7698
    7782
    7788
    7839
    7902

SQL>

Listing 8-30: One way to find who earns more than the average salary.

SQL> select   e1.empno
  2  from     employees e1
  3  ,        employees e2
  4  group by e1.empno
  5  ,        e1.msal
  6  having   e1.msal > avg(e2.msal);

     MNR
--------
    7566
    7698
    7782
    7788
    7839
    7902

SQL>

Listing 8-31: Another way to find who earns more than the average salary.

The solution in Listing 8-31 would probably not win an SQL beauty contest, but it is certainly worth further examination. This solution is based on the Cartesian product of the EMPLOYEES table with itself. Notice that it doesn’t have a WHERE clause. Notice also that you group on e1.EMPNO and e1.MSAL, which allows you to refer to this column in the HAVING clause.

Grouping on Additional Columns

You sometimes need this (apparently) superfluous grouping on additional columns. For example, suppose you want to see the employee number and the employee name, followed by the total number of course registrations. The query in Listing 8-32, which could be a first attempt to solve this problem, produces an Oracle error message.

SQL> select   e.empno, e.ename, count(*)
  2  from     employees e
  3           join
  4           registrations r
  5           on (e.empno = r.attendee)
  6  group by e.empno;
select   e.empno, e.ename, count(*)
                  *
ERROR at line 1:
ORA-00979: not a GROUP BY expression

SQL>

Listing 8-32: Error message: Not a GROUP BY expression

The pseudo-intermediate result in Listing 8-33 explains what went wrong here, and why you must also group on e.ENAME.

GROUP BY e.EMPNO               GROUP BY e.EMPNO,e.ENAME

e.EMPNO    e.ENAME  e.INIT ... e.EMPNO   e.ENAME  e.INIT ...
=======  =========  ======     =======  ========  ======
   7369  {'SMITH'}  {'N'}         7369   'SMITH'  {'N'}
   7499  {'ALLEN'}  {'JAM'}       7499   'ALLEN'  {'JAM'}
   7521  {'WARD' }    ...         7521    ...      ...
   7566   ...                    ...

Listing 8-33: Pseudo-Intermediate GROUP BY result.

The two results look similar; however, there is an important difference between sets consisting of a single element, such as {'SMITH'}, and a literal value, such as 'SMITH'. In mathematics, sets with a single element are commonly referred to as singleton sets, or just singletons.

Listing 8-34 shows another instructive mistake.

SQL> select deptno
  2  ,      sum(msal)
  3  from   employees;
select deptno
       *
ERROR at line 1:
ORA-00937: not a single-group group function

SQL>

Listing 8-34: Error message: Not a single-group group function.

In the absence of a GROUP BY clause, the SUM function would return a single row, while DEPTNO would produce 14 department numbers. Two columns with different row counts cannot be presented side-by-side in a single result. After the correction in Listing 8-35, the error message disappears, and you get the desired results.

SQL> select   deptno
  2  ,        sum(msal)
  3  from     employees
  4  group by deptno;

  DEPTNO     SUM(MSAL)
-------- -------------
      10          8750
      20         10875
      30          9250

SQL>

Listing 8-35: Correction of the error message in Listing 8-34.

In summary, if your query contains a GROUP BY clause, the SELECT clause is allowed to contain only group expressions. A group expression is a column name that is part of the GROUP BY clause, or a group function applied to any other column expression. See also Table 8-3 at the end of Section 8.6.

8.8 Advanced GROUP BY Features

The previous sections showed examples of using “standard” GROUP BY clauses. You can also use some more advanced features of the GROUP BY clause. Here, we will look at GROUP BY CUBE and GROUP BY ROLLUP.

Let’s start with a regular GROUP BY example, shown in Listing 8-36.

SQL> select   deptno, job
  2  ,        count(empno) headcount
  3  from     employees
  4  group by deptno, job;

  DEPTNO JOB        HEADCOUNT
-------- ---------- ---------
      10 MANAGER            1
      10 DIRECTOR           1
      10 ADMIN              1
      20 MANAGER            1
      20 TRAINER            4
      30 MANAGER            1
      30 SALESREP           4
      30 ADMIN              1

8 rows selected.

SQL>

Listing 8-36: Regular GROUP BY example.

You get an overview with the number of employees per department, and within each department per job. To keep things simple, let’s forget about department 40, the department without employees.

GROUP BY ROLLUP

Notice what happens if you change the GROUP BY clause and add the keyword ROLLUP, as shown in Listing 8-37.

SQL> select   deptno, job
  2  ,        count(empno) headcount
  3  from     employees
  4  group by ROLLUP(deptno, job);

  DEPTNO JOB      HEADCOUNT
-------- -------- ---------
      10 ADMIN            1
      10 MANAGER          1
      10 DIRECTOR         1
>>>   10                  3  <<<
      20 MANAGER          1
      20 TRAINER          4
>>>   20                  5  <<<
      30 ADMIN            1
      30 MANAGER          1
      30 SALESREP         4
>>>   30                  6  <<<
>>>                      14  <<<

12 rows selected.

SQL>

Listing 8-37: GROUP BY ROLLUP example.

The ROLLUP addition results in four additional rows, marked with >>> and <<< in Listing 8-37 for readability. Three of these four additional rows show the head count per department over all jobs, and the last row shows the total number of employees.

GROUP BY CUBE

You can also use the CUBE keyword in the GROUP BY clause. Listing 8-38 shows an example.

SQL> select   deptno, job
  2  ,        count(empno) headcount
  3  from     employees
  4  group by CUBE(deptno, job);

  DEPTNO JOB      HEADCOUNT
-------- -------- ---------
                         14
>>>      ADMIN            2 <<<
>>>      MANAGER          3 <<<
>>>      TRAINER          4 <<<
>>>      DIRECTOR         1 <<<
>>>      SALESREP         4 <<<
      10                  3
      10 MANAGER          1
      10 DIRECTOR         1
      10 ADMIN            1
      20                  5
      20 MANAGER          1
      20 TRAINER          4
      30                  6
      30 MANAGER          1
      30 SALESREP         4
      30 ADMIN            1

17 rows selected.

SQL>

Listing 8-38: GROUP BY CUBE example.

This time, you get five more rows in the query result, marked in the same way with >>> and <<<, showing the number of employees per job, regardless of which department employs them.

Tip: Both GROUP BY CUBE and GROUP BY ROLLUP are two special cases of the GROUP BY GROUPING SETS syntax, offering more flexibility. You can also merge the results of different grouping operations into a single GROUP BY clause by specifying them in a comma-separated list. For more details, see Oracle SQL Reference.

CUBE, ROLLUP, and Null Values

The CUBE and ROLLUP keywords generate many null values in query results, as you can see in Listings 8-37 and 8-38. You can distinguish these system-generated null values from other null values; for example, to replace them with some explanatory text. You can use the GROUPING and GROUPING_ID functions for that purpose.

The GROUPING Function

Listing 8-39 shows an example of the GROUPING function.

SQL> select   deptno
  2  ,        case GROUPING(job)
  3                when 0 then job
  4                when 1 then '**total**'
  5           end job
  6  ,        count(empno) headcount
  7  from     employees
  8  group by rollup(deptno, job);

  DEPTNO JOB       HEADCOUNT
-------- --------- ---------
      10 ADMIN             1
      10 MANAGER           1
      10 DIRECTOR          1
      10 **total**         3
      20 MANAGER           1
      20 TRAINER           4
      20 **total**         5
      30 ADMIN             1
      30 MANAGER           1
      30 SALESREP          4
      30 **total**         6
         **total**        14

12 rows selected.

SQL>

Listing 8-39
GROUPING Function example.

Unfortunately, the GROUPING function can return only two results: 0 or 1. That’s why the last two lines both show '**total**'.

The GROUPING_ID Function

The GROUPING_ID function is more flexible that the GROUPING function, because it can return several different results, as you can see in Listing 8-40.

SQL> select   deptno
  2  ,        case GROUPING_ID(deptno, job)
  3                when 0 then job
  4                when 1 then '**dept **'
  5                when 3 then '**total**'
  6           end job
  7  ,        count(empno) headcount
  8  from     employees
  9  group by rollup(deptno, job);

  DEPTNO JOB       HEADCOUNT
-------- --------- ---------
      10 ADMIN             1
      10 MANAGER           1
      10 DIRECTOR          1
      10 **dept **         3
      20 MANAGER           1
      20 TRAINER           4
      20 **dept **         5
      30 ADMIN             1
      30 MANAGER           1
      30 SALESREP          4
      30 **dept **         6
         **total**        14

12 rows selected.

SQL>

Listing 8-40: GROUPING_ID Function example with ROLLUP.

You may be puzzled by the value 3 being used on the fifth line in Listing 8-40. Things become clear when you convert 3 to a binary representation, which results in the binary number 11. The two ones in this number act as a flag to trap the situation in which both columns contain a null value. GROUP BY ROLLUP can produce only 1 (binary 01) and 3 (binary 11), but GROUP BY CUBE can also generate 2 (binary 10). Look at the results in Listing 8-41. Obviously, GROUPING_ID produces a 0 (zero) for all “regular” rows in the result.

SQL> select   deptno, job
  2  ,        GROUPING_ID(deptno, job) gid
  3  from     employees
  4  group by cube(deptno, job);

  DEPTNO JOB           GID
-------- -------- --------
                         3
         ADMIN           2
         MANAGER         2
         TRAINER         2
         DIRECTOR        2
         SALESREP        2
      10                 1
      10 ADMIN           0
      10 MANAGER         0
      10 DIRECTOR        0
      20                 1

      20 MANAGER         0
      20 TRAINER         0
      30                 1
      30 ADMIN           0
      30 MANAGER         0
      30 SALESREP        0

17 rows selected.

SQL>

Listing 8-41: GROUPING_ID Function example with CUBE.

Contributors : Lex de Haan
Last modified 2006-01-06 11:08 AM

DBAzine.com

Sections

Personal tools

Menu

Who Are You?

Retrieval: Multiple Tables and Aggregation - Part 4

From the book, Mastering Oracle SQL and SQL Plus*, Chapter 8, Apress, January 2005.

8.7 The HAVING Clause

The Difference Between WHERE and HAVING

HAVING Clauses Without Group Functions

A Classic SQL Mistake

Grouping on Additional Columns

8.8 Advanced GROUP BY Features

GROUP BY ROLLUP

GROUP BY CUBE

CUBE, ROLLUP, and Null Values

The GROUPING Function

The GROUPING_ID Function

DBAzine.com

Sections

Personal tools

Menu

Who Are You?

Retrieval: Multiple Tables and Aggregation - Part 4

From the book, Mastering Oracle SQL and SQL *Plus, Chapter 8, Apress, January 2005.

8.7 The HAVING Clause

The Difference Between WHERE and HAVING

HAVING Clauses Without Group Functions

A Classic SQL Mistake

Grouping on Additional Columns

8.8 Advanced GROUP BY Features

GROUP BY ROLLUP

GROUP BY CUBE

CUBE, ROLLUP, and Null Values

The GROUPING Function

The GROUPING_ID Function

From the book, Mastering Oracle SQL and SQL Plus*, Chapter 8, Apress, January 2005.