# DBAzine.com

##### Personal tools
You are here: Home » Matrix Transposition in SQL
Seeking new owner for this high-traffic DBAzine.com site.
Tap into the potential of this DBA community to expand your business! Interested? Contact us today.
##### Who Are You?
I am a:
Mainframe True Believer
Distributed Fast-tracker

[ Results | Polls ]

# Matrix Transposition in SQL

“God made the integers; all the rest is the work of man.”

-- Leopold Kronecker

Matrix transposition is among Frequently Asked Questions. Given a single-column table ORIGINAL,

```ENAME
SMITH
ALLEN
WARD
JONES
MARTIN
BLAKE
CLARK
SCOTT
KING
TURNER
JAMES
FORD
MILLER```

we’ll explore how to transform it into TRANSPOSED:

```Column
S A W J M B C S K T A J F M
M L A O A L L C I U D A O I
I L R N R A A O N R A M R L
T E D E T K R T G N M E D L
H N S I E K T E S S E
N R R```

This problem has been discussed in the Usenet thread Matrix transpose in SQL, and general agreement was that it can’t be done in standard SQL-92. One of the latest Oracle magazine code tips --  -- suggests a rather lengthy procedural solution. This article describes a SQL solution with a minimal amount of procedural code. As a bonus, we’ll learn several ways how to program user-defined aggregates in Oracle 9i.

## Nesting and Unnesting

Consider the table

 EMPNO POS LET 7369 1 S 7369 2 M 7369 3 I 7369 4 T 7369 5 H 7499 1 A 7499 2 L 7499 3 L 7499 4 E 7499 5 N … … …

that we call UNNESTED. Both tables in the beginning of the article could be viewed as aggregates of UNNESTED. The ORIGINAL table could be specified as:

`    select concat(LET) from UNNESTED group by EMPNO`

while the TRANSPOSED table is just a grouping by a different column:

`    select concat(LET||’ ‘) from UNNESTED group by POS`

(I also added a space padding to make the query result set more readable). So the problem of transposing the ORIGINAL table into TRANSPOSED can be solved by just implementing two steps:

Unnesting ORIGINAL --> UNNESTED

Nesting UNNESTED --> TRANSPOSED

The first step involves an integer enumeration relation, introduced in myprevious article. Reader feedback, and other articles about integer enumeration convinced me to further expand on this topic.

## Integer Enumeration for Aggregate Dismembering

Again, I prefer producing arbitrary, large list of integers with a Table Function

```CREATE TYPE IntSet AS TABLE OF Integer;

/

CREATE or replace FUNCTION UNSAFE

RETURN  IntSet PIPELINED IS

BEGIN

loop

PIPE ROW(1);

end loop;

END;

/

select rownum from TABLE(UNSAFE) where rownum < 1000000

select rownum from TABLE(UNSAFE) where rownum < 1000000
```

In my previous article, I reserved the possibility of using an upper-bound integer range argument that would make the function safe. In other words, the function would never spin out of control whenever a user forgot the stop predicate rownum < 1000000. On the other hand, using the function argument is inferior for two reasons:

• predicates are more self-documenting than function arguments, and
• we can use subqueries instead of hardcoded limits.

The runtime expense of using the table function is minimal: unlike forced materialization into a real table, logical I/O associated with table function calls is virtually zero.

In DB2, a list of integers can be constructed with recursive SQL:

```with iota (n) as

( values(1)

union all

select n+1 from iota

where n<100000

)

select * from iota;
```

It is slightly inconvenient, however, that the predicate, which limits the list of numbers, must be specified within the recursion subquery, while it naturally belongs to the main query. The problem of pushing the predicate inside an inner query is somewhat similar to the one we saw for UNSAFE table function.

With the list of integers at our disposal, writing an unnesting query is easy:

```SQL>select empno, pos, substr(ENAME,i,1) from emp,

2 (select rownum pos from table(unsafe)

3  where rownum < (select max(length(ename)) from emp));

EMPNO        POS S

---------- ---------- -

7369          1 S

7369          2 M

7369          3 I

7369          4 T

7369          5 H

7499          1 A

7499          2 L

7499          3 L

7499          4 E

7499          5 N

7521          1 W
```

Looking at strings "SMITH," "ALLEN," etc., as aggregates of letters might seem odd at first, but that is what they really are. We’ll assemble those letters back into aggregates of (different) words.

## User Defined Aggregate Functions

There is no aggregate function that would concatenate strings in standard SQL. However, there are multiple ways defining it in Oracle:

```CREATE or replace FUNCTION CONCAT_LIST( cur SYS_REFCURSOR )

RETURN  VARCHAR2 IS

ret VARCHAR2(32000);

tmp VARCHAR2(4000);

BEGIN

loop

fetch cur into tmp;

exit when cur%NOTFOUND;

ret := ret || tmp;

end loop;

RETURN ret;

END;

/

select distinct

deptno,

CONCAT_LIST(CURSOR(

select ename ||',' from emp ee where e.deptno = ee.deptno

) employees

from emp e;
```

Syntactically, neither of these solutions looks like a group by. However, scalar subquery in the select list is actually more powerful than group by. This idea is emphasized in the article by C.J. Date: “A discussion of some redundancies in SQL.” If you prefer, however, the traditional group by syntax, then there is yet another way to program user-defined aggregates:

• Oracle 9i user-defined aggregates:
```create or replace type string_agg_type as object (

total varchar2(4000),

static function

ODCIAggregateInitialize(sctx IN OUT string_agg_type )

return number,

member function

ODCIAggregateIterate(self IN OUT string_agg_type ,

value IN varchar2 )

return number,

member function

ODCIAggregateTerminate(self IN string_agg_type,

returnValue OUT  varchar2,

flags IN number)

return number,

member function

ODCIAggregateMerge(self IN OUT string_agg_type,

ctx2 IN string_agg_type)

return number

);

/

create or replace type body string_agg_type is

static function ODCIAggregateInitialize(sctx IN OUT string_agg_type)

return number

is

begin

sctx := string_agg_type( null );

return ODCIConst.Success;

end;

member function ODCIAggregateIterate(self IN OUT string_agg_type,

value IN varchar2 )

return number

is

begin

self.total := self.total || value;

return ODCIConst.Success;

end;

member function ODCIAggregateTerminate(self IN string_agg_type,

returnValue OUT varchar2,

flags IN number)

return number

is

begin

returnValue := self.total;

return ODCIConst.Success;

end;

member function ODCIAggregateMerge(self IN OUT string_agg_type,

ctx2 IN string_agg_type)

return number

is

begin

self.total := self.total || ctx2.total;

return ODCIConst.Success;

end;

end;

/

CREATE or replace

FUNCTION stragg(input varchar2 )

RETURN varchar2

PARALLEL_ENABLE AGGREGATE USING string_agg_type;

/

select deptno, stragg(ename)

from emp

group by deptno;
```

This last solution is probably the best from a performance perspective, since the query with the user-defined aggregate looks exactly like the traditional group by, the usual optimizations can be employed. Compare this to the second method -- pipelining the user-defined aggregate function. In that case the optimizer would certainly not be able to unnest a scalar subquery within a table function (yet).

Now we have all the ingredients necessary for writing the transposition query. According to our program, we need to apply aggregation to the unnested view. I skipped that step, however, and rewrote the query directly to

```select CONCAT_LIST(CURSOR(

SELECT substr(ENAME,i,1)|| '  '  from emp

))

from (select rownum i from table(unsafe)

where rownum < (select max(length(ename))+1 from emp))
```

Here I used solution #2 for user defined aggregates, and the reader is advised writing a transposition query with the other aggregate solutions as well. The last query needs the final touch: taking care of those employee names, which are shorter than the maximum length. It can be easily accommodated with a switch:

```select CONCAT_LIST(CURSOR(

select case when length(ename)<i

then ' '

else substr(ENAME,i,1)|| '  '

end

from emp

))

from (select rownum i from table(unsafe)

where rownum < (select max(length(ename))+1 from emp))

CONCAT_LIST(CURSOR(SELECTCASEWHENLENGTH(ENAME

---------------------------------------------

S  A  W  J  M  B  C  S  K  T  A  J  F  M

M  L  A  O  A  L  L  C  I  U  D  A  O  I

I  L  R  N  R  A  A  O  N  R  A  M  R  L

T  E  D  E  T  K  R  T  G  N  M  E  D  L

H  N     S  I  E  K  T     E  S  S     E

N              R           R
```

--

Vadim Tropashko works for Real World Performance group at Oracle Corp. Prior to that he was application programmer and translated "The C++ Programming Language" by B. Stroustrup, 2nd edition into Russian. His current interests include SQL Optimization, Constraint Databases, and Computer Algebra Systems.

Contributors : Vadim Tropashko