Managing Projects
I love getting email, even if not all of it is pornographic. That is one reason I always ask for a better answer to any SQL programming puzzle I put in my columns. One of the people who consistently beats my answer is Richard Romley at Salomon Smith Barney, so I love to get something from him.
The problem deals with two tables. The first table has a list of managers and the projects they can manage. The second table has a list of employees, their departments and the project to which they are assigned. Each employee is assigned to one and only one department and each employee works on one and only one project at a time. A department can have several different projects at the same time and a project can span several departments.
CREATE TABLE MgrProjects (mgr CHAR(2) NOT NULL, project CHAR(2) NOT NULL, PRIMARY KEY(mgr, project)); INSERT INTO Mgr_Project VALUES ('M1', 'P1'); INSERT INTO Mgr_Project VALUES ('M1', 'P3'); INSERT INTO Mgr_Project VALUES ('M2', 'P2'); INSERT INTO Mgr_Project VALUES ('M2', 'P3'); INSERT INTO Mgr_Project VALUES ('M3', 'P2'); INSERT INTO Mgr_Project VALUES ('M4', 'P1'); INSERT INTO Mgr_Project VALUES ('M4', 'P2'); INSERT INTO Mgr_Project VALUES ('M4', 'P3'); CREATE TABLE Employees (emp CHAR(2) NOT NULL, dept CHAR(2) NOT NULL, project CHAR(2) NOT NULL, UNIQUE (emp, project), UNIQUE (emp, dept), PRIMARY KEY (emp, dept, project)); INSERT INTO Employees VALUES ('Al', 'D1', 'P1'); INSERT INTO Employees VALUES ('Bob', 'D1', 'P1'); INSERT INTO Employees VALUES ('Carl', 'D1', 'P1'); INSERT INTO Employees VALUES ('Don', 'D1', 'P2'); INSERT INTO Employees VALUES ('Ed', 'D1', 'P2'); INSERT INTO Employees VALUES ('Frank', 'D1', 'P2'); INSERT INTO Employees VALUES ('George', 'D1', 'P2'); INSERT INTO Employees VALUES ('Harry', 'D2', 'P2'); INSERT INTO Employees VALUES ('Jack', 'D2', 'P2'); INSERT INTO Employees VALUES ('Larry', 'D2', 'P2'); INSERT INTO Employees VALUES ('Mike', 'D2', 'P2'); INSERT INTO Employees VALUES ('Nat', 'D2', 'P2'); INSERT INTO Employees VALUES ('Oscar', 'D3', 'P2'); INSERT INTO Employees VALUES ('Pat', 'D3', 'P2'); INSERT INTO Employees VALUES ('Rich', 'D3', 'P3');
As a side note, look at the use of UNIQUE constraints to enforce the data integrity rules of the problem. That is a very useful trick.
The problem was to generate an exception report showing for each manager any department he was qualified to manage some but not all of the projects being worked on within the department. Richard sent over two initial answers:
SELECT DISTINCT M1.mgr, E1.dept FROM MgrProjects AS M1 INNER JOIN Employees AS E1 ON M1.project = E1.project WHERE EXISTS (SELECT * FROM Employees AS E2 WHERE E2.dept = E1.dept AND NOT EXISTS (SELECT * FROM MgrProjects AS m2 WHERE m2.project = E2.project AND m2.mgr = M1.mgr));
This first query is not like Richard; he hates correlated subqueries and he really hates nested correlated subqueries. Optimizers in most SQL products do not do a job with them. His second effort was more like what he usually writes -- lots of JOINs
SELECT DISTINCT M1.mgr, E1.dept FROM ((MgrProjects AS M1 INNER JOIN Employees AS E1 ON M1.project = E1.project) INNER JOIN Employees AS E2 ON E1.dept = E2.dept) LEFT OUTER JOIN MgrProjects AS m2 ON m2.project = E2.project AND m2.mgr = M1.mgr WHERE m2.mgr IS NULL;
He liked this answer, but thought it was a little too confusing to understand easily. I agreed and thought of the LEFT OUTER JOIN and IS NULL combination as yet another way of doing a [NOT] EXISTS () predicate.
But when I looked at the first query, I immediately thought of Chris Date's version of relational division.This made me think of using the version of relational division associated with my books.
SELECT DISTINCT M1.mgr, E1.dept FROM ((MgrProjects AS M1 INNER JOIN Employees AS E1 ON M1.project = E1.project) INNER JOIN Employees AS E2 ON E1.dept = E2.dept) LEFT OUTER JOIN MgrProjects AS m2 ON m2.project = E2.project AND m2.mgr = M1.mgr WHERE m2.mgr IS NULL;
The query is simply a relational division with a <> instead of an = in the HAVING clause. I felt that I was finally one up on Richard. Then he got obsessive.
He came back with a modification of my answer which uses a characteristic function inside a single aggregate function.
SELECT M1.mgr, E1.dept FROM MgrProjects AS M1 CROSS JOIN Employees AS E1 WHERE M1.project = E1.project GROUP BY M1.mgr, E1.dept HAVING COUNT(*) <> (SELECT COUNT(emp) FROM Employees AS E2 WHERE E2.dept = E1.dept);
This query uses a characteristic function while my original version compares a count of employees under each manager to a count of employees under each project. The use of "GROUP BY M1.mgr, E1.dept, E2.project" with the "SELECT DISTINCT M1.mgr, E1.dept" is really the tricky part in this new queryb. What we have is a three dimensional space with the (x, y, z) axis representing (mgr, dept, project) and then we reduce it to two dimensions (mgr, dept) by seeing if employees on shared projects cover the department or not.
That observation lead to the next changes. We can build a table which shows each combination of manager, department and the level of authority they have over the projects they have in common. That is the derived table T1 in the following query; authority = 1 means the manager is not on the project and authority = 2 means that he is on the project
SELECT DISTINCT M1.mgr, E1.dept FROM (MgrProjects AS M1 INNER JOIN Employees AS E1 ON M1.project = E1.project) INNER JOIN Employees AS E2 ON E1.dept = E2.dept GROUP BY M1.mgr, E1.dept, E2.project HAVING MAX (CASE WHEN M1.project = E2.project THEN 1 ELSE 0 END) = 0;
We can now sum the authority numbers for all the projects within a department to determine the power this manager has over the department as a whole. If he had a total of one, he has no authority over employees on any project in the department. If he had a total of two, he has power over all employees on all projects in the department. If he had a total of three, he has both a 1 and a 2 authority total on some projects within the department. Here is the final answer.
Results mgr dept power --- ---- ---- M1 D1 Some M1 D2 None M1 D3 Some M2 D1 Some M2 D2 All M2 D3 All M3 D1 Some M3 D2 All M3 D3 Some M4 D1 All M4 D2 All M4 D3 All
While this is not exactly the original exception report that was requested, it is quick to execute and gives a lot more information. I have observed it is a good idea to answer the unasked next question when you can do it easily.
Now, can you do better?
--
Joe Celko was a member of the ANSI X3H2 Database Standards Committee and helped write the SQL-92 standards. He is the author of over 450 magazine columns and four books, the best known of which is SQL for Smarties (Morgan-Kaufmann Publishers, 1999). He is the Vice President of RDBMS at Northface University in Salt Lake City.
Contributors : Joe Celko
Last modified 2005-04-20 10:34 AM