How to emulate relational algebra division operator in SQL

I found this recently among my files while preparing a lecture on SQL for my Database Systems course. I wrote this on October 2010 and have not checked it since then, but I believe in the correctness of the information.
To my Spanish readers, I ask for your forgiveness, I need to write a few things in english from time to time to avoid getting clogged (Is it “clogged” the right word?).


This document answer a very common question related to the emulation of division operation from relational algebra using SQL. Despite how commonly it’s used, most of the RDBMs don’t support division operation out of the box, so it’s necessary to find a way to emulate it with the help of other supported relational algebra operators. The present document describes a way to do that.

If you want to know what does a relational algebra’s division operator do or want to know how to emulate a division from other relational algebra operators check here.

First, we create the following tables. The meaning of the information contained in the tables is not important, only their structure matters.

CREATE TABLE foo (
  id   INTEGER PRIMARY KEY,
  name VARCHAR(10) 
);

CREATE TABLE faa (
  id   INTEGER PRIMARY KEY,
  name VARCHAR(10) 
);

CREATE TABLE foofaa (
  id_foo INTEGER REFERENCES foo,
  id_faa INTEGER REFERENCES faa
);

This is basically a many to many relation between foo and faa:

+-----+ *    +-----+
| foo |------| faa |
+-----+    * +-----+

or if you like:

+-----+ 1  * +--------+ *  1 +-----+
| foo |------| foofaa |------| faa |
+-----+      +--------+      +-----+

For testing purposes, we’ll fill the tables with some data. This way, the next queries will return the following results:

SELECT * FROM foo;

 id | name 
----+------
  1 | X
  2 | Y
  3 | Z
(3 rows)
SELECT * FROM faa;

 id | name 
----+------
  1 | A
  2 | B
  3 | C
(3 rows)
SELECT * FROM foofaa;

 id_foo | id_faa 
--------+--------
      1 |      1
      1 |      2
      1 |      3
      2 |      1
      2 |      2
      3 |      1
(6 rows)

Then, we want to perform a division between (R):

SELECT foo.id, foo.name, faa.name FROM foo, foofaa, faa
  WHERE foo.id = foofaa.id_foo AND faa.id = foofaa.id_faa;

 id | name | name 
----+------+------
  1 | X    | A
  1 | X    | B
  1 | X    | C
  2 | Y    | A
  2 | Y    | B
  3 | Z    | A
(6 rows)

…and (S):

SELECT name FROM faa;

 name 
------
 A
 B
 C
(3 rows)

So, the operation would be R/S.

The expected result is a relation containing the row (1, X), which is the only one that is related to each and every “name” in (S) (see division).

To emulate this with other relational algebra operators (but in SQL), we have to cross product the attributes in R that are not in S with S. So basically, from the setup we have created we need to cross the content of table faa with the content of S. We’ll name the result of this operation T:

SELECT foo.id, foo.name, faa.name FROM foo, faa;

 id | name | name 
----+------+------
  1 | X    | A
  1 | X    | B
  1 | X    | C
  2 | Y    | A
  2 | Y    | B
  2 | Y    | C
  3 | Z    | A
  3 | Z    | B
  3 | Z    | C
(9 rows)

Then we need to make T-R to have all the elements NOT present in the original R. We’ll call this result U:

(SELECT foo.id, foo.name, faa.name FROM foo, faa)
  EXCEPT
(SELECT foo.id, foo.name, faa.name FROM foo, foofaa, faa
    WHERE foo.id = foofaa.id_foo AND faa.id = foofaa.id_faa);

 id | name | name 
----+------+------
  2 | Y    | C
  3 | Z    | B
  3 | Z    | C
(3 rows)

This looks right, in the sense that the row originally named Y has only related A and B, so in U we have only C, which is not related to Y. Z is related only to A so it’s not related to B and C, and finally, X is related to all entries in faa so it does not appear on the results of U.

Now we project only the attributes from U that belong to R and name the result as V (some aliases tricks where needed to do this):

SELECT x.id, x.foo_name FROM (                                    
  (SELECT foo.id, foo.name AS foo_name, faa.name AS faa_name FROM foo, faa)
    EXCEPT
  (SELECT foo.id, foo.name, faa.name FROM foo, foofaa, faa
      WHERE foo.id = foofaa.id_foo AND faa.id = foofaa.id_faa)) AS x;

 id | foo_name 
----+----------
  2 | Y
  3 | Z
  3 | Z
(3 rows)

And finally we do R-V to get the desired R/S result:

(SELECT * FROM foo)
  EXCEPT
(SELECT x.id, x.foo_name FROM (
  (SELECT foo.id, foo.name AS foo_name, faa.name AS faa_name FROM foo, faa)
    EXCEPT
  (SELECT foo.id, foo.name, faa.name FROM foo, foofaa, faa
      WHERE foo.id = foofaa.id_foo AND faa.id = foofaa.id_faa)) AS x);

 id | name 
----+------
  1 | X
(1 row)

That’s it, the result is R/S!

Leave a Reply

Your email address will not be published. Required fields are marked *