Graphs? In Databases? Yes, you can!

To do graphs in SQL, we need nodes and edges. One common mistake

we'll avoid is to lump the nodes and edges into a single table. I'll

get into just why this is a mistake later, but let's just take it as a

given for now.

We'll start by modeling a forest. In graph theory, as in reality, a

forest is a collection of trees.

Graph theoretically, we can think of a tree as a collection of nodes

connected by "arrows," which is to say edges with a defined head and

tail. As with real arrows, which end you're dealing with matters a

good deal. The part you sharpen is very important to distinguish from

the part you touch to your bowstring.

Intuitively, we can think of an edge as a pair of nodes: a tail node,

and a head node. It must have exactly one of each.

Picture a tree. It consists of a trunk, branches, and leaves. The

trunk can branch, and the branches can branch in turn, or sprout

leaves, but the branches don't grow back together again. Leaves can't

have branches. We'll need to describe all that somehow.

So conceptually, we have a table of nodes that looks like this:

CREATE TABLE node (

node_id SERIAL PRIMARY KEY

/* and maybe some other stuff describing the node.

* We will leave this out for now.

*/

);

Next, let's define the table of edges:

CREATE TABLE edge (

tail INTEGER NOT NULL REFERENCES node(node_id),

head INTEGER NOT NULL REFERENCES node(node_id),

CHECK(tail <> head), /* Heads and tails can't be the same! */

PRIMARY KEY(tail, head) /* Just one edge between any two nodes! */

);

If branches

**are** growing back together, you'll

have at least one place where a head has more than one tail, and we

want to prevent that. How? It turns out SQL has a built-in,

non-clever feature for that: UNIQUE constraints. We just

say:

ALTER TABLE edge ADD UNIQUE(head)

Well, that was easy!

We didn't really need a UNIQUE constraint on (tail, head), so let's

drop that:

ALTER TABLE edge DROP CONSTRAINT edge_pkey

That pretty much does it. Edge can only describe a forest because it guarantees that anything in it must be a tree.

We haven't guaranteed that there's just one tree in the edge table.

That's for another time.

Until the next time, happy SQLing!

http://techportal.ibuildings.com/2009/09/07/graphs-in-the-database-sql-meets-social-networks/

http://www.alberton.info/talks/show/id/2

HTH