Ullman subgraph isomorphism.html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <link href="https://adriann.github.io/feed.rss" rel="alternate" type="application/rss+xml" title="What's new on adriann.github.io" />
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <meta http-equiv="Content-Style-Type" content="text/css" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <meta name="generator" content="pandoc" />
  <meta name="author" content="Adrian Neumann (adrian_neumann@gmx.de)" />
  <title>Ullman’s Subgraph Isomorphism Algorithm</title>
  <style type="text/css">
  .displayequation{margin-left:auto;margin-right:auto;}
  </style>
  <style>
  .caption{font-size:66%;text-align:right;}
  .figure{float:right;padding-bottom:1em;padding-left:1em;}
  .figure>img{display:block;margin:0 auto;}
  .footnotes{font-size:80%;}
  .block{border-left:1ex solid gray;padding-left:2em;}
  li{padding:0.25em;}
  a:hover{text-shadow: 0 0 5px;}
  body{font-family:sans-serif;max-width:100ex;padding-left:3em;padding-right:2em;}
  code{font-family:Consolas, Inconsolata, Monaco, monospace;}
  p{text-align:justify;}
  </style>
</head>
<body>
<div id="header">
<h1 class="title">Ullman’s Subgraph Isomorphism Algorithm</h1>
</div>
<div class="figure">
<img src="pictures/puzzle_by_aice83.jpg" title="puzzle by deviant art user ~aice83" />

</div>
<p>The subgraph isomorphism problem asks whether a graph <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>G</mi><annotation encoding="application/x-tex">G</annotation></semantics></math> has a subgraph <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>G</mi><mi>′</mi><mo>⊂</mo><mi>G</mi></mrow><annotation encoding="application/x-tex">G&#39;\subset G</annotation></semantics></math> that is isomorphmic to a graph <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>P</mi><annotation encoding="application/x-tex">P</annotation></semantics></math>. So basically you have the picture on the box of a puzzle (<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>G</mi><annotation encoding="application/x-tex">G</annotation></semantics></math>) and want to know where a particular piece (<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>P</mi><annotation encoding="application/x-tex">P</annotation></semantics></math>) fits, if at all. It is NP-complete because Hamiltonian cycle is a special case.</p>
<p>In 1976 Ullman proposed a backtracking algorithm for this problem. The writeup in the <a href="http://dx.doi.org/10.1145%2F321921.321925">original paper</a> is hard to follow because the pseudo-code doesn’t use functions or even loops. Maybe those weren’t invented back then.<a href="#fn1" class="footnoteRef" id="fnref1"><sup>1</sup></a></p>
<!--more-->
<h2 id="the-basic-algorithm">The basic algorithm</h2>
<p>It is possible to encode a subgraph isomorphism as a <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">|</mo><msub><mi>V</mi><mi>P</mi></msub><mo stretchy="false" form="prefix">|</mo><mo>×</mo><mo stretchy="false" form="prefix">|</mo><msub><mi>V</mi><mi>G</mi></msub><mo stretchy="false" form="prefix">|</mo></mrow><annotation encoding="application/x-tex">|V_P|\times |V_G|</annotation></semantics></math> matrix <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>M</mi><annotation encoding="application/x-tex">M</annotation></semantics></math> in which each row contains exactly one 1 and each column contains at most one 1. We set <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>m</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub><annotation encoding="application/x-tex">m_{ij}</annotation></semantics></math> to 1 iff <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>v</mi><mi>j</mi></msub><mo>∈</mo><mi>G</mi></mrow><annotation encoding="application/x-tex">v_j\in G</annotation></semantics></math> corresponds to <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>v</mi><mi>i</mi></msub><mo>∈</mo><mi>P</mi></mrow><annotation encoding="application/x-tex">v_i\in P</annotation></semantics></math> in the isomorphism. Then <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mo>=</mo><mi>M</mi><mo stretchy="false" form="prefix">(</mo><mi>M</mi><mi>G</mi><msup><mo stretchy="false" form="postfix">)</mo><mi>T</mi></msup></mrow><annotation encoding="application/x-tex">P=M(MG)^T</annotation></semantics></math>, where I use <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>P</mi><annotation encoding="application/x-tex">P</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>G</mi><annotation encoding="application/x-tex">G</annotation></semantics></math> to stand for the adjacency matrices. If we aren’t looking for induced subgraphs, <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mo>≤</mo><mi>M</mi><mo stretchy="false" form="prefix">(</mo><mi>M</mi><mi>G</mi><msup><mo stretchy="false" form="postfix">)</mo><mi>T</mi></msup></mrow><annotation encoding="application/x-tex">P\leq M(MG)^T</annotation></semantics></math> (componentwise), i.e. the subgraph of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>G</mi><annotation encoding="application/x-tex">G</annotation></semantics></math> selected by <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>M</mi><annotation encoding="application/x-tex">M</annotation></semantics></math> might contain additional edges not present in <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>P</mi><annotation encoding="application/x-tex">P</annotation></semantics></math>.</p>
<p>The algorithm works by systematically enumerating possible matrices <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>M</mi><annotation encoding="application/x-tex">M</annotation></semantics></math> and checking whether they actually encode an isomorphism.</p>
<p>We start by setting up a <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">|</mo><msub><mi>V</mi><mi>P</mi></msub><mo stretchy="false" form="prefix">|</mo><mo>×</mo><mo stretchy="false" form="prefix">|</mo><msub><mi>V</mi><mi>G</mi></msub><mo stretchy="false" form="prefix">|</mo></mrow><annotation encoding="application/x-tex">|V_P|\times |V_G|</annotation></semantics></math> matrix <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>M</mi><mn>0</mn></msup><annotation encoding="application/x-tex">M^0</annotation></semantics></math> that contains a 1 at <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">(</mo><mi>i</mi><mo>,</mo><mi>j</mi><mo stretchy="false" form="postfix">)</mo></mrow><annotation encoding="application/x-tex">(i,j)</annotation></semantics></math> if it is possible that <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>v</mi><mi>i</mi></msub><mo>∼</mo><msub><mi>v</mi><mi>j</mi></msub></mrow><annotation encoding="application/x-tex">v_i \sim v_j</annotation></semantics></math> in some subgraph isomorphism. For now, we only use the degree as a criterion, i.e. we can map <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>v</mi><mi>i</mi></msub><annotation encoding="application/x-tex">v_i</annotation></semantics></math> to <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>v</mi><mi>j</mi></msub><annotation encoding="application/x-tex">v_j</annotation></semantics></math> if the latter has enough neighbors: <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msubsup><mi>m</mi><mrow><mi>i</mi><mi>j</mi></mrow><mn>0</mn></msubsup><mo>=</mo><mn>1</mn><mo accent="false">⇔</mo><mtext mathvariant="normal">deg</mtext><mo stretchy="false" form="prefix">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo stretchy="false" form="postfix">)</mo><mo>≤</mo><mtext mathvariant="normal">deg</mtext><mo stretchy="false" form="prefix">(</mo><msub><mi>v</mi><mi>j</mi></msub><mo stretchy="false" form="postfix">)</mo></mrow><annotation encoding="application/x-tex">m_{ij}^0 = 1 \Leftrightarrow \mbox{deg}(v_i) \leq \mbox{deg}(v_j)</annotation></semantics></math>. If we were cleverer we could remove more 1’s and reduce our runtime.</p>
<p>Now all we need to do is try all matrices <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>M</mi><annotation encoding="application/x-tex">M</annotation></semantics></math> that can be obtained from <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>M</mi><mn>0</mn></msup><annotation encoding="application/x-tex">M^0</annotation></semantics></math> by removing all but one 1 from each row while having at most one 1 in each column. We do that recursively.</p>
<pre><code>recurse(used_columns, cur_row, G, P, M)
    if cur_row = num_rows(M)
        if  M is an isomorphism:
            output yes and end the algorithm

    M&#39; = M
    prune(M&#39;)

    for all unused columns c
        set column c in M&#39; to 1 and other columns to 0
        mark c as used
        recurse(used_column, cur_row+1, G, P, M&#39;)
        mark c as unused

    output no</code></pre>
<p>Making the <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>M</mi><mi>′</mi></mrow><annotation encoding="application/x-tex">M&#39;</annotation></semantics></math> copy of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>M</mi><annotation encoding="application/x-tex">M</annotation></semantics></math> isn’t really necessary, but it will become so once we implement our pruning procedure.</p>
<h2 id="pruning">Pruning</h2>
<p>We want to (safely) change at least some of the 1’s in our matrix to 0’s to reduce the computation time. For that, we use a simple observation. If some <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>p</mi><mo>∈</mo><msub><mi>V</mi><mi>P</mi></msub></mrow><annotation encoding="application/x-tex">p\in V_P</annotation></semantics></math> has neighbours <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>p</mi><mn>1</mn></msub><mo>,</mo><mi>…</mi><mo>,</mo><msub><mi>p</mi><mi>l</mi></msub><mo>∈</mo><mi>P</mi></mrow><annotation encoding="application/x-tex">p_1,\ldots, p_l \in P</annotation></semantics></math>, and we map it to some <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>g</mi><mo>∈</mo><msub><mi>V</mi><mi>G</mi></msub></mrow><annotation encoding="application/x-tex">g \in V_G</annotation></semantics></math>, then we’d better also map <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>p</mi><mn>1</mn></msub><mo>,</mo><mi>…</mi><mo>,</mo><msub><mi>p</mi><mi>l</mi></msub></mrow><annotation encoding="application/x-tex">p_1,\ldots,p_l</annotation></semantics></math> to neigbors of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>g</mi><annotation encoding="application/x-tex">g</annotation></semantics></math>.</p>
<p>Remember that a 1 at <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">(</mo><mi>i</mi><mo>,</mo><mi>j</mi><mo stretchy="false" form="postfix">)</mo></mrow><annotation encoding="application/x-tex">(i,j)</annotation></semantics></math> in <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>M</mi><annotation encoding="application/x-tex">M</annotation></semantics></math> means that we still think that <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>v</mi><mi>i</mi></msub><mo>∈</mo><mi>P</mi></mrow><annotation encoding="application/x-tex">v_i\in P</annotation></semantics></math> can correspond to <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>v</mi><mi>j</mi></msub><mo>∈</mo><mi>G</mi></mrow><annotation encoding="application/x-tex">v_j\in G</annotation></semantics></math>. But if we already found out that there is a neighbour of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>v</mi><mi>i</mi></msub><mo>∈</mo><mi>P</mi></mrow><annotation encoding="application/x-tex">v_i\in P</annotation></semantics></math> can’t be mapped to any neigbour of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>v</mi><mi>j</mi></msub><mo>∈</mo><mi>G</mi></mrow><annotation encoding="application/x-tex">v_j\in G</annotation></semantics></math> clearly the 1 at <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">(</mo><mi>i</mi><mo>,</mo><mi>j</mi><mo stretchy="false" form="postfix">)</mo></mrow><annotation encoding="application/x-tex">(i,j)</annotation></semantics></math> is wrong and we can change it safely to a 0. This change might make more mappings impossible, so we iterate this check until nothing can be changed. If we remove all 1’s from a row during this refinement, we can stop the whole process, since <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>M</mi><annotation encoding="application/x-tex">M</annotation></semantics></math> can’t be completed to an isomorphism anymore.</p>
<pre><code>do
    for all (i,j) where M is 1
        for all neighbors x of vi in P
            if there is no neighbor y of vj s.t. M(x,y)=1
                M(i,j)=0
while M was changed</code></pre>
<p>Now the effectiveness of our pruning procedure depends on the order of the vertices. The earlier in the recursion we find a 1 that can be changed to a 0, the better. Thus it is a good a idea to order the vertices such that high degree vertices of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>P</mi><annotation encoding="application/x-tex">P</annotation></semantics></math> are first.</p>
<h2 id="clever-implementation">Clever Implementation</h2>
<p>This algorithm seems rather costly, since we do a matrix multplication for every leaf in the recursion tree and manipulate the matrix quite a lot in between. However, note that we’re dealing with boolean matrices here. It’s a good idea to encode them as bit vectors. Then multiplication can be done efficiently using bit-twiddling, even if we still use the naive O(<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>n</mi><mn>3</mn></msup><annotation encoding="application/x-tex">n^3</annotation></semantics></math>) algorithm. Similarly setting a column to 1 and the other columns to 0 can be done using bit-twiddling and finding viable neighbors during the pruning step is fast too. Using these implementation tricks we can speed up the naive algorithm by some largish constant factor, depending on the word size of the CPU and whether it supports SSE or similar vector operations.</p>
<div class="footnotes">
<hr />
<ol>
<li id="fn1"><p><a href="http://www.blogger.com/profile/03287960526248203121">Clifford Wolf</a> says: I think those were invented in the late 60’s by Dijkstra (“Structured Programming” by E.W.Dijkstra in Nato Science Committee - Software Engineering Techniques in April 1970 and “GO TO Statement Considered Harmful” by E.W.Dijkstra in CACM V 11 No. 3 in March 1968). But afaik the concept of programming using block structures was unknown to most people in computer science before Kernighan and Plauger’s 1974 book “The Elements of Programming Style” and probably Nassi and Shneiderman’s 1973 paper “Flowchart techniques for structured programming”. So at least it was a rather new idea in 1976.<a href="#fnref1">↩</a></p></li>
</ol>
</div>
<hr/>
<div style="display:inline-flex;flex-wrap:wrap;justify-content:space-between;font-size:80%">
<p style="margin-right:2ex">CC-BY-SA <a href="mailto:adrian_neumann@gmx.de">Adrian Neumann</a> (PGP Key <a href="https://adriann.github.io/ressources/pub.asc">A0A8BC98</a>)</p>
<p style="margin-left:1ex;margin-right:1ex"><a href="http://adriann.github.io">adriann.github.io</a></p>
<p style="margin-left:2ex"><a href="https://adriann.github.io/feed.rss">RSS</a></p>
</div>
</body>
</html>