Astro Starter Blog
Published at

Exact Marginal Estimation via Junction Trees

A short introduction into mathematical perspective on exact marginal inference on graphs via Junction Tree Algorithm

Authors
  • avatar
    Name
    Dr. Dmitrij Sitenko
  • Mathematical Image Processing at Ruprecht Karls University Heidelberg

Junction trees, also known as clique trees comprises a particular type of data structure used in probabilistic graphical models for the inference of exact marginals of a given joint probability distribution.

We start with a simple graphical model constisting of four nodes that form tree.

A four node Tree Graph

Knowing the joint distribution p(x1,x2,x3,x4)p(x_1,x_2,x_3,x_4) our goal are the marginals p(x1),p(x2),p(x3),p(x4)p(x_1),p(x_2),p(x_3),p(x_4). A simple apporach is the marginalization over the complementary set, i.e. for x3x_3

x1,x2,x4p(x1,,x4)=p(x3)\begin{aligned} \sum_{x_1,x_2,x_4} p(x_1,\dots,x_4) = p(x_3) \end{aligned}

which requires 10310^3 summations if we assume xix_i to be a categorical random variable taking values in the finite set {1,10}\{1,\dots 10\}. However, for larger graphs this procedure scales exponential with the number of nodes V=n|V| = n in the graph resulting in 109010^{90} summations for each marginal which is more then the number of atoms in the universe! A way out of this hopeless apporach is given by the underlying tree structure of the graph with respect to which the disribution can be rewritten as

p(x1,x2,x3,x4)=p(x1,x2,x3,x4)p(x3,x2,x1)p(x3,x2,x1)=p(x4x3,x2,x1)p(x3,x2,x1)=p(x4x3,x2,x1)p(x2x3,x1)p(x1,x3)=p(x4x3,x2,x1)p(x2x3,x1)p(x1,x3)p(x1x3)p(x3)\begin{aligned} p(x_1,x_2,x_3,x_4) &= \frac{p(x_1,x_2,x_3,x_4)p(x_3,x_2,x_1)}{p(x_3,x_2,x_1)}\\ &= p(x_4|x_3,x_2,x_1)p(x_3,x_2,x_1)\\ &=p(x_4|x_3,x_2,x_1)p(x_2|x_3,x_1)p(x_1,x_3)\\ &= p(x_4|x_3,x_2,x_1)p(x_2|x_3,x_1)p(x_1,x_3)p(x_1|x_3)p(x_3) \end{aligned}

As the node x3x_3 separates the connecting nodes x1,x2x_1,x_2 and x4x_4 and therefore by conditional independence using p(x4x3,x2,x1)=p(x4x3)p(x_4|x_3,x_2,x_1) = p(x_4|x_3) and p(x2x3,x1)=p(x2x3)p(x_2|x_3,x_1) = p(x_2|x_3) the joint probability distribution factorizes into the product of conditional distributions

p(x1,x2,x3,x4)=p(x1,x2,x4x3)p(x3)=p(x1x3)p(x2x3)p(x4x3)p(x3)=p(x1,x3)p(x2,x3)p(x4,x3)p(x32)\begin{aligned} p(x_1,x_2,x_3,x_4) &= p(x_1,x_2,x_4|x_3)p(x_3) \\&= p(x_1|x_3)p(x_2|x_3)p(x_4|x_3)p(x_3)\\ &= \frac{p(x_1,x_3)p(x_2,x_3)p(x_4,x_3)}{p(x_3^2)} \end{aligned}

So far nothing is won, because taking off from the right hand side we still have to sum over nodes {x1,x2,x4}\{x_1,x_2,x_4\}. However, if we would have knowledge on the edge marginals p(xi,x3)p(x_i,x_3), we could reselve this problem by just taking the sum over xix_i. This is the essence why estimating the marginals of a large joint probability distribution concerns so much attention in the scientific communnity.

Following the steps we have now reduced the marginalization problem to inference task of edge marginals. But how do we arrive at the right hand side if we have more complecated graphs ? Before answering this quastion we state an alternative factorization of p(x)p(x)

p(x1,x3)p(x2,x3)p(x4,x3)p(x32)=p(x1,x3)p(x1)p(x3)p(x2,x3)p(x3)p(x2)p(x4,x3)p(x4)p(x3)p(x3)p(x1)p(x2)p(x4)\begin{aligned} \frac{p(x_1,x_3)p(x_2,x_3)p(x_4,x_3)}{p(x_3^2)} = \frac{p(x_1,x_3)}{\textcolor{red}{p(x_1)}p(x_3)}\cdot \frac{p(x_2,x_3)}{p(x_3)\textcolor{red}{p(x_2)}}\cdot\frac{p(x_4,x_3)}{\textcolor{red}{p(x_4)p(x_3)}}\cdot\textcolor{red}{p(x_3)p(x_1)p(x_2)p(x_4)} \end{aligned}

Now even if we dont have acces to true marginals p(xi,xk)p(x_i,x_k) we still have the global distribution from the left and factorization from the right which we can use to model our distribution

p(x)=ψ1(x1)ψ2(x2)ψ3(x3)ψ4(x4)ψ13(x1,x3)ψ23(x2,x3)ψ34(x3,x4)\begin{aligned} p(\bold{x}) = \psi_1(x_1)\psi_2(x_2)\psi_3(x_3)\psi_4(x_4)\psi_{13}(x_1,x_3)\psi_{23}(x_2,x_3)\psi_{34}(x_3,x_4) \end{aligned}

which agrees with the previous factorization for the particular case ψi(xi)=p(xi)\psi_i(x_i) = p(x_i) and ψij(xi,xj)=p(xi,xj)p(xi)p(xj)\psi_{ij}(x_i,x_j) = \frac{p(x_i,x_j)}{p(x_i)p(x_j)}. Now given any factorization with respect to potential functions ψij(xi,xj)\psi_{ij}(x_i,x_j) how can we effiently acces the marginal at p(x1)p(x_1)? To get an intuition we consider a simple scenario of a chain consisting of three nodes x1,x2x_1,x_2 and x3x_3

Four node tree-graph message path

p(x1,x2,x3)ψ~1(0)(x1)ψ~3(0)(x3)ψ4(0)(x4)ψ~13(0)(x1,x3)ψ~34(0)(x3,x4)=ψ13(0)(x1,x3)ψ13(0)(x3,x4)ψ3(0)(x3),ψ3(0)(x3)=1\begin{aligned} p(x_1,x_2,x_3) \propto \tilde{\psi}^{(0)}_1(x_1)\tilde{\psi}^{(0)}_3(x_3)\psi^{(0)}_4(x_4)\tilde{\psi}^{(0)}_{13}(x_1,x_3)\tilde{\psi}^{(0)}_{34}(x_3,x_4) = \frac{\psi^{(0)}_{13}(x_1,x_3)\psi^{(0)}_{13}(x_3,x_4)}{\psi_3^{(0)}(x_3)}, \qquad \psi_3^{(0)}(x_3) = 1 \end{aligned}

We want the updates of local potentials ψ(n)(x1,x3)ψ3(n)(x3)\psi^{(n)}(x_1,x_3)\psi^{(n)}_{3}(x_3) and ψ(n)(x1,x3)ψ3(n)(x3)\psi^{(n)}(x_1,x_3)\psi^{(n)}_{3}(x_3) that in the limits satisfy a constaint

x1p(x1,x3)p(x3)=x4p(x3,x4)p(x3)=1\begin{aligned} \sum_{x_1}\frac{p(x_1,x_3)}{p(x_3)} = \sum_{x_4}\frac{p(x_3,x_4)}{p(x_3)} = 1 \end{aligned}

Now assume that this local consistancy condition is not satisfied , that is

x1ψ13(0)(x1,x3)x4ψ34(0)(x3,x4)\begin{aligned} \sum_{x_1}\psi^{(0)}_{13}(x_1,x_3) \neq \sum_{x_4}\psi^{(0)}_{34}(x_3,x_4) \end{aligned}

at the next iteration we want fix this by first updating the separator node marginal ψ3(1)(x3)=x1ψ13(0)(x1,x3)\psi^{(1)}_{3}(x_3) = \sum_{x_1} \psi^{(0)}_{13}(x_1,x_3)

Four node tree-graph message path

and then normalize the edge potential

ψ34(1)(x3,)=ψ3(1)(x3)ψ34(0)(x3,)ψ3(0)(x3)\psi^{(1)}_{34}(x_3,\cdot) = \frac{\psi^{(1)}_3(x_3)\psi^{(0)}_{34}(x_3,\cdot)}{\psi_3^{(0)}(x_3)}

For the above steps to be admissible two aspect must hold

(i)(i): The factorization with respect does not effect the joint probability, i.e.

ψ13(0)(x1,x3)ψ34(1)(x3,x4)ψ3(1)(x3)=ψ(0)(x1,x3)ψ3(1)(x3)ψ34(0)(x3,x4)ψ3(0)(x3)ψ3(1)(x3)=p(x1,x3,x4)\begin{aligned} \frac{\psi^{(0)}_{13}(x_1,x_3)\psi^{(1)}_{34}(x_3,x_4)}{\psi_3^{(1)}(x_3)} = \frac{\psi^{(0)}(x_1,x_3)\psi_3^{(1)}(x_3)\psi^{(0)}_{34}(x_3,x_4)}{\psi_3^{(0)}(x_3)\psi_3^{(1)}(x_3)} = p(x_1,x_3,x_4) \end{aligned}

(ii)(ii): the local consistancy condition at x3x_3:

x1ψ13(0)(x1,x3)=ψ3(1)(x3)x4ψ34(1)(x3,x4)=x4ψ3(1)(x3)ψ34(0)(x3,x4)ψ3(0)(x3)\begin{aligned} \sum_{x_1}\psi^{(0)}_{13}(x_1,x_3) = \psi^{(1)}_{3}(x_3) \neq \sum_{x_4}\psi^{(1)}_{34}(x_3,x_4) = \sum_{x_4}\frac{\psi^{(1)}_{3}(x_3)\psi^{(0)}_{34}(x_3,x_4)}{\psi^{(0)}_3(x_3)} \end{aligned}

which is still violated due to ψ3(0)(x3)=1x4ψ34(0)(x3,x4)\psi^{(0)}_3(x_3) = 1 \neq \sum_{x_4}\psi^{(0)}_{34}(x_3,x_4) by assumption. However, what we have achieved is (iii)(iii): marginalization condition at the edge (34)(34) as

ψ34(1)(x3,x4)=x1ψ130(x1,x3)ψ34(0)(x3,x4)ψ3(0)(x3)=p(x1,x2,x3)=p(x3,x4)\begin{aligned} \psi_{34}^{(1)}(x_3,x_4) = \sum_{x_1}\underbrace{\frac{\psi^{0}_{13}(x_1,x_3)\psi^{(0)}_{34}(x_3,x_4)}{\psi_3^{(0)}(x_3)}}_{= p(x_1,x_2,x_3)} = p(x_3,x_4) \end{aligned}

Four node tree-graph message path

This process of propagating the marginal from leaf node x1x_1 over separator node x3x_3 over to leaf node x4x_4 is commonly denoted as the forward pass which leads the true marginal distribution ψ(x3,x4)\psi^{\ast}(x_3,x_4). Now by contruction we can do the same in the reverse direction by repeating the steps

  • ψ3(2)(x3)=x4ψ34(x3,x4)\psi_3^{(2)}(x_3) = \sum_{x_4}\psi^{\ast}_{34}(x_3,x_4)
Four node tree-graph message path

and then updating the edge potential

ψ13()(x1,x3)=ψ3(2)(x3)ψ13(0)(x1,x3)ψ3(1)(x3)\begin{aligned} \psi^{(\ast)}_{13}(x_1,x_3) = \frac{\psi_3^{(2)}(x_3)\psi_{13}^{(0)}(x_1,x_3)}{\psi_3^{(1)}(x_3)} \end{aligned}

which defines the backward pass

Four node tree-graph message path

and we next check again the three conditions

\bullet Joint probability distribution:

ψ13(x1,x3)ψ34(x3,x4)ψ3(2)(x3)=ψ3(2)(x3)ψ13(0)(x1,x3)ψ34(1)(x3,x4)ψ3(1)(x3)ψ3(2)(x3)=ψ13(0)(x1,x3)ψ3(1)(x3)ψ34(0)(x3,x4)ψ3(0)(x3)ψ3(1)(x3)=ψ13(0)(x1,x3)ψ34(0)(x3,x4)ψ3(0)(x3)=p(x1,x2,x3)\begin{aligned} \frac{\textcolor{green}{\psi^{\ast}_{13}(x_1,x_3)}\textcolor{red}{\psi^{\ast}_{34}(x_3,x_4)}}{\psi^{(2)}_3(x_3)} &= \frac{\textcolor{green}{\psi_3^{(2)}(x_3)\psi^{(0)}_{13}(x_1,x_3)}\textcolor{red}{\psi^{(1)}_{34}(x_3,x_4)}}{\textcolor{green}{\psi_3^{(1)}(x_3)}\psi_3^{(2)}(x_3)}\\ &=\frac{\textcolor{green}{\psi^{(0)}_{13}(x_1,x_3)}\textcolor{red}{\psi^{(1)}_{3}(x_3)\psi^{(0)}_{34}(x_3,x_4)}}{\textcolor{green}{\textcolor{red}{\psi^{(0)}_3(x_3)}\psi_3^{(1)}(x_3)}}\\ &=\frac{\textcolor{green}{\psi^{(0)}_{13}(x_1,x_3)}\textcolor{red}{\psi^{(0)}_{34}(x_3,x_4)}}{\textcolor{green}{\psi_3^{(0)}(x_3)}}\\ &=p(x_1,x_2,x_3) \end{aligned}

\bullet Local consistancy:

x1ψ13(x1,x3)=ψ3(2)(x3)ψ3(1)(x3)x1ψ13(0)(x1,x3)=ψ3(2)(x3)ψ3(1)(x3)x1ψ13(0)(x1,x3)ψ3(1)(x3)=x4ψ34(1)(x3,x4)\begin{aligned} \sum_{x_1}\psi_{13}^{\ast}(x_1,x_3) &= \frac{\psi^{(2)}_3(x_3)}{\psi_3^{(1)}(x_3)}\sum_{x_1}\psi^{(0)}_{13}(x_1,x_3)\\ &= \frac{\psi^{(2)}_3(x_3)}{\psi_3^{(1)}(x_3)}\underbrace{\sum_{x_1}\psi^{(0)}_{13}(x_1,x_3)}_{\psi_3^{(1)}(x_3)}\\ &= \sum_{x_4}\psi^{(1)}_{34}(x_3,x_4) \end{aligned}

\bullet Correct edge marginals:

ψ13(x1,x3)=ψ3(2)(x3)ψ13(0)(x1,x3)ψ3(1)(x3)=x4ψ34()(x3,x4)ψ13(0)(x1,x3)ψ3(1)(x3)=x4ψ3(1)(x3)ψ34(0)(x3,x4)ψ13(0)(x1,x3)ψ3(1)(x3)ψ3(0)(x3)=p(x1,x3,x4)=p(x1,x3)\begin{aligned} \psi^{\ast}_{13}(x_1,x_3) &= \frac{\psi^{(2)}_{3}(x_3)\psi^{(0)}_{13}(x_1,x_3)}{\psi^{(1)}_{3}(x_3)}\\ &= \frac{\sum_{x_4}\psi_{34}^{(\ast)}(x_3,x_4)\psi^{(0)}_{13}(x_1,x_3)}{\psi^{(1)}_{3}(x_3)} &=\sum_{x_4} \underbrace{\frac{\psi_3^{(1)}(x_3)\psi_{34}^{(0)}(x_3,x_4)\psi^{(0)}_{13}(x_1,x_3)}{\psi^{(1)}_{3}(x_3)\psi_3^{(0)}(x_3)}}_{= p(x_1,x_3,x_4)} &= p(x_1,x_3) \end{aligned}

and consequently we obtain a new factorization of the joint probabilty with

ψ13(x1,x3)=p(x1,x3),ψ34(x3,x4)=p(x3,x4)ψ3(x3)=p(x3)\begin{aligned}\psi^{\ast}_{13}(x_1,x_3) = p(x_1,x_3),\quad \psi^{\ast}_{34}(x_3,x_4) = p(x_3,x_4) \quad \psi^{\ast}_3(x_3) = p(x_3)\end{aligned}

We can generalize the above procedure from chain graph to simple tree graph by including node x2x_2 adding the missing compenents in the factorization of ψ34(1)(x3,x4)\psi_{34}^{(1)}(x_3,x_4) in the forward pass to yield correct edge potential p(x3,x4)p(x_3,x_4) via additional marginalization over new node x2x_2.

ψ34(1)(x3,x4)=x1x2ψ130(x1,x3)ψ34(0)(x3,x4)ψ230(x2,x3)ψ3(0)(x3)ψ3(0)(x3)=ψ13(1)(x3)ψ23(1)(x3)ψ34(0)(x3,x4)ψ3(0)(x3)2=x2,x1p(x1,x2,x3,x4)=p(x3,x4)\begin{aligned} \psi_{34}^{(1)}(x_3,x_4) &= \sum_{x_1}\textcolor{red}{\sum_{x_2}} \frac{\psi^{0}_{13}(x_1,x_3)\psi^{(0)}_{34}(x_3,x_4)\textcolor{red}{\psi^{0}_{23}(x_2,x_3)}}{\psi_3^{(0)}(x_3)\textcolor{red}{\psi_3^{(0)}(x_3)}}\\ &= \frac{\textcolor{red}{\psi^{(1)}_{1\to 3}(x_3)} \textcolor{red}{\psi^{(1)}_{2 \to 3}(x_3)}\psi^{(0)}_{34}(x_3,x_4)}{\psi_3^{(0)}(x_3)^{2}}\\ &= \sum_{\textcolor{red}{x_2},x_1} p(x_1,\textcolor{red}{x_2},x_3,x_4) = p(x_3,x_4) \end{aligned}

Simply put we the above update corresponds to a simple forward pass where now we have a set of messages ψi3(1)(x3)\psi^{(1)}_{i \to 3}(x_3) coming through a separating node x3x_3 from from edges connected to it. The of message passing updates now comes from the fact that we can interpret the current marginals at x3x_3 as messages from marginalized node over x3x_3 over to leaf node x4x_4. Now repeating the steps of the backward pass but now with additional edge potential ψ23()(x2,x3)\psi_{23}^{(\ast)}(x_2,x_3) it follows

ψ23(x2,x3)=ψ43(2)(x3)ψ23(0)(x2,x3)ψ23(1)(x3)=x4ψ34()(x3,x4)ψ23(0)(x2,x3)ψ23(1)(x3)=x4x1ψ23(1)(x3)ψ13(0)(x1,x3)ψ34(0)(x3,x4)ψ23(0)(x2,x3)ψ23(1)(x3)(ψ3(0)(x3))2=x1,x4p(x1,x2,x3,x4)= p(x2,x3)\begin{aligned} \psi^{\ast}_{23}(x_2,x_3) &= \frac{\psi^{(2)}_{4 \to 3}(x_3)\psi^{(0)}_{23}(x_2,x_3)}{\psi^{(1)}_{2 \to 3}(x_3)}\\ &= \frac{\sum_{x_4}\psi_{34}^{(\ast)}(x_3,x_4)\psi^{(0)}_{23}(x_2,x_3)}{\psi^{(1)}_{2 \to 3}(x_3)} \\ &=\sum_{x_4}\sum_{x_1} \frac{\psi_{2 \to 3}^{(1)}(x_3)\psi_{13}^{(0)}(x_1,x_3)\psi_{34}^{(0)}(x_3,x_4)\psi^{(0)}_{23}(x_2,x_3)}{\psi^{(1)}_{2 \to 3}(x_3)(\psi_3^{(0)}(x_3))^2}\\ &= \sum_{x_1,x_4}p(x_1,x_2,x_3,x_4) = \ p(x_2,x_3) \end{aligned}

It can easely be checked that all condition hold and we get the correct marginals from the backward pass. The following graphic illustrates the steps

Four node tree-graph message path

More generally on a tree graph with node size V=n|V| = n each tree-structured distribution factorizes via local marginals

p(x)=iVp(xi)(xi,xj)Ep(xi,xj)p(xi)p(xj)x=(x1,,xn)\begin{aligned} p(\bold{x}) = \prod_{i \in V}p(x_i)\prod_{(x_i,x_j) \in E}\frac{p(x_i,x_j)}{p(x_i)p(x_j)} \qquad \bold{x} = (x_1,\dots,x_n) \end{aligned}

where:

  • p(xi)p(x_i) are the marginal probability at node v inV v \ in V on a tree graph GT=(V,E)\mathcal{G}^T = (V,E),
  • p(xi,xj) p(x_i, x_j) are the pairwise joint probabilities on edges (i,j)E(i, j) \in E,
  • The fraction p(xi,xj)p(xi)p(xj)\frac{p(x_i, x_j)}{p(x_i) p(x_j)} is now the desired representation of ψij(xi,xj)\psi_{ij}(x_i,x_j) in terms of marginals.

This can be seen as follows

First, each node xix_i on a tree is conditionally independent of all other nodes given its neighbors, i.e. the joint distribution satisfies the pairwise Markov property:

p(xixV{i})=p(xixneigh(i))p(x_i \mid x_{V \setminus \{i\}}) = p(x_i \mid x_{\text{neigh}(i)})

where neigh(i)\text{neigh}(i) is the set of neighboring nodes of ii. The joint distribution can be expanded using the chain rule for probabilities

p(x1,x2,,xn)=iVp(xixj<i)=iVp(xixparents(i))p(x_1, x_2, \dots, x_n) = \prod_{i \in V} p(x_i \mid x_{j < i}) = \prod_{i \in V} p(x_i \mid x_{\text{parents}(i)})

where parents(i)\text{parents}(i) refers to the parent of vertex ii with respect to some arbitrary root of the tree.

Second, for each vertex iV i \in V, its conditional distribution p(xixparents(v))p(x_i \mid x_{\text{parents}(v)}) can be expressed using the joint marginal P(Xv,Xparents(v))P(X_v, X_{\text{parents}(v)}) and the marginal P(Xparents(v))P(X_{\text{parents}(v)}) via

p(xixparents(i))=P(xi,xparents(i))p(xparents(i))p(x_i \mid x_{\text{parents}(i)}) = \frac{P(x_i, x_{\text{parents}(i)})}{p(x_{\text{parents}(i)})}

which after insertion into the joint probability yields the desired factorization

p(x1,x2,,xn)=(i,j)Ep(xi,xj)p(xi)=iVp(xi)(i,j)Ep(xi,xj)p(xi)p(xj)p(x_1, x_2, \dots, x_n) = \prod_{(i, j) \in E} \frac{p(x_i, x_j)}{p(x_i)} = \prod_{i \in V}p(x_i) \prod_{(i,j)\in E}\frac{p(x_i, x_j)}{p(x_i)p(x_j)}

where the last equality can be shown using induction over n=Vn = |V|.

As next more sophisticated example we consider the following graphical model with cycles on the 9×99\times 9 ineteger grid

Grid Graph with cycles

with an underlying distribution p(x1,,x9)p(x_1,\dots,x_9). We want to go for the marginals p(x1),p(x3),p(x7),p(x9)p(x_1),p(x_3),p(x_7),p(x_9). A naive approach what we can do, say for accessing p(x7)p(x_7) is taking a sum over all nodes xkx_k except x7x_7 which results in

xk7p(x)=p(x7)\begin{aligned} \sum_{x_{k\setminus 7}}p(x) = p(x_7) \end{aligned}

which scales exponentially as O(8r)\mathcal{O}(8^r) with the node dimension rr. However we can follow the same idea and determine p(x7)p(x_7) by performing message passing on a tree graph. However, due to presence of cycles we introduce a slight generalization of tree graph the so called junction trees.

A junctio tree, consists of maximal cliques (subsets of fully connected nodes) of an triangulated graph, i.e. a graph with cycles of maximal length 3. The edges are defined in terms of the separater sets, i.e. intersection of adjacent cliques. In addition, the tree must satisfy the running intersection property, that is for each two maximal cliques C1,C2\mathcal{C}_1,\mathcal{C}_2 and a node xjmathcalC1mathcalC2x_j \in mathcal{C}_1 \cap mathcal{C}_2 all cliques Ck\mathcal{C}_k and separater nodes SkS_k on the path between C1\mathcal{C}_1 and C2\mathcal{C}_2 include xjx_j The following figure illustrates a valid junctio tree for the above grid graph.

Condensed Matter Phenomena

#include <iostream>
#include <vector>
#include <unordered_set>

using namespace std;

struct Node {
    int id;
    vector<int> neighbors;
};

struct Clique {
    unordered_set<int> nodes;
};

class Graph {
    int V;
    vector<Node> nodes;

public:
    Graph(int V);
    void addEdge(int v, int w);
    vector<Clique> triangulate();
    void printJunctionTree(const vector<Clique>& cliques);
};

Graph::Graph(int V) {
    this->V = V;
    nodes.resize(V);
    for (int i = 0; i < V; ++i) {
        nodes[i].id = i;
    }
}

void Graph::addEdge(int v, int w) {
    nodes[v].neighbors.push_back(w);
    nodes[w].neighbors.push_back(v);
}

// Triangulation using minimum fill-in heuristic
vector<Clique> Graph::triangulate() {
    vector<Clique> cliques;

    // Temporary vector to store the triangulated nodes
    vector<bool> processed(V, false);

    for (int i = 0; i < V; ++i) {
        if (!processed[i]) {
            // Find the neighbors of current node
            unordered_set<int> neighbors(nodes[i].neighbors.begin(), nodes[i].neighbors.end());

            // Add the node and its neighbors to a new clique
            Clique clique;
            clique.nodes.insert(i);
            for (int neighbor : nodes[i].neighbors) {
                clique.nodes.insert(neighbor);
                for (int n : nodes[neighbor].neighbors) {
                    if (n != i)
                        clique.nodes.insert(n);
                }
            }

            // Add the new clique to the junction tree
            cliques.push_back(clique);

            // Mark all nodes in the new clique as processed
            for (int node : clique.nodes)
                processed[node] = true;
        }
    }

    return cliques;
}

void Graph::printJunctionTree(const vector<Clique>& cliques) {
    cout << "Junction Tree:" << endl;
    for (int i = 0; i < cliques.size(); ++i) {
        cout << "Clique " << i << ": ";
        for (int node : cliques[i].nodes) {
            cout << node << " ";
        }
        cout << endl;
    }
}

int main() {
    Graph graph(6);
    graph.addEdge(0, 1);
    graph.addEdge(0, 2);
    graph.addEdge(1, 2);
    graph.addEdge(1, 3);
    graph.addEdge(2, 4);
    graph.addEdge(3, 4);
    graph.addEdge(3, 5);
    graph.addEdge(4, 5);

    vector<Clique> cliques = graph.triangulate();
    graph.printJunctionTree(cliques);

    return 0;
}