Local Features for Medical Applications

The Manifold of Symmetric Positive Definite Matrices $\mathcal{P}_{d}$

For a given positive integer $d \in \mathbb{N}_+$ we next carry over the differential geometric objects to the specific open set of symmetric positive definite $d \times d$ matrices that play an important role in mathematics, physics, numerical analysis and imaging that is given by the linear subspace of $\mathbb{R}^{d\times d}$

\mathcal{P}_{d} = \{S\in\mathbb{R}^{d\times d}\colon S=S^{T},\, S\;\text{is positive definite}\}.

The tangent space at $S\in\mathcal{P}_d$ consists of symmetric matrices

T_{S}\mathcal{P}_{d} = \{S \in \mathbb{R}^{d\times d}\colon S^{T}=S\}.

As the set $\mathcal{P}_d$ is embedded in the Euclidean space $\mathbb{R}^{c \times c}$ the most
simple metric on $T_{S}\mathcal{P}_{d}$ is naturally given in terms of the Frobenius norm

d_F(U,V) = \|U-V\|_F, \quad \| V \|_F = \text{tr}(V^TV)^{\frac{1}{2}} \quad U,V \in T_{S}\mathcal{P}_{d}.

We will use $\mathcal{P}_d$ as a building backbone for prototype extraction in connection with the labeling task More specifically, we adopt various geometries on $\mathcal{P}_d$ by utilizing
alternative metrics and clarify their influence on the resulting mean properties defined by optimization problem along with main computational tools for mean approximation.

Affine Invariant Riemannian Metric

Geometrically the set of symmetric positive define matrices
comprises the interior of a convex cone in the Euclidean space $\mathbb{R}^{d\times d}$ with zero curvature. Therefore, one major limitation of metric is the finite distance to points on the boundary, i.e. matrices with zero eigenvalues. Instead, equipping the tangent space $T_S\mathcal{P}_d$ with the Riemannian metric.

g_{S}(U,V) = \text{tr}(S^{-1} U S^{-1} V),\qquad U,V\in T_{S}\mathcal{P}_{d},

commonly known as the affine invariant Riemannian metric alleviates the problem and turns the set $\mathcal{P}_d$ into a negatively curved
Riemannian manifold with a natural Riemannian distance

d_{\mathcal{P}_{d}}(S,T) =\Big(\sum_{i \in [d]}\big(\log\lambda_{i}(S,T)\big)^{2}\Big)^{1/2},

where $\lambda_i(S,T)$ are the generalized eigenvalues of the pencil $(S,T)$ that is

Tv = \lambda_i(S,T) S v, \quad \text{for some} \quad v \in \mathbb{R}^d \quad \text{with} \quad \|v\| = 1.

For this particular case the exponential map reads

\exp_{S}(U) = S^{\frac{1}{2}}\text{expm}(S^{-\frac{1}{2}}US^{-\frac{1}{2}})S^{\frac{1}{2}}, \qquad \text{expm}(V) = \sum_{i = 0}^{\infty}\frac{V^i}{i!}, \quad V \in \mathbb{R}^{d\times d}

In the map $\text{expm}(\cdot)$ is denoted as the matrix exponential with the inverse given by matrix logarithm $\text{logm}=\text{expm}^{-1}$

\text{logm}(S) = \sum_{i = 1}^{\infty}\frac{(-1)^{i-1}}{i}(S-I)^i.

Finally, given a smooth objective function $J \colon \mathcal{P}_{d} \to \mathbb{R}$ , the Riemannian gradient is given by

\nabla J(S) = S\big(\partial J(S)\big)S \in T_{S}\mathcal{P}_{d},

where the symmetric matrix $\partial J(S)$ denotes the Euclidean gradient of $J$ at $S$ . Since $\mathcal{P}_{d}$ is a simply connected, complete and nonpositively curved Riemannian manifold, the exponential map is globally defined and bijective, and the Riemannian mean always exists and is uniquely defined as minimizer of the objective function , after substituting the Riemannian distance .

Given a set of positive definite matrices

\mathcal{S}_{N} = \{(S_1,\omega_{1}),\dots , (S_{N},\omega_{N})\} \subset \mathcal{P}_d

together with positive weights $\omega_{i}$ , we next focus on the solution of the
problem for specific geometry ,

\overline{S} = \arg\min_{S \in \mathcal{P}_{d}} J(S;\mathcal{S}_{N}),\qquad J(S;\mathcal{S}_{N}) = \sum_{i \in [N]} \omega_{i}d_{\mathcal{P}_{d}}^{2}(S,S_{i}),

with the distance $d_{\mathcal{P}_{d}}$ given by . Due to invertibility of
, each tangent vector $U \in T_S \mathcal{P}_d$ is represented at the base point $S$ by

U = \exp_{S}^{-1} \circ \exp_{S}(U) = S^{\frac{1}{2}} \text{logm}\big(S^{-\frac{1}{2}}\exp_{S}(U)S^{-\frac{1}{2}}\big) S^{\frac{1}{2}}.

As a result, optimality condition reads

\sum_{i \in [N]} \omega_{i} \overline{S}^{\frac{1}{2}} \text{logm}\big(\overline{S}^{-\frac{1}{2}}S_{i} \overline{S}^{-\frac{1}{2}}\big) \overline{S}^{\frac{1}{2}} = 0.

Apart from the specific case $|\mathcal{S}_2| = 2$ in
where the Riemannian mean is directly given by $\overline{S} = S_1(S_1^{-1}S_2)^{\frac{1}{2}}$ , the solution to nonlinear equation

is not explicitly known. A remedy is the corresponding basic fixed iteration initialized at $S_0 \in \mathcal{P}_d$

S_{t+1} = S_t\text{expm}(-h_t\sum_{i = [N]}\omega_i\text{logm}(S_i^{-1}S_t)),

where the step size $h_t> 0$ can be defined heuristically or by following more sophisticated
line search strategies, see and . However, as pointed out in applying

is limited in the sense that convergence is not theoretically guaranteed and if the iteration converges, than at a linear rate only. This can be remedied by restricting the set of matrices $\{S_1,\dots,S_N \}$ in to pairwise commute resulting in the following variant of Riemannian mean recovery proposed by
that comes with guarantees to converge at a quadratic rate. Using the parametrization by means corresponding to the Cholesky decomposition

S = L L^{T}

along with replacing the map of fixed point iteration with its linearization leads to the following fixed point iteration

F_{\tau}(L; \mathcal{S}_{N}) = L L^{T}- \tau \sum_{i\in [N]} \omega_{i} L^{T} \text{logm}(L^{-T}S_i^{-1}L^{-1})L,

with damping parameter $\tau > 0$ . Comparing to shows that the basic idea is to compute the Riemannian mean $\overline{S}$ as fixed point of the iteration

\overline{S} = \lim_{t \to \infty} S_{(t)},\qquad S_{(t+1)} = F(S_{(t)};\mathcal{S}_{N}).

Algorithm provides a refined variant of this iteration including adaptive stepsize selection.

Fixed Point Iteration for Computing the Riemannian Matrix Mean

Initialization

$\epsilon$ (termination threshold)
$t = 0$ , $S_{(0)} = L L^{T}$ , with $S_{(0)}$ solving.
$c_0 = \frac{\lambda_{\max}(S_{(0)})}{\lambda_{\min}(S_{(0)})}$ ,
$\{\alpha_0, \beta_0\} = \big[\frac{\log(c_0)}{c_0-1}, c_0 \frac{\log(c_0)}{c_0-1}\big]$ (condition number and step size selection parameters)
$\tau_0 = \frac{2}{\alpha_0 + \beta_0}$
$S_{(1)} = F_{\tau}(L; \mathcal{S}_{N})$ (iterative step)
$\epsilon_{1} = \Big\|\sum_{i \in [N]} \omega_i \text{logm}\Big(S_{(1)}^{\frac{1}{2}} S^{-1}_i S_{(1)}^{\frac{1}{2}}\Big)\Big\|_{F}, \quad t = 1$ While $\epsilon_{t} > \epsilon$ :

$S_{(t)} = L L^{T}$
$c_{t} = \frac{\lambda_{\max}(S_{(t)})}{\lambda_{\min}(S_{(t)})}$
If $c_{t} = 1$ , stop
$\{\alpha_{t}, \beta_{t}\} = \Big\{\sum_{k=0}^{t}\frac{\log(c_k)}{c_k-1}, c_k \frac{\log(c_k)}{c_k-1}\Big\}$
$\tau_{t} = \frac{2}{\alpha_{t} + \beta_{t}}$
$S_{(t+1)} = F_{\tau_{t}}(L; \mathcal{S}_{N})$
$\epsilon_{t+1} := \Big\| \sum_{i \in [N]} \omega_i \text{logm}\Big(S_{(t+1)}^{\frac{1}{2}} S^{-1}_i S_{(t+1)}^{\frac{1}{2}}\Big)\Big\|_{F}, \quad t \leftarrow t + 1$

Log-Euclidean Metric

Leveraging the specific properties of the mappings and an alternative metric was proposed by (among several other ones). More precisely, introducing the operations

prescribes on $\mathcal{P}_d$ a Lie group structure where set $(\mathcal{P}_{s},\odot,\cdot)$ is isomorphic to the vector space with $\odot$ playing the role of addition. The Log-Euclidean metric for two symmetric positive definite matrices $S_1,S_2 \in \mathcal{P}_d$ is then defined by
the Euclidean inner product on the flat Riemannian space of zero curvature after applying the mapping $\text{logm}:\mathcal{P}_d\to T_S \mathcal{P}_d$ , (cf. ) and provides a lower bound to the (AIRM) metric via

d_{le}(S_1,S_2) = \| \text{logm}(S_1)-\text{logm}(S_2) \|_F, \qquad S_1,S_2 \in \mathcal{P}_d.

The corresponding Riemannian mean of the set $\mathcal{S}_{N}$ has the closed form expression

\overline{S} = \text{expm}\Big(\sum_{i\in [N]} w_{i}\text{logm}(S_{i})\Big),

which has analogous form as the arithmetic mean. While computing the mean is considerably cheaper than integrating the flow using approximation Algorithm, the critical drawback of relying on is not taking into account the structure (curvature) of the manifold $\mathcal{P}_{d}$ . Therefore, in the next section, we additionally consider another approximation of the Riemannian mean that better fits the underlying geometry and avoids the expensive computation of generalized eigenvalues resulting in a more efficient mean evaluation than the Riemannian mean associated to metric.

$S$ -Divergence

In this section we make use of divergence functions introduced in Section and present a general approach of approximating the objective function by instead replacing the intractable problem specific squared Riemannian distance by a divergence function

D(p,q) \approx \frac{1}{2} d_{g}^{2}(p,q),

Property says that, for any feasible $p$ , the Hessian with respect to the first argument is positive definite. In fact, suitable divergence functions $D$ recover in this way locally the metric tensor of the underlying manifold $\mathcal{M}$ , in order to qualify as a surrogate for the squared Riemannian distance .

	Properties of mean $\mathcal{G}(S_N)$	Riemann.	Log-Eucl.
Invariance by reordering of $S_N$	✔	✔	✔
Congruence invariance
$S_N \to AS_NA^T$ for $A \in GL(d)$	✔	✔, $A \in SO(d)$	✔
Self-duality $(\mathcal{G}(S_N))^{-1} = \mathcal{G}((S^{-1}_N))$	✔	✔	✔
Joint homogeneity
$\mathcal{G}(\alpha_1S_1,\dots,\alpha_NS_N) = (\underset{i \in [N]}{\Pi}\alpha_i)^{\frac{1}{N}}\mathcal{G}(S_N)$	✔	✔	✖
Determinant identity
$\det \mathcal{G}(S_N) = (\underset{i \in [N]}{\Pi} \det S_i)^{\frac{1}{N}}$	✔	✔	✔
If $S_i \in S_N$ commute pairwise then:
$\mathcal{G}(S_N) = (\underset{i \in [N]}{\Pi} S_i)^{\frac{1}{N}}$	✔	✔	✔

For the present case $\mathcal{M}=\mathcal{P}_{d}$ of interest, proposed the divergence function, called Stein divergence. For $S, S_{1},S_{2} \in \mathcal{P}_{d}$ it is defined as

D_{s}(S_{1},S_{2}) = \log\det\Big(\frac{S_{1}+S_{2}}{2}\Big) - \frac{1}{2}\log\det (S_{1} S_{2}),

which is not a metric on $\mathcal{P}_d$ due to the lack of triangle inequality. However, as shown by taking the square root in yield a metric $d_{S}(\cdot,\cdot)$ on $\mathcal{P}_d$ . Thus replacing the Riemannian distance in the second term of problem the Riemannian distance by $d_{S}$ provides an approximate objective

\overline{S} = \arg\min_{S\in\mathcal{P}_{d}} J_{s}(S;\mathcal{S}_{N}),\qquad J_{s}(S;\mathcal{S}_{N}) = \sum_{i\in [N]} w_{i} D_{s}(S,S_{i}),

which allows mean recovery while avoiding to solve the numerically involved generalized eigenvalue problem as opposed to . The resulting Riemannian gradient flow reads

with

R(S;\mathcal{S}_{N}) = \sum_{i\in [N]} w_{i}\Big(\frac{S+S_{i}}{2}\Big)^{-1}.

Discretizing the flow using the geometric explicit Euler scheme with step size $h$ ,

and using the log-Euclidean mean as initial point $S_{(0)}$ , defines Algorithm as listed below.

Initialization

$\epsilon$ (termination threshold)

$t = 0, \quad S_{(0)}$ solves

$\epsilon_{0} > \epsilon$ (any value $\epsilon_{0}$ )

While $\epsilon_{t} > \epsilon$ :

$L L^{T} = S_{(t)}$
$L_{i} L_{i}^{T} = \frac{S_{(t)} + S_{i}}{2}$ for $i \in [N]$
$U = I - S_{(t)}^{\frac{1}{2}}\left(\sum_{i\in [N]} \omega_{i}(L_{i}L_{i}^{T})^{-1}\right) S_{(t)}^{\frac{1}{2}}$
$S_{(t+1)} = S_{(t)}^{\frac{1}{2}}\text{expm}\left(\frac{h}{2} U\right) S_{(t)}^{\frac{1}{2}}$
$\epsilon_{t+1} := \|U\|_{F}, \quad t \leftarrow t + 1$

Computing the Geometric Matrix Mean Based on the $S$ -divergence.

The key advantage of Algorithm over Algorithm is that it respects the geometry of $\mathcal{P}_d$ while remaining numerically efficient. In Section~ we will provide an experimental evaluation and a qualitative comparison of the mean properties with respect to the aforementioned metrics
while extracting prototypes for the labeling task. Application details are reported in Chapter . We conclude by listing the main properties of the resulting means when relying on the Riemannian, log-Euclidean and Stein divergence summarized in Table.

The Manifold of Symmetric Positive Definite Matrices Pd\mathcal{P}_{d}

Affine Invariant Riemannian Metric

Fixed Point Iteration for Computing the Riemannian Matrix Mean

Log-Euclidean Metric

SS-Divergence

The Manifold of Symmetric Positive Definite Matrices $\mathcal{P}_{d}$

$S$ -Divergence