Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with...

45
Rational Krylov for Stieltjes matrix functions: convergence and pole selection Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0

Transcript of Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with...

Page 1: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Rational Krylov for Stieltjes matrix functions: convergenceand pole selection

Leonardo Robol, UniPI (joint work with S. Massei, EPFL)13 Febbraio 2020, Montecatini, Convegno GNCS

0

Page 2: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Progetto GNCS Giovani ricercatori

• Progetto GNCS “Giovani ricercatori” 2018/2019.• Titolo: “Metodi di proiezione per equazioni di matrici e sistemi lineari con operatori

definiti tramite somme di prodotti di Kronecker, e soluzioni con struttura di rango.”• Supporto per la partecipazione alle conferenze ILAS2019 (Rio de Janeiro) e ICIAM2019

(Valencia).

1

Page 3: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Motivation: Fractional diffusion equations

We are concerned with the “fractional equivalent” of the 1D/2D Laplacian. That is, instead ofconsidering

∂2u∂x2 = f (x), ∆u = f (x , y)

we deal with∂αu∂yα = f (x), ∆αu = f (x , y)

for 1 < α < 2.

Fractional diffusion allow to model nonlocal behavior.

• Useful in describing anomalous diffusion phenomena (plasma physics, financial markets,. . . )

• Several (non-equivalent) definitions in the literature: Riemann-Liouville, Caputo,Grunwald-Letkinov, Matrix Transform Method

2

Page 4: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Motivation: Matrix Transform Method

Let A be the usual discretization of the Laplacian operator.

One way to define the discrete operator M representing the derivative of order α is to imposethat

M 2α = A ⇐⇒ M = Aα

2 (1 < α < 2)

• This follows by “composition” of derivatives• The problem is recast as computing x := f (A)v , with f (z) = z−α2 .

3

Page 5: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Evaluation of matrix functions

Let f (z) : Ω→ R be an analytic function on Ω ⊆ R+.

Problem 1. Given a symmetric positive definite matrix A ∈ Rn×n with eigenvalues in[λmin, λmax] ⊂ Ω and v ∈ Rn compute

x := f (A)v .

Applications: system of ODEs, exponential integrators, fractional diffusion problems, networkcommunicability measures, control theory, ... .

4

Page 6: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Subspace projection methods

Let U ⊂ Rn be a `-dimensional subspace (` n) with an orthogonal basis U = [u1| . . . |u`] andA` = U∗AU, v` = U∗v be the projections of A and v on U .

• Linear systemsx = A−1v ≈ x` := UA−1

` v`.

• Matrix functionsx = f (A)v ≈ x` := Uf (A`)v`.

If we use the Krylov subspace U = K`(A, v) := spanv ,Av , . . . ,A`−1v then,

‖x − x`‖2 ≤ C · minp(z)∈P`−1

maxz∈[λmin,λmax]

|p(z)− f (z)|,

where P` := poly of degree ≤ ` and C is independent on A and f .

5

Page 7: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Subspace projection methods

Let U ⊂ Rn be a `-dimensional subspace (` n) with an orthogonal basis U = [u1| . . . |u`] andA` = U∗AU, v` = U∗v be the projections of A and v on U .

• Linear systemsx = A−1v ≈ x` := UA−1

` v`.

• Matrix functionsx = f (A)v ≈ x` := Uf (A`)v`.

If we use the Krylov subspace U = K`(A, v) := spanv ,Av , . . . ,A`−1v then,

‖x − x`‖2 ≤ C · minp(z)∈P`−1

maxz∈[λmin,λmax]

|p(z)− f (z)|,

where P` := poly of degree ≤ ` and C is independent on A and f .

5

Page 8: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Polynomial approximation does not always work...

Sometimes the convergence of polynomial approximation struggles, e.g. x = A− 12 v ,

[λmin, λmax] ≈ [10−5, 4], n = 1000.

0 50 100 150 200

10−3

10−1

101

`

‖x − x`‖2

6

Page 9: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Rational Krylov method

Rational Krylov subspace. Given Σ` := σ1, . . . , σ` ⊂ C it is defined as1

RK`(A, v ,Σ`) : = q`(A)−1K`+1(A, v) =

p(A)q`(A) v : p(z) ∈ P`

= spanv , (σ1I − A)−1v , . . . , (σ`I − A)−1v

where q`(z) :=∏

j(z − σj)−1.

If U = RK`(A, v ,Σ`) we get a problem of rational approximation with fixed poles

‖x − x`‖2 ≤ C · minr(z)∈ P`q`(z)

maxz∈[λmin,λmax]

|r(z)− f (z)|.

1last equality is valid only for distinct poles

7

Page 10: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Rational approximation does not always work...

• Uniform approximations of highly oscillatory functions on the whole positive real line arenot possible for polynomials or rational functions

• In general a number of steps dependent on the norm of the operator is needed forconvergence2.

• E.g. x = cos(A)v , where ‖A‖2 ≈ 105, n = 1000

0 20 40 60 80 100

10−3

10−1

`

‖x−

x `‖ 2

Ext. KrylovRat. Krylov

2V. Grimm, M. Hochbruck. Rational approximation to trigonometric operators, BIT 2006.

8

Page 11: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Stieltjes functions

A favourable case is when f (z) : R+ → R if defined via a Stieltjes integral:

1. f (z) =∫ ∞

0

1z + t dµ(t) Cauchy-Stieltjes/Markov function

2. f (z) =∫ ∞

0e−ztdµ(t) Laplace-Stieltjes function

where dµ(t) is a non negative measure on R+.

Examples of functions in these classes are

1. z−α = sin(απ)π

∫ ∞0

t−αz + t dt α ∈ (0, 1), log(1 + z)

z ,e−t√

z − 1z .

2. e−z ,1− e−z

z , ϕj(z) :=∫ ∞

0e−tz [max1− t, 0]j−1

(j − 1)! dt.

9

Page 12: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Stieltjes functions

A favourable case is when f (z) : R+ → R if defined via a Stieltjes integral:

1. f (z) =∫ ∞

0

1z + t dµ(t) Cauchy-Stieltjes/Markov function

2. f (z) =∫ ∞

0e−ztdµ(t) Laplace-Stieltjes function

where dµ(t) is a non negative measure on R+.

Examples of functions in these classes are

1. z−α = sin(απ)π

∫ ∞0

t−αz + t dt α ∈ (0, 1), log(1 + z)

z ,e−t√

z − 1z .

2. e−z ,1− e−z

z , ϕj(z) :=∫ ∞

0e−tz [max1− t, 0]j−1

(j − 1)! dt.

9

Page 13: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Stieltjes functions

• Laplace Stieltjes functions on (0,∞) are also known as completely monotone functions,i.e. those such that (−1)j f (j)(z) > 0 ∀z > 0,

• By considering dµ(t) to be a finite sum of Dirac deltas we have that rational functions ofthe form

f (z) =h∑

j=1

αjz − βj

, αj > 0, βj < 0,

are Cauchy-Stieltjes.• It holds (Bernstein’s theorem3)

z−1 ∈ Cauchy − Stieltjes ⊂ Laplace − Stieltjes.

3S. Bernstein. Sur les fonctions absolument monotones, Acta Mathematica 1929

10

Page 14: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Evaluation of Stieltjes matrix functions

Let f (z) : R+ → R be an analytic function defined via a Stieltjes integral:

f (z) =∫ ∞

0g(t, z)dµ(t) g(t, z) ∈

1

z + t , e−zt.

Problem 1. Given a symmetric positive definite matrix A ∈ Rn×n with eigenvalues in[λmin, λmax] ⊂ Ω and v ∈ Rn compute

x := f (A)v .

Goal. Provide selection strategies for Σ` and estimates of the error ‖x − x`‖2.

11

Page 15: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Outline

• Problem 1:x = f (A)v

• Cauchy-Stieltjes case• Laplace-Stieltjes case

• Problem 2:x = f (I ⊗ A + B ⊗ I)vec(low-rank matrix)

• Laplace-Stieltjes case• Cauchy-Stieltjes case

12

Page 16: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Cauchy-Stieltjes functions: Sketching the idea

• By writing the integral formulation of f we get:

f (A)v =∫ ∞

0(A + tI)−1v dµ(t) ≈

∫ ∞0

U(A` + tI)−1v` dµ(t).

• So we need a space RK(A, v ,Σ`) that approximates simultaneously well (A + tI)−1v forany t > 0.

• Problem: Krylov subspaces are not shift invariant (apart from polynomial Krylov).

13

Page 17: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Cauchy-Stieltjes functions: Sketching the idea

• Approximating (A + tI)−1 ∀t ≥ 0, is linked to approximate 1λ+t in the strip

[λmin, λmax]× [0,∞].• We consider a Skeleton approximation of the form

1λ+ t ≈ fskel (λ, t) :=

[1

λ+ t1, . . . ,

1λ+ t`

]M−1

1

λ1+t...1

λ`+t

, Mij =(

1λi + tj

).

14

Page 18: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Cauchy-Stieltjes functions: Sketching the idea

fskel (λ, t) :=[

1λ+ t1

, . . . ,1

λ+ t`

]M−1

1

λ1+t...1

λ`+t

Skeleton approximations have 2 crucial properties:

1 Explicit expression of the residual error4:

1λ+ t − fskel (λ, t) = 1

λ+ t ·r(λ)

r(−t) , r(z) :=∏

j

z − λjz + tj

.

2 fskel (λ, t) is a rational function in λ with poles −t1, . . . ,−t`.If we set Σ` = −t1, . . . ,−t`, then

fskel (A, t)v ∈ RK`(A, v ,Σ`) ∀t ∈ [0,∞].

4Oseledets. Lower bounds for separable approximations of the Hilbert kernel, Sbornik 2007.

15

Page 19: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Cauchy-Stieltjes functions: Sketching the idea

fskel (λ, t) :=[

1λ+ t1

, . . . ,1

λ+ t`

]M−1

1

λ1+t...1

λ`+t

Skeleton approximations have 2 crucial properties:

1 Explicit expression of the residual error4:

1λ+ t − fskel (λ, t) = 1

λ+ t ·r(λ)

r(−t) , r(z) :=∏

j

z − λjz + tj

.

2 fskel (λ, t) is a rational function in λ with poles −t1, . . . ,−t`.If we set Σ` = −t1, . . . ,−t`, then

fskel (A, t)v ∈ RK`(A, v ,Σ`) ∀t ∈ [0,∞].4Oseledets. Lower bounds for separable approximations of the Hilbert kernel, Sbornik 2007.

15

Page 20: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Cauchy-Stieltjes functions: Sketching the idea

• We have the following point-wise estimate of the error 5

‖(tI + A)−1v − U(tI + A`)−1v`‖2 ≤2‖v‖2λmin + t min

r(z)∈P`Σ`

maxz∈[λmin,λmax] |r(z)|minz∈[−∞,0] |r(z)| .

• Minimizing the right hand side over Σ` means solving

minr(z)∈R`,`

maxz∈[λmin,λmax] |r(z)|minz∈[−∞,0] |r(z)| ,

where R`,` is the set of (`, `) rational functions.

5An estimate of the L2-norm in:Druskin, Knizhnerman, Zaslavsky. Solution of large scale evolutionary problems using rational Krylov subspaceswith optimized shifts, SISC 2009.

16

Page 21: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Cauchy-Stieltjes functions: Sketching the idea

• We have the following point-wise estimate of the error 5

‖(tI + A)−1v − U(tI + A`)−1v`‖2 ≤2‖v‖2λmin + t min

r(z)∈P`Σ`

maxz∈[λmin,λmax] |r(z)|minz∈[−∞,0] |r(z)| .

• Minimizing the right hand side over Σ` means solving

minr(z)∈R`,`

maxz∈[λmin,λmax] |r(z)|minz∈[−∞,0] |r(z)| ,

where R`,` is the set of (`, `) rational functions.

5An estimate of the L2-norm in:Druskin, Knizhnerman, Zaslavsky. Solution of large scale evolutionary problems using rational Krylov subspaceswith optimized shifts, SISC 2009.

16

Page 22: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Cauchy-Stieltjes functions: Sketching the idea

• This problem can be transformed via a Moebius map T (z) = αz+βγz+δ into

minr(z)∈R`,`

maxz∈[a,b] |r(z)|minz∈[−b,−a] |r(z)| . (1)

that Zolotarev solved ≈ 140 years ago.

• In particular, we know explicitly the optimal poles Σ` of the rational function that solves(1).So the optimal poles Σ∗` for our starting problem are given by

Σ∗` := T−1(Σ`).

17

Page 23: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Theoretical result

TheoremLet f (z) be a Cauchy-Stieltjes function, U be an orthogonal basis of RK`(A, v ,Σ∗` ) andx` = Uf (A`)v`. Then

‖f (A)v − x`‖2 ≤ 8f (λmin)‖v‖2ρ`,

where ρ := exp(− π2

log(

16λmaxλmin

)).

Similar results by Beckermann and Reichel6.

6Beckerman, Reichel. Error estimate and evaluation of matrix functions via the Faber transform, SINUM 2009

18

Page 24: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Numerical results

x = A− 12 v , A = trid(−1, 2,−1) ∈ R1000×1000

0 10 20 30 40 5010−11

10−6

10−1

104

Iterations (`)

‖x−

x `‖ 2

BoundExt. KrylovPol. KrylovRat. Krylov

markovfunmv7

7Guttel, Knizhnerman. A black-box rational Arnoldi variant for Cauchy-Stieltjes matrix functions, BIT 2013

19

Page 25: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Laplace-Stieltjes functions

f (λ) =∫ ∞

0e−λtdµ(t)

• The core idea is to exploit the relation:

e−λt = 12πi

∫iR

est

λ+ s ds

to link (parameter dependent) resolvents with exponentials.

• Then, using the approximation

e−λt ≈∫

iRfskel (λ, s)est ds

yields‖e−tAv − Ue−tA`v`‖2 ≤ 4γ` max

z∈[λmin,λmax]|r(z)|, r(z) :=

∏j

z − λjz + λj

with U orthogonal basis of RK`(A, v , −λ1, . . . ,−λ`).

20

Page 26: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Laplace-Stieltjes functions

f (λ) =∫ ∞

0e−λtdµ(t)

• The core idea is to exploit the relation:

e−λt = 12πi

∫iR

est

λ+ s ds

to link (parameter dependent) resolvents with exponentials.

• Then, using the approximation

e−λt ≈∫

iRfskel (λ, s)est ds

yields‖e−tAv − Ue−tA`v`‖2 ≤ 4γ` max

z∈[λmin,λmax]|r(z)|, r(z) :=

∏j

z − λjz + λj

with U orthogonal basis of RK`(A, v , −λ1, . . . ,−λ`).

20

Page 27: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Laplace-Stieltjes functions

TheoremLet f (z) be a Laplace-Stieltjes function, U be an orthogonal basis of RK`(A, v ,Σ`) andx` = Uf (A`)v`. Then there exists a choice of Σ` such that

‖f (A)v − x`‖2 ≤ 8γ`f (0)‖v‖2ρ`2 ρ := exp

− π2

log(

4λmaxλmin

) ,

where γ` := 2.23 + 2π log(4`λmax

λmin).

Conjecture: The result holds with γ` = 1.

21

Page 28: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

2D problems with tensor structure

• When considering discretization (say, finite differences) of

∂2u∂x2 + ∂2u

∂y 2 = f (x , y)

on a rectangular domain one get a linear system of the form

(A⊗ I + I ⊗ A)︸ ︷︷ ︸M

x = v .

• Moreover, if f (x , y) is a regular function or has a small support (e.g. a point source) then

v ≈ vec(C)

where C is a low-rank matrix, i.e. C = WZ T for some tall and skinny matrices W ,Z .• Applying the matrix transform method to the fractional analogous of this problem requires

to compute f (M)vec(C).

22

Page 29: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Function evaluation and Kronecker structure

Problem 2. Given two symmetric positive definite matrices A,B ∈ Rn×n with eigenvalues in[λmin, λmax] ⊂ Ω and v = vec(C) = vec(low-rank matrix) ∈ Rn2 , compute

x := f (I ⊗ A + B ⊗ I)v , or equivalently X := vec−1(f (I ⊗ A + B ⊗ I)v).

• If f (z) = z−1, then this is equivalent to solve the Sylvester equation: we know thelow-rank structure is transferred from C to X .

• Does this hold in more general cases? Yes, it does.• Once again, we focus on the case where f (z) is a Stieltjes function:

f (z) =∫ ∞

0g(t, z)dµ(t), g(t, z) ∈

1

z + t , e−zt.

23

Page 30: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Function evaluation and Kronecker structure

Problem 2. Given two symmetric positive definite matrices A,B ∈ Rn×n with eigenvalues in[λmin, λmax] ⊂ Ω and v = vec(C) = vec(low-rank matrix) ∈ Rn2 , compute

x := f (I ⊗ A + B ⊗ I)v , or equivalently X := vec−1(f (I ⊗ A + B ⊗ I)v).

• If f (z) = z−1, then this is equivalent to solve the Sylvester equation: we know thelow-rank structure is transferred from C to X .

• Does this hold in more general cases? Yes, it does.• Once again, we focus on the case where f (z) is a Stieltjes function:

f (z) =∫ ∞

0g(t, z)dµ(t), g(t, z) ∈

1

z + t , e−zt.

23

Page 31: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Function evaluation and Kronecker structure

Problem 2. Given two symmetric positive definite matrices A,B ∈ Rn×n with eigenvalues in[λmin, λmax] ⊂ Ω and v = vec(C) = vec(low-rank matrix) ∈ Rn2 , compute

x := f (I ⊗ A + B ⊗ I)v , or equivalently X := vec−1(f (I ⊗ A + B ⊗ I)v).

• If f (z) = z−1, then this is equivalent to solve the Sylvester equation: we know thelow-rank structure is transferred from C to X .

• Does this hold in more general cases? Yes, it does.

• Once again, we focus on the case where f (z) is a Stieltjes function:

f (z) =∫ ∞

0g(t, z)dµ(t), g(t, z) ∈

1

z + t , e−zt.

23

Page 32: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Function evaluation and Kronecker structure

Problem 2. Given two symmetric positive definite matrices A,B ∈ Rn×n with eigenvalues in[λmin, λmax] ⊂ Ω and v = vec(C) = vec(low-rank matrix) ∈ Rn2 , compute

x := f (I ⊗ A + B ⊗ I)v , or equivalently X := vec−1(f (I ⊗ A + B ⊗ I)v).

• If f (z) = z−1, then this is equivalent to solve the Sylvester equation: we know thelow-rank structure is transferred from C to X .

• Does this hold in more general cases? Yes, it does.• Once again, we focus on the case where f (z) is a Stieltjes function:

f (z) =∫ ∞

0g(t, z)dµ(t), g(t, z) ∈

1

z + t , e−zt.

23

Page 33: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Galerkin projection for evaluating f (M)vec(C)

Inspired by the Galerkin projection for Sylvester equations, we may consider the followingalgorithm8:

• Choose subspaces of Rn spanned by orthogonal matrices U,V .• Evaluate the projected matrix function

vec(Y ) = f ((V ⊗ U)∗M(V ⊗ U))vec(U∗CV ) = f (M)vec(U∗CV ),

where M := V ∗BV ⊗ I + I ⊗ U∗AU.• Use UYV ∗ as approximation for X .

8Benzi, Simoncini. Approximation of functions of large matrices with Kronecker structure, NumerischeMathematik, 2017.

24

Page 34: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Galerkin projection for evaluating f (M)vec(C)

Inspired by the Galerkin projection for Sylvester equations, we may consider the followingalgorithm8:

• Choose subspaces of Rn spanned by orthogonal matrices U,V .

• Evaluate the projected matrix function

vec(Y ) = f ((V ⊗ U)∗M(V ⊗ U))vec(U∗CV ) = f (M)vec(U∗CV ),

where M := V ∗BV ⊗ I + I ⊗ U∗AU.• Use UYV ∗ as approximation for X .

8Benzi, Simoncini. Approximation of functions of large matrices with Kronecker structure, NumerischeMathematik, 2017.

24

Page 35: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Galerkin projection for evaluating f (M)vec(C)

Inspired by the Galerkin projection for Sylvester equations, we may consider the followingalgorithm8:

• Choose subspaces of Rn spanned by orthogonal matrices U,V .• Evaluate the projected matrix function

vec(Y ) = f ((V ⊗ U)∗M(V ⊗ U))vec(U∗CV ) = f (M)vec(U∗CV ),

where M := V ∗BV ⊗ I + I ⊗ U∗AU.

• Use UYV ∗ as approximation for X .

8Benzi, Simoncini. Approximation of functions of large matrices with Kronecker structure, NumerischeMathematik, 2017.

24

Page 36: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Galerkin projection for evaluating f (M)vec(C)

Inspired by the Galerkin projection for Sylvester equations, we may consider the followingalgorithm8:

• Choose subspaces of Rn spanned by orthogonal matrices U,V .• Evaluate the projected matrix function

vec(Y ) = f ((V ⊗ U)∗M(V ⊗ U))vec(U∗CV ) = f (M)vec(U∗CV ),

where M := V ∗BV ⊗ I + I ⊗ U∗AU.• Use UYV ∗ as approximation for X .

8Benzi, Simoncini. Approximation of functions of large matrices with Kronecker structure, NumerischeMathematik, 2017.

24

Page 37: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Evaluating the small scale matrix function

For f (z) = z−1, evaluating f (M)vec(C) can be done in O(n3) using Bartels-Stewart.

For a more general f (z) and M = I ⊗ A + B ⊗ I:

• The projected matrix retains the same structure.• M can be diagonalized in O(n3) by diagonalizing A and B.

If Q∗AAQA = DA, Q∗BBQB = DB , then

vec(X ) = f (M)vec(C) = (QA ⊗ QB)f (I ⊗ DA + DB ⊗ I)vec(Q∗ACQB)

which can be written as

X = QA(F (Q∗ACQB))Q∗B , Fij := f (λA,i + λB,j).

Again, total cost is about O(n3).

25

Page 38: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Galerkin projection for evaluating f (M)vec(C)

• Algorithm reduces to Galerkin for Sylvester eqs. if f (z) = z−1.• Proposed by Benzi & Simoncini for more general f (z) using polynomial Krylov subspaces.• Using rational Krylov subspaces yields improved convergence in ill-conditioned cases;

major difficulty: how to choose the poles?

26

Page 39: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Determining the poles: Laplace-Stieltjes case

Here the situation is straightforward because e−tM = eI⊗(−tA)e(−tB)⊗I , so that

vec(X ) = f (M)vec(C)

=∫ ∞

0e−tMvec(C)dµ(t)

= vec(∫ ∞

0e−tACe−tBdµ(t)

)= vec

(∫ ∞0

(e−tAW

) (e−tBZ

)T dµ(t)).

Hence we can choose the same set of poles used in 1D case, for both Krylov subspaces.

27

Page 40: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Convergence result: Laplace Stieltjes case

TheoremThere exists a choice of poles that, for any Laplace-Stieltjes function f applied toM = I ⊗ A + B ⊗ I, where A,B are symmetric positive definite with spectrum contained in[λmin, λmax], yields the convergence

‖X − X`‖2 ≤ 16γ`,κf (0) · ‖C‖2 · ρ`2 , ρ = exp

(π2

log(4λmaxλmin

)

).

28

Page 41: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Numerical results

0 10 20 30 40 5010−15

10−9

10−3

103

Iterations (`)

‖X−

X `‖ 2

Evaluation of vec(X ) = ϕ1(I ⊗ A + A⊗ I)vec(C)

BoundExt. Kryl.Pol. Kryl.Rat. Kryl.

29

Page 42: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Determining the poles: Cauchy Stieltjes case

If f (z) =∫∞

0 dµ(t)/(z + t) then, f (M) =∫∞

0 dµ(t)(tI +M)−1.

Since tI +M = A⊗ I + I ⊗ (B + tI) we have

vec(X ) = f (M)vec(C) ⇐⇒ X =∫ ∞

0Xtdµ(t), AXt + Xt(B + tI) = C .

• Can we design a projection space that is good for all values of t?• This can be related to the Zolotarev problem:

max z ∈ [λmin, λmax]|r(z)|minz∈(−∞,−λmin] |r(z)|

• Using a Mobius transform, this can be mapped (approximately) into a Zolotarev problemon [−2λmax,−λmin] ∪ [λmin, 2λmax].

• We can compute optimal poles there and go back.

30

Page 43: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Convergence result: Cauchy Stieltjes case

TheoremThere exists a choice of poles that, for any Cauchy-Stieltjes function f applied toM = I ⊗ A + B ⊗ A, where A,B are symmetric positive definite with spectrum contained in[λmin, λmax], yields the convergence

‖X − X`‖2 ≤ 4 · f (2λmin) ·(

1 + λmaxλmin

)· ‖C‖2 · ρ−`, ρ = exp

(π2

log(8λmaxλmin

)

).

Note: the exponent for f (z) = z−1 is exp(

π2

log(4λmaxλmin

)

).

31

Page 44: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Decaying singular values

The theory predicts the decay in the singular values as well.

0 10 20 30 40 5010−17

10−12

10−7

10−2

`

Evaluation of vec(X ) = (I ⊗ A + A⊗ I)−0.7vec(C)

Bound‖X` − X‖2σ`(X )

32

Page 45: Rational Krylov for Stieltjes matrix functions ...€¦ · Leonardo Robol, UniPI (joint work with S. Massei, EPFL) 13 Febbraio 2020, Montecatini, Convegno GNCS 0. Progetto GNCS Giovani

Conclusions and outlook

• Practically, nested sequences of poles with the same asymptotic behavior can be used.

Possible extensions:

• Similar result for more general spectra configuration (and normal matrices) — implicitlydepends on the Zolotarev rational approximation problem.

• For non-normal cases, one can resort to using the field-of-values instead of the spectrum.• Divide and conquer methods for right hand sides obtained as vectorizations of banded or

hierarchical matrices.• Higher dimensional Laplace-like operators.

Full story:

• Rational Krylov for Stieltjes matrix functions: convergence and pole selection. S. Massei.,L.R., arXiv, 2019.

33