API references (qmm module)

All the functionalities are provided by the unique qmm module described below.

Optimization algorithms

Three algorithms are implemented.

  1. mmcg() that use the Majorize-Minimize Conjugate Gradient (MM-CG),

  2. mmmg() that use the Majorize-Minimize Memory Gradient (3MG), and

  3. lcg() that use the Linear Conjugate Gradient (CG) for quadratic objective QuadObjective only, with exact optimal step and conjugacy parameters.

The 3MG algorithm is usually faster but use more memory. The MM-CG can be faster and use less memory.

Majorize-Minimize Conjugate Gradient

mmcg(objv_list, x0, tol=0.0001, max_iter=500, min_iter=0, precond=None, callback=None, calc_objv=False)

The Majorize-Minimize Conjugate Gradient (MM-CG) algorithm.

The MM-CG is a nonlinear conjugate gradient (NL-CG) optimization algorithm with an explicit step formula based on Majorize-Minimize Quadratic approach 1.

Parameters
  • objv_list (list of BaseObjective) – A list of BaseObjective objects that each represents a μ ψ(Vx - ω). The objectives are summed.

  • x0 (ndarray) – The initial point.

  • tol (float, optional) – The stopping tolerance. The algorithm is stopped when the gradient norm is inferior to x0.size * tol.

  • max_iter (int, optional) – The maximum number of iterations.

  • min_iter (int, optional) – The minimum number of iterations.

  • precond (callable, optional) – A callable that must implement a preconditioner, that is Px. Must be a callable with a unique input parameter x and unique output like x.

  • callback (callable, optional) – A function that receive the OptimizeResult at the end of each iteration.

  • calc_objv (boolean, optional) – If True, objective function is computed at each iteration with low overhead. False by default. Not used by the algorithm.

Returns

result

Return type

OptimizeResult

References

1

C. Labat and J. Idier, “Convergence of Conjugate Gradient Methods with a Closed-Form Stepsize Formula,” J Optim Theory Appl, p. 18, 2008.

Majorize-Minimize Memory Gradient

mmmg(objv_list, x0, tol=0.0001, max_iter=500, min_iter=0, precond=None, callback=None, calc_objv=False)

The Majorize-Minimize Memory Gradient (3MG) algorithm.

The mmmg (3MG) algorithm is a subspace memory-gradient optimization algorithm with an explicit step formula based on Majorize-Minimize Quadratic approach 2.

Parameters
  • objv_list (list of BaseObjective) – A list of BaseObjective objects that each represents a μ ψ(Vx - ω). The objectives are summed.

  • x0 (array) – The initial point.

  • tol (float, optional) – The stopping tolerance. The algorithm is stopped when the gradient norm is inferior to x0.size * tol.

  • max_iter (int, optional) – The maximum number of iterations.

  • min_iter (int, optional) – The minimum number of iterations.

  • precond (callable, optional) – A callable that must implement a preconditioner, that is Px. Must be a callable with a unique input parameter x and unique output like x.

  • callback (callable, optional) – A function that receive the OptimizeResult at the end of each iteration.

  • calc_objv (boolean, optional) – If True, objective function is computed at each iteration with low overhead. False by default. Not used by the algorithm.

Returns

result

Return type

OptimizeResult

References

2

E. Chouzenoux, J. Idier, and S. Moussaoui, “A Majorize-Minimize Strategy for Subspace Optimization Applied to Image Restoration,” IEEE Trans. on Image Process., vol. 20, no. 6, pp. 1517–1528, Jun. 2011, doi: 10.1109/TIP.2010.2103083.

Linear Conjugate Gradient

lcg(objv_list, x0, tol=0.0001, max_iter=500, min_iter=0, precond=None, callback=None, calc_objv=False)

Linear Conjugate Gradient (CG) algorithm.

Linear Conjugate Gradient optimization algorithm for quadratic objective.

Parameters
  • objv_list (list of QuadObjective) – A list of QuadObjective objects that each represents a ½ μ ||Vx - ω||²_B. The objectives are summed.

  • x0 (ndarray) – The initial point.

  • precond (callable, optional) – A callable that must implement a preconditioner, that is Px. Must be a callable with a unique input parameter x and unique output like x.

  • tol (float, optional) – The stopping tolerance. The algorithm is stopped when the gradient norm is inferior to x0.size * tol.

  • max_iter (int, optional) – The maximum number of iterations.

  • min_iter (int, optional) – The minimum number of iterations.

  • callback (callable, optional) – A function that receive the OptimizeResult at the end of each iteration.

Returns

result

Return type

OptimizeResult

Optimization results

The output are instance of OptimizeResult that behave like OptimizeResult of scipy. They behave like Python dictionary and are implemented to avoid dependency to scipy.

class OptimizeResult(*args, **kwargs)

Represents the optimization result.

x: array

The solution of the optimization, with same shape than x0.

success: bool

Whether or not the optimizer exited successfully.

status: int

Termination status of the optimizer. Its value depends on the underlying solver. Refer to message for details.

message: str

Description of the cause of the termination.

nit: int

Number of iterations performed by the optimizer.

diff: list of float

The value of ||x_{k+1} - x_{k}||² at each iteration

time: list of float

The time at each iteration, starting at 0, in seconds.

fun: float

The value of the objective function.

objv_val: list of float

The objective value at each iteration

jac: array

The gradient of the objective function.

grad_norm: list of float

The gradient norm at each iteration

Notes

OptimizeResult mimes OptimizeResult of scipy for compatibility.

__init__(*args, **kwargs)

Initialize self. See help(type(self)) for accurate signature.

Objective classes

Objective functions are defined from the abstract class BaseObjective that have three abstract methods that must be implemented by the subclass. If users want to implements it’s own objective, he is encouraged to subclass BaseObjective.

Four generic concrete classes of BaseObjective can be used. The Objective class is the more general and prefered way, and QuadObjective is a specialized subclass that allows simplification and slightly faster computation. Vmax and Vmin are for bound penalties.

Note

The property lastgv is used by algorithms to compute the objective function value at each iteration with low overhead, if the flag calc_fun is set to True (False by default). It is not required by the algorithms.

class BaseObjective(hyper=1, name='')

An abstract base class for objective function

\[J(x) = \mu \Psi \left(V x - \omega \right)\]

with \(\Psi(u) = \sum_i \varphi(u_i)\).

calc_objv

If true, compute the objective value when gradient is computed and store in lastgv attribute (False by default).

Type

boolean

name

The name of the objective.

Type

str

hyper

The hyperparameter value μ.

Type

float

lastv

The last evaluated value of the objective (0 by default).

Type

float

lastgv

The value of objective obtained during gradient computation (0 by default).

Type

float

__init__(hyper=1, name='')

Initialize self. See help(type(self)) for accurate signature.

abstract operator(point)

Compute the output of Vx.

abstract value(point)

Compute the value at current point.

abstract gradient(point)

Compute the gradient at current point.

abstract norm_mat_major(vecs, point)

Return the normal matrix of the quadratic major function.

Given vectors W = V·S, return Wᵀ·diag(b)·W

where S are the vectors defining a subspace and b are Geman & Reynolds coefficients at given point.

Parameters
  • vecs (array) – The W vectors.

  • point (array) – The given point where to compute Geman & Reynolds coefficients b.

Returns

out – The normal matrix

Return type

array

Main objective

class Objective(operator, adjoint, loss, data=None, hyper=1, name='')

An objective function defined as

\[J(x) = \mu \Psi \left(V x - \omega \right)\]

with \(\Psi(u) = \sum_i \varphi(u_i)\).

The instance attributs are:

dataarray

The data array, or the vectorized list of array given at init.

hyperfloat

The hyperparameter value μ.

lossLoss

The loss φ.

__init__(operator, adjoint, loss, data=None, hyper=1, name='')

A objective function μ ψ(Vx - ω).

Parameters
  • operator (callable) – A callable that compute the output Vx.

  • adjoint (callable) – A callable that compute Vᵀe.

  • loss (Loss) – The loss φ.

  • data (array or list of array, optional) – The data vector ω.

  • hyper (float, optional) – The hyperparameter μ.

  • name (str, optional) – The name of the objective.

Notes

If data is a list of array, operator must return a similar list with arrays of same shape, and adjoint must accept a similar list also.

In that case, however, and for algorithm purpose, everything is internally stacked as a column vector and values are therefore copied, by using a Stacked object. This is not efficient but flexible. Users are encouraged to do the vectorization themselves and use this “list of array” feature.

operator(point)

Compute the output of Vx.

value(point)

The value of the objective function at given point

Return μ ψ(Vx - ω).

gradient(point)

The gradient and value at given point

Return μ Vᵀφ’(Vx - ω).

norm_mat_major(vecs, point)

Return the normal matrix of the quadratic major function.

Given vectors W = V·S, return Wᵀ·diag(b)·W

where S are the vectors defining a subspace and b are Geman & Reynolds coefficients at given point.

Parameters
  • vecs (array) – The W vectors.

  • point (array) – The given point where to compute Geman & Reynolds coefficients b.

Returns

out – The normal matrix

Return type

array

gr_coeffs(point)

The Geman & Reynolds coefficients at given point

Given x return φ’(Vx - ω) / (Vx - ω).

gy_coeffs(point)

The Geman & Yang coefficients at given point

Given x return Vx - φ’(Vx - ω).

Note

The Objective class implements __call__ interface allowing objects to behave like callable (function), returning the objective value

identity = lambda x: x

objv = qmm.Objective(identity, identity, qmm.Square())
x = np.random.standard_normal((100, ))
objv(x) == objv.value(x)

Quadratic objective

This class implements specific properties or methods associated to quadratic objective function.

class QuadObjective(operator, adjoint, hessp=None, data=None, hyper=1, invcovp=None, name='')

A quadratic objective function

\begin{equation} \begin{aligned} J(x) & = \frac{1}{2} \mu \|V x - \omega\|_B^2 \\ & = \frac{1}{2} \mu (V x - \omega)^tB(V x - \omega) \end{aligned} \end{equation}

The instance attributes are:

hyperfloat

The hyperparameter value μ.

ht_dataarray

The retroprojected data μ VᵀBω.

constantfloat

The constant value μ ωᵀBω.

__init__(operator, adjoint, hessp=None, data=None, hyper=1, invcovp=None, name='')

A quadratic objective ½ μ ||Vx - ω||²_B

Parameters
  • operator (callable) – A callable that compute the output Vx.

  • adjoint (callable) – A callable that compute Vᵀe.

  • hessp (callable, optional) – A callable that compute Qx as Qx = VᵀBVx. Must take a parameter like operator and return a parameter like adjoint (x-like in both case).

  • data (array or list of array, optional) – The data vector ω.

  • hyper (float, optional) – The hyperparameter μ.

  • invcovp (callable, optional) – A callable, that take a parameter like adjoint and return like operator (ω-like in both case), that apply the inverse covariance, or metric, B=Σ⁻¹. Equivalent to Identity if None.

  • name (str, optional) – The name of the objective.

Notes

The hessp (Q) callable is used for gradient computation as ∇ = μ (Qx - b) where b = VᵀBω instead of ∇ = μ VᵀB(Vx - ω). This is optional and in some case can be more efficient. Use it only in that case.

The variable b = VᵀBω is computed at object initialisation.

property VtB_data

The second term b = μ ∑ᵢ VᵢᵀBᵢωᵢ

property constant

The constant c = μ ∑_i ωᵢᵀBᵢωᵢ

operator(point)

Compute the output of Vx.

value(point)

The value of the objective function at given point

Return ½ μ ||Vx - ω||²_B.

gradient(point)

The gradient and value at given point

Return ∇ = μ (Qx - b) = μ VᵀB(Vx - ω).

Notes

Objective value is always computed with low overhead thanks to the relation

J(x) = ½ (xᵀ∇ - xᵀb + μ ωᵀBω).

value_hessp(point, hessp)

Return J(x) value at low cost given x and q = Qx

thanks to the relation

J(x) = ½ (xᵀq - 2 xᵀb + μ ωᵀBω).

value_residual(point, residual)

Return J(x) value at low cost given x and r = b - Qx

thanks to the relation

J(x) = ½ (xᵀ(-b - r) + μ ωᵀBω).

norm_mat_major(vecs, point)

Return the normal matrix of the quadratic major function.

Given vectors W = V·S, return Wᵀ·diag(b)·W

where S are the vectors defining a subspace and b are Geman & Reynolds coefficients at given point.

Parameters
  • vecs (array) – The W vectors.

  • point (array) – The given point where to compute Geman & Reynolds coefficients b.

Returns

out – The normal matrix

Return type

array

Note

The operator argument for Objective and QuadObjective must be a callable that accept an array as input. The operator can return an array as output but can also return a list of array (for data fusion for instance). However, for needs of optimization algorithm implementation, everything must be an array internally. In case of list or arrays, all these arrays are handled by a Stacked class, internally vectorized and the data are therefore memory copied, at each iteration.

If operator returns a list of array, the adjoint must accept a list of array also. Again, everything is vectorized and Objective rebuild the list of array internally.

QuadObjective handle this list of array more efficiently since data \(\omagea\) is not stored internally by the class but only \(\mu V^T B \omega\), that is an array like x.

If given, the hessp callable argument for QuadObjective must accept an array and returns an array.

Specific objective classes

class Vmin(vmin, hyper, name='')

A minimum value objective function

\[J(x) = \frac{1}{2} \mu \|P_{]-\infty, m]}(x) - m\|_2^2.\]
vminfloat

The minimum value m.

hyperfloat

The hyperparameter value μ.

__init__(vmin, hyper, name='')

A minimum value objective function

J(x) = ½ μ ||P_[m, +∞[(x) - m||².

Parameters
  • vmin (float) – The minimum value m.

  • hyper (float) – The hyperparameter value μ.

  • name (str) – The name of the objective.

operator(point)

Compute the output of Vx.

value(point)

Return the value at current point.

gradient(point)

Compute the gradient at current point.

norm_mat_major(vecs, point)

Return the normal matrix of the quadratic major function.

Given vectors W = V·S, return Wᵀ·diag(b)·W

where S are the vectors defining a subspace and b are Geman & Reynolds coefficients at given point.

Parameters
  • vecs (array) – The W vectors.

  • point (array) – The given point where to compute Geman & Reynolds coefficients b.

Returns

out – The normal matrix

Return type

array

class Vmax(vmax, hyper, name='')

A maximum value objective function

\[J(x) = \frac{1}{2} \mu \|P_{[M, +\infty[}(x) - M\|_2^2.\]
vmaxfloat

The maximum value M.

hyperfloat

The hyperparameter value μ.

__init__(vmax, hyper, name='')

A maximum value objective function

Return J(x) = ½ μ ||P_[M, +∞[(x) - M||².

Parameters
  • vmax (float) – The maximum value M.

  • hyper (float) – The hyperparameter value μ.

  • name (str) – The name of the objective.

operator(point)

Compute the output of Vx.

value(point)

Return the value at current point.

gradient(point)

Compute the gradient at current point.

norm_mat_major(vecs, point)

Return the normal matrix of the quadratic major function.

Given vectors W = V·S, return Wᵀ·diag(b)·W

where S are the vectors defining a subspace and b are Geman & Reynolds coefficients at given point.

Parameters
  • vecs (array) – The W vectors.

  • point (array) – The given point where to compute Geman & Reynolds coefficients b.

Returns

out – The normal matrix

Return type

array

Sum of objectives

The MixedObjective is a convenient (not required) list-like class that represent the sum of BaseObjective. Moreover, BaseObjective and MixedObjective support the “+” operator and returns a MixedObjective instance, or update the instance, respectively. Since MixedObjective is a list, it can be used with optimization algorithms.

likelihood = QuadObjective(...)
prior1 = Objective(...)
prior2 = Objective(...)

# Equivalent to objective = MixedObjective([likelihood, prior1])
objective = likelihood + prior1

# Equivalent to objective.append(prior2)
objective = objective + prior2

# Equivalent to res = mmmg([likelihood, prior1, prior2], ...)
res = mmmg(objective, ...)
class MixedObjective(objv_list)

Represents a mixed objective function

\[J(x) = \sum_k \mu_k \Psi_k \left(V_k x - \omega_k \right)\]

This is a Sequence (or list-like) and instance of this class can be used in optimization algorithms.

__init__(objv_list)

A mixed objective function

J(x) = ∑ₖ μₖ ψₖ(Vₖ·x - ωₖ).

Parameters

objv_list (list of BaseObjective) –

property lastv

Return the value of objectives obtained during gradient computation.

insert(index, value)

S.insert(index, value) – insert value before index

value(point)

The value J(x)

gradient(point)

The gradient ∇J(x)

Losses classes

The class Loss is an abstract base class and serve as parent class for all losses. At that time, the provided concrete loss functions are Square, Huber, Hyperbolic, HebertLeahy, GemanMcClure, and TruncSquareApprox.

Note

The Loss class implements __call__ interface allowing objects to behave like callable (function), returning the function value

u = np.linspace(-5, 5, 1000)
pot = qmm.Huber(1)
plt.plot(u, pot(u))
class Loss(inf, convex=False, coercive=False)

An abstract base class for loss φ.

The class has the following attributes.

inffloat

The value of lim_{u→0} φ’(u) / u.

convexboolean

A flag indicating if the loss is convex (not used).

coerciveboolean

A flag indicating if the loss is coercive (not used).

__init__(inf, convex=False, coercive=False)

The loss φ

Parameters
  • inf (float) – The value of lim_{u→0} φ’(u) / u.

  • convex (boolean) – A flag indicating if the loss is convex.

  • coercive (boolean) – A flag indicating if the loss is coercive.

abstract value(point)

The value φ(·) at given point.

abstract gradient(point)

The gradient φ’(·) at given point.

gr_coeffs(point)

The Geman & Reynolds φ’(·)/· coefficients at given point.

gy_coeffs(point)

The Geman & Yang · - φ’(·) coefficients at given point.

Square

class Square

The Square loss

\[\varphi(u) = \frac{1}{2} u^2.\]
__init__()

The Square loss φ(u) = ½ u².

value(point)

The value φ(·) at given point.

gradient(point)

The gradient φ’(·) at given point.

Huber

class Huber(delta)

The convex coercive Huber loss

\[\begin{split}\varphi(u) = \begin{cases} \frac{1}{2} u^2 & \text{, if } u \leq \delta, \\ \delta |u| - \frac{\delta^2}{2} & \text{, otherwise.} \end{cases}\end{split}\]
__init__(delta)

The Huber loss.

value(point)

The value φ(·) at given point.

gradient(point)

The gradient φ’(·) at given point.

Hyperbolic or Pseudo-Huber

class Hyperbolic(delta)

The convex coercive hyperbolic loss

\[\varphi(u) = \delta^2 \left( \sqrt{1 + \frac{u^2}{\delta^2}} -1 \right)\]

This is sometimes called Pseudo-Huber.

__init__(delta)

The hyperbolic loss.

value(point)

The value φ(·) at given point.

gradient(point)

The gradient φ’(·) at given point.

Hebert & Leahy

class HebertLeahy(delta)

The non-convex coercive Hebert & Leahy loss

\[\varphi(u) = \log \left(1 + \frac{u^2}{\delta^2} \right)\]
__init__(delta)

The Hebert & Leahy loss.

value(point)

The value φ(·) at given point.

gradient(point)

The gradient φ’(·) at given point.

Geman & Mc Clure

class GemanMcClure(delta)

The non-convex non-coervice Geman & Mc Clure loss

\[\varphi(u) = \frac{u^2}{2\delta^2 + u^2}\]
__init__(delta)

The Geman & Mc Clure loss.

value(point)

The value φ(·) at given point.

gradient(point)

The gradient φ’(·) at given point.

Truncated Square approximation

class TruncSquareApprox(delta)

The non-convex non-coercive truncated square approximation

\[\varphi(u) = 1 - \exp \left(- \frac{u^2}{2\delta^2} \right)\]
__init__(delta)

The truncated square approximation.

value(point)

The value φ(·) at given point.

gradient(point)

The gradient φ’(·) at given point.