API references (qmm
module)¶
All the functionalities are provided by the unique qmm
module described
below.
Optimization algorithms¶
Three algorithms are implemented.
mmcg()
that use the Majorize-Minimize Conjugate Gradient (MM-CG),mmmg()
that use the Majorize-Minimize Memory Gradient (3MG), andlcg()
that use the Linear Conjugate Gradient (CG) for quadratic objectiveQuadObjective
only, with exact optimal step and conjugacy parameters.
The 3MG algorithm is usually faster but use more memory. The MM-CG can be faster and use less memory.
Majorize-Minimize Conjugate Gradient¶
-
mmcg
(objv_list, x0, tol=0.0001, max_iter=500, min_iter=0, precond=None, callback=None, calc_objv=False)¶ The Majorize-Minimize Conjugate Gradient (MM-CG) algorithm.
The MM-CG is a nonlinear conjugate gradient (NL-CG) optimization algorithm with an explicit step formula based on Majorize-Minimize Quadratic approach 1.
- Parameters
objv_list (list of BaseObjective) – A list of
BaseObjective
objects that each represents a μ ψ(Vx - ω). The objectives are summed.x0 (ndarray) – The initial point.
tol (float, optional) – The stopping tolerance. The algorithm is stopped when the gradient norm is inferior to x0.size * tol.
max_iter (int, optional) – The maximum number of iterations.
min_iter (int, optional) – The minimum number of iterations.
precond (callable, optional) – A callable that must implement a preconditioner, that is Px. Must be a callable with a unique input parameter x and unique output like x.
callback (callable, optional) – A function that receive the OptimizeResult at the end of each iteration.
calc_objv (boolean, optional) – If True, objective function is computed at each iteration with low overhead. False by default. Not used by the algorithm.
- Returns
result
- Return type
References
- 1
C. Labat and J. Idier, “Convergence of Conjugate Gradient Methods with a Closed-Form Stepsize Formula,” J Optim Theory Appl, p. 18, 2008.
Majorize-Minimize Memory Gradient¶
-
mmmg
(objv_list, x0, tol=0.0001, max_iter=500, min_iter=0, precond=None, callback=None, calc_objv=False)¶ The Majorize-Minimize Memory Gradient (3MG) algorithm.
The mmmg (3MG) algorithm is a subspace memory-gradient optimization algorithm with an explicit step formula based on Majorize-Minimize Quadratic approach 2.
- Parameters
objv_list (list of BaseObjective) – A list of
BaseObjective
objects that each represents a μ ψ(Vx - ω). The objectives are summed.x0 (array) – The initial point.
tol (float, optional) – The stopping tolerance. The algorithm is stopped when the gradient norm is inferior to x0.size * tol.
max_iter (int, optional) – The maximum number of iterations.
min_iter (int, optional) – The minimum number of iterations.
precond (callable, optional) – A callable that must implement a preconditioner, that is Px. Must be a callable with a unique input parameter x and unique output like x.
callback (callable, optional) – A function that receive the OptimizeResult at the end of each iteration.
calc_objv (boolean, optional) – If True, objective function is computed at each iteration with low overhead. False by default. Not used by the algorithm.
- Returns
result
- Return type
References
- 2
E. Chouzenoux, J. Idier, and S. Moussaoui, “A Majorize-Minimize Strategy for Subspace Optimization Applied to Image Restoration,” IEEE Trans. on Image Process., vol. 20, no. 6, pp. 1517–1528, Jun. 2011, doi: 10.1109/TIP.2010.2103083.
Linear Conjugate Gradient¶
-
lcg
(objv_list, x0, tol=0.0001, max_iter=500, min_iter=0, precond=None, callback=None, calc_objv=False)¶ Linear Conjugate Gradient (CG) algorithm.
Linear Conjugate Gradient optimization algorithm for quadratic objective.
- Parameters
objv_list (list of QuadObjective) – A list of
QuadObjective
objects that each represents a ½ μ ||Vx - ω||²_B. The objectives are summed.x0 (ndarray) – The initial point.
precond (callable, optional) – A callable that must implement a preconditioner, that is Px. Must be a callable with a unique input parameter x and unique output like x.
tol (float, optional) – The stopping tolerance. The algorithm is stopped when the gradient norm is inferior to x0.size * tol.
max_iter (int, optional) – The maximum number of iterations.
min_iter (int, optional) – The minimum number of iterations.
callback (callable, optional) – A function that receive the OptimizeResult at the end of each iteration.
- Returns
result
- Return type
Optimization results¶
The output are instance of OptimizeResult
that behave like
OptimizeResult
of scipy. They behave like Python dictionary and are
implemented to avoid dependency to scipy.
-
class
OptimizeResult
(*args, **kwargs)¶ Represents the optimization result.
- x: array
The solution of the optimization, with same shape than x0.
- success: bool
Whether or not the optimizer exited successfully.
- status: int
Termination status of the optimizer. Its value depends on the underlying solver. Refer to message for details.
- message: str
Description of the cause of the termination.
- nit: int
Number of iterations performed by the optimizer.
- diff: list of float
The value of ||x_{k+1} - x_{k}||² at each iteration
- time: list of float
The time at each iteration, starting at 0, in seconds.
- fun: float
The value of the objective function.
- objv_val: list of float
The objective value at each iteration
- jac: array
The gradient of the objective function.
- grad_norm: list of float
The gradient norm at each iteration
Notes
OptimizeResult
mimes OptimizeResult of scipy for compatibility.-
__init__
(*args, **kwargs)¶ Initialize self. See help(type(self)) for accurate signature.
Objective classes¶
Objective functions are defined from the abstract class BaseObjective
that have three abstract methods that must be implemented by the subclass. If
users want to implements it’s own objective, he is encouraged to subclass
BaseObjective
.
Four generic concrete classes of BaseObjective
can be used. The
Objective
class is the more general and prefered way, and
QuadObjective
is a specialized subclass that allows simplification and
slightly faster computation. Vmax
and Vmin
are for bound
penalties.
Note
The property lastgv
is used by algorithms to compute the objective
function value at each iteration with low overhead, if the flag calc_fun
is set to True
(False
by default). It is not required by the
algorithms.
-
class
BaseObjective
(hyper=1, name='')¶ An abstract base class for objective function
\[J(x) = \mu \Psi \left(V x - \omega \right)\]with \(\Psi(u) = \sum_i \varphi(u_i)\).
-
calc_objv
¶ If true, compute the objective value when gradient is computed and store in lastgv attribute (False by default).
- Type
boolean
-
name
¶ The name of the objective.
- Type
str
-
hyper
¶ The hyperparameter value μ.
- Type
float
-
lastv
¶ The last evaluated value of the objective (0 by default).
- Type
float
-
lastgv
¶ The value of objective obtained during gradient computation (0 by default).
- Type
float
-
__init__
(hyper=1, name='')¶ Initialize self. See help(type(self)) for accurate signature.
-
abstract
operator
(point)¶ Compute the output of Vx.
-
abstract
value
(point)¶ Compute the value at current point.
-
abstract
gradient
(point)¶ Compute the gradient at current point.
-
abstract
norm_mat_major
(vecs, point)¶ Return the normal matrix of the quadratic major function.
Given vectors W = V·S, return Wᵀ·diag(b)·W
where S are the vectors defining a subspace and b are Geman & Reynolds coefficients at given point.
- Parameters
vecs (array) – The W vectors.
point (array) – The given point where to compute Geman & Reynolds coefficients b.
- Returns
out – The normal matrix
- Return type
array
-
Main objective¶
-
class
Objective
(operator, adjoint, loss, data=None, hyper=1, name='')¶ An objective function defined as
\[J(x) = \mu \Psi \left(V x - \omega \right)\]with \(\Psi(u) = \sum_i \varphi(u_i)\).
The instance attributs are:
- dataarray
The data array, or the vectorized list of array given at init.
- hyperfloat
The hyperparameter value μ.
- lossLoss
The loss φ.
-
__init__
(operator, adjoint, loss, data=None, hyper=1, name='')¶ A objective function μ ψ(Vx - ω).
- Parameters
operator (callable) – A callable that compute the output Vx.
adjoint (callable) – A callable that compute Vᵀe.
loss (Loss) – The loss φ.
data (array or list of array, optional) – The data vector ω.
hyper (float, optional) – The hyperparameter μ.
name (str, optional) – The name of the objective.
Notes
If data is a list of array, operator must return a similar list with arrays of same shape, and adjoint must accept a similar list also.
In that case, however, and for algorithm purpose, everything is internally stacked as a column vector and values are therefore copied, by using a Stacked object. This is not efficient but flexible. Users are encouraged to do the vectorization themselves and use this “list of array” feature.
-
operator
(point)¶ Compute the output of Vx.
-
value
(point)¶ The value of the objective function at given point
Return μ ψ(Vx - ω).
-
gradient
(point)¶ The gradient and value at given point
Return μ Vᵀφ’(Vx - ω).
-
norm_mat_major
(vecs, point)¶ Return the normal matrix of the quadratic major function.
Given vectors W = V·S, return Wᵀ·diag(b)·W
where S are the vectors defining a subspace and b are Geman & Reynolds coefficients at given point.
- Parameters
vecs (array) – The W vectors.
point (array) – The given point where to compute Geman & Reynolds coefficients b.
- Returns
out – The normal matrix
- Return type
array
-
gr_coeffs
(point)¶ The Geman & Reynolds coefficients at given point
Given x return φ’(Vx - ω) / (Vx - ω).
-
gy_coeffs
(point)¶ The Geman & Yang coefficients at given point
Given x return Vx - φ’(Vx - ω).
Note
The Objective
class implements __call__
interface allowing
objects to behave like callable (function), returning the objective value
identity = lambda x: x
objv = qmm.Objective(identity, identity, qmm.Square())
x = np.random.standard_normal((100, ))
objv(x) == objv.value(x)
Quadratic objective¶
This class implements specific properties or methods associated to quadratic objective function.
-
class
QuadObjective
(operator, adjoint, hessp=None, data=None, hyper=1, invcovp=None, name='')¶ A quadratic objective function
\begin{equation} \begin{aligned} J(x) & = \frac{1}{2} \mu \|V x - \omega\|_B^2 \\ & = \frac{1}{2} \mu (V x - \omega)^tB(V x - \omega) \end{aligned} \end{equation}The instance attributes are:
- hyperfloat
The hyperparameter value μ.
- ht_dataarray
The retroprojected data μ VᵀBω.
- constantfloat
The constant value μ ωᵀBω.
-
__init__
(operator, adjoint, hessp=None, data=None, hyper=1, invcovp=None, name='')¶ A quadratic objective ½ μ ||Vx - ω||²_B
- Parameters
operator (callable) – A callable that compute the output Vx.
adjoint (callable) – A callable that compute Vᵀe.
hessp (callable, optional) – A callable that compute Qx as Qx = VᵀBVx. Must take a parameter like operator and return a parameter like adjoint (x-like in both case).
data (array or list of array, optional) – The data vector ω.
hyper (float, optional) – The hyperparameter μ.
invcovp (callable, optional) – A callable, that take a parameter like adjoint and return like operator (ω-like in both case), that apply the inverse covariance, or metric, B=Σ⁻¹. Equivalent to Identity if None.
name (str, optional) – The name of the objective.
Notes
The hessp (Q) callable is used for gradient computation as ∇ = μ (Qx - b) where b = VᵀBω instead of ∇ = μ VᵀB(Vx - ω). This is optional and in some case can be more efficient. Use it only in that case.
The variable b = VᵀBω is computed at object initialisation.
-
property
VtB_data
¶ The second term b = μ ∑ᵢ VᵢᵀBᵢωᵢ
-
property
constant
¶ The constant c = μ ∑_i ωᵢᵀBᵢωᵢ
-
operator
(point)¶ Compute the output of Vx.
-
value
(point)¶ The value of the objective function at given point
Return ½ μ ||Vx - ω||²_B.
-
gradient
(point)¶ The gradient and value at given point
Return ∇ = μ (Qx - b) = μ VᵀB(Vx - ω).
Notes
Objective value is always computed with low overhead thanks to the relation
J(x) = ½ (xᵀ∇ - xᵀb + μ ωᵀBω).
-
value_hessp
(point, hessp)¶ Return J(x) value at low cost given x and q = Qx
thanks to the relation
J(x) = ½ (xᵀq - 2 xᵀb + μ ωᵀBω).
-
value_residual
(point, residual)¶ Return J(x) value at low cost given x and r = b - Qx
thanks to the relation
J(x) = ½ (xᵀ(-b - r) + μ ωᵀBω).
-
norm_mat_major
(vecs, point)¶ Return the normal matrix of the quadratic major function.
Given vectors W = V·S, return Wᵀ·diag(b)·W
where S are the vectors defining a subspace and b are Geman & Reynolds coefficients at given point.
- Parameters
vecs (array) – The W vectors.
point (array) – The given point where to compute Geman & Reynolds coefficients b.
- Returns
out – The normal matrix
- Return type
array
Note
The operator
argument for Objective
and QuadObjective
must be a callable that accept an array
as input. The operator can return
an array as output but can also return a list
of array (for data
fusion for instance). However, for needs of optimization algorithm
implementation, everything must be an array internally. In case of list
or arrays, all these arrays are handled by a Stacked class, internally
vectorized and the data are therefore memory copied, at each iteration.
If operator
returns a list of array, the adjoint
must accept a
list of array also. Again, everything is vectorized and Objective rebuild
the list of array internally.
QuadObjective handle this list
of array more efficiently since data
\(\omagea\) is not stored internally by the class but only \(\mu V^T
B \omega\), that is an array like x.
If given, the hessp
callable argument for QuadObjective
must
accept an array and returns an array.
Specific objective classes¶
-
class
Vmin
(vmin, hyper, name='')¶ A minimum value objective function
\[J(x) = \frac{1}{2} \mu \|P_{]-\infty, m]}(x) - m\|_2^2.\]- vminfloat
The minimum value m.
- hyperfloat
The hyperparameter value μ.
-
__init__
(vmin, hyper, name='')¶ A minimum value objective function
J(x) = ½ μ ||P_[m, +∞[(x) - m||².
- Parameters
vmin (float) – The minimum value m.
hyper (float) – The hyperparameter value μ.
name (str) – The name of the objective.
-
operator
(point)¶ Compute the output of Vx.
-
value
(point)¶ Return the value at current point.
-
gradient
(point)¶ Compute the gradient at current point.
-
norm_mat_major
(vecs, point)¶ Return the normal matrix of the quadratic major function.
Given vectors W = V·S, return Wᵀ·diag(b)·W
where S are the vectors defining a subspace and b are Geman & Reynolds coefficients at given point.
- Parameters
vecs (array) – The W vectors.
point (array) – The given point where to compute Geman & Reynolds coefficients b.
- Returns
out – The normal matrix
- Return type
array
-
class
Vmax
(vmax, hyper, name='')¶ A maximum value objective function
\[J(x) = \frac{1}{2} \mu \|P_{[M, +\infty[}(x) - M\|_2^2.\]- vmaxfloat
The maximum value M.
- hyperfloat
The hyperparameter value μ.
-
__init__
(vmax, hyper, name='')¶ A maximum value objective function
Return J(x) = ½ μ ||P_[M, +∞[(x) - M||².
- Parameters
vmax (float) – The maximum value M.
hyper (float) – The hyperparameter value μ.
name (str) – The name of the objective.
-
operator
(point)¶ Compute the output of Vx.
-
value
(point)¶ Return the value at current point.
-
gradient
(point)¶ Compute the gradient at current point.
-
norm_mat_major
(vecs, point)¶ Return the normal matrix of the quadratic major function.
Given vectors W = V·S, return Wᵀ·diag(b)·W
where S are the vectors defining a subspace and b are Geman & Reynolds coefficients at given point.
- Parameters
vecs (array) – The W vectors.
point (array) – The given point where to compute Geman & Reynolds coefficients b.
- Returns
out – The normal matrix
- Return type
array
Sum of objectives¶
The MixedObjective
is a convenient (not required) list-like class that
represent the sum of BaseObjective
. Moreover, BaseObjective
and MixedObjective
support the “+” operator and returns a
MixedObjective
instance, or update the instance, respectively. Since
MixedObjective
is a list, it can be used with optimization
algorithms.
likelihood = QuadObjective(...)
prior1 = Objective(...)
prior2 = Objective(...)
# Equivalent to objective = MixedObjective([likelihood, prior1])
objective = likelihood + prior1
# Equivalent to objective.append(prior2)
objective = objective + prior2
# Equivalent to res = mmmg([likelihood, prior1, prior2], ...)
res = mmmg(objective, ...)
-
class
MixedObjective
(objv_list)¶ Represents a mixed objective function
\[J(x) = \sum_k \mu_k \Psi_k \left(V_k x - \omega_k \right)\]This is a Sequence (or list-like) and instance of this class can be used in optimization algorithms.
-
__init__
(objv_list)¶ A mixed objective function
J(x) = ∑ₖ μₖ ψₖ(Vₖ·x - ωₖ).
- Parameters
objv_list (list of BaseObjective) –
-
property
lastv
¶ Return the value of objectives obtained during gradient computation.
-
insert
(index, value)¶ S.insert(index, value) – insert value before index
-
value
(point)¶ The value J(x)
-
gradient
(point)¶ The gradient ∇J(x)
-
Losses classes¶
The class Loss
is an abstract base class and serve as parent class for
all losses. At that time, the provided concrete loss functions are
Square
, Huber
, Hyperbolic
, HebertLeahy
,
GemanMcClure
, and TruncSquareApprox
.
Note
The Loss
class implements __call__
interface allowing objects to
behave like callable (function), returning the function value
u = np.linspace(-5, 5, 1000)
pot = qmm.Huber(1)
plt.plot(u, pot(u))
-
class
Loss
(inf, convex=False, coercive=False)¶ An abstract base class for loss φ.
The class has the following attributes.
- inffloat
The value of lim_{u→0} φ’(u) / u.
- convexboolean
A flag indicating if the loss is convex (not used).
- coerciveboolean
A flag indicating if the loss is coercive (not used).
-
__init__
(inf, convex=False, coercive=False)¶ The loss φ
- Parameters
inf (float) – The value of lim_{u→0} φ’(u) / u.
convex (boolean) – A flag indicating if the loss is convex.
coercive (boolean) – A flag indicating if the loss is coercive.
-
abstract
value
(point)¶ The value φ(·) at given point.
-
abstract
gradient
(point)¶ The gradient φ’(·) at given point.
-
gr_coeffs
(point)¶ The Geman & Reynolds φ’(·)/· coefficients at given point.
-
gy_coeffs
(point)¶ The Geman & Yang · - φ’(·) coefficients at given point.
Square¶
Huber¶
-
class
Huber
(delta)¶ The convex coercive Huber loss
\[\begin{split}\varphi(u) = \begin{cases} \frac{1}{2} u^2 & \text{, if } u \leq \delta, \\ \delta |u| - \frac{\delta^2}{2} & \text{, otherwise.} \end{cases}\end{split}\]-
__init__
(delta)¶ The Huber loss.
-
value
(point)¶ The value φ(·) at given point.
-
gradient
(point)¶ The gradient φ’(·) at given point.
-
Hyperbolic or Pseudo-Huber¶
-
class
Hyperbolic
(delta)¶ The convex coercive hyperbolic loss
\[\varphi(u) = \delta^2 \left( \sqrt{1 + \frac{u^2}{\delta^2}} -1 \right)\]This is sometimes called Pseudo-Huber.
-
__init__
(delta)¶ The hyperbolic loss.
-
value
(point)¶ The value φ(·) at given point.
-
gradient
(point)¶ The gradient φ’(·) at given point.
-
Hebert & Leahy¶
Geman & Mc Clure¶
Truncated Square approximation¶
-
class
TruncSquareApprox
(delta)¶ The non-convex non-coercive truncated square approximation
\[\varphi(u) = 1 - \exp \left(- \frac{u^2}{2\delta^2} \right)\]-
__init__
(delta)¶ The truncated square approximation.
-
value
(point)¶ The value φ(·) at given point.
-
gradient
(point)¶ The gradient φ’(·) at given point.
-