Group Sparsity Residual Constraint Image Denoising Model with 𝒍1/𝒍2 Regularization

Di WU; Tao ZHANG; Xutao MO

doi:10.1051/wujns/2023281053

All issues

Volume 28 / No 1 (February 2023)

Wuhan Univ. J. Nat. Sci., 28 1 (2023) 53-60

Full HTML

Open Access

Issue		Wuhan Univ. J. Nat. Sci. Volume 28, Number 1, February 2023


Page(s)		53 - 60
DOI		https://doi.org/10.1051/wujns/2023281053
Published online		17 March 2023

Wuhan University Journal of Natural Sciences, 2023, Vol.28 No.1, 53-60

Computer Science

CLC number: TP 751

Group Sparsity Residual Constraint Image Denoising Model with 𝒍₁/𝒍₂ Regularization

Di WU¹, Tao ZHANG²^† and Xutao MO²

¹ Key Laboratory of Multidisciplinary Management and Control of Complex Systems of Anhui Higher Education Institutes, Anhui University of Technology, Maanshan 243002, Anhui, China
² School of Mathematics and Physics, Anhui University of Technology, Maanshan 243032, Anhui, China

^† To whom correspondence should be addressed. E-mail: zt9877@163.com

Received: 18 October 2022

Abstract

Group sparse residual constraint with non-local priors (GSRC) has achieved great success in image restoration producing state-of-the-art performance. In the GSRC model, the $l_{1}$ norm minimization is employed to reduce the group sparse residual. In recent years, non-convex regularization terms have been widely used in image denoising problems, which have achieved better results in denoising than convex regularization terms. In this paper, we use the ratio of the $l_{1}$ and $l_{2}$ norm instead of the $l_{1}$ norm to propose a new image denoising model, i.e., a group sparse residual constraint model with $l_{1} / l_{2}$ minimization (GSRC- $l_{1} / l_{2}$ ). Due to the computational difficulties arisen from the non-convexity and non-linearity, we focus on a constrained optimization problem that can be solved by alternative direction method of multipliers (ADMM). Experimental results of image denoising show that the pro-posed model outperforms several state-of-the-art image denoising methods both visually and quantitatively.

Key words: image denoising / l₁/l₂ minimization / group sparse representation

Biography: WU Di, male, Master candidate, research direction: image processing. E-mail: wudi2020wudi@163.com

Fundation item: Supported by the Open Fund of Key Laboratory of Anhui Higher Education Institutes (CS2021-07) , the National Natural Science Foundation of China (61701004) , the Outstanding Young Talents Support Program of Anhui Province (GXYQ 2021178 ) and University Natural Science Research Project of Anhui Province of China ( KJ2020A0238)

© Wuhan University 2023

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

0 Introduction

The goal of image denoising is to estimate the latent image $x \in R^{m \times n}$ from its degraded observation $y \in R^{m \times n}$ , which is typically an ill-posed inverse problem. To tackle the ill-posed nature of this problem, regularization methods based on various image prior information have to be incorporated into the image denoising process, which is usually realized by minimizing the objective function as

$\hat{x} = \underset{x}{a r g m i n} {‖ y - x ‖}_{2}^{2} + λ R (x)$ (1)

where $x$ and $y$ represent the original image and the degraded image, respectively. The first term is the data fidelity and the second term is the regularization term. $λ$ is a regularization parameter that balances these two terms. During the past decades, a variety of image denoising methods have been proposed, such as total variation^[1,2], sparse representation^[3-5], non-local self-similarity^[6,7], and neural network models^[8-10].

Nowadays, sparse representation has become a mainstream direction in image processing. Aharon et al^[11]used singular value decomposition to update the basic atoms and corresponding coefficients in a minimum error criterion, and pioneered the K-SVD (singular value decomposition) algorithm with a K-time optimization process in 2006. In dictionary learning and sparse coding, each block is treated as an independent individual, while blocks are generally correlated with each other, which is usually ignored. Mairal et al^[12]assumed that the sparse coefficients of similar blocks had the same support mechanism and combined the image redundancy information to improve the K-SVD denoising performance by learning simultaneous sparse coding (LSSC). Dong et al^[13] trained principal component analysis (PCA) dictionaries by clustering all image blocks, and for each patch group, the most appropriate local PCA dictionary was selected to encode it, thus called the non-local centralized sparse representation (NCSR) model. This NCSR method introduces the concept of "coding residuals". Zhang et al^[14]formally proposed the concept of "group" as the basic processing unit of sparse coding, and proposed a group-based sparse representation (GSR) model to extend the mutually independent patch sparse to group sparse which is applicable to both non-local similarity and local sparsity of images. Zha et al^[15] proposed a group sparse residual constraint model with non-local priors (GSRC-NLP) and used soft thresholding to achieve image denoising, which essentially combines the NCSR method and the GSR model to upgrade the patch-based sparse coding residuals to the group-based sparse residuals.

Other than the popular $l_{1}$ norm, there are many regularization functions to promote sparsity. Bai et al^[16]proposed an adaptive correction term to alleviate the over constraint of the $l_{1}$ norm . The $l_{1} / l_{2}$ regularization term is also used to improve sparsity and can alleviate the over-constraint of $l_{1}$ norm to some extent ^[17]. In addition to this, many non-convex regularization terms have been proposed, such as $l_{1 / 2}$ norm ^[18], $l_{p}$ norm ^[19], and Logarithmic regularization term^[20].

In this paper, we study the ratio of the $l_{1}$ and $l_{2}$ norms, denoted as $l_{1} / l_{2}$ , to promote the sparsity of group sparse residual. The contributions of the paper are summarized as follows:

First, the $l_{1} / l_{2}$ regularization term is used to reduce the group sparse residual, and then a group sparse residual constraint model with $l_{1} / l_{2}$ minimization is proposed. The proposed model is successfully solved by the alternative direction method of multipliers (ADMM) algorithm.

Second, experimental results on image denoising show that the proposed scheme is feasible and outperforms many state-of-the-art methods in both objective and perceptual quality.

1 Background and Related Work

1.1 Group Sparse Representation

In the framework of GSR method^[14], an image $x$ with size $N$ is divided into $n$ overlapped patches $x_{i}$ of size $\sqrt[]{b} \times \sqrt[]{b}$ , and each patch is denoted by a vector $x_{i} \in R^{b \times m}$ . Then for each patch $x_{i}$ , $m$ similar matched patches are selected from a $W \times W$ sized searching window by K-nearest neighbor (KNN) algorithm^[21]. All patches in set $S_{i}$ are stacked into a matrix $X_{i} \in R^{b \times m}$ , i.e., $X_{i} = {x_{i, 1}, x_{i, 2}, \dots, x_{i, m}}$ . This matrix $X_{i}$ consisting of patches with similar structures is called a patch group, where ${x_{i, j}}$ denotes the $j - t h$ patch in the $i - t h$ patch group.

Give a dictionary $D_{i}$ , which is often learned from each group, such as PCA dictionary^[22]. Each group $X_{i}$ can be sparsely represented by solving the following $l_{1}$ norm minimization problem:

${\hat{A}}_{i} = \underset{A_{i}}{a r g m i n} {\frac{1}{2} {‖ Y_{i} - D_{i} A_{i} ‖}_{F}^{2} + λ {‖ A_{i} ‖}_{1}}$ (2)

where $A_{i}$ represents the group sparse coefficient of each group $X_{i}$ , i.e., $X_{i} = D_{i} A_{i}$ . $D_{i}$ is an adaptive PCA dictionary learned from each patch group.

1.2 Construction of Group Sparse Coefficient Residuals $A_{i} - B_{i}$

In Ref.[18], the authors pointed out that traditional GSR models fail to faithfully estimate the sparsity of each group due to the degradation of the observed image, and they showed that the quality of image restoration largely depends on the group sparsity residual, which is defined as the difference between each degraded group sparse coefficient $A_{i}$ and the corresponding original group sparse coefficient $B_{i}$ :

$R_{i} = A_{i} - B_{i}$ (3)

So, they proposed the group sparsity residual constraint model to boost up the accuracy of group sparse coefficient $A_{i}$ , which can be written as

${\hat{A}}_{i} = \underset{A_{i}}{a r g m i n} {\frac{1}{2} {‖ Y_{i} - D_{i} A_{i} ‖}_{F}^{2} + λ {‖ A_{i} - B_{i} ‖}_{1}}$ (4)

where $B_{i} = {β_{1}, β_{2}, \dots, β_{m}}$ is the group sparse coefficient of each group corresponding to the original image. As the original image $x$ is not available in image denoising, the non-local self-similarity prior is used to estimate $B_{i}$ . For each group including $m$ non-local similar patches, a good estimation of $β_{k}$ can be computed by the weighted average of each vector $α_{j}$ in $A_{i}$ . We have

$β_{k} = \sum_{j = 1}^{m} w_{j} α_{j}$ (5)

where $A_{i} = D_{i}^{T} Y_{i}$ , $β_{k}$ and $α_{j}$ represent the $k - t h$ and $j - t h$ vector of $B_{i}$ and $A_{i}$ , respectively. After this, $β_{k}$ is copied by $m$ times to estimate $B_{i}$ . We can obtain $B_{i} = {β_{k}, β_{k}, \dots, β_{k}}$ . In Eq. (5), $w_{j}$ is set to be inversely proportional to the distance between the target patch $y_{i}$ and its similar patch $y_{i, j}$ , i.e.,

$w_{j} = \frac{1}{L} e x p (- {‖ y_{i} - y_{i, j} ‖}_{2}^{2} / h)$

where $h$ is a predefined constant and $L$ is a normalization factor ^[23].

1.3 $l_{1} / l_{2}$ Minimization

In Ref.[17], the authors employed the $l_{1} / l_{2}$ norm to promote the sparsity and proposed the following model for sparse recovery:

$\underset{x \in R^{n}}{m i n} \frac{{‖ x ‖}_{1}}{{‖ x ‖}_{2}}, s . t . A x = b$ (6)

Two main advantages of $l_{1} / l_{2}$ minimization are scale invariant and parameter free. As the ratio of $l_{1}$ and $l_{2}$ is non-convex and non-linear, solving model (6) and analyzing the convergence become theoretically difficult. The authors of Ref.[17] applied ADMM for minimizing model (11), and verified the convergence empirically by plotting residual errors and objective functions. In Ref.[17], the authors applied the $l_{1} / l_{2}$ norm on the image gradient for MRI reconstruction problem.

2 The GSRC- $l_{1} / l_{2}$ Model for Image Denoising

In this subsection, we employ the ratio of $l_{1}$ and $l_{2}$ norm as the regularizer to promote the sparsity of the group sparse residual mentioned before. The proposed model is

${\hat{A}}_{i} = \underset{A_{i}}{a r g m i n} {\frac{1}{2 λ_{0}} {‖ y - D α ‖}_{2}^{2} + \sum_{i = 1}^{n} \frac{{‖ A_{i} - B_{i} ‖}_{1}}{{‖ A_{i} - B_{i} ‖}_{2}}}$ (7)

where $D$ represents the dictionary and $α$ is the sparse coefficient.

As we mentioned before, patch group is the unite of the proposed sparse representation. Supposing $x, y \in R^{n}$ , the corresponding $n$ patch group obtained by block matching is denoted by $X_{i}, Y_{i} \in R^{b \times m}$ . Let $x = D α$ , then Eq. (7) can be rewritten as

${\hat{A}}_{i} = \underset{A_{i}}{a r g m i n} {\frac{1}{2 λ_{0}} {‖ y - x ‖}_{2}^{2} + \sum_{i = 1}^{n} \frac{{‖ A_{i} - B_{i} ‖}_{1}}{{‖ A_{i} - B_{i} ‖}_{2}}}$ (8)

By means of Theorem I in Ref.[15], the following equation holds with a very large probability at each iteration.

$\frac{1}{N} {‖ x - y ‖}_{2}^{2} = \frac{1}{K} \sum_{i = 1}^{n} {‖ X_{i} - Y_{i} ‖}_{F}^{2}$ (9)

Based on Eq. (9), minimization problem (8) is equivalent to

${\hat{A}}_{i} = \underset{A_{i}}{a r g m i n} \sum_{i = 1}^{n} {\frac{1}{2 λ} {‖ Y_{i} - X_{i} ‖}_{F}^{2} + \frac{{‖ A_{i} - B_{i} ‖}_{1}}{{‖ A_{i} - B_{i} ‖}_{2}}}$ (10)

where $λ = \frac{λ_{0} K}{N}$ , $K = b \times m \times n$ . Let $X_{i} = D_{i} A_{i}$ and $Y_{i} = D_{i} S_{i}$ , $D_{i}$ is a PCA based dictionary learned from each group $Y_{i}$ , so it is orthogonal. By utilizing ADMM and introducing two auxiliary variables $P_{i} = A_{i} - B_{i}$ and $Q_{i} = A_{i} - B_{i}$ , Eq. (10) can be transformed into another equivalent constraint form:

${\begin{matrix} \underset{A_{i}, P_{i}, Q_{i}}{a r g m i n} \sum_{i = 1}^{n} (\frac{1}{2 λ} {‖ A_{i} - S_{i} ‖}_{F}^{2} + \frac{{‖ P_{i} ‖}_{1}}{{‖ Q_{i} ‖}_{2}}) \\ s . t . P_{i} = A_{i} - B_{i}, Q_{i} = A_{i} - B_{i} \end{matrix}$ (11)

The augmented Lagrange function plays a central role in the ADMM method. For the problem (11), the augmented Lagrange function can be written as

$\begin{array}{l} L_{ρ_{1}, ρ_{2}} (A_{i}, P_{i}, Q_{i}, μ_{1}, μ_{2}) \\ = \frac{{‖ P_{i} ‖}_{1}}{{‖ Q_{i} ‖}_{2}} + \frac{1}{2 λ} {‖ A_{i} - S_{i} ‖}_{F}^{2} + 〈 μ_{1}, P_{i} - A_{i} + B_{i} 〉 \\ + \frac{ρ_{1}}{2} {‖ P_{i} - A_{i} + B_{i} ‖}_{F}^{2} \\ + 〈 μ_{2}, Q_{i} - A_{i} + B_{i} 〉 + \frac{ρ_{2}}{2} {‖ Q_{i} - A_{i} + B_{i} ‖}_{F}^{2} \end{array}$ (12)

Then, the ADMM iteration goes as follows

$\begin{array}{l} A_{i}^{k + 1} = \underset{A_{i}}{a r g m i n} \frac{1}{2 λ} {‖ A_{i} - S_{i} ‖}_{F}^{2} \\ + \frac{ρ_{1}}{2} {‖ A_{i} - (P_{i}^{k} + B_{i} + \frac{μ_{1}}{ρ_{1}}) ‖}_{F}^{2} + \frac{ρ_{2}}{2} {‖ A_{i} - (Q_{i}^{k} + B_{i} + \frac{μ_{2}}{ρ_{2}}) ‖}_{F}^{2} \end{array}$ (13)

$P_{i}^{k + 1} = \underset{P_{i}}{a r g m i n} \frac{{‖ P_{i} ‖}_{1}}{{‖ Q_{i}^{k} ‖}_{2}} + \frac{ρ_{1}}{2} {‖ A_{i}^{k + 1} - (P_{i} + B_{i} + \frac{μ_{1}}{ρ_{1}}) ‖}_{F}^{2}$ (14)

$Q_{i}^{k + 1} = \underset{Q_{i}}{a r g m i n} \frac{{‖ P_{i}^{k + 1} ‖}_{1}}{{‖ Q_{i} ‖}_{2}} + \frac{ρ_{2}}{2} {‖ A_{i}^{k + 1} - (Q_{i} + B_{i} + \frac{μ_{2}}{ρ_{2}}) ‖}_{F}^{2}$ (15)

$μ_{1}^{k + 1} = μ_{1}^{k} + ρ_{1} (P_{i}^{k + 1} - A_{i}^{k + 1} + B_{i})$ (16)

$μ_{2}^{k + 1} = μ_{2}^{k} + ρ_{2} (Q_{i}^{k + 1} - A_{i}^{k + 1} + B_{i})$ (17)

Next, we will discuss the optimization strategies for solving each sub-problem one by one.

2.1 $A_{i}$ Sub-Problem

The sub-problem of $A_{i}$ is a least square problem. From the optimal conditions, we can obtain

$\begin{array}{l} \frac{1}{λ} (A_{i} - S_{i}) + ρ_{1} (A_{i} - (P_{i}^{k} + B_{i} + \frac{μ_{1}}{ρ_{1}})) + ρ_{2} (A_{i} - (Q_{i}^{k} + B_{i} \\ + \frac{μ_{2}}{ρ_{2}})) = 0 \end{array}$

The solution of $A_{i}$ is

$A_{i}^{k + 1} = \frac{S_{i} + λ ρ_{1} (P_{i}^{k} + B_{i} + \frac{μ_{1}}{ρ_{1}}) + λ ρ_{2} (Q_{i}^{k} + B_{i} + \frac{μ_{2}}{ρ_{2}})}{1 + λ ρ_{1} + λ ρ_{2}}$ (18)

2.2 $P_{i}$ Sub-Problem

The $P_{i}$ sub-problem is equivalent to

$P_{i}^{k + 1} = \underset{P_{i}}{a r g m i n} {‖ P_{i} ‖}_{1} + \frac{ρ_{1} {‖ Q_{i}^{k} ‖}_{2}}{2} {‖ P_{i} - (A_{i}^{k + 1} - B_{i} - \frac{μ_{1}}{ρ_{1}}) ‖}_{F}^{2}$

It has a closed form solution via soft shrinkage, i.e.,

$\begin{array}{l} P_{i}^{k + 1} = m a x {| A_{i}^{k + 1} - B_{i} - \frac{μ_{1}}{ρ_{1}} | - \frac{1}{{‖ Q_{i}^{k} ‖}_{2} ρ_{1}}, 0} \circ s g n (A_{i}^{k + 1} \\ - B_{i} - \frac{μ_{1}}{ρ_{1}}) \end{array}$ (19)

where $\circ$ represents point-wise product.

2.3 $Q_{i}$ Sub-Problem

As for the $Q_{i}$ sub-problem, let $c^{k} = {‖ P_{i}^{k + 1} ‖}_{1}$ , $d^{k} = A_{i}^{k + 1} - B_{i} - \frac{μ_{2}}{ρ_{2}}$ and the minimization problem reduces to

$Q_{i}^{k + 1} = \underset{Q_{i}}{a r g m i n} \frac{c^{k}}{{‖ Q_{i} ‖}_{2}} + \frac{ρ_{2}}{2} {‖ Q_{i} - d^{k} ‖}_{F}^{2}$ (20)

If $d^{k} = 0$ , then any vector $Q_{i}$ with ${‖ Q_{i} ‖}_{2} = \sqrt[3]{\frac{c^{k}}{ρ_{2}}}$ is a solution to the minimization problem. If $c^{k} = 0$ , then $Q_{i} = d^{k}$ is the solution. If $c^{k} \neq 0, d^{k} \neq 0$ , by taking derivative of the objective function with respect to $Q_{i}$ , we obtain

$(- \frac{c^{k}}{{‖ Q_{i} ‖}_{2}^{3}} + ρ_{2}) Q_{i} = ρ_{2} d^{k}$ (21)

As a result, there exists a positive number $τ^{k}$ such that $Q_{i} = τ^{k} d^{k}$ . Thus, Eq. (21) reduces to

$(- \frac{c^{k}}{(τ^{k})^{3} {‖ d^{k} ‖}_{2}^{3}} + ρ_{2}) τ^{k} d^{k} = ρ_{2} d^{k}$

Given $d^{k}$ , we denote

$η^{k} = {‖ d^{k} ‖}_{2}, M^{k} = \frac{c^{k}}{ρ_{2} (η^{k})^{3}}$

Then $τ^{k}$ is the root of

$τ^{3} - τ^{2} - M^{k} = 0$

Let $F (τ) = τ^{3} - τ^{2} - M^{k}$ , we can find $F (τ) = 0$ has only one real root, which has the following closed-form:

$τ = \frac{1}{3} + \frac{1}{3} (C^{k} + \frac{1}{C^{k}})$

where $C^{k} = \sqrt[3]{\frac{27 M^{k} + 2 + \sqrt[]{(27 M^{k} {+ 2)}^{2} - 4}}{2}}$ .

In summary, we have

$Q_{i}^{k + 1} = {\begin{matrix} e^{k}, i f d^{k} = 0 \\ τ^{k} d^{k}, i f d^{k} \neq 0 \end{matrix}$ (22)

where $e^{k}$ is a random vector with the $l_{2}$ norm to be $\sqrt[3]{\frac{c^{k}}{ρ_{2}}}$ .

Finally, the final estimated group patch $X_{i}$ is computed by ${\hat{X}}_{i} = D_{i} {\hat{A}}_{i}$ . We obtain the complete image $\hat{x}$ by putting the group back to its original position and averaging the overlapping pixels.

2.4 Settings of Parameters and Iterations

In addition, to obtain better results, we can perform the following several iterations of the denoising process. In the $t - t h$ iteration, an iterative regularization strategy is used to update the estimation of the noise variance, and therefore updating $y^{t}$ . The standard deviation of the noise $σ$ in the $t - t h$ iteration is adjusted to

$σ_{n}^{t} = ρ \sqrt[]{(σ_{n}^{2} - {‖ y - {\hat{x}}^{t} ‖}_{2}^{2})}$

where $ρ > 0$ is a constant.

Since the noise variance changes after each regularization process, the parameter $λ$ , which is used to balance the fidelity and regularization terms, needs to change along with it. After each regularization, the parameter $λ$ of group $Y_{i}$ is set to

$λ = \frac{\bar{c} 2 \sqrt[]{2} σ_{n}^{2}}{δ_{i} + κ}$ (23)

where $δ_{i}$ denotes the estimated standard deviation of $A_{i} - B_{i}$ and $\bar{c}, κ$ are constants. The complete description of the proposed GSRC- $l_{1} / l_{2}$ model for image denoising employing ADMM framework is presented in Algorithm 1.

Algorithm 1 The proposed GSRC-l₁/l₂ model for image denoising

1. Require: Noisy image

y

.
2. Initialization: Input

{\hat{x}}^{0} = y

,

y^{0} = y

,

P_{i}^{0}

,

Q_{i}^{0}

,

μ_{1}

,

μ_{2}

,

δ_{i}

,

ρ_{1}

,

ρ_{2}

,

σ_{n}

,

c

,

m

,

L

,

h

,

ρ

,

μ

and

ε

.
3. For

t = 1

to Max-Iter
4. Iterative Regularization

y^{t} = {\hat{x}}^{t - 1} + γ (y - y^{t - 1})

5. For each patch

y_{i}

in

y^{t}

do
6. Find non-local similar patches from group

Y_{i}

.
7. Obtain dictionary

D_{i}

by

Y_{i}

using PCA.
8. Update

Α_{i}

by computing

A_{i} = D_{i}^{T} Y_{i}

.
9. Update

B_{i}

by computing Eq. (5).
10. Update

λ

by computing Eq. (23).
11. While

k

< ADMM-Iter or

\frac{{‖ A_{i}^{k} - A_{i}^{k - 1} ‖}_{2}}{{‖ A_{i}^{k} ‖}_{2}} > ε

12. Update

A_{i}^{k + 1}

by computing Eq. (18).
13. Update

P_{i}^{k + 1}

by computing Eq. (19).
14. Compute

Q_{i}^{k + 1}

by computing Eq. (22).
15. Update

μ_{1}^{k + 1}

via (16).
16. Update

μ_{2}^{k + 1}

via (17).
17. Get the estimation:

{\hat{X}}_{i} = D_{i} {\hat{A}}_{i}

.
18. End while
19. End for
20. Aggregate

{\hat{X}}_{i}

to the denoised image

{\hat{x}}^{t}

.
21. End for
22. Output: The final denoised image

\hat{x}

.

3 Experiments

In this section, we will present extensive experimental results to evaluate the denoising performance of the proposed GSRC- $l_{1} / l_{2}$ model. To evaluate the quality of the recovered images, the peak signal to noise ratio (PSNR) and structural similarity (SSIM) metrics are used. The source codes of all comparison methods were obtained from the original authors, and we used the default parameter settings. For color images, image restoration operators are only performed for the luminance component in this paper.

3.1 Setting Adaptive Parameters of the GSRC- $l_{1} / l_{2}$ Model

In this subsection, we report the performance of the proposed GSRC- $l_{1} / l_{2}$ for image denoising and compare them with some leading denoising methods, including BM3D^[24], NCSR^[13], PGPD^[25], SAIST^[26]and GSRC-NLP ^[15]. The parameter setting of the proposed GSRC- $l_{1} / l_{2}$ model for image denoising is as follows. The size of searching window for block matching is set to $25 \times 25$ . All experiments are conducted under the MATLAB 2018a environment on a computer with Intel (R) Core (TM) i7-10750H with 2.60 GHz CPU and NVIDIA GeForce GTX 1650 Ti and Windows 10 platform. In the GSRC- $l_{1} / l_{2}$ model, the size of patch $\sqrt[]{b} \times \sqrt[]{b}$ is set to $7 \times 7$ for $20 < σ \leq 50$ . In addition, the parameter settings of the model are different. The parameters $(c, γ, ρ_{1}, ρ_{2})$ are (0.6,0.1,0.023,0.001), (0.7,0.1,0.003 4,0.001) and (0.7,0.1,0.0034,0.001) set to for $σ_{n} = 30$ , $σ_{n} = 40$ and $σ_{n} = 50$ .

3.2 Analysis of Results

Note that BM3D, PGPD and SAIST are the GSR-based or NSS-based image denoising methods. NCSR and GSRC-NLP are group sparsity residual based image denoising methods. We test the denoising performance on the presented test images and the highest PSNR is recorded in this paper. The PSNR results for all methods are shown in Table 1.

The proposed GSRC- $l_{1} / l_{2}$ has an average PSNR increase of 0.26dB, 0.32dB, 0.27dB, 0.07dB and 0.03dB compared with BM3D, NCSR, PGPD, SAIST and GSRC-NLP, respectively. In addition, the proposed GSRC- $l_{1} / l_{2}$ has an average SSIM increase of 0.013 3, 0.012 9, 0.014 4, 0.007 2 and 0.003 3 compared with BM3D, NCSR, PGPD, SAIST and GSRC-NLP, respectively. We can observe that the proposed GSRC- $l_{1} / l_{2}$ can obtain better results than other competing methods.

Figures 1　and 2 show the visual comparison of the images Barbara and Mural, respectively. To better compare the details of the reconstructed images, we zoom in the same parts of each image to twice the original size. It can be seen that BM3D, NCSR, SAIST and GSRC-NLP are prone to over smooth the images. PGPD does not protect image details. In contrast, the proposed GSRC- $l_{1} / l_{2}$ is able to preserve local structure of the image more effectively than other competing approaches.

Fig. 1 Denoising results on image Barbaraby different methods (noise level

σ_{n} = 30

)

Fig. 2 Denoising results on image Mural by different methods (noise level

σ_{n} = 40

)

Table 1

PSNR and SSIM result by denoising methods

3.3 Empirical Convergence Analysis of GSRC- $l_{1} / l_{2}$ Model

The existing literature on the ADMM convergence requires the existence of one separable function in the objective function, whose gradient is Lipschitz continuous. However, the proposed GSRC- $l_{1} / l_{2}$ model does not satisfy this assumption. Therefore, it is difficult to analyze the convergence theoretically. Instead, in this subsection we will show the convergence empirically by plotting residual errors and objective functions, which provides strong supports for the convergence validation. In Fig. 3, we empirically demonstrate the convergence of the proposed GSRC- $l_{1} / l_{2}$ algorithm. There are two auxiliary variables $P_{i}$ and $Q_{i}$ in $l_{1} / l_{2}$ , i.e., $P_{i} = A_{i} - B_{i}$ and $Q_{i} = A_{i} - B_{i}$ .Figure 3(a) shows the values of $\sum_{i = 1}^{n} {‖ P_{i}^{k} - A_{i}^{k} + B_{i}^{k} ‖}_{2}$ and $\sum_{i = 1}^{n} {‖ Q_{i}^{k} - A_{i}^{k} + B_{i}^{k} ‖}_{2}$ with respect to iteration counter $k$ . Figure 3(b) shows the values of the objective function (11) versus iteration counter $k$ . The constraints and objective functions in Fig. 3 decrease rapidly with respect to the iteration counters, which is the heuristic evidence of the convergence of the Algorithm 1.

Fig. 3 Plots of residual errors and objective function for empirically demonstrating the convergence

4 Conclusion

To improve the performance of group sparse based image restoration approaches, we introduce an effective sparsity promoting strategies, i.e., $l_{1} / l_{2}$ minimization to promote the sparsity of the group sparse residual. We use the non-local self-similar prior of images alone with a self-supervised learning scheme to obtain good estimates of the group sparsity coefficients for each original group. Then we adapt the $l_{1} / l_{2}$ minimization to reduce the group sparse residual. Compared with $l_{1}$ minimization in GSRC-NLP method, the proposed GSRC- $l_{1} / l_{2}$ can lead to better performance in image denoising. We apply the ADMM algorithm to solve the models and explore the convergence of the proposed algorithm. Experimental results show that the model is comparable with the tested denoising methods, and outperforms many state-of-the-art image denoising methods in terms of both objective and perceptual quality metrics.

References

Blomgren P, Chan T F. Color TV: Total variation methods for restoration of vector-valued images[J]. IEEE Transactions on Image Processing, 1998, 7(3):304-309. [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
Bertalmio M, Caselles V, Rouge B, et al. A TV based restoration model with local constraints. [J]. Journal of Scientific Computing, 2003, 19(1):95-122. [Google Scholar]
Li H, Liu F. Image denoising via sparse and redundant representations over learned dictionaries in wavelet domain[C]// 2009 Fifth International Conference on Image and Graphics. Washington D C: IEEE, 2009: 754-758. [Google Scholar]
Mairal J, Elad M, Sapiro G. Sparse representation for color image restoration[J]. IEEE Transactions on Image Processing, 2007, 17(1):53-69. [Google Scholar]
Aharon M, Elad M, Bruckstein A M. On the uniqueness of overcomplete dictionaries, and a practical way to retrieve them[J]. Linear Algebra & Its Applications, 2006, 416(1):48-67. [CrossRef] [MathSciNet] [Google Scholar]
Zha Z, Yuan X, Zhu C, et al. Image restoration via simultaneous nonlocal selfsimilarity priors[J]. IEEE Transactions on Image Processing, 2020, 29: 8561-8576. [NASA ADS] [CrossRef] [Google Scholar]
Gu S H, Zhang L, Zuo W M, et al. Weighted nuclear norm minimization with application to image denoising[C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition. Washington D C : IEEE, 2014: 2862-2869. [Google Scholar]
Zhang K, Zuo W M, Chen Y J, et al. Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising[J]. IEEE Transactions on Image Processing, 2016, 26(7):3142-3155. [Google Scholar]
Dong W S, Wang P Y, Yin W T, et al. Denoising prior driven deep neural network for image restoration[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(10): 2305-2318. [CrossRef] [PubMed] [Google Scholar]
Zhang K, Zuo W M, Gu S H, et al. Learning deep CNN denoiser prior for image restoration[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Washington D C : IEEE, 2017: 2808-2817. [Google Scholar]
Aharon M, Elad M, Bruckstein. A. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation[J]. IEEE Transactions on Signal Processing, 2006, 54(11): 4311-4322. [CrossRef] [Google Scholar]
Mairal J, Bach F, Ponce J, et al. Non-local sparse models for image restoration[C]// 2009 IEEE 12th International Conference on Computer Vision. Washington D C : IEEE, 2009: 2272-2279 [Google Scholar]
Dong W S, Zhang L, Shi G M, et al. Nonlocally centralized sparse representation for image restoration[J]. IEEE Transcation on Image Processing, 2013, 22(4):1620-1630. [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
Zhang J, Zhao D B, Gao W. Group-based sparse representation for image restoration[J]. IEEE Transactions on Image Processing, 2014, 23(8): 3336-3351. [NASA ADS] [CrossRef] [MathSciNet] [PubMed] [Google Scholar]
Zha Z Y, Yuan X, Wen B H, et al. Group sparsity residual constraint with non-local priors for image restoration[J]. IEEE Transactions on Image Processing, 2020, 29: 8960-8975. [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
Bai M, Zhang X, Shao Q. Adaptive correction procedure for TVL1 image deblurring under impulse noise[J]. Inverse Problems, 2016, 32(8): 289-338. [Google Scholar]
Rahimi Y, Wang C, Dong H B, et al. A scale-invariant approach for sparse signal recovery[J]. SIAM Journal on Scientific Computing, 2019, 41(6):A3649-A3672. [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]
Xu Z B, Chang X Y, Xu F M, et al. L-1/2 regularization: A thresholding representation theory and a fast solver[J]. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23(7):1013-1027. [CrossRef] [PubMed] [Google Scholar]
Zuo W M, Meng D Y, Zhang L, et al. A generalized iterated shrinkage algorithm for non-convex sparse coding[C]// 2013 IEEE International Conference on Computer Vision. Washington D C: IEEE, 2013: 217-224. [Google Scholar]
Jia X, Feng X. Bayesian inference for adaptive low rank and sparse matrix estimation[J]. Neurocomputing, 2018(5):71-83. [CrossRef] [Google Scholar]
Keller J M, Gray M R, Givens J A. A fuzzy k-nearest neighbor algorithm[J]. IEEE Transactions on Systems Man & Cybernetics,1985, SMC-15(4): 580-585. [CrossRef] [Google Scholar]
Dong W S, Lei Z, Shi G M, et al. Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization[J]. IEEE Transactions on Image Processing, 2011, 20(7):1838-1857. [CrossRef] [MathSciNet] [PubMed] [Google Scholar]
Buades A, Coll B, Morel J M. A non-local algorithm for image denoising[C]// 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05). Washington D C: IEEE, 2005, 2: 60-65. [Google Scholar]
Dabov K, Foi A, Katkovnik V, et al. Image denoising by sparse 3-D transform-domain collaborative filtering[J]. IEEE Transactions on Image Processing, 2007, 16(8):2080-2095. [CrossRef] [Google Scholar]
Xu J, Zhang L , Zuo W M, et al. Patch group based nonlocal self-similarity prior learning for image denoising[C]// 2015 IEEE International Conference on Computer Vision (ICCV). Washington D C: IEEE, 2016: 244-252. [Google Scholar]
Dong W S, Shi G M, Li X. Nonlocal image restoration with bilateral variance estimation: A low-rank approach[J]. IEEE Transactions on Image Processing, 2013, 22(2):700-711. [NASA ADS] [CrossRef] [MathSciNet] [PubMed] [Google Scholar]

All Tables

Table 1

PSNR and SSIM result by denoising methods

In the text

All Figures

	Fig. 1 Denoising results on image Barbaraby different methods (noise level $σ_{n} = 30$ )
In the text

	Fig. 2 Denoising results on image Mural by different methods (noise level $σ_{n} = 40$ )
In the text

	Fig. 3 Plots of residual errors and objective function for empirically demonstrating the convergence
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Blomgren P, Chan T F. Color TV: Total variation methods for restoration of vector-valued images[J]. IEEE Transactions on Image Processing, 1998, 7(3):304-309. [NASA ADS] [CrossRef] [PubMed] [Google Scholar]

[2] Bertalmio M, Caselles V, Rouge B, et al. A TV based restoration model with local constraints. [J]. Journal of Scientific Computing, 2003, 19(1):95-122. [Google Scholar]

[3] Li H, Liu F. Image denoising via sparse and redundant representations over learned dictionaries in wavelet domain[C]// 2009 Fifth International Conference on Image and Graphics. Washington D C: IEEE, 2009: 754-758. [Google Scholar]

[4] Mairal J, Elad M, Sapiro G. Sparse representation for color image restoration[J]. IEEE Transactions on Image Processing, 2007, 17(1):53-69. [Google Scholar]

[5] Aharon M, Elad M, Bruckstein A M. On the uniqueness of overcomplete dictionaries, and a practical way to retrieve them[J]. Linear Algebra & Its Applications, 2006, 416(1):48-67. [CrossRef] [MathSciNet] [Google Scholar]

[6] Zha Z, Yuan X, Zhu C, et al. Image restoration via simultaneous nonlocal selfsimilarity priors[J]. IEEE Transactions on Image Processing, 2020, 29: 8561-8576. [NASA ADS] [CrossRef] [Google Scholar]

[7] Gu S H, Zhang L, Zuo W M, et al. Weighted nuclear norm minimization with application to image denoising[C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition. Washington D C : IEEE, 2014: 2862-2869. [Google Scholar]

[8] Zhang K, Zuo W M, Chen Y J, et al. Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising[J]. IEEE Transactions on Image Processing, 2016, 26(7):3142-3155. [Google Scholar]

[9] Dong W S, Wang P Y, Yin W T, et al. Denoising prior driven deep neural network for image restoration[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(10): 2305-2318. [CrossRef] [PubMed] [Google Scholar]

[10] Zhang K, Zuo W M, Gu S H, et al. Learning deep CNN denoiser prior for image restoration[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Washington D C : IEEE, 2017: 2808-2817. [Google Scholar]

[11] Aharon M, Elad M, Bruckstein. A. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation[J]. IEEE Transactions on Signal Processing, 2006, 54(11): 4311-4322. [CrossRef] [Google Scholar]

[12] Mairal J, Bach F, Ponce J, et al. Non-local sparse models for image restoration[C]// 2009 IEEE 12th International Conference on Computer Vision. Washington D C : IEEE, 2009: 2272-2279 [Google Scholar]

[13] Dong W S, Zhang L, Shi G M, et al. Nonlocally centralized sparse representation for image restoration[J]. IEEE Transcation on Image Processing, 2013, 22(4):1620-1630. [NASA ADS] [CrossRef] [PubMed] [Google Scholar]

[14] Zhang J, Zhao D B, Gao W. Group-based sparse representation for image restoration[J]. IEEE Transactions on Image Processing, 2014, 23(8): 3336-3351. [NASA ADS] [CrossRef] [MathSciNet] [PubMed] [Google Scholar]

[15] Zha Z Y, Yuan X, Wen B H, et al. Group sparsity residual constraint with non-local priors for image restoration[J]. IEEE Transactions on Image Processing, 2020, 29: 8960-8975. [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]

[16] Bai M, Zhang X, Shao Q. Adaptive correction procedure for TVL1 image deblurring under impulse noise[J]. Inverse Problems, 2016, 32(8): 289-338. [Google Scholar]

[17] Rahimi Y, Wang C, Dong H B, et al. A scale-invariant approach for sparse signal recovery[J]. SIAM Journal on Scientific Computing, 2019, 41(6):A3649-A3672. [NASA ADS] [CrossRef] [MathSciNet] [Google Scholar]

[18] Xu Z B, Chang X Y, Xu F M, et al. L-1/2 regularization: A thresholding representation theory and a fast solver[J]. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23(7):1013-1027. [CrossRef] [PubMed] [Google Scholar]

[19] Zuo W M, Meng D Y, Zhang L, et al. A generalized iterated shrinkage algorithm for non-convex sparse coding[C]// 2013 IEEE International Conference on Computer Vision. Washington D C: IEEE, 2013: 217-224. [Google Scholar]

[20] Jia X, Feng X. Bayesian inference for adaptive low rank and sparse matrix estimation[J]. Neurocomputing, 2018(5):71-83. [CrossRef] [Google Scholar]

[21] Keller J M, Gray M R, Givens J A. A fuzzy k-nearest neighbor algorithm[J]. IEEE Transactions on Systems Man & Cybernetics,1985, SMC-15(4): 580-585. [CrossRef] [Google Scholar]

[22] Dong W S, Lei Z, Shi G M, et al. Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization[J]. IEEE Transactions on Image Processing, 2011, 20(7):1838-1857. [CrossRef] [MathSciNet] [PubMed] [Google Scholar]

[23] Buades A, Coll B, Morel J M. A non-local algorithm for image denoising[C]// 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05). Washington D C: IEEE, 2005, 2: 60-65. [Google Scholar]