Subgroup Analysis of a Single-Index Threshold Penalty Quantile Regression Model Based on Variable Selection

Hui QI; Yaxin XUE

doi:10.1051/wujns/2025302169

All issues

Volume 30 / No 2 (April 2025)

Wuhan Univ. J. Nat. Sci., 30 2 (2025) 169-183

Full HTML

Open Access

Issue		Wuhan Univ. J. Nat. Sci. Volume 30, Number 2, April 2025


Page(s)		169 - 183
DOI		https://doi.org/10.1051/wujns/2025302169
Published online		16 May 2025

Wuhan University Journal of Natural Sciences, 2025, Vol.30 No.2, 169-183

Mathematics

CLC number: O212.1

Subgroup Analysis of a Single-Index Threshold Penalty Quantile Regression Model Based on Variable Selection

基于变量选择的单指标阈值惩罚分位数回归模型的亚组分析

Hui QI (祁辉)¹^,2 and Yaxin XUE (薛雅心)³

¹ School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan 430073, Hubei, China
² Institute of Information Engineering, Sanming University, Sanming 365004, Fujian, China
³ Department of Biostatistics, School of Public Health, Fudan University, Shanghai 200032, China

Received: 28 June 2024

Abstract

In clinical research, subgroup analysis can help identify patient groups that respond better or worse to specific treatments, improve therapeutic effect and safety, and is of great significance in precision medicine. This article considers subgroup analysis methods for longitudinal data containing multiple covariates and biomarkers. We divide subgroups based on whether a linear combination of these biomarkers exceeds a predetermined threshold, and assess the heterogeneity of treatment effects across subgroups using the interaction between subgroups and exposure variables. Quantile regression is used to better characterize the global distribution of the response variable and sparsity penalties are imposed to achieve variable selection of covariates and biomarkers. The effectiveness of our proposed methodology for both variable selection and parameter estimation is verified through random simulations. Finally, we demonstrate the application of this method by analyzing data from the PA.3 trial, further illustrating the practicality of the method proposed in this paper.

摘要

在临床研究中，亚组分析可以帮助识别对特定治疗反应较好或较差的患者群体，提高治疗效果和安全性，在精准医疗中具有重要意义。为此，本文考虑了含有多个协变量和生物标志物的纵向数据下的亚组分析方法，并基于多个生物标志物的线性组合是否超过某一阈值来划分亚组，根据亚组与暴露变量间的交互作用来评估亚组间的异质效应。利用分位数回归方法更好地刻画响应变量的全局分布，并施加稀疏性惩罚实现协变量和生物标志物的变量选择。模拟研究验证了本文提出的估计方法在变量选择和参数估计方面的有效性，最后，将此方法应用到了PA.3试验的数据分析当中，进一步说明了本文方法的实用性。

Key words: longitudinal data / subgroup analysis / threshold model / quantile regression / variable selection

关键字 : 纵向数据 / 亚组分析 / 阈值模型 / 分位数回归 / 变量选择

Cite this article: QI Hui, XUE Yaxin. Subgroup Analysis of a Single-Index Threshold Penalty Quantile Regression Model Based on Variable Selection[J]. Wuhan Univ J of Nat Sci, 2025, 30(2): 169-183.

Biography: QI Hui, male, Associate professor, Ph. D. candidate, research direction: biostatistics, high dimensional statistics. E-mail: qh19810@126.com

Foundation item: Supported by the Natural Science Foundation of Fujian Province(2022J011177, 2024J01903) and the Key Project of Fujian Provincial Education Department(JZ230054)

© Wuhan University 2025

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

0 Introduction

Subgroups are usually defined based on biomarkers such as blood sugar, blood pressure, gene or protein expression levels, etc^[1]. When evaluating the efficacy of new therapy on patients in clinical studies, these characteristic differences may lead to different responses of the same therapy to different patients^[2]. Therefore, when physicians assess therapeutic effect, they should not only consider the average effect across the overall population, but also the heterogeneous effects within subgroups. Many methods for subgroup identification have proposed in research literature, such as method based on the tree structure^[3], global model^[4], clustering analysis^[5], etc. In clinical practice, a straightforward and easily interpretable approach is often preferred, which involves dividing subgroups according to whether a single continuous biomarker exceeds a certain threshold. This is the threshold model we focus on.

However, in the threshold model described above, a single variable is used to classify subgroups, which can result in a limited representation of subgroup characteristics and may be compromised by the inappropriate choice of the threshold variable. Numerous studies suggest that subgroups may arise from the combined influence of multiple variables, for instance, Jiang et al^[6] demonstrated that multiple single nucleotide polymorphisms (SNPs) can frequently alter the risk of developing specific diseases, whereas any single SNP alone lacks this capability; Fan et al^[7] showed that a risk score defined as a function of multiple predictors is useful for identifying subgroups of AIDS patients who benefit more from treatment. Similarly, Vander Weele et al^[8] discussed the importance of using multiple variables to identify optimal treatment subgroups. He et al^[9] and Wei et al^[10] extended this model to survival and longitudinal data, respectively. While they utilized penalization methods to identify significant biomarkers, they did not perform variable selection for the included covariates.

Moreover, the majority of current studies on threshold model primarily focus on traditional mean regression. However, sometimes we place greater emphasis on sample data at other quantiles, for example, we prefer to conduct heterogeneity analysis on patients with worse condition and provide targeted medical solutions. Consequently, we consider introducing quantile regression to analyze the threshold model. The threshold quantile regression was initially introduced by Cai et al^[11] and applied to quantile self-exciting threshold autoregressive time series models. Galvao et al^[12] studied the threshold quantile autoregressive processes. Furthermore, Lee et al^[13] developed the general tests for threshold effects in regression models, Su and Xu^[14] presented a systematic estimation and inference procedure for threshold quantile regression. On this basis, Zhang et al^[15] proposed a more general single-index threshold quantile regression model. However, there is no existing literature that addresses the variable selection for covariates and biomarkers in single-index threshold quantile regression.

In practical applications, incorporating excessive number of biomarkers for subgroup division may diminish the interpretability of the subgroups, and including excessive number of covariates may also introduce multicollinearity. Consequently, we prefer to select the variables most pertinent to the response from the model's explanatory variables. Common variable selection methods include those based on penalties. This method, initially proposed by Tibshirani in 1996^[16], is known as Lasso estimation, which uses an L1 norm penalty. Subsequent researches include: Fused Lasso penalty^[17], Adaptive Lasso^[18], Graphical Lasso^[19] and so on. Furthermore, as Lasso compresses all coefficients equally, the resulting estimator is often biased. To this end, Fan et al^[20] introduced the smoothly clipped absolute deviation (SCAD) penalty and proved that the estimation result satisfied the oracle property.

Our paper extends the single-index threshold quantile regression^[15] to longitudinal data and introduces the SCAD penalty for variable selection of covariates and biomarkers. To this end, we propose an efficiency algorithm similar to that proposed by Zhang et al^[21], which transforms the problem of estimating the covariate regression coefficients and threshold parameters into a penalized linear quantile regression. Furthermore, we obtain a degraded linear quantile regression through pseudo observations. In addition, the penalty for covariates and biomarkers should be performed separately. We combine the two-step grid search algorithm^[22] to select the compression parameters that minimize the Bayesian information criterion (BIC), and then substitute the pre-selected compression parameters into the iterative algorithm for parameter estimation. Finally, random simulations and example analysis illustrate the practicality of our method.

1 Models and Estimation

We assume that $y_{i j}$ is the response variable of the i-th individual at the j-th observation, $i = 1, \dots, n$ , $j = 1, \dots, m$ , total observations $N = n m$ , $z_{i j} = {(z_{1, i j}, \dots, z_{p, i j})}^{T}$ is the covariates with dimension of $p$ , where $z_{1, i j} = 1$ , ${\tilde{z}}_{i j}$ is a subset of $z_{i j}$ , and x_i= ${(x_{i 0}, \dots, x_{i d})}^{T}$ is the baseline biomarker including the intercept term. We consider the following single-index threshold quantile regression with $τ \in (0,1) : y_{i j} = z_{i j}^{T} β (τ) + {\tilde{z}}_{i j}^{T} δ (τ) I (x_{i}^{T} γ (τ) > 0) + ε_{i j},$

$i = 1, \dots, n, j = 1, \dots, m .$

The regression coefficients $β (τ)$ characterize the baseline effect of $z_{i j}$ at the $τ$ -quantile of $y_{i j}$ , and the regression coefficients $δ (τ)$ describe the heterogeneity of ${\tilde{z}}_{i j}$ within the subgroup, the threshold coefficients $γ (τ)$ determine the subgroup based on whether the linear combination $x_{i}^{T} γ (τ)$ exceeds 0. We consider time-invariant biomarkers to prevent time-varying biomarkers from dividing repeated observations of the same sample into different subgroups due to the influence of treatment. In addition, the error terms $(ε_{i 1}, \dots, ε_{i m})$ and $(ε_{k 1}, \dots, ε_{k m})$ are typically assumed to be independent for any $i \neq k$ stemming from the index set ${1, \dots, n}$ , but correlated within $(ε_{i 1}, \dots, ε_{i m})$ for any fixed $i \in {1, \dots, n}$ . Assume $P (ε_{i j} < 0 | z_{i j}, x_{i}) = τ .$

For the recognizability of $γ (τ)$ , reference Zhang et al^[15] and Horowitz^[23], assuming that at least one threshold variable has a nonzero coefficient, and the corresponding probability distribution is absolutely continuous with respect to the Lebesgue measure conditional on the other threshold variables. The variables in $x_{i}$ can be rearranged so that $x_{1 i}$ meets this condition, then the model can be rewritten as:

$Q_{y_{i j}} (τ | z_{i j}, x_{i}) = z_{i j}^{T} β (τ) + {\tilde{z}}_{i j}^{T} δ (τ) I (x_{1 i} + x_{2 i}^{T} ψ (τ) > 0)$

where $x_{i} = {(x_{1 i}, x_{2 i}^{T})}^{T}$ , $ψ (τ) = \frac{γ_{- 1} (τ)}{γ_{1} (τ)}$ with $γ_{1} (τ)$ and $γ_{- 1} (τ)$ corresponding to the coefficients of $x_{1 i}$ and $x_{2 i}$ , respectively. If $γ_{1} (τ)$ is positive, the normalization without altering the coefficients of $β (τ)$ and $δ (τ)$ . If $γ_{1} (τ)$ is negative, we should redefined $β (τ)$ and $δ (τ)$ to ensure the consistency of the model^[15]. Moreover, for the convenience of expression, we remove $(τ)$ in the parameters below.

In order to estimate all parameters $θ = {(β^{T}, δ^{T}, ψ^{T})}^{T}$ , a computationally efficient method is to ignore the possible correlation between repeated observations within the longitudinal data, that is, to minimize the following objective function under the independence framework:

$l_{n} (θ) = \frac{1}{n m} \sum_{i = 1}^{n} \sum_{j = 1}^{m} ρ_{τ} {y_{i j} - z_{i j}^{T} β - {\tilde{z}}_{i j}^{T} δ I (x_{1 i} + x_{2 i}^{T} ψ > 0)}$

where $ρ_{τ} (u) = {τ - I (u < 0)} u$ is the quantile loss function. In order to address the challenge of discontinuity in the objective function, similar to Seo and Linton^[24], we employ an integrated kernel function to smooth this characteristic function, for example, the cumulative distribution function of the standard normal distribution $Φ (\cdot)$ , then we can derive the smoothing objective function as following:

$S_{n} (θ; h_{n}) = \frac{1}{n m} \sum_{i = 1}^{n} \sum_{j = 1}^{m} ρ_{τ} {y_{i j} - z_{i j}^{T} β - {\tilde{z}}_{i j}^{T} δ Φ (\frac{x_{1 i} + x_{2 i}^{T} ψ}{h_{n}})}$

In refer to Zhang et al^[15], this article selects the bandwidth $h_{n} = 0.5 l o g (N) / \sqrt[]{N}$ .

Furthermore, it is hoped to select important covariates $z_{i j}$ and threshold variables $x_{2 i}$ . In practice, a few exposure variables of interest are included in ${\tilde{z}}_{i j}$ , such as treatment group indices or drug doses, thus variable selection for interaction terms is not considered. Consequently, the objective function, which includes corresponding penalty terms, is formulated as:

$Q_{n} (θ; h_{n}) = S_{n} (θ; h_{n}) + \sum_{l = 1}^{p} p_{λ_{1}} (| β_{l} |) + \sum_{l = 1}^{d} p_{λ_{2}} (| ψ_{l} |)$

where $p_{λ} (\cdot)$ is the penalty function, and $λ_{1}$ and $λ {}_{2}$ are the shrinkage adjustment parameters that control the partial linear regression parameters and threshold parameters, respectively. We consider the following non-convex SCAD penalty function:

$\begin{array}{l} p_{λ} (| α |) = λ | α | I (0 < | α | < λ) \\ + \frac{(a^{2} - 1) λ^{2} - {(| α | - a λ)}^{2}}{2 (a - 1)} I (λ < | α | < a λ) \\ + \frac{(a^{2} + 1) λ^{2}}{2} I (| α | \geq a λ), \end{array}$

when $λ > 0$ , and $a = 3.7$ . SCAD penalty approaches the optimal Bayesian risk^[20], therefore, the nuisance parameters involved in the model only are $λ_{1} > 0$ and $λ_{2} > 0$ .

Since the objective function $Q_{n} (θ; h_{n})$ is non-convex and not differentiable everywhere, we use the local linear approximation algorithm of Zou and Li^[25] to transform the non-convex SCAD penalty into a convex approximation form. Specifically,

$p_{λ_{1}} (| β_{l} |) \approx p_{λ_{1}} (| {\hat{β}}_{l}^{(0)} |) + p_{λ_{1}}^{'} (| {\hat{β}}_{l}^{(0)} |) (| β_{l} | - | {\hat{β}}_{l}^{(0)} |), f o r β_{l} \approx {\hat{β}}_{l}^{(0)}$

$p_{λ_{2}} (| ψ_{l} |) \approx p_{λ_{2}} (| {\hat{ψ}}_{l}^{(0)} |) + p_{λ_{2}}^{'} (| {\hat{ψ}}_{l}^{(0)} |) (| ψ_{l} | - | {\hat{ψ}}_{l}^{(0)} |), f o r ψ_{l} \approx {\hat{ψ}}_{l}^{(0)}$

Moreover, even if we employ a profile estimation similar to Zhang et al^[15] to estimate the slope ${(β^{T}, δ^{T})}^{T}$ and threshold parameters $ψ$ separately, the objective function is still non-convex with respect to $ψ$ with fixing ${(β^{T}, δ^{T})}^{T}$ . Owing to the slow computation speed of genetic algorithms and the sensitivity of the Nelder-Mead method to initial values, therefore, we employ the first-order Taylor linear approximation to $Φ$ ,

$Φ (\frac{x_{1 i} + x_{2 i}^{T} ψ}{h_{n}}) \approx Φ (\frac{x_{1 i} + x_{2 i}^{T} {\hat{ψ}}^{(0)}}{h_{n}}) + Φ^{'} (\frac{x_{1 i} + x_{2 i}^{T} {\hat{ψ}}^{(0)}}{h_{n}}) \frac{x_{2 i}^{T}}{h_{n}} (ψ - {\hat{ψ}}^{(0)}),$

where $Φ^{'} (\cdot)$ indicates the probability density function of the standard normal distribution. Accordingly, with initial estimation for $\hat{δ}$ and $(\hat{β}, \hat{ψ})$ , we can derive a new objective function $\tilde{Q} (β, ψ)$ with respect to $β$ and $ψ$ :

$\tilde{Q} (β, ψ) = \sum_{i = 1}^{n} \sum_{j = 1}^{m} ρ_{τ} ({\tilde{y}}_{i j} - z_{i j}^{T} β - {\tilde{x}}_{i}^{T} ψ)$

$+ N \sum_{l = 1}^{p} p_{λ_{1}}^{'} (| {\hat{β}}_{l} |) | β_{l} | + N \sum_{l = 1}^{d} p_{λ_{2}}^{'} (| {\hat{ψ}}_{l} |) | ψ_{l} |$

and

${\tilde{y}}_{i j} = y_{i j} - {\tilde{z}}_{i j}^{T} \hat{δ} Φ (\frac{x_{1 i} + x_{2 i}^{T} \hat{ψ}}{h_{n}}) + {\tilde{z}}_{i j}^{T} \hat{δ} Φ^{'} (\frac{x_{1 i} + x_{2 i}^{T} \hat{ψ}}{h_{n}}) \frac{x_{2 i}^{T}}{h_{n}} \hat{ψ},$

${\tilde{x}}_{i} = {\tilde{z}}_{i j}^{T} \hat{δ} Φ^{'} (\frac{x_{1 i} + x_{2 i}^{T} \hat{ψ}}{h_{n}}) \frac{x_{2 i}}{h_{n}} .$

Now, the estimation of $β$ and $ψ$ is essentially a penalized linear quantile regression problem, specific iterative algorithm is as follows:

Step 0 Initialize $({\hat{β}}^{(0)}, {\hat{ψ}}^{(0)})$ .

Step 1 Given $({\hat{β}}^{(k - 1)}, {\hat{ψ}}^{(k - 1)})$ , we have:

${\hat{δ}}^{(k)} = a r g m i n \sum_{i = 1}^{n} \sum_{j = 1}^{m} ρ_{τ} {y_{i j} - z_{i j}^{T} {\hat{β}}^{(k - 1)} - {\tilde{z}}_{i j}^{T} δ Φ (\frac{x_{1 i} + x_{2 i}^{T} {\hat{ψ}}^{(k - 1)}}{h_{n}})}$

Step 2 Given ${\hat{δ}}^{(k)}$ , the estimation $({\hat{β}}^{(k)}, {\hat{ψ}}^{(k)})$ is derived by minimizing the following objective function:

$\begin{array}{l} \tilde{Q} (β, ψ) = \sum_{i = 1}^{n} \sum_{j = 1}^{m} ρ_{τ} ({\tilde{y}}_{i j} - z_{i j}^{T} β - {\tilde{x}}_{i}^{T} ψ) \\ + N \sum_{l = 1}^{p} p_{λ_{1}}^{'} (| {\hat{β}}_{l}^{(k - 1)} |) | β_{l} | \\ + N \sum_{l = 1}^{d} p_{λ_{2}}^{'} (| {\hat{ψ}}_{l}^{(k - 1)} |) | ψ_{l} |, \end{array}$

and

$\begin{array}{l} {\tilde{y}}_{i j} = y_{i j} - {\tilde{z}}_{i j}^{T} {\hat{δ}}^{(k)} Φ (\frac{x_{1 i} + x_{2 i}^{T} {\hat{ψ}}^{(k - 1)}}{h_{n}}) \\ + {\tilde{z}}_{i j}^{T} {\hat{δ}}^{(k)} Φ^{'} (\frac{x_{1 i} + x_{2 i}^{T} {\hat{ψ}}^{(k - 1)}}{h_{n}}) \frac{x_{2 i}^{T}}{h_{n}} {\hat{ψ}}^{(k - 1)}, \end{array}$

set ${\tilde{x}}_{i} = {\tilde{z}}_{i j}^{T} {\hat{δ}}^{(k)} Φ^{'} (\frac{x_{1 i} + x_{2 i}^{T} {\hat{ψ}}^{(k - 1)}}{h_{n}}) \frac{x_{2 i}}{h_{n}}$ , repeat step 1 and step 2 until the L2 norm of adjacent estimation for all parameters are less than the tolerance error.

Finally, according to the facts $ρ_{τ} (c v) = c ρ_{τ} (v) (c > 0)$ and $| β_{l} | = ρ_{τ} (β_{l}) + ρ_{τ} (- β_{l})$ , set $N p_{λ_{1}}^{'} (| {\hat{β}}_{l}^{(k - 1)} |) = c_{l}$ , and $N p_{λ_{2}}^{'} (| ψ_{l}^{(k - 1)} |) = d_{l}$ , then, we rewrite the first and second penalty terms as follows:

$\sum_{l = 1}^{p} (ρ_{τ} (c_{l} β_{l}) + ρ_{τ} (- c_{l} β_{l})), \sum_{l = 1}^{d} (ρ_{τ} (d_{l} ψ_{l}) + ρ_{τ} (- d_{l} ψ_{l})) .$

Furthermore, we construct the "unpenalized" linear quantile regression through incorporating additional pseudo observations. The augmented form of the defining variables is as follows:

${\tilde{y}}_{l}^{+} = {\begin{array}{l} {\tilde{y}}_{l}, l = 1, \dots, N \\ 0, l = N + 1, \dots, N + 2 p + 2 d \end{array}$

${\tilde{z}}_{l}^{+} = {\begin{array}{l} z_{l}, l = 1, \dots, N, \\ (0, \dots, 0, c_{l}, 0, \dots, 0), l = N + 1, \dots, N + p, \\ (0, \dots, 0, - c_{l}, 0, \dots, 0), l = N + p + 1, \dots, N + 2 p, \\ (0, \dots, 0), l = N + 2 p + 1, \dots, N + 2 p + 2 d, \end{array}$

${\tilde{x}}_{l}^{+} = {\begin{array}{l} x_{l}, l = 1, \dots, N, \\ (0, \dots, 0), l = N + 1, \dots, N + 2 p, \\ (0, \dots, 0, d_{l}, 0, \dots, 0), l = N + 2 p + 1, \dots, N + 2 p + d, \\ (0, \dots, 0, - d_{l}, 0, \dots, 0), l = N + 2 p + d + 1, \dots, N + 2 p + 2 d, \end{array}$

then the objective function in Step 2 degenerates into linear quantile regression: $\sum_{l = 1}^{N + 2 p + 2 d} ρ_{τ} ({\tilde{y}}_{l}^{+} - {\tilde{z}}_{l}^{+}^{T} β - {\tilde{x}}_{l}^{+}^{T} ψ)$ ,and proceed with solving it.

In practice, BIC is frequently employed in variable selection problems for tuning parameter selection,

$\begin{array}{l} B I C (λ_{1}, λ_{2}) = l o g (\sum_{i = 1}^{n} \sum_{j = 1}^{m} ρ_{τ} {y_{i j} - z_{i j}^{T} {\hat{β}}_{λ_{1}} - {\tilde{z}}_{i j}^{T} δ Φ (\frac{x_{1 i} + x_{2 i}^{T} {\hat{ψ}}_{λ_{2}}}{h_{n}})}) \\ + \frac{l o g (N)}{2 N} d_{λ}, \end{array}$ $d_{λ}$ represents the total count of non-zero coefficients in both the slope and threshold components, while $N$ denotes the total number of observations. Following Ruppert and Carroll^[22], we employ a two-step grid search algorithm to derive the optimal tuning parameters $(λ_{1}, λ_{2})$ .

2 Simulation

In order to assess the performance of our proposed estimation in finite samples, we consider the following data generation model:

$y_{i j} = z_{i j}^{T} β_{0} + {\tilde{z}}_{i j}^{T} δ_{0} I {x_{1 i} + x_{2 i}^{T} ψ_{0} + 0.1 Φ^{- 1} (U_{i}) > 0} + (1 + 0.1 z_{2, i j}) ε_{i j},$

covariates $z_{i j} = {(1, z_{1, i j}, \dots, z_{p - 1, i j})}^{T}$ , and $z_{1, i j} \sim e x p (1)$ , $z_{2, i j} \sim N (- 1,1)$ , the remaining covariates $z_{3, i j}, \dots, z_{p - 1, i j}$ are independent and conform to uniform distribution $U (0,1)$ ; interaction variable ${\tilde{z}}_{i j} = {(1, z_{1, i j})}^{T}$ , threshold variables $x_{1 i} \sim N (0,1)$ , component $x_{2 i}^{*} \sim N (1,1)$ within $x_{2 i} = {(1, x_{2 i}^{*}, \dots, x_{d i}^{*})}^{T}$ , the remaining covariates $x_{3 i}^{*}, \dots, x_{d i}^{*}$ are independent and conform to uniform distribution $U (0,1)$ too; setting $β_{0} = {(1,1, 1,0, \dots, 0_{p - 3})}^{T}$ , $δ_{0} = {(1,1)}^{T}$ , and $ψ_{0} = {(- 1,1, 0, \dots, 0_{d - 2})}^{T}$ . However, in our data generation model, the true value of $(β_{01}, β_{03}, ψ_{01})$ varies with the quantile. The random error term $ε_{i j} = ξ + e_{i j}$ , where $ξ$ ensures $P (ε_{i j} < 0 | z_{i j}, x_{i}) = τ$ , we take into account two error distributions:

Case 1 The random error vector conforms to multivariate normal distribution, that is, $ε_{i} = {(ε_{i 1}, \dots, ε_{i m})}^{T}$ $\sim M V N (0, Σ)$ , and the covariance matrix $Σ$ follows $A R$ -1 correlation structure with correlation coefficient $ρ = 0.3$ , where the partial coefficients of the model are set to

$(β_{01}, β_{03}, ψ_{01}) = (1 + Φ^{- 1} (τ), 1 + 0.1 Φ^{- 1} (τ), 0.1 Φ^{- 1} (τ) - 1) .$

Case 2 The random error vector conforms to multivariate $t$ distribution, that is, $ε_{i} = {(ε_{i 1}, \dots, ε_{i m})}^{T}$

$\sim M V T_{3} (0, Σ)$ , and the covariance matrix shares the same structure, where $(β_{01}, β_{03}, ψ_{01}) = (1 + q t (τ, d f = 3), 1 + 0.1 \times q t (τ, d f = 3),$ $0.1 Φ^{- 1} (τ) - 1)$ , $q t (τ,$ $d f = 3)$ is the $τ$ quantile of the $t$ distribution with degrees of freedom (df) 3.

We will assess the performance of our estimation across varying numbers of parameters $(p = d = 10$ or $p = d = 20)$ or different quantiles $(τ = 0.25,0.5,0.75)$ , each random experiment yields $(n = 400 o r 800)$ samples, with each sample being observed $(m = 4)$ times, and the simulation is repeated 500 times. We assess the model's performance using four criterias: 1) True Negative (TN): This represents the average number of zero parameters that are correctly classified as zero by the model. A higher TN indicates better model performance; 2) False Negative (FN): The average number of non-zero parameters erroneously classify as zero. A lower FN indicates better model performance; 3) Correct (%): This represents the percentage of accurate true models with non-zero parameters that are correctly identified. The closer this percentage is to 100, the more accurate the model estimation effect is; 4) MSE: It represents the mean square error of the parameter estimation, smaller MSE indicates that the estimated value $\hat{θ}$ is closer to the true value $θ_{0}$ , and the better estimation effect of the model. By comparing the mean square error O.MSE of the oracle estimator with the mean square error P.MSE of our method, the P.MSE is closer to O.MSE means the better estimation effect of our approach.

Tables 1 and 2 present the variable selection outcomes of the single-index threshold quantile regression when the error terms follow a multivariate normal distribution or a multivariate T(3) distribution, respectively. In an ideal scenario, the component of TN for the linear segment is $p - 3$ , and for the threshold segment it is $d - 2$ . The oracle TN in scenario $p = d = 10$ and scenario $p = d = 20$ is 15 or 35, respectively. In all settings, our method can compress the true zero coefficients to zero and correctly identify all relevant covariates (FN=0), with P.MSE and the O.MSE is very close, which indicate that the model estimation is very effective.

Tables 3 and 4 show the estimators of the identified nonzero parameters by our method when the error terms follow a multivariate normal distribution or a multivariate T(3) distribution, respectively. The Biases for all nonzero parameters remain minimal across various parameter counts $(p = d = 10$ or $p = d = 20)$ or quantiles $(τ = 0.25,0.5,0.75)$ , and all metrics: Biases, SDs, and RMSEs decrease as the sample size increases.

We further conduct sensitivity comparisons for two bandwidths $r = l o g (N) / \sqrt[]{N}$ and $r^{a d j} = 1 / \sqrt[5]{N}$ under $p = d = 10$ , $τ = 0.5$ , while keeping the remaining settings as the same as before. The RMSEs of the proposed estimators in Fig. 1 show that these two bandwidths do not exert significant effects on the performance of the proposed method

Fig. 1 The sensitivity diagram of parameter estimation under the bandwidth rates

r

and

r^{a d j}

with

p = d = 10

and

τ = 0.5

Table 1

Variable selection in the case where $ε_{i} \sim M V N (0, Σ)$ and $Σ$ follows an $A R$ -1 structure

Table 2

Variable selection in the case where $ε_{i} \sim M V T_{3} (0, Σ)$ and $Σ$ follows an $A R$ -1 structure

Table 3

Estimation results of parameters in the case where $ε_{i} \sim M V N (0, Σ)$ and $Σ$ follows an $A R$ -1 structure

Table 4

Estimation results of parameters in the case where $ε_{i} \sim M V T_{3} (0, Σ)$ and $Σ$ follows an $A R$ -1 structure

3 Real Data Analysis

We apply the proposed method to analyze the PA.3 trial dataset conducted by the Canadian Cancer Group. In this trial, patients with locally advanced or metastatic pancreatic cancer were randomly assigned to the erlotinib plus gemcitabine group or the gemcitabine monotherapy group. In the primary analysis, the survival rate of the erlotinib plus gemcitabine group was significantly improved compared with the gemcitabine monotherapy group^[26]. Shultz et al^[27] found that when each biomarker was evaluated individually, none of them was significantly associated with erlotinib. However, He et al^[9] selected CA19-9 and AXL from 47 biomarkers to define the treatment-sensitive subgroup, and found that the erlotinib plus gemcitabine could prolong patient's survival time compared with gemcitabine monotherapy. Similarly, Wei et al^[10] selected CA19-9 and AXL from 6 biomarkers to define a treatment-sensitive subgroup and found that erlotinib plus gemcitabine improved patient's quality of life (QoL) scores compared with gemcitabine monotherapy. However, these studies only focused on selecting biomarkers for defining subgroups, not covariates. Moreover, none of these studies investigated whether distinct subgroup effects exist in patients across various QoL levels. Therefore, we use our method to select relevant covariates and biomarkers, and assess the differential therapeutic effect of erlotinib in subgroups defined by linear combinations of these biomarkers, as well as the differences in effects at different quantile levels of the outcomes.

A total of 377 patients were included in the analysis, of whom 196 received erlotinib treatment, and their quality of life was assessed using the global QoL score. For each patient in the trial, QoL scores were recorded at the baseline, as well as 4, 8, 12, and 16 weeks following the treatment. We focus on the changes in patient's QoL scores relative to the baseline (cQoL), and missing values have imputed using multiple imputation. The Q-Q plot of the imputed variable $y = c Q o L$ and the Shapiro-Wilk normality test indicate that the data do not conform to a normal distribution. The calculated skewness is 0.139 3, indicating a right skewed distribution, so that quantile regression is considered to be more suitable than the mean regression. The covariates included in the analysis are: baseline QoL score (bQoL), time point (Time), baseline age (Age), treatment index B (if the patient received erlotinib plus gemcitabine, B = 1; if the patient received gemcitabine alone, B=0), and pain level (Painmeas). The twelve biomarkers included in the analysis include proteins with lower overall survival and baseline expression levels associated with metastatic diseases, specifically: AXL: AXL receptor tyrosine kinase; CA 19-9: Carbohydrate antigen 19-9; IL8: Interleukin-8; CEA: Carcinoembryonic antigen; MUC-1: Mucin 1, cell surface associate; PDGFRA: Platelet-derived growth factor receptor; EGFR: Epidermal growth factor receptor; PDK1: Pyruvate dehydrogenase kinase 1; BMP-2: Bone morphogenetic protein 2; HER2: Erb-b2 receptor tyrosine kinase 2; PF4: Platelet factor 4; GAS6: Growth arrest-specific protein 6. We standardize the 12 biomarkers and impute missing values using the median.

It is worth noting that our model requires the development of significant biomarker $x_{1}$ and assumes its coefficient to be nonzero. With prior knowledge, we should primarily consider results from expert opinions or existing literature. In terms of analyzing the PA.3 dataset, existing literature has selected CA19-9 and AXL as significant biomarkers in mean regression^[10] and Cox regression^[9] analysis based on threshold models for $y$ . Hence, we may consider utilizing CA19-9 or AXL as significant threshold variable for defining subgroups, with one designate as $x_{1}$ and the rest store in $x_{2}$ , we can also compare the differences in subgroup analysis when different biomarker is designated as $x_{1}$ . We construct the following single-index threshold quantile regression to fit the data:

$\begin{array}{l} c Q o L_{i j} = β_{1} + β_{2} b Q o L_{i} + β_{3} T i m e_{i j} + β_{4} A g e_{i} + β_{5} B_{i} \\ + β_{6} P a i n m e a s_{i} + δ_{1} I (x_{1 i} + x_{2 i}^{T} ψ > 0) \\ + δ_{2} B_{i} I (x_{1 i} + x_{2 i}^{T} ψ > 0) + ε_{i j}, i = 1, \dots, 377, j = 1, \dots, 4 . \end{array}$

Set $y = c Q o L$ , $z = (1, b Q o L, T i m e, A g e, B, P a i n m e a n s)$ , $\tilde{z} = (1, B)$ , $x_{1} = C A 19$ - $9 o r A X L$ , $x = (1, C A 19$ - $9, A X L, I L 8, C E A$ ,

$M U C$ - $1, P D G F R A, E G F R, P D K 1, I P$ - $10, H E R 2, P F 4,$ $G A S 6),$

$x_{2}$ represents the part of $x$ minus $x_{1}$ , assuming that the error term satisfies $P (ε_{i j} < 0 | z_{i j}, x_{i}) = τ$ .

In order to conduct subgroup analysis of outcome variables at different levels as comprehensively as possible, we consider 7 quantile levels $τ = 0.2,0.3, \dots, 0.8$ , and the same two-step grid search method as in the simulation study is used to select the tuning parameters $λ_{1}$ and $λ_{2}$ . Table 5 presents the point estimator, bootstrap standard error (BS-SE), 95% confidence interval (95%-CI), and p-value of each parameter obtained at the above quantile levels with fixing $x_{1} = A X L$ .

From Table 5, it can be seen that in terms of variable selection, our method sets the coefficients for observation time (Time) and treatment index variable (B) to 0 across all quantiles ranging from 0.2 to 0.8, additionally, it sets the coefficients for age (Age) to 0 at quantiles 0.4 to 0.8, while the effect at quantiles 0.2 to 0.3 have not statistically significant. Thus, we conclude that the effects of observation time (Time) and age (Age) on patient's QoL scores are not significant. However, the selection of treatment effect (B) as an unrelated variable may be due to the masking of the interaction effect between treatment and subgroups, resulting in an insignificant effect, which requires further confirmation in stratified analysis. In addition, CA19-9 is selected as an important biomarker at all quantiles, and the coefficients of the remaining 10 biomarkers are compressed to 0. The biomarkers we selected are consistent with the results of He et al^[9] and Wei et al^[10].

In terms of subgroup analysis, within the mid or low percentiles (0.2-0.6), the p-value of the interaction coefficient $δ_{2}$ between treatment and subgroup is less than 0.05, this demonstrates a significant difference in therapeutic effect between the two subgroups at the 0.05 significance level. However, within the high percentile range (0.7-0.8), the corresponding p-value is met with $p > 0.05$ , this demonstrates that the difference in therapeutic effect between the two subgroups is not significant at the 0.05 significance level, the possible reason is that there is no treatment sensitive subgroup, and further exploration is required through stratified analysis.

Next, we define the biomarker positive subgroup and the biomarker negative subgroup according to linear combination of fixing biomarker AXL, intercept and selected biomarker CA19-9, corresponding coefficients $(ψ_{0}, ψ_{1})$ as shown in Table 5. Linear quantile regression has conducted on all covariates separately for the samples in two subgroups, the estimated therapeutic effect and their 95% confidence intervals at different quantiles present in Fig. 2, and the confidence intervals are derived from rank tests. Figure 2 reveals that there is significant heterogeneity in therapeutic effect between the two subgroups. Specifically, at all quantiles, within the biomarker positive subgroup, the therapeutic effect are all significant positive values, and the 95% confidence intervals mostly fall above the zero line, which indicate that erlotinib has a positive impact on improving QoL scores. Therefore, we believe that at fixed $x_{1} = A X L$ , the identified biomarker positive subgroup is the treatment sensitive subgroup, in which patients receiving treatment of erlotinib plus gemcitabine have significantly improved quality of life scores compared to treatment with gemcitabine alone. However, within the biomarker negative subgroup, the 95% confidence interval for therapeutic effect remains zero, which indicates that the results are not significant. Therefore, we believe that in this subgroup, patients do not benefit more from erlotinib plus gemcitabine treatment compared to treatment with gemcitabine alone.

Fig. 2 The 95% confidence intervals of therapeutic effect when fixing

x_{1} = A X L

Moreover, Table 6 presents the same four estimators at the corresponding quantile level with fixing $x_{1} = C A 19$ -9. As can be seen from Table 6, the corresponding main results of variable selection and subgroup analysis are similar to those in Table 5 with fixing $x_{1} = A X L$ .

Similarly, we also define the biomarker positive subgroup and the biomarker negative subgroup according to linear combination of fixing biomarker CA19-9. The corresponding estimated therapeutic effect and their 95% confidence intervals at different quantiles present in Fig. 3. Figure 3 shows that there is still significant heterogeneity in the therapeutic effect between the two subgroups. But unlike in the case of fixing $x_{1} = A X L$ , when $x_{1} = C A 19$ -9 is fixed, the treatment sensitive subgroup with significant therapeutic effect is the identified biomarker negative subgroup, patients in this subgroup who receive treatment with erlotinib plus gemcitabine will significantly improve their overall quality of life score, however, within the biomarker positive subgroup, patients did not benefit more from the combination therapy of erlotinib and gemcitabine. This indicates that the 0-1 values of the subgroup index variables have reversed in both scenarios, that is, the interaction coefficient $δ_{2}$ between treatment and subgroups is negative when $x_{1} = C A 19$ -9, and positive when $x_{1} = A X L$ . It also explains why the coefficient $β_{5}$ of the treatment term is selected as a significant correlation variable when $x_{1} = C A 19$ -9, while it is compressed to 0 when $x_{1} = A X L$ .

Fig. 3 The 95% confidence intervals of therapeutic effect when fixing

x_{1} = C A 19

-9

In summary, combining the results of subgroup analysis for setting $x_{1} = A X L$ and $x_{1} = C A 19$ -9, we can see that our method can identify treatment sensitive subgroups. In this subgroup, patients receiving treatment of erlotinib plus gemcitabine are more beneficial in improving their QoL score compared with using gemcitabine alone, and erlotinib is effective for patients with different QoL levels.

Finally, when fixing $x_{1} = A X L$ or $x_{1} = C A 19$ -9 separately, we also want to compare whether the defined subgroups are consistent. We draw Fig. 4 to visualize the overlap of the subgroups divided in both cases, when fixing $x_{1} = A X L$ , the treatment sensitive subgroup corresponding to the biomarker positive subgroup is defined as group 1; the treatment insensitive subgroup corresponding to the biomarker negative subgroup is defined as group 2. Conversely, when fixing $x_{1} = C A 19$ -9, the treatment sensitive subgroup corresponding to the biomarker negative subgroup is defined as group 1; the treatment insensitive subgroup corresponding to the biomarker positive subgroup is defined as group 2. Figure 4 shows the intersection and union of group 1 and group 2 in two different scenarios, the orange bars corresponding to 11 indicate that they are all assigned to group 1 in both scenarios, the blue bars corresponding to 22 signify that they are all assigned to group 2, 12 and 21 represent the number of individuals assigned to distinct subgroups in each scenario, with no intersection, and are uniformly displayed with gray bars.

Fig. 4 Schematic diagram of subgroup overlap for two different settings

In order to clearly display the proportion of patients divided into different subgroups in both scenarios, we set the highest count point of each subgraph to 100 and compare the height of the orange and gray bars. It can be found that the overlap of the treatment sensitive subgroups identified in both scenarios is very high, which indicates that the application of our method for subgroup identification has good robustness in the selection of $x_{1}$ .

Table 5

Estimation results of parameters when fixing $x_{1} = A X L$

Table 6

Estimation results of parameters when fixing $x_{1} = C A 19$ -9

4 Conclusion

This article considers a single-index threshold quantile regression for subgroup analysis, and selects covariates and biomarkers for defining subgroups, an efficient algorithm is proposed to smooth and locally linearly approximate the subgroup index function and non-convex penalty, respectively. Based on pseudo observations, the corresponding estimation problem degenerates into linear quantile regression, and all unknown parameters are iteratively solved through a two-step estimation process. Numerical simulations demonstrate that the proposed algorithm performs well with a moderate number of variables. Analysis of real data from the PA.3 trial shows that it can distinguish patient between treatment sensitive and treatment insensitive subgroups.

However, our model only considers the case of one threshold, so patient can only be divided into two subgroups. Furthermore, multiple thresholds can be introduced to divide subjects into multiple subgroups with different covariate effects, for related studies, please refer to Li et al^[28-29]. In addition, the recognizability constraint in our model requires specifying a threshold variable $x_{1}$ with non-zero coefficient. In practical applications, this variable can be determined through professional knowledge. However, when prior knowledge is not available, it may be achieved by extending the score type specification test method designed by Zhang et al^[30] for selecting the non-zero threshold variables. In addition, considering the internal correlation of longitudinal data and conducting variable selection and parameter estimation under certain condition $p > n$ is our future research direction.

References

Alosh M, Huque M F, Bretz F, et al. Tutorial on statistical considerations on subgroup analysis in confirmatory clinical trials[J]. Statistics in Medicine, 2017, 36(8): 1334-1360. [Google Scholar]
Sachdev J C, Sandoval A C, Jahanzeb M. Update on precision medicine in breast cancer[J]. Cancer Treatment and Research, 2019, 178: 45-80. [Google Scholar]
Su X, Tsai C L, Wang H, et al. Subgroup analysis via recursive partitioning[J]. Journal of Machine Learning Research, 2009, 10(2): 141-158. [Google Scholar]
Lipkovich I, Dmitrienko A, Denne J, et al. Subgroup identification based on differential effect search: A recursive partitioning method for establishing response to treatment in patient subpopulations[J]. Statistics in Medicine, 2011, 30(21): 2601-2621. [Google Scholar]
Zhang Z H, Seibold H, Vettore M V, et al. Subgroup identification in clinical trials: An overview of available methods and their implementations with R[J]. Annals of Translational Medicine, 2018, 6(7): 122. [Google Scholar]
Jiang Z Y, Du C G, Jablensky A, et al. Analysis of schizophrenia data using a nonlinear threshold index logistic model[J]. PLoS One, 2014, 9(10): e109454. [Google Scholar]
Fan A L, Song R, Lu W B. Change-plane analysis for subgroup detection and sample size calculation[J]. Journal of the American Statistical Association, 2017, 112(518): 769-778. [Google Scholar]
Vander Weele T J, Luedtke A R, Vander Laan M J, et al. Selecting optimal subgroups for treatment using many covariates[J]. Epidemiology, 2019, 30(3): 334-341. [Google Scholar]
He Y, Lin H Z, Tu D S. A single-index threshold Cox proportional hazard model for identifying a treatment-sensitive subset based on multiple biomarkers[J]. Statistics in Medicine, 2018, 37(23): 3267-3279. [Google Scholar]
Wei K C, Zhu H C, Qin G Y, et al. Multiply robust subgroup analysis based on a single-index threshold linear marginal model for longitudinal data with dropouts[J]. Statistics in Medicine, 2022, 41(15): 2822-2839. [Google Scholar]
Cai Y Z, Stander J. Quantile self-exciting threshold autoregressive time series models[J]. Journal of Time Series Analysis, 2008, 29(1): 186-202. [Google Scholar]
Galvao Jr A F, Montes‐Rojas G, Olmo J. Threshold quantile autoregressive models[J]. Journal of Time Series Analysis, 2011, 32(3): 253-267. [Google Scholar]
Lee S, Seo M H, Shin Y. Testing for threshold effects in regression models[J]. Journal of the American Statistical Association, 2011, 106(493): 220-231. [Google Scholar]
Su L J, Xu P. Common threshold in quantile regressions with an application to pricing for reputation[J]. Econometric Reviews, 2019, 38(4): 417-450. [Google Scholar]
Zhang Y Y, Wang H J, Zhu Z Y. Single-index thresholding in quantile regression[J]. Journal of the American Statistical Association, 2022, 117(540): 2222-2237. [Google Scholar]
Tibshirani R. Regression shrinkage and selection via the lasso[J]. Journal of the Royal Statistical Society Series B: Statistical Methodology, 1996, 58(1): 267-288. [CrossRef] [Google Scholar]
Tibshirani R, Saunders M, Rosset S, et al. Sparsity and smoothness via the fused lasso[J]. Journal of the Royal Statistical Society Series B: Statistical Methodology, 2005, 67(1): 91-108. [Google Scholar]
Zou H. The adaptive lasso and its oracle properties[J]. Journal of the American Statistical Association, 2006, 101(476): 1418-1429. [CrossRef] [Google Scholar]
Friedman J, Hastie T, Tibshirani R. Sparse inverse covariance estimation with the graphical lasso[J]. Biostatistics, 2008, 9(3): 432-441. [CrossRef] [PubMed] [Google Scholar]
Fan J Q, Li R Z. Variable selection via nonconcave penalized likelihood and its oracle properties[J]. Journal of the American Statistical Association, 2001, 96(456): 1348-1360. [CrossRef] [Google Scholar]
Zhang Y, Lian H, Yu Y. Ultra-high dimensional single-index quantile regression[J]. Journal of Machine Learning Research, 2020, 21(224): 1-25. [Google Scholar]
Ruppert D, Carroll R J. Theory & methods: Spatially-adaptive penalties for spline fitting[J]. Australian & New Zealand Journal of Statistics, 2000, 42(2): 205-223. [Google Scholar]
Horowitz J L. A smoothed maximum score estimator for the binary response model[J]. Econometrica, 1992, 60(3): 505. [Google Scholar]
Seo M H, Linton O. A smoothed least squares estimator for threshold regression models[J]. Journal of Econometrics, 2007, 141(2): 704-735. [Google Scholar]
Zou H, Li R Z. One-step sparse estimates in nonconcave penalized likelihood models[J]. Annals of Statistics, 2008, 36(4): 1509-1533. [Google Scholar]
Moore M J, Goldstein D, Hamm J, et al. Erlotinib plus gemcitabine compared with gemcitabine alone in patients with advanced pancreatic cancer: A phase III trial of the National Cancer Institute of Canada Clinical Trials Group[J]. Journal of Clinical Oncology, 2007, 25(15): 1960-1966. [Google Scholar]
Shultz D B, Pai J, Chiu W, et al. A novel biomarker panel examining response to gemcitabine with or without erlotinib for pancreatic cancer therapy in NCIC clinical trials group PA.3[J]. PLoS One, 2016, 11(1): e0147995. [Google Scholar]
Li J L, Jin B S. Multi-threshold accelerated failure time model[J]. The Annals of Statistics, 2018, 46(6A): 2657-2682. [Google Scholar]
Li J L, Li Y G, Jin B S, et al. Multithreshold change plane model: Estimation theory and applications in subgroup identification[J]. Statistics in Medicine, 2021, 40(15): 3440-3459. [Google Scholar]
Zhang L W, Wang H J, Zhu Z Y. Testing for change points due to a covariate threshold in quantile regression[J]. Statistica Sinica, 2014, 24(4): 1859-1877. [Google Scholar]

All Tables

Table 1

Variable selection in the case where $ε_{i} \sim M V N (0, Σ)$ and $Σ$ follows an $A R$ -1 structure

In the text

Table 2

Variable selection in the case where $ε_{i} \sim M V T_{3} (0, Σ)$ and $Σ$ follows an $A R$ -1 structure

In the text

Table 3

Estimation results of parameters in the case where $ε_{i} \sim M V N (0, Σ)$ and $Σ$ follows an $A R$ -1 structure

In the text

Table 4

Estimation results of parameters in the case where $ε_{i} \sim M V T_{3} (0, Σ)$ and $Σ$ follows an $A R$ -1 structure

In the text

Table 5

Estimation results of parameters when fixing $x_{1} = A X L$

In the text

Table 6

Estimation results of parameters when fixing $x_{1} = C A 19$ -9

In the text

All Figures

	Fig. 1 The sensitivity diagram of parameter estimation under the bandwidth rates $r$ and $r^{a d j}$ with $p = d = 10$ and $τ = 0.5$
In the text

	Fig. 2 The 95% confidence intervals of therapeutic effect when fixing $x_{1} = A X L$
In the text

	Fig. 3 The 95% confidence intervals of therapeutic effect when fixing $x_{1} = C A 19$ -9
In the text

	Fig. 4 Schematic diagram of subgroup overlap for two different settings
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Alosh M, Huque M F, Bretz F, et al. Tutorial on statistical considerations on subgroup analysis in confirmatory clinical trials[J]. Statistics in Medicine, 2017, 36(8): 1334-1360. [Google Scholar]

[2] Sachdev J C, Sandoval A C, Jahanzeb M. Update on precision medicine in breast cancer[J]. Cancer Treatment and Research, 2019, 178: 45-80. [Google Scholar]

[3] Su X, Tsai C L, Wang H, et al. Subgroup analysis via recursive partitioning[J]. Journal of Machine Learning Research, 2009, 10(2): 141-158. [Google Scholar]

[4] Lipkovich I, Dmitrienko A, Denne J, et al. Subgroup identification based on differential effect search: A recursive partitioning method for establishing response to treatment in patient subpopulations[J]. Statistics in Medicine, 2011, 30(21): 2601-2621. [Google Scholar]

[5] Zhang Z H, Seibold H, Vettore M V, et al. Subgroup identification in clinical trials: An overview of available methods and their implementations with R[J]. Annals of Translational Medicine, 2018, 6(7): 122. [Google Scholar]

[6] Jiang Z Y, Du C G, Jablensky A, et al. Analysis of schizophrenia data using a nonlinear threshold index logistic model[J]. PLoS One, 2014, 9(10): e109454. [Google Scholar]

[7] Fan A L, Song R, Lu W B. Change-plane analysis for subgroup detection and sample size calculation[J]. Journal of the American Statistical Association, 2017, 112(518): 769-778. [Google Scholar]

[8] Vander Weele T J, Luedtke A R, Vander Laan M J, et al. Selecting optimal subgroups for treatment using many covariates[J]. Epidemiology, 2019, 30(3): 334-341. [Google Scholar]

[9] He Y, Lin H Z, Tu D S. A single-index threshold Cox proportional hazard model for identifying a treatment-sensitive subset based on multiple biomarkers[J]. Statistics in Medicine, 2018, 37(23): 3267-3279. [Google Scholar]

[10] Wei K C, Zhu H C, Qin G Y, et al. Multiply robust subgroup analysis based on a single-index threshold linear marginal model for longitudinal data with dropouts[J]. Statistics in Medicine, 2022, 41(15): 2822-2839. [Google Scholar]

[11] Cai Y Z, Stander J. Quantile self-exciting threshold autoregressive time series models[J]. Journal of Time Series Analysis, 2008, 29(1): 186-202. [Google Scholar]

[12] Galvao Jr A F, Montes‐Rojas G, Olmo J. Threshold quantile autoregressive models[J]. Journal of Time Series Analysis, 2011, 32(3): 253-267. [Google Scholar]

[13] Lee S, Seo M H, Shin Y. Testing for threshold effects in regression models[J]. Journal of the American Statistical Association, 2011, 106(493): 220-231. [Google Scholar]

[14] Su L J, Xu P. Common threshold in quantile regressions with an application to pricing for reputation[J]. Econometric Reviews, 2019, 38(4): 417-450. [Google Scholar]

[15] Zhang Y Y, Wang H J, Zhu Z Y. Single-index thresholding in quantile regression[J]. Journal of the American Statistical Association, 2022, 117(540): 2222-2237. [Google Scholar]

[16] Tibshirani R. Regression shrinkage and selection via the lasso[J]. Journal of the Royal Statistical Society Series B: Statistical Methodology, 1996, 58(1): 267-288. [CrossRef] [Google Scholar]

[17] Tibshirani R, Saunders M, Rosset S, et al. Sparsity and smoothness via the fused lasso[J]. Journal of the Royal Statistical Society Series B: Statistical Methodology, 2005, 67(1): 91-108. [Google Scholar]

[18] Zou H. The adaptive lasso and its oracle properties[J]. Journal of the American Statistical Association, 2006, 101(476): 1418-1429. [CrossRef] [Google Scholar]

[19] Friedman J, Hastie T, Tibshirani R. Sparse inverse covariance estimation with the graphical lasso[J]. Biostatistics, 2008, 9(3): 432-441. [CrossRef] [PubMed] [Google Scholar]

[20] Fan J Q, Li R Z. Variable selection via nonconcave penalized likelihood and its oracle properties[J]. Journal of the American Statistical Association, 2001, 96(456): 1348-1360. [CrossRef] [Google Scholar]

[21] Zhang Y, Lian H, Yu Y. Ultra-high dimensional single-index quantile regression[J]. Journal of Machine Learning Research, 2020, 21(224): 1-25. [Google Scholar]

[22] Ruppert D, Carroll R J. Theory & methods: Spatially-adaptive penalties for spline fitting[J]. Australian & New Zealand Journal of Statistics, 2000, 42(2): 205-223. [Google Scholar]

[23] Horowitz J L. A smoothed maximum score estimator for the binary response model[J]. Econometrica, 1992, 60(3): 505. [Google Scholar]

[24] Seo M H, Linton O. A smoothed least squares estimator for threshold regression models[J]. Journal of Econometrics, 2007, 141(2): 704-735. [Google Scholar]

[25] Zou H, Li R Z. One-step sparse estimates in nonconcave penalized likelihood models[J]. Annals of Statistics, 2008, 36(4): 1509-1533. [Google Scholar]

[26] Moore M J, Goldstein D, Hamm J, et al. Erlotinib plus gemcitabine compared with gemcitabine alone in patients with advanced pancreatic cancer: A phase III trial of the National Cancer Institute of Canada Clinical Trials Group[J]. Journal of Clinical Oncology, 2007, 25(15): 1960-1966. [Google Scholar]

[27] Shultz D B, Pai J, Chiu W, et al. A novel biomarker panel examining response to gemcitabine with or without erlotinib for pancreatic cancer therapy in NCIC clinical trials group PA.3[J]. PLoS One, 2016, 11(1): e0147995. [Google Scholar]

[28] Li J L, Jin B S. Multi-threshold accelerated failure time model[J]. The Annals of Statistics, 2018, 46(6A): 2657-2682. [Google Scholar]

[29] Li J L, Li Y G, Jin B S, et al. Multithreshold change plane model: Estimation theory and applications in subgroup identification[J]. Statistics in Medicine, 2021, 40(15): 3440-3459. [Google Scholar]

[30] Zhang L W, Wang H J, Zhu Z Y. Testing for change points due to a covariate threshold in quantile regression[J]. Statistica Sinica, 2014, 24(4): 1859-1877. [Google Scholar]