Issue 
Wuhan Univ. J. Nat. Sci.
Volume 27, Number 2, April 2022



Page(s)  161  168  
DOI  https://doi.org/10.1051/wujns/2022272161  
Published online  20 May 2022 
Physics
CLC number: O572.21+3
Using Deep Learning Algorithms to Improve Energy Resolution in the Semileptonic Decays
School of Physics and Technology, Wuhan University, Wuhan
430072, Hubei, China
^{†} To whom correspondence should be addressed.hcai@whu.edu.cn;
sunl@whu.edu.cn
Received:
10
February
2022
The neutrino closure method can be used to obtain the decay kinematics with one missing final state particle (ν) in semileptonic decays. Its solution should give the square of the invariant mass of the lv system (q^{2}) and momentum (P) of the decayed mother particle in semileptonic decay process. However, the resolution obtained by solving twosolution problems with existing algorithms is limited. We propose a new method based on deep learning to improve the resolution of the two key physical quantities when processing Large Hadron Collider beauty (LHCb) experimental data. Resolution of q^{2} (P) can be improved evenly 1.7% (8.2%) by regression algorithm and 2.7% (9.6%) by classification algorithm compared to linear regression algorithm. The resolution improvements using the new method will benefit the studies on semileptonic decays in hardon collider experiments. Moreover, the new method can be applied to other decays with a missing particle in the final state.
Key words: Large Hadron Collider beauty (LHCb) experiment / semileptonic decay / improving resolution / deep neural network
Biography: WANG Yang, male, Master candidate, research direction: particle physics experiment. Email: wangya@whu.edu.cn
Foundation item: Supported by the National Natural Science Foundation of China (11735010, U1932108, U2032102, 12061131006)
© Wuhan University 2022
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
0 Introduction
Large Hadron Collider beauty (LHCb) experiment^{[1]} has the advantages of high energy and large data statistics, and it also provides an excellent opportunity to study semileptonic decay^{[2]}. However, great challenges are still required to unravel. Determining the square of the invariant mass (q ^{2}) of the lv system (lepton and neutrino system) in the hadron collider experiment is a challenging and important task. The form factors in decay are constituted by q ^{2}, improving the precision of form factor via the improvement of resolution of q ^{2}. The CabibboKobayashiMaskawa (CKM) matrix elements can be extracted in a more precise way based on the resolution improvement of q ^{2} ^{[35]}.
The q ^{2} can be researched via momentum of particle in semileptonic decay process. But the neutrino cannot be detected because of strong penetration, and then momentum of the parent particle does not include the component of neutrino in semileptonic decay. Neutrino closure method^{[6]} can be used to reconstruct the momentum of decayed parent particle in semileptonic decay processes, which is based on topological information and mass hypothesis. It can be applied in the research of q ^{2} in order to better solve the impact of missing neutrino on decayed parent momentum. However, this method has a limitation of the twosolution problem in quadratic equation. The original method is selected randomly by random seed, but the resolution of q ^{2} is not satisfactory. Recently, a novel method^{[7]} based on linear regression algorithm (LR) has been proposed to solve the twosolution problem. It makes sense to find ways to enhance further the resolution improvement of q ^{2} and the momentum of the decayed parent particle.
This paper proposes a new idea based on deep neural network to solve the twosolution problem in neutrino closure method. The error of form factor and CKM elements in semileptonic decay are expected to be further reduced by our new idea.
1 Simulation of Semileptonic Decay
In this paper, RapidSim^{[8]} is used to generate the sample events of different decay processes in the environment of LHCb^{[9]} , which request acceptance is “AllDownStream”. Pythia 8.2^{[10]} is used to produce pp collisions sample events at the center of mass energy of 13 TeV. The decay mode of hadrons is described by EvtGen^{[11]} and the radiation of the final state particle described by PHOTOS^{[12]} .
Different semileptonic decay processes in LHCb experiment we selected in this research are ^{[13]}, (, )^{[14]}, ^{[15]} and . We define the momentum of particle as P, the transverse momentum relative to the Zaxis as P_{T} and the component of momentum along the Zaxis as P_{z} by P_{T} = P·sinθ in this simulation. The pseudorapidity of a particle is defined as η = ln(tan(θ/2)). In order to make the generated sample meet the acceptance requirement of the LHCb^{[16]}, the pure sample events are selected by pseudorapidity (η), transverse momentum (P_{T}) and momentum (P) of particles for different decay processes. The signal instance requests 2<η<5 for and , and for requests 1.7<η <5.2 to be met in LHCb detector^{[17]}. The distribution of η of , and are shown in Fig. 1. The selection conditions for different decay processes^{[18,19]} are shown in Table 1. Meanwhile, the distribution of P and P_{T} are shown in Fig. 2 and Fig. 3. The natural unit c(light speed) =1 is used in this paper. The different decay process will use other bracketed number expressed in Table 1. “(1)” is used to represent decay process, “(2)” is used to represent decay process, “(3)” is used to represent and, “(4)” is used to represent decay process.
Fig.1 The distribution of pseudorapidity (η) of decayed particles (a) , (b) , (c) and (d) 
Fig. 2 The distribution of momentum of decayed particles for total decay processes in sample events (a) , (b) , (c) and (d) ; The natural unit c (light speed)=1 
Fig. 3 The distribution of transverse momentum of decayed particles for total decay processes in sample events (a) , (b) , (c) and (d) ; The natural unit c (light speed)=1 
Selection condition for total decay processes
2 Variables Used to Correct Momentum of Particles
Before reconstructing the q ^{2}, the momentum of parent particle must be reconstructed. The final state particles generated by parent particle decay process, the momentum of final particles, have a direct relationship with momen tum of the parent particle because of momentum conservation. Meanwhile threemomentum and mass information of the final state particles are important for reconstructing the 4D Lorentz vector of parent particle for every decay process. So the simulated threemomentum (P_{x}, P_{y}, P_{z}) of final state particles will be chosen as a part of the input features.
As mentioned in the research of ^{[13]}, the corrected momentum of the can be predicted by linear regression (LR) using the variables related to the flight vector. Flight vector is crucial because it is used to model the resolution of the momentum and calculate the corrected mass. Another part of input features we need to consider in this study is the line between the primary vertex and the decay vertex of the decayed particles. Due to the characteristics of the nonlinear model, we directly select the coordinate of the primary vertex and the end vertex on X, Y, Z as a part of the input features. Meanwhile, the angle between P and P_{z} of particles in the spherical coordinate system are chosen in order to reduce the smear effect of the primary vertex and end vertex. The same strategy will be used for the other three decay processes.
The impact parameters of final state particles are chosen to be the input variables because the impact parameter is critical to the multiplicity and average transverse momentum of the particles produced by the collision.
3 Introduction of Deep Neural Network Algorithm
3.1 Deep Neural Network Regression Algorithm
Fully connected regression deep neural network is chosen to predict momentum. The total variables we mentioned in Section 2 are regarded as the input feature of fully connected deep neural network regression model (DNNR). The structure of DNNR constructed by us consists of four layers with the PReLU^{[20]} activation, and its optimizer is Adam^{[21]} in order to predict the momentum of decayed parent particles more accurately, the model is based on Keras^{[22]} and Tensorflow package^{[23]}. The details of regression algorithm are shown in Algorithm 1.
Testing samples generated by using the same option with the training samples, training samples are used to train machine learning model and testing samples are used to test model and do physical calculation. The Scikitlearn package^{[24]} is used to split sample randomly. The uproot4 package, one part of the HepML^{[25]} package, is used to import data to the machine learning models. The Numpy package^{[26]} will be used to transform data to the array when validating the model. Meanwhile, calculation of correlation coefficients is supported by the PANDAS package^{[27]}. The training process is based on NVIDIA GeForce RTX 2080ti for speed. Figure 4 shows the data stream sketch map of the DNNR.
Fig.4 The flow chart of deep neural network regression algorithm and classification algorithm 
The correlation coefficients (ρ) of P_{pred} (momentum predicted by DNNR (P_{DNN}) and LR (P_{Linear})) versus the true momentum (P_{true}) of decayed particle for different decay processes are shown in Table 2. The more information the model gets, the closer the distribution of P_{pred} is to the distribution of P_{true}. So the correlation between P_{DNN} and P_{true} is higher than that between P_{Linear} and P_{true}. Although nonlinear model can better describe the distribution of target compared with the linear model, the over fitting of the model is worthy of attention at the same time.
According to Fig. 5, the distribution, (P_{pred}P_{true})/ P_{true}, which is predicted by DNNR, is better than that predicted by LR on different decay processes. The RMS of using DNNR is 35.26% better than using LR for , for is 36.12%, for is 39.23% and for is 33.54% from Table 3. Therefore, it can be learned that the distribution of momentum predicted by the DNNR model is closer to the distribution of true momentum than LR. This also means that the DNNR will perform better than LR on selecting a solution closer to the true value in neutrino closure method.
Fig. 5 Comparison of distribution (P
_{pred}P
_{true})/P
_{true} for LR and DNNR (a) , (b) , (c) and (d) 
Algorithm 1 Deep neural network regression algorithm used for different decay processes.  

Input: the original feature variables dataset , size is, is the number of input variables, is the number of events  
Output: the regression results (predicted momentum) for testing dataset .  
1) for each group input dataset, construct it by the value of input variables for one event, exist groups  
2) for each group target dataset, construct by true momentum of decayed parent particle,  
3) for each I _{ i } in I ^{0}  
(a) input the model of 4 layer fully connected neural regression network, and PReLU activation is used in the middle of different layers;  
(b) train the model by Adam optimization;  
(c) obtain the regression result of training dataset and the weight parameters of the model.  
4. calculate the RMS between and for deciding the performance of the model;  
5. for each in test dataset , input it to the trained model;  
6. return regression result (predicted momentum) of I ^{ t }, y ^{ t }. 
The correlation coefficient (ρ) of P _{pred} (P _{DNN}, P _{Linear}) vs. P _{true }for total decay processes
RMS of (P _{pred}P _{true})/ P _{true} of total decay processes (单位：%)
3.2 Deep Neural Network Classification Algorithm
In addition to using the deep neural network regression algorithm for the momentum correction, we also propose another new method based on fully connected deep neural network classification algorithm (DNNC), by regarding the situation of twosolutions in the neutrino closure method as a mathematically separate problem. The two solutions can be regarded as two classes, and every solution will be given a label which is “0” or “1”. We make a classification model to select the better solution according to give the label defined. The construction of the model uses the MLP Classifier model of Scikitlearn package based on CPU, and the structure is the same as DNNR that consists of one input layer, two hidden layers, and one output layer. The activation is default optionReLU^{[28]} and optimizer is also Adam (same as DNNR). The details of the algorithm used by us are shown in Algorithm 2.
Compared with the regression algorithm, the classification algorithm leaves out the middle process of predicting the inferred momentum given by model, and gives the final corrected momentum result directly. The fully connected deep neural network regression algorithm has two parts of errors, the predicting momentum process and selection process. Fully connected deep neural network classification only has one part of error because of the more direct process. The structure of DNNC model is shown in Fig. 4.
Algorithm 2 Deep neural network classification algorithm used for different decay processes.  

Input: the original feature variables dataset , size is , is the number of input variables, is the number of events  
Output: the classification results for testing dataset R ^{ t }.  
1. for each group input dataset , construct it by the value of input variables for one event, exist groups.  
2. for each group target dataset, use true momentum to select a better result from two solutions for every event, two solutions are ; if the better result satisfy “+” (“”), the label is “1” (“0”), target dataset are constructed by the label for total events  
3. for each in  
(a) input the model of 4 layer fully connected neural network, ReLU activation is used in the middle of different layers;  
(b) train the model by Adam optimization;  
(c) obtain the classification result (predicted label) of training dataset and the weight parameters of the model;  
4. validate the accuracy between and for deciding the robustness of the model;  
5. for each (each group) in test dataset , input it to the trained model;  
6. return classification result of , . 
4 Physics Application to the Study of Semileptonic Decay Processes
The target of using the DNNR or DNNC to predict the momentum of the decayed particles is to select a better one from the two solutions in the neutrino closure method^{[6]} on different decay processes. The sum momentum vector of the reconstructed system by detector is defined as the visible momentum, which is denoted . The momentum of the invisible neutrinos from a semileptonic decay is denoted as . The visible momentum is decomposed into two parts, the perpendicular component () and parallel component ()^{[7]}:(1) (2)
In the specified coordinate system, the perpendicular component of the missing momentum is required to be equal to the perpendicular component of the visible momentum, =. Assuming the mass of decayed parent particle is m, and there is a single massless unreconstructed particle (neutrino) in the final state, can be obtained by the solutions of the quadratic equation. The α and β of the quadratic equation are defined as:(3) (4) is the visible energy and M_{vis} is the visible mass. is defined as:(5)This yields two solutions for the momentum of neutrino closure method:(6) (7)The distribution of (P_{}  P_{true})/P_{true} vs. (P_{+}  P_{true})/P_{true} of different decay processes is shown in Fig. 6. A horizontal band can be seen clearly for P_{+} and the same as for P_{} in the vertical direction. Although the effect of the vertex smearing is visible, the two bands are nevertheless well separated, and the feature between P_{+} and P_{} is apparent.
Fig. 6 The distribution of (P
_{ } P
_{true})/P
_{true}
vs. (P
_{+ } P
_{true})/P
_{true}
(a) , (b) , (c) and (d) 
The information used in DNNR and DNNC is more than that used in LR because of the restriction of physical formula in LR. Therefore, the performance of using DNNR and DNNC in selecting a better solution for different decay processes in neutrino closure method is better. The solution selected by true momentum of decayed particles represents the upper threshold. The closer the result obtained with the machine learning model to the upper threshlod, the better the performance of the machine learning model.
Now we can compare the performance of our new idea on q ^{2} and momentum for the different decay processes. According to Fig.7, the distribution using DNNR and DNNC methods is closer to the upper threshold compared with using LR for total decay processes, meaning the method based on deep neural network will perform better in reconstruction of invariant mass q ^{2} of the lv system for the semileptonic decay processes used in this paper. The resolution improvement of decayed parent particles for different decay processes are also shown in Table 4. According to Table 5, the resolution of q ^{2} increases by 2.04% and 2.25% by using DNNR and DNNC than that by using LR for , 2.53% and 3.84% for using DNNR and DNNC than using LR for , 0.83% and 2.67% by using DNNC than that by using LR for and 1.53% and 2.02% by using DNNR and DNNC for . The resolution improvement of momentum and q ^{2} is obvious, which means the performance of deep neural network idea with either regression algorithm or classification algorithm in solving twosolution problem of neutrino closure method is better than linear regression algorithm.
Fig. 7 Comparison of distribution
(a), (b), (c) and (d) . True: the q ^{2} obtained by true momentum of decayed parent particle to select the better solution, this result is upper threshold in this research; Rand: the q ^{2} obtained by random strategy to select better solution; DNNC: q ^{2} obtained by trained DNNC model to select the better solution; DNNR (LR): q ^{2} obtained by the predicted momentum (P _{pred}) predicted by DNNR (LR) to select the better solution 
The RMS of for different decay processes (单位： %)
The RMS of for different decay processes (单位： %)
5 Conclusion
Compared with using the linear regression method, the distributions of the solution selected by the methods of deep neural network regression and classification move to the upper threshold. Resolution of q ^{2} (P) can be improved evenly by 1.7% (8.2%) using deep neural network regression algorithm and 2.7% (9.6%) by deep neural network classification algorithm compared with linear regression algorithm. Accompanied by improving the resolution of momentum, the effect caused by missing neutrino can be reduced. The momentum reconstructed by the neutrino closure method is close to the actual situation, which will supply a good chance to better study the kinematics of semileptonic decay process. The improvement of resolution of q ^{2} will also lead to better constraints on the improvement of precision of forms factors, providing a chance to calculate the branch fractions more accurately for undiscovered semileptonic decay processes and improve the knowledge of CKM matrix elements.
References
 Alves Jr A A, Andrade Filho L M, Barbosa A F, et al. The LHCb detector at the LHC [J]. Journal of Instrumentation, 2008, 3(8): S08005. [Google Scholar]
 Dominik M . Study of Semileptonic D ^{0} Decays for a Measurement of Charm Mixing at LHCb [D]. Heidelberg: University of Heidelberg, 2014. [Google Scholar]
 Ablikim M, Achasov M N, Adlarson P, et al. Future physics programme of BESIII [J]. Chinese Physics C, 2020, 44(4): 040001. [NASA ADS] [CrossRef] [Google Scholar]
 Na H, Davies C T, Follana E, et al. semileptonic decay scalar form factor and from lattice QCD [J]. Physical ReviewSection DParticles and Fields, 2011, 84 (11): 114506. [Google Scholar]
 Shen Y L, Wei Y B. form factors with the Bmeson lightcone sum rules [J]. Advances in High Energy Physics, 2022, 2022: 2755821. [Google Scholar]
 Dambach S, Langenegger U, Starodumov A. Neutrino reconstruction with topological information [J]. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 2006, 569(3): 824828. [Google Scholar]
 Ciezarek G, Lupato A, Rotondo M, et al. Reconstruction of semileptonically decaying beauty hadrons produced in high energy pp collisions [J]. Journal of High Energy Physics, 2017, 2017(2): 21. [CrossRef] [Google Scholar]
 Cowan G A, Craik D C, Needham M D. RapidSim: An application for the fast simulation of heavyquark hadron decays [J]. Computer Physics Communications, 2017, 214: 239246. [NASA ADS] [CrossRef] [Google Scholar]
 Aaij R, Albrecht J, Alessio F, et al. The LHCb trigger and its performance in 2011 [J]. Journal of Instrumentation, 2013, 8(4): P04022. [NASA ADS] [CrossRef] [Google Scholar]
 Sjöstrand T, Ask S, Christiansen J R, et al. An introduction to PYTHIA 8.2 [J]. Computer Physics Communications, 2015, 191: 159177. [CrossRef] [Google Scholar]
 Lange D J . The EvtGen particle decay simulation package [J]. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 2001, 462(12): 152155. [Google Scholar]
 Golonka P, Kersevan B, Pierzchała T, et al. The tauolaphotosF environment for the TAUOLA and PHOTOS packages, release II [J]. Computer Physics Communications, 2006, 174(10): 818835. [NASA ADS] [CrossRef] [Google Scholar]
 Aaij R, Beteta C A, Ackernley T, et al. First observation of the decay and a measurement of [J]. Physical Review Letters, 2021, 126(8): 081804. [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
 Flynn J, Hill R, Jüttner A, et al. Semileptonic , , , and decays [J]. Proceedings of Science, 2020, 363: 184. [Google Scholar]
 Aaij R, Adeva B, Adinolfi M, et al. Determination of the quark coupling strength using baryonic decays [J]. Nature Phys, 2015, 11: 743747. [NASA ADS] [CrossRef] [Google Scholar]
 Aaij R, Beteta C A, Adeva B, et al. Measurement of b hadron fractions in 13 TeV pp collisions [J]. Physical Review D, 2019, 100(3): 031102. [NASA ADS] [CrossRef] [Google Scholar]
 Antunes Nobrega R, Franca Barbos A, Bediaga I, et al. LHCb reoptimized detector design and performance: Technical design report [EB/OL]. [20211210]. https://hal.archivesouvertes.fr/in2p300025912. [Google Scholar]
 Detmold W, Lehner C, Meinel S. and form factors from lattice QCD with relativistic heavy quarks [J]. Physical Review D, 2015, 92(3): 034503. [NASA ADS] [CrossRef] [Google Scholar]
 Aaij R, Beteta C A, Ackernley T, et al. Measurement of with decays [J]. Physical Review D, 2020, 101(7): 072004. [NASA ADS] [CrossRef] [Google Scholar]
 He K M, Zhang X Y, Ren S Q, et al. Delving deep into rectifiers: Surpassing humanlevel performance on ImageNet classification[C]//2015 IEEE International Conference on Computer Vision. New York: IEEE, 2015: 10261034. [Google Scholar]
 Chang Z H, Zhang Y, Chen W B. Effective Adamoptimized LSTM neural network for electricity price forecasting [C]// 2018 IEEE 9th International Conference on Software Engineering and Service Science. New York: IEEE, 2018: 245248. [Google Scholar]
 Ketkar N . Deep Learning with Python [M]. Berkeley: Apress, 2017: 97111. [CrossRef] [Google Scholar]
 Abadi M, Barham P, Chen J, et al. TensorFlow: A system for largescale machine learning [C]//12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). New York: ACM, 2016: 265283. [Google Scholar]
 Pedregosa F, Varoquaux G, Gramfort A, et al. Scikitlearn: machine learning in python [J]. The Journal of Machine Learning Research, 2011, 12: 28252830. [MathSciNet] [Google Scholar]
 Belov S, Dudko L, Kekelidze D, et al. HepML, an XML based format for describing simulated data in high energy physics [J]. Computer Physics Communications, 2010, 181 (10): 17581768. [NASA ADS] [CrossRef] [Google Scholar]
 Oliphant T E . A Guide to NumPy [M]. New York: Trelgol Publishing, 2006. [Google Scholar]
 Snider L A, Swedo S E. PANDAS: Current status and directions for research [J]. Molecular Psychiatry, 2004, 9(10): 900907. [CrossRef] [PubMed] [Google Scholar]
 Hanin B . Universal function approximation by deep neural nets with bounded width and Relu activations [J]. Mathematics, 2019, 7(10): 992. [CrossRef] [Google Scholar]
All Tables
The correlation coefficient (ρ) of P _{pred} (P _{DNN}, P _{Linear}) vs. P _{true }for total decay processes
All Figures
Fig.1 The distribution of pseudorapidity (η) of decayed particles (a) , (b) , (c) and (d) 

In the text 
Fig. 2 The distribution of momentum of decayed particles for total decay processes in sample events (a) , (b) , (c) and (d) ; The natural unit c (light speed)=1 

In the text 
Fig. 3 The distribution of transverse momentum of decayed particles for total decay processes in sample events (a) , (b) , (c) and (d) ; The natural unit c (light speed)=1 

In the text 
Fig.4 The flow chart of deep neural network regression algorithm and classification algorithm  
In the text 
Fig. 5 Comparison of distribution (P
_{pred}P
_{true})/P
_{true} for LR and DNNR (a) , (b) , (c) and (d) 

In the text 
Fig. 6 The distribution of (P
_{ } P
_{true})/P
_{true}
vs. (P
_{+ } P
_{true})/P
_{true}
(a) , (b) , (c) and (d) 

In the text 
Fig. 7 Comparison of distribution
(a), (b), (c) and (d) . True: the q ^{2} obtained by true momentum of decayed parent particle to select the better solution, this result is upper threshold in this research; Rand: the q ^{2} obtained by random strategy to select better solution; DNNC: q ^{2} obtained by trained DNNC model to select the better solution; DNNR (LR): q ^{2} obtained by the predicted momentum (P _{pred}) predicted by DNNR (LR) to select the better solution 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.