Issue 
Wuhan Univ. J. Nat. Sci.
Volume 27, Number 3, June 2022



Page(s)  240  254  
DOI  https://doi.org/10.1051/wujns/2022273240  
Published online  24 August 2022 
Computer Science
CLC number: TP 182
Improving the Efficiency of MultiObjective Grasshopper Optimization Algorithm to Enhance Ontology Alignment
School of Computer Science, Wuhan University, Wuhan
430072, Hubei, China
^{†} To whom correspondence should be addressed. Email: rongpeng@whu.edu.cn
Received:
18
March
2022
Ontology alignment is an essential and complex task to integrate heterogeneous ontology. The metaheuristic algorithm has proven to be an effective method for ontology alignment. However, it only applies the inherent advantages of metaheuristics algorithm and rarely considers the execution efficiency, especially the multiobjective ontology alignment model. The performance of such multiobjective optimization models mostly depends on the welldistributed and the fastconverged set of solutions in realworld applications. In this paper, two multiobjective grasshopper optimization algorithms (MOGOA) are proposed to enhance ontology alignment. One is εdominance concept based GOA (EMOGOA) and the other is fast Nondominated Sorting based GOA (NSMOGOA). The performance of the two methods to align the ontology is evaluated by using the benchmark dataset. The results demonstrate that the proposed EMOGOA and NSMOGOA improve the quality of ontology alignment and reduce the running time compared with other wellknown metaheuristic and the stateoftheart ontology alignment methods.
Key words: ontology alignment / multiobjective grasshopper optimization algorithm / εdominance / fast nondominated sorting / knowledge integration
Biography: LV Zhaoming, male,Ph. D. candidate, research direction:knowledge engineering, ontology matching and swarm intelligence. Email:zhaominglv@whu.edu.cn
Foundation item: Supported by the Ministry of EducationChina Mobile Joint Fund Project (MCM2020J01)
© Wuhan University 2022
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
0 Introduction
Ontology is a clear formal specification for sharing concepts and is widely used to support knowledge integration and interoperability such as intelligent transportation systems ^{[1]}, industrial megaprojects analytics ^{[2]}, product lifecycle management ^{[3]}, and cognitive and robotic systems^{ [4]}. In recent years, more and more ontologies have been developed by using the Semantic Web standard technologies in the same domain of interest. However, different ontology design experts may use different terms to describe the concept of the same thing, which emerges the heterogeneity problem between ontologies. Therefore, ontologybased data management is becoming more and more complex. In order to apply these knowledge widely, ontology alignment is a key process. The goal of ontology alignment is to find a set of semantic correspondences between different ontologies for further handled knowledgebased systems. In the past few years, many ontology alignment methods or techniques have been introduced in the literature. Especially the method of weighted aggregation for combining multiple basic matchers through swarm and evolutionary algorithms has demonstrated to be an effective method. These methods utilize weights to combine lexicalbased, linguisticbased, and structurebased matchers for finding more correspondences between entities to make full use of the information contained within context of an ontology. Weights based methods are divided into two categories: fixed weights and dynamic weights. However, the fixed weights manually configured according to experimental results have certain limitations, which may reduce the quality of alignment. Therefore, adaptive weights allocation are an effective method for combining matchers adequately.
In recent years, the successful application of metaheuristic algorithm is very noteworthy. Swarm intelligence algorithm has successfully solved the ontology alignment problem and obtained the better ontology alignment quality in the form of a single objective. However, these methods only apply the inherent advantages of metaheuristics and rarely consider the execution efficiency, especially the multiobjective ontology alignment model. Further, the performance of such multiobjective optimization models mostly depends on the welldistributed and the fastconverged set of solutions in specific applications. In order to improve the efficiency and effect of multiobjective ontology alignment methods, in this work, the grasshopper optimization algorithms are improved to perform multiobjective ontology alignment tasks more efficiently. Grasshopper Optimization Algorithm (GOA) is a population based metaheuristic algorithm inspired by the behavioral characteristics of grasshopper swarms to simulate exploration and exploitation processes of the swarm intelligence algorithm ^{[5]}. According to the claim of Bock et al ^{[6]}, ontology alignment as an optimization problem needs to maintain the number of correspondences and a set of correct correspondences at the same time. Therefore, the ontology alignment can be regarded as a multiobjective optimization problem with two objectives: the number of correspondences and the set of correct correspondences. Further, GOA balances the exploration and exploitation in the search process through an adaptive parameter. The high repulsive force between grasshoppers gives GOA the advantages of high exploration and local optimal avoidance. The attraction between grasshoppers makes GOA algorithm have strong exploitation and convergence performance. These characteristics make the GOA algorithm more adaptable to complex multiobjective search problems and surpass other algorithms such as particle swarm optimization (PSO), bat optimization algorithm (BA), Bonobo Optimizer (BO), and firefly algorithm (FA). These advantages motivate us to develop an ontology alignment system based on multiobjective GOA. In this paper, two different multiobjective GOA are proposed to perform ontology alignment tasks in a more efficient way. One is εdominance concept based GOA (EMOGOA) and the other is fast nondominated sorting based GOA (NSMOGOA). The proposed EMOGOA approach uses the grid technology based on εdominance concept to keep the diversity of the nondominated solution in the archive, and reduce the time to converge to the Pareto optimal solution. The NSMOGOA uses a fast nondominated sorting method by assigning ranking and crowding distance to the population, which speeds up the completion of the ontology alignment task.
The remainder of this paper is organized as follows. Section 1 describes the related work of this paper. Section 2 describes the related concepts underlying our research work. Section 3 gives the mathematical model of the multiobjective ontology alignment problem. The modified EMOGOA is proposed in Section 4. Section 5 proposes the NSMOGOA. Section 6 initializes the conditions required for the experiments and the results are discussed. Section 7 concludes with an overview of our future work.
1 Related Works
In the past few years, evolutionary computing technology based ontology alignment methods have been proposed to address the problem of combining multiple basic matchers. These methods are divided into two categories: direct optimization alignment and the ontology metamatching technique. The method of directly optimizing an alignment is to search for a correspondence set in the candidate correspondences. This set is also called an alignment. Furthermore, each confidence value for a correspondence is calculated by combining multiple basic matchers. The most famous method of this type is MapPSO, which directly optimizes an alignment evaluated by a fitness function with two objectives ^{[6]}. The method minimizes the aggregate values of lexicalbased, linguisticbased, and structurebased matchers and evaluates candidate alignments by using fitness functions. The optimization process is performed by using discrete particle swarm optimization. In detail, a candidate solution is randomly selected from the Class, Data property and Object property space. The update process deletes some correspondences when the fitness function decreases, and some new correspondences are added. When the termination condition is reached, a suboptimal alignment is found. The limitation of this method is that the weight value of the combined matchers is fixed. This similar work is GAOM, which uses an evaluation function to evaluate candidate alignments in the optimization process of genetic algorithms^{ [7]}.In recent years, Mohammadi et al ^{ [8]} proposed a method to directly optimize ontology alignment based on simulated annealing algorithm from the perspective of reducing the time of calculation and memory. The authors model the ontology alignment problem as a function of optimizing a state, and an optimal state is obtained by performing a simulated annealing algorithm. This method uses several matchers to calculate a similarity matrix. The confidence of each correspondence is taken from the maximum value calculated by these matchers which utilize all information of a pair of entities such as name, label and comment.
Another category is the ontology metamatching technology, which can be considered to automatically select or configure appropriate weights and filtering thresholds during optimization process ^{[9]}. The earliest metamatching method is GOAL ^{[10]}. Instead of directly optimizing the candidate alignment between two ontologies, this method uses a genetic algorithm to set the weights, and to find a suboptimal alignment. This algorithm searches for different weight configurations, i.e., a candidate solution vector, and finds the best set of weights in infinite search space. The evaluation metric of a solution is to use the precision and recall of the reconciliation. Similarly, Naya et al ^{ [11]} used the genetic algorithm to find out appropriate weights for aggregating different similarity metrics into a final measure. Each individual represents a specific combination of metrics, and the method finds the combination and provides the best alignment. The genetic algorithm was also used in Ref.[12] for optimizing a suboptimal alignment. The difference is that the weight and the threshold are coded in the chromosome at the same time. Xue et al ^{ [13]} utilized multiobjective genetic algorithm (NSGAII) to align ontology. In addition, also in Ref.[14], the multiobjective evolutionary algorithm based on decomposition (MOEA/D) was utilized for solving ontology alignment problem. For these two multiobjective approaches, precision and recall are used as two objectives. Although these metamatching methods have obtained good alignment quality, they rarely consider the execution efficiency of metaheuristics algorithms, especially the multiobjective ontology alignment model.
In recent years, researchers have proposed different methods based on newly developed metaheuristic algorithms to solve the ontology metamatching problem, and these methods do not require a priori knowledge in the alignment process. For example, Acampora et al ^{[15] } proposed an ontology alignment process based on a hybrid evolutionary algorithm, which can effectively aggregate a variety of basic matchers without a priori knowledge. In the end, a suboptimal alignment was obtained between the given two ontologies. In addition, also in Ref.[16], the NSGAII was utilized to solve the ontology metamatching problem and the problem of prior knowledge. Ryma et al ^{[17]} proposed a hybridization method by combining the Hill climbing local search technique into the genetic algorithm for large ontology alignment without a priori knowledge. Forsati et al ^{ [18]} proposed an effective method based on Harmony Search (HSOMap). This method uses a weighted harmonicmean method to aggregate various matchers into a single confidence among all possible correspondences between two ontologies. Lv et al ^{ [19]} proposed a novel approach by using grasshopper optimization algorithm to solve the ontology metamatching problem. This method makes use of various matchers and does not require a priori knowledge in the alignment process. As each basic matcher can capture the information of an entity from a different perspective, the aggregation technique of multiple matchers can enhance the matching effect of the algorithm. For deep reviews of other metaheuristics based methods, consider the following articles: utilizing Memetic algorithm through partial reference alignment to evaluate candidate solution^{ [20]}, using Memetic algorithm through MatchFmeasure and Unanimous improvement ratio to evaluate candidate solution ^{[21]}, using the Tabu search improved compact evolutionary algorithm to match sensor ontologies ^{[22]}, and matching ontology based on instance by using Memetic algorithm ^{[23]}. These methods also only focus on solving the problem of relying on prior knowledge and improving the quality of ontology alignment without considering the execution efficiency of metaheuristic algorithms.
In this paper, the welldistributed and the fastconverged set of solutions for multiobjective metaheuristic algorithms are considered. Two different GOA are proposed for multiobjective ontology alignment.
2 Preliminaries
In this section, the ontology, ontology alignment, and two basic concepts related to MOOP are defined as follows.
2.1 Ontology and Ontology Alignment
Definition 1 (Ontology) An ontology is formalized as O =<C, P, I, V, R> , where:
1) C represents a collection of classes;
2) P = {P _{1}, P _{2}, ⋯, P_{n} } is a collection of properties;
3) I = {I _{1}, I _{2}, ⋯, I_{n} } represents instance information of concepts;
4) V indicates the annotations information of concepts;
5) R = {R _{1}, R _{2}, ⋯, R_{n} } represents axioms of concepts.
Definition 2 (Ontology alignment)
Ontology alignment is a task of finding a set of pair of entities between different ontologies. In this work, a pair of entities is also called correspondence, matching or mapping. The algorithm functioned to achieve this task can be defined as follows. This algorithm takes three parameters ${O}_{\mathrm{1}},{O}_{\mathrm{2}},\mathrm{a}\mathrm{n}\mathrm{d}\text{}P$ as input and returns an alignment M, which is a set of finding correspondences. The ${O}_{\mathrm{1}}$ and ${O}_{\mathrm{2}}$ are two ontologies, and P is a set of parameters^{ [24]}.
$\begin{array}{c}M=F\left({O}_{\mathrm{1}},{O}_{\mathrm{2}},P\right)\end{array}$(1)
Definition 3 (Pareto dominance)
Assuming x and y are any two solution vectors in the evolutionary population, then if x is called to dominate y (denoted by$\text{}\mathit{x}\prec \mathit{y}$) , it must meet two conditions: For all subobjectives, x is not worse than y and at least one subobjective makes that x is better than y . This mathematical model is defined as follows.
$\begin{array}{c}\begin{array}{l}\left(\forall k\in \left\{\mathrm{1,2},\cdots ,r\right\}:\text{}{f}_{k}\left(\mathit{x}\right)\le {f}_{k}\left(\mathit{y}\right)\right)\bigwedge \\ \left(\exists l\in \left\{\mathrm{1,2},\cdots ,r\right\}:\text{}{f}_{l}\left(\mathit{x}\right){f}_{l}\left(\mathit{y}\right)\right)\end{array}\end{array}$(2)
Definition 4 (Pareto optimal solution) A solution x is said a Pareto optimal solution because it is not dominated by any vector in the decision space$\text{}\mathrm{\Omega}$. The Pareto optimal solution is mathematically defined as follows:
$\begin{array}{c}\nexists y\in \text{}\mathrm{\Omega}\text{}:\mathit{y}\prec \mathit{x}\end{array}$(3)
Definition 5 (Pareto optimal solution set) A Pareto optimal solution set (POS) is composed of multiple Pareto optimal solutions ^{[25]}.
$\begin{array}{c}\mathrm{P}\mathrm{O}\mathrm{S}=\end{array}\{\mathit{x}\text{}\nexists \mathit{y}\in \text{}\mathrm{\Omega}\text{}:\mathit{y}\prec \text{}\mathit{x}\}$(4)
2.2 MultiObjective Grasshopper Optimization Algorithm
In 2018, Mirjalili et al ^{[26]} proposed a multiobjective grasshopper optimization algorithm inspired from the behavior of grasshoppers swarms. In order to solve the optimization problem, a mathematical model was proposed by simulating the foraging behavior of grasshopper swarms. In this model, a social interaction function realizes the attraction and repulsion between grasshopper swarms. Furthermore, this repulsive force realizes the exploration process of GOA in a search space, and this attraction force realizes the exploitation and convergence process. In order to balance these two search processes, an adaptive comfort zone coefficient leads to an appropriate balance between exploration and exploitation. In the multiobjective version, the Pareto dominance, an archive and target selection techniques are introduced into the algorithm to optimize a Pareto optimal solution set. Further, the Pareto optimal dominance is integrated into GOA to compare solutions with multiple objectives. In order to obtain a nondominated solution set, i.e., Pareto optimal solution set, an external archiving technique is used to store the nondominated solutions and is updated after the population evolves. The mathematical model is defined as follows:
$\begin{array}{c}{\mathit{X}}_{i}={c}_{w}\left({\displaystyle \sum _{\begin{array}{c}j=\mathrm{1}\\ j\ne i\end{array}}^{N}}{c}_{n}\frac{pq}{\mathrm{2}}S\left({d}_{ij}\right)\stackrel{\u0311}{{d}_{ij}}\right)+\stackrel{\u0311}{{T}_{d}}\end{array}$(5)
${\mathit{X}}_{i}$ is the updated position vector of the ith grasshopper; ${c}_{w}$ is similar to the inertia weight of particle swarm optimization, which balances the exploration and exploitation; p and q are the upper and lower bounds of the search range; $\stackrel{\u0311}{{T}_{d}}$ is the currently searched optimal target value, and $\stackrel{\u0311}{{d}_{ij}\text{}}$ is a unit vector from the ith grasshopper to jth grasshopper; S is the attraction function which is defined as follows.
$\begin{array}{c}S\left({d}_{ij}\right)=u{\mathrm{e}}^{\frac{{d}_{ij}}{\tau}}{\mathrm{e}}^{{d}_{ij}}\text{}\end{array}$(6)
where d ${}_{ij}$ is the distance between the ith and the jth grasshopper; u is the strength of attraction; τ indicates the attractive length scale. The unit vector $\stackrel{\u0311}{{d}_{ij}\text{}}$ is defined as follows:
$\begin{array}{c}\stackrel{\u0311}{{d}_{ij}}=\frac{{\mathit{x}}_{j}{\mathit{x}}_{i}}{{d}_{ij}}\end{array}$(7)
where ${\mathit{x}}_{j}$ is the current position vector of the jth grasshopper; ${\mathit{x}}_{i}$ is the current position vector of the ith grasshopper. In equation (5), ${c}_{n}$ is an internal parameter, which is the decreasing coefficient to shrink the comfort zone, repulsion zone and attraction zone; The ${c}_{w}$ and ${c}_{n}$ is defined as follows:
$\begin{array}{c}{c}_{w}={c}_{n}={c}_{\mathrm{m}\mathrm{a}\mathrm{x}}t\frac{{c}_{\mathrm{m}\mathrm{a}\mathrm{x}}{c}_{\mathrm{m}\mathrm{i}\mathrm{n}}}{L}\end{array}$(8)
where ${c}_{\mathrm{m}\mathrm{a}\mathrm{x}}$ and ${c}_{\mathrm{m}\mathrm{i}\mathrm{n}}$ indicate the maximum value and minimum value of the parameter c, respectively; t is the current iteration number; L is the maximum number of iterations.
3 Mathematical Model of the MultiObjective Ontology Alignment Method
According to the above analysis, the approach using metaheuristics needs to be modeled as multiobjective optimization problem to solve the ontology alignment problem in a more suitable way. A mathematical model for the multiobjective ontology matching problem is defined. Subsequently, the data structure of the grasshopper is represented, and two objective functions are introduced. The mathematical model of the multiobjective optimization problem is defined as follows:
Given a vector$\text{}\mathit{X}=\left({x}_{\mathrm{1}},{x}_{\mathrm{2}},\cdots ,\text{}{x}_{n}\right),\text{}\mathit{X}\in \mathrm{\Omega}$ and it satisfies the following conditions:
${g}_{i}\left(\mathit{X}\right)\ge \mathrm{0},\text{}i=\mathrm{1,2},\cdots ,k$
${h}_{l}\left(\mathit{X}\right)=\mathrm{0},\text{}i=\mathrm{1,2},\cdots ,l$
Considering a maximization problem for ontology alignment, there are k conflicting objectives to be optimized. This optimization function is defined as follows:
$\begin{array}{c}\mathrm{M}\mathrm{a}\mathrm{x}\mathrm{i}\mathrm{m}\mathrm{i}\mathrm{z}\mathrm{e}\text{}f\left(\mathit{X}\right)=\left({f}_{\mathrm{1}}\left(\mathit{X}\right),{f}_{\mathrm{2}}\left(\mathit{X}\right),\cdots ,{f}_{k}\left(\mathit{X}\right)\right)\end{array}$(9)
For a candidate solution in the population, each grasshopper position also represents a set of weights used in the basic matcher aggregation task and is associated with a candidate alignment in the metamatching method. The value of each weight indicates the contribution for a basic matcher. In order to adequately capture the information of the ontology, the matchers with different capabilities are used in the proposed method. Therefore, a candidate solution vector X is represented as follows.
$\{\begin{array}{l}\mathit{X}=\left({r}_{\mathrm{1}},{r}_{\mathrm{2}},\cdots ,\text{}{r}_{n}\right)\\ {r}_{i}={\displaystyle \sum _{i=\mathrm{1}}^{n}}{w}_{i}\times {m}_{i}({c}_{e}),\mathrm{s}.\mathrm{t}.{\displaystyle \sum _{i=\mathrm{1}}^{n}}{w}_{i}=\mathrm{1}\end{array}$(10)
where ${c}_{e}$ is a correspondence, ${w}_{i}$ is the weight of the ith matcher, ${m}_{i}$ (i∈[1,n]) is the ith matcher, n is the number of matchers.
Inspired by Bock et al ^{[6]}, the transformation of ontology alignment task into an optimization problem requires two objectives: 1) the number of correspondences; 2) a set of correct correspondences. Therefore, two evaluation functions were proposed. Furthermore, we consider the matching probability of correspondences for a candidate alignment found during perform. It is formulated to the first objective, which is formally defined as follows:
$\begin{array}{c}{f}_{\mathrm{1}}\left(\mathit{X}\right)=\frac{\left\mathit{X}\right}{\left{O}_{\mathrm{1}}\times {O}_{\mathrm{2}}\right}\end{array}$(11)
In order to protect the elite individuals, it is necessary to maximize the average confidence of correspondences in an alignment, which is formally defined as follows:
$\begin{array}{c}{f}_{\mathrm{2}}\left(\mathit{X}\right)=\frac{{\displaystyle \sum _{i=\mathrm{1}}^{\left\mathit{X}\right}}{r}_{i}}{\left\mathit{X}\right}\end{array}$(12)
where $\left\mathit{X}\right$ is the number of correspondences for an alignment found so far, ${r}_{i}$ is the confidence weighted for a correspondence, and $\left{O}_{\mathrm{1}}\times {O}_{\mathrm{2}}\right$ represents the total number of candidate correspondences between two ontologies.
In the next sections, two versions of the multiobjective grasshopper optimization algorithm are proposed to investigate the matching quality and calculation time for solving ontology alignment problem.
4 Proposed EMOGOA Algorithm
Considering that the two essential components of a multiobjective optimization algorithm are the selecting of nondominated solutions and the maintaining of the diversity of nondominated solutions in the optimization process, we propose an archive update mechanism based on the εdomination concept to improve the distribution and convergence speed. In the proposed EMOGOA, a grid or hyperbox mechanism based on the εdominance concept is used to improve the original MOGOA which is only based on the Pareto dominance. The main reason for this modification is that in the steadystate MOEA experiment by εdominance, it has been proved to be a good compromise method in terms of approaching the Pareto optimal front, the diversity of the solution, and the wellconverged ^{[27]}. Furthermore, this mechanism maintains the diversity of solutions in the archive by allowing only one solution to be displayed in each preallocated hyperbox. In EMOGOA optimization process, the new solutions generated are compared with each member in the archive by using εdominance. First of all, an identification array $\{\mathit{B}=({B}_{\mathrm{1}},{B}_{\mathrm{2}},\cdots ,{B}_{k}{)}^{\mathrm{T}}\}$ is assigned to each member in the archive, and it is defined as follows:
$\begin{array}{c}{B}_{k}\left(f\right)=\{\begin{array}{c}\u23bf\frac{{f}_{k}{f}_{k}^{\mathrm{m}\mathrm{i}\mathrm{n}}}{{\epsilon}_{k}}\u23cc,\text{}\mathrm{f}\mathrm{o}\mathrm{r}\text{}\mathrm{m}\mathrm{i}\mathrm{n}\mathrm{i}\mathrm{m}\mathrm{i}\mathrm{z}\mathrm{i}\mathrm{n}\mathrm{g}\text{}{f}_{k}\\ \u23be\frac{{f}_{k}{f}_{k}^{\mathrm{m}\mathrm{i}\mathrm{n}}}{{\epsilon}_{k}}\u23cb,\text{}\mathrm{f}\mathrm{o}\mathrm{r}\text{}\mathrm{m}\mathrm{a}\mathrm{x}\mathrm{i}\mathrm{m}\mathrm{i}\mathrm{z}\mathrm{i}\mathrm{n}\mathrm{g}\text{}{f}_{k}\text{}\end{array}\end{array}$(13)
where k is the number of problem objectives, and ${f}_{k}^{\mathrm{m}\mathrm{i}\mathrm{n}}$ is the minimum value of the kth objectie and ${\epsilon}_{k}$ is the allowable tolerance in the kth objective.
Therefore, this mechanism divides the whole objective space into grid clusters. The identification array ${B}_{x}$ of the new solution x and the identification array ${B}_{a}$ of each archive member a are calculated, respectively. This new solution x is not accepted when the identification array ${B}_{a}$ of any member a in the archive dominates ${B}_{a}$ of the new solution x . That is, the new solution x is dominated by the archive member εdominance. Instead, this archive member a is deleted, and the new solution x is accepted when the new solution ${B}_{x}$ dominates the ${B}_{a}$ of any archive member a. If neither of the above two situations occurs, it means that the new solution x and the members in the archive are εnondominance. Two cases need to be analyzed. If the new solution shares a hyperbox with an archive member, the general nondominated relationships are used to detect them first. If this new solution x dominates the archive member or it is nondominated by the archive member and is closer to the B vector than the archive member, then the new solution x is retained and the archive member is removed from the archive. If the new solution x does not share the hyperbox with any archive member, then the new solution x is accepted ^{[27]}.
The above analysis shows that each hyperbox is occupied by only one hyperbox at the Pareto optimal front. The εdominance concept have three advantages: 1) the maintenance of diversity; 2) consistency of the size of the final archive and the total number of Pareto optimal solutions; 3) the rapid convergence of Pareto optimal solutions. The EMOGOA is summarized in Algorithm 1.
Algorithm 1: Summary of EMOGOA 

1: Load two ontologies O
_{1} and O
_{2}
2: Construct solution space by similarity measure 3: Initialize maximum number of iterations(maxIter), c ${}_{\mathrm{m}\mathrm{a}\mathrm{x}}$ , c ${}_{\mathrm{m}\mathrm{i}\mathrm{n}}$ , the population ${X}_{i}(i=\mathrm{1,2},\cdots ,n)$; 4: Calculate the fitness of each grasshopper by using Eqs. (11) and (12). 5: Select the target T of population by using Eq.(2). 6: while (t < maxIter) 7: Update c using Eq. (8) 8: for each search individual 9: Normalize the distances between individuals in [1,4] 10: Update the position of the current individual by using Eq. (5) 11: End for 12: for each individual 13: Evaluate individual by using Eqs. (11) and (12) 14: Update the archive by using the ε dominance technique. 15: End for 16: Select the target from the archive by using the roulette wheel strategy. 17: Update target position and fitness. 18: t=t+1; 19: End while 20: End 
5 Proposed NSMOGOA Algorithm
In the proposed NSMOGOA, a fast nondominated sorting (FNS) algorithm is used to replace the calculation of nondominated solutions in MOGOA, and to improve the performance of MOGOA. The simulation results of this FNS algorithm in NSGAII show that it can preserve the diversity of solutions and better convergence. This FNS algorithm mainly includes two operations: 1) Nondominated ranking for the whole population; 2) Calculating the crowding distance of each individual^{ [28]}. The NSMOGOA is summarized in Algorithm 2.
First, for the nondominated ranking, there are two suboperations that need to be performed. The first one is to count the number of dominating solutions p. It is also called the dominating number N ${}_{p}$; the second operation saves the solutions dominated by this solution p. It is worth noting that the solution with a dominance number 0 is naturally called the first nondominated level. Next, the algorithm scans all solutions with N ${}_{p}$=0 and decrements their dominance numbers. When the dominance number of any solution q decreases by 0, the solution q is saved in a list Γ. The members from list Γ are considered the second nondominant level. This operation continues until all levels are recognized.
For the calculation of crowding distance, this operation estimates the density around a particular solution in the population. Specifically, this operation sorts the population in ascending order according to each objective value. For each objective, the solution with the largest and smallest function value is assigned an infinite distance value. The intermediate solution is assigned a difference v, which is equal to the difference between the two neighboring solutions of this solution. This calculation is continued until all target values are reached. As a result, the crowding distance of each individual is the sum of each object difference v of each solution. Further, the crowding distance for the zth solution is half the circumference of the cuboid. The method is depicted in Fig. 1.
Fig. 1 The method of calculating for the crowding distance of the zth solution Note: The points linked by dashed lines indicate that they are at the same nondominant level 
In the selection process, the FNS algorithm preferentially selects the solution with a low nondominated rank when two solutions with different nondominated ranks. When these two solutions have the same ranking, the one in a spacious area (that is, a less crowded area) is selected first.
Algorithm 2: Summary of NSMOGOA 

1: Load two ontologies O
_{1} and O
_{2}
2: Construct solution space by similarity measure 3: Initialize maximum number of iterations(maxIter), c _{max}, c _{min},the population ${X}_{i}(i=\mathrm{1,2},\cdots ,n)$; 4: Calculate the fitness of each grasshopper by using Eqs. (11) and (12). 5: Select the target T of population by using Eq.(2). 6: while (t < maxIter) 7: Update c using Eq. (8) 8: for each search individual 9: Normalize the distances between individuals in [1,4] 10: Update the position of the current individual by using Eq. (5) 11: Evaluate individual by using Eqs. (11) and (12) 12: End for 13: Update the population by using the fast nondominated sorting technique. 14: Select the target from the individuals with lower rankings by using the roulette wheel strategy. 15: Update target position and fitness.// Update current target 16: t=t+1; 17: End while 18: End 
6 Experiments
In order to evaluate the performance of the improved algorithms in performing ontology alignment tasks, three experiments are performed. In the first experiment, the selection strategy for this target is studied. The second experiment compares traditional MOGOA with other commonly used metaheuristic algorithms. In the third experiment, two improved methods were compared with the Ontology Alignment Evaluation Initiative (OAEI) methods.
6.1 Evaluation Methods and Dataset
The result of ontology alignment is a set of correspondences, which are usually evaluated using three indicators: precision, recall and Fmeasure from the Information Retrieval field^{ [29]}. Precision represents the percentage of correctly matched mappings in all mappings found. Recall represents the percentage of correctly matched mappings in all correct mappings from the reference alignment. It is worth noting that if a system only retrieves the correct correspondence, the number of correct correspondence retrieved is reduced, but the total number of retrieved correspondences will be significantly reduced, which leads to a high precision. When the number of correct correspondences decreases, the recall becomes smaller. Therefore, precision and recall are two conflicting indicators. In the actual evaluation, it is not complete to evaluate the performance of a system by only applying the precision and the recall, and they must be well combined^{[9]}. For this reason, the Fmeasure is used, which represents the weighted harmonic mean of precision and recall. In this experiment, precision is expressed as Prec, recall is expressed as Rec , Fmeasure is represented as F1.
$\begin{array}{c}\mathrm{P}\mathrm{r}\mathrm{e}\mathrm{c}\mathrm{i}\mathrm{s}\mathrm{i}\mathrm{o}\mathrm{n}=\frac{\left{R}_{\mathrm{r}\mathrm{e}\mathrm{l}}\bigcap {A}_{\mathrm{r}\mathrm{e}\mathrm{t}}\right}{\left{A}_{\mathrm{r}\mathrm{e}\mathrm{t}}\right}\end{array}$(14)
$\begin{array}{c}\mathrm{R}\mathrm{e}\mathrm{c}\mathrm{a}\mathrm{l}\mathrm{l}=\frac{\left{R}_{\mathrm{r}\mathrm{e}\mathrm{l}}\bigcap {A}_{\mathrm{r}\mathrm{e}\mathrm{t}}\right}{\left{R}_{\mathrm{r}\mathrm{e}\mathrm{t}}\right}\end{array}$(15)
$\begin{array}{c}F\mathrm{m}\mathrm{e}\mathrm{a}\mathrm{s}\mathrm{u}\mathrm{r}\mathrm{e}=\mathrm{2}\times \frac{\mathrm{P}\mathrm{r}\mathrm{e}\mathrm{c}\mathrm{i}\mathrm{s}\mathrm{i}\mathrm{o}\mathrm{n}\times \mathrm{R}\mathrm{e}\mathrm{c}\mathrm{a}\mathrm{l}\mathrm{l}}{\mathrm{P}\mathrm{r}\mathrm{e}\mathrm{c}\mathrm{i}\mathrm{s}\mathrm{i}\mathrm{o}\mathrm{n}+\mathrm{R}\mathrm{e}\mathrm{c}\mathrm{a}\mathrm{l}\mathrm{l}}\end{array}$(16)
where$\text{}{R}_{\mathrm{r}\mathrm{e}\mathrm{l}}\text{}$is the relevant mappings given by OAEI, and ${A}_{\mathrm{r}\mathrm{e}\mathrm{t}}$ is the mappings retrieved by the system.
In order to study the performance of the proposed EMOGOA and NSMOGOA methods, twentythree ontology alignment tasks are performed, which come from the OAEI. Since the maximum number of executions for the OAEI2016 systems is five, we performed each task independently 5 times and calculated the average value for the fairness. These test tasks are described in Table 1.
Brief of benchmarks
6.2 Parameter Settings
In order to maintain an ontology alignment system with high accuracy and applicability to input different ontologies, in this work, basic matchers with different capabilities are used to calculate the similarity for a pair of entities.The matcher means Jero measure^{[30]}, Vector Space Model^{[31]}, WordNet^{[32]},Levhenstein^{[33]}, Hierarchy distance^{[15]} and Numberedhierarchy distance^{[15]}, respectively. The parameter settings of each method are shown in Table 2.
Parameters configuration of each method
6.3 Experiment and Comparision
6.3.1 Target selection strategy
The next position of the grasshopper is moved by learning the current position, the target position and the positions of all other grasshoppers. Therefore, the choice of the target is an important component, which improves the distribution of nondominated solutions in the archive and guides the grasshopper population toward promising regions. However, since the choice of this target needs to choose one from the Pareto optimal solution set, the choice of this target is a challenge in multiobjective MOGOA. In this work, we experimented with two selection techniques: roulette wheel selection (RWS) technique and square root distance (SRD) technique^{ [34]}. Therefore, in our work, the roulette selection strategy and the square root distance strategy are verified to select targets from the archive. For roulette selection, the author counts the number of neighbors of each solution in the archive based on a fixed distance as crowding distance. Subsequently, the members in the archive are ranked according to the crowding distance. The archive ranked is employed by the roulette selection strategy to select a target. For the square root distance, we calculated the distance between each member in the archive and the current target by using Eq.(17). The member with the smallest SRD is selected as the new target.
$\begin{array}{c}\mathrm{S}\mathrm{R}\mathrm{D}\left({x}_{\mathrm{1}},{x}_{\mathrm{2}}\right)={\displaystyle \sum _{k=\mathrm{1}}^{\mathrm{2}}}\sqrt[]{\left{f}_{k}\left({x}_{\mathrm{1}}\right){f}_{k}\left({x}_{\mathrm{2}}\right)\right}\end{array}$(17)
where $k$ is the number of the problem objective.
The results of this experiment are shown in Table 3. As Table 3 shown, the RWS strategy is better than the SRD strategy in terms of matching accuracy and execution time. Therefore, RWS strategy is used to select the target that is suitable for ontology matching.
Comparison of target selection strategy
6.3.2 Comparison between proposed methods and the other metaheuristic algorithms
In order to investigate the performance of the proposed EMOGOA and NSMOGOA methods, we summarized the average Fmeasure and time, and compared them with the classic single and multiobjective metaheuristic algorithms. Table 4 shows the comparison results with the multiobjective metaheuristic algorithms. For the average F1, this modified EMOGOA and NSMOGOA achieve better alignment quality compared with other multiobjective algorithms. The main reason is that EMOGOA maintains the diversity in the archive by allowing only one solution in each preallocated hyperbox of the Pareto optimal front. Another reason is that it emphasizes nondominated solutions. Compared with the original MOGOA, the NSMOGOA can maintain a better solution diversity and better converge in the obtained nondominated front. The SPEA2^{[27]} obtains a poor alignment quality. Considering the precision for Eq.(14), if we select as many correct correspondences as possible, that is, the precision is high, then only the most likely correct correspondences will be selected. In this case, the total number of correspondences selected by the algorithm will be reduced, and the correct correspondences selected by the algorithm will be correspondingly reduced, but the denominator is reduced faster so that the precision becomes higher. The precision of EMOGOA and NSMOGOA are 0.985 and 0.989, respectively, which shows that, in most cases, the performance of the proposed methods to find the correct correspondence is enhanced.
Comparison results with the singleobjective algorithm in Table 5 shows that the proposed EMOGOA and NSMOGOA outperform the GAbased and PSObased methods. The main reason is that multiobjective optimization algorithms can maintain a better solution diversity and better converge in the obtained nondominated front. Furthermore, the multiobjective optimization algorithm has better performance than singleobjective for ontology alignment problem.
Table 6 shows the average running time of each algorithm. It is clear that the time of the proposed EMOGOA is much less than NSMOGOA, MOGOA, SPEA2, and GA. The main reason is that EMOGOA employs the technology of εdominance to update the archive. This mechanism can reduce the time to converge to the Pareto optimal solution set and find a highquality alignment. Although the running time of NSMOGOA is higher than that of EMOGOA, it is lower than MOGOA. Furthermore, this NSMOGOA modified by employing a fast nondominated solution sorting can speed up the convergence of nondominated solutions and better maintain the diversity of solutions.
Comparison with other multiobjective metaheuristic algorithms on 23 matching tasks
Comparison results with other singleobjective metaheuristic algorithms on 23 matching tasks
Comparison of the time with MOGOA, SPEA2, GA and PSO (unit:ms)
6.3.3 Comparison between proposed methods and the other OAEI methods
In order to further study the performance of the proposed methods, the top system AML${}^{[\mathrm{35}]}$, LogMap${}^{[\mathrm{36}]}$ and XMap${}^{[\mathrm{37}]}$ in the OAEI2016 competition are selected. It is worth noting that there are no results of tasks 103, 104, 203, 204, 205 and 231 on the OAEI official website. Therefore, these results are removed from our experimental results. Table 7 is the comparision of the proposed method with the top system of OAEI competition,and the results show that the proposed EMOGOA and NSMOGOA methods significantly improve the quality of alignment.
Comparison of the proposed method with the top system of OAEI competition
7 Conclusion
The improvement of multiobjective evolutionary algorithms usually considers the two goals of maintaining the diversity of nondominated solution sets and converging to the true Pareto optimal front. However, it is often ignored to achieve these two goals with rapid calculation. In this paper, the fast calculation manner for multiobjective metaheuristic algorithms to align ontology are considered. Two different grasshopper optimization algorithms (GOA) are proposed for multiobjective ontology alignment. The performance of the proposed two methods is evaluated on the benchmark dataset. The results show that the proposed two methods outperform other metaheuristicbased methods in terms of alignment quality. In particular, the EMOGOA method achieves the best performance. For the running time, compared with SPEA2, GA, PSO, despite the long execution time of NSMOGOA, the alignment quality has been improved. This EMOGOA is only 0.019 49 seconds higher than PSO. It can be concluded that the two proposed multiobjective ontology alignment methods are successful for improving execution efficiency. In future work, we plan to further improve the performance of the proposed two methods in large ontology alignment.
References
 Fernández S, Ito T. Semantic integration of sensor data with SSN ontology in a multiagent architecture for intelligent transportation systems[J]. IEICE Trans Inf Syst, 2017,12: 29152922. [CrossRef] [Google Scholar]
 Zangeneh P, McCabe B. Ontologybased knowledge representation for industrial megaprojects analytics using linked data and the semantic web[J]. Adv Eng Informatics, 2020, 46:101164. [CrossRef] [Google Scholar]
 Talhi A, Fortineau V, Huet J C, et al. Ontology for cloud manufacturing based product lifecycle management[J]. J Intell Manuf, 2019, 30(5) : 21712192. [CrossRef] [Google Scholar]
 Azevedo H, Belo J P R, Romero R A F. Using ontology as a strategy for modeling the interface between the cognitive and robotic systems[J]. J Intell Robotic Syst, 2020, 99(3):431449. [CrossRef] [Google Scholar]
 Saremi S, Mirjalili S, Lewis A. Grasshopper optimisation algorithm: Theory and application[J]. Advances in Engineering Software, 2017, 105: 3047. [CrossRef] [Google Scholar]
 Bock J, Hettenhausen J. Discrete particle swarm optimisation for ontology alignment[J]. Inf Sci, 2012, 192: 152173. [CrossRef] [Google Scholar]
 Wang J L, Ding Z J, Jing C J. GAOM: Genetic algorithm based ontology alignment[C]// AsiaPacific Services Computing Conference (APSCC) . Washington D C: IEEE, 2006: 617620. [Google Scholar]
 Mohammadi M, Hofman W, Tan Y H. Simulated annealingbased ontology alignment[J]. ACM Trans Manag Inf Syst, 2019, 10(1): 3:13:24. [Google Scholar]
 Gil J M, Montes J F A. Evaluation of two heuristic approaches to solve the ontology metamatching problem[J]. Knowl Inf Syst, 2011, 26(2): 225247. [CrossRef] [Google Scholar]
 Gil J M, Alba E, Montes J F A. Optimizing ontology alignments by using genetic algorithms [C]// 7th International Semantic Web Conference (ISWC). Berlin Heidelberg: SpringerVerlag, 2008: 115. [Google Scholar]
 Naya J M V, Romero M M, Loureiro J P, et al. Improving ontology alignment through genetic algorithms[C]// Soft Computing Methods for Practical Environment Solutions: Techniques and Studies. Hershey: IGI Global, 2010: 240259. [Google Scholar]
 Ginsca A L, Iftene A. Using a genetic algorithm for optimizing the similarity aggregation step in the process of ontology alignment[C]// 9th Roedunet International Conference (RoEduNet). Washington D C: IEEE, 2010: 118122. [Google Scholar]
 Xue X S, Wang Y P. Improving the efficiency of NSGAII based ontology aligning technology[J]. Data Knowl Eng, 2017, 108: 114. [CrossRef] [Google Scholar]
 Xue X S, Wang Y P, Hao W C. Using MOEA/D for optimizing ontology alignments[J]. Soft Comput, 2014,18(8):15891601. [CrossRef] [Google Scholar]
 Acampora G, Loia V, Vitiello A. Enhancing ontology alignment through a memetic aggregation of similarity measures[J]. Inf Sci, 2013, 250: 120. [CrossRef] [Google Scholar]
 Acampora G, Kaymak U, Loia V, et al. Applying NSGAII for solving the ontology alignment problem[C]// International Conference on Systems, Man, and Cybernetics(SMC). Washington D C: IEEE , 2013: 10981103. [Google Scholar]
 Ryma G, Kholladi M K. Genetic algorithm with hill climbing for correspondences discovery in ontology mapping[J]. J Inf Technol Res, 2019, 12(4):153170. [CrossRef] [Google Scholar]
 Forsati R, Shamsfard M. Symbiosis of evolutionary and combinatorial ontology mapping approaches[J]. Inf Sci, 2016, 342: 5380. [CrossRef] [Google Scholar]
 Lv Z M, Peng R. A novel metamatching approach for ontology alignment using grasshopper optimization[J]. Knowl Based Syst, 2020: 201202. Article ID: 106050. [Google Scholar]
 Xue X S, Wang Y P, Ren A H. Optimizing ontology alignment through memetic algorithm based on partial reference alignment[J]. Expert Syst Appl, 2014, 41(7): 32133222. [CrossRef] [Google Scholar]
 Xue X S, Wang Y P. Optimizing ontology alignments through a memetic algorithm using both MatchFmeasure and unanimous improvement ratio[J]. Artif Intell, 2015, 223:6581. [CrossRef] [Google Scholar]
 Xue X S, Wang Y P. Using compact evolutionary Tabu Search algorithm for matching sensor ontologies[J]. Swarm Evol Comput, 2019, 48: 2530. [CrossRef] [Google Scholar]
 Xue X S, Wang Y P. Using memetic algorithm for instance coreference resolution[J]. IEEE Trans Knowl Data Eng, 2016, 28(2): 580591. [CrossRef] [Google Scholar]
 Euzenat J, Shvaiko P. Ontology Matching[M]. Berlin Heidelberg: SpringerVerlag, 2013. [CrossRef] [Google Scholar]
 Das A K, Nikum A K, Krishnan S V, et al. Multiobjective Bonobo Optimizer (MOBO): An intelligent heuristic for multicriteria optimization[J]. Knowl Inf Syst, 2020, 62(11):44074444. [CrossRef] [Google Scholar]
 Mirjalili S Z, Mirjalili S, Saremi S, et al. Grasshopper optimization algorithm for multiobjective optimization problems[J]. Appl Intell, 2018, 48 (4): 805820. [CrossRef] [Google Scholar]
 Deb K, Mohan M, Mishra S. Evaluating the epsilondomination based multiobjective evolutionary algorithm for a quick computation of Paretooptimal solutions[J]. Evol Comput, 2005,13(4): 501525. [CrossRef] [PubMed] [Google Scholar]
 Deb K, Agrawal S, Pratap A. et al. A fast and elitist multiobjective genetic algorithm: NSGAII[J]. IEEE Trans Evol Comput, 2002, 6(2): 182197. [CrossRef] [Google Scholar]
 Yates R A B, Neto B A R. Modern Information Retrieval[M]. New York: ACM Press / AddisonWesley ,1999. [Google Scholar]
 Cohen W W, Ravikumar P, Fienberg S E. A comparison of string distance metrics for namematching tasks [C]// Proc of the KDD Workshop on Data Cleaning and Object Consolidation. Washington D C: IEEE, 2003, 3: 7378. [Google Scholar]
 Salton G, Wong A, Yang C S. A vector space model for automatic indexing[J]. Commun ACM, 1975, 18(11): 613620. [CrossRef] [Google Scholar]
 Miller G A. WordNet: A lexical database for English[J].Communications of the ACM, 1995, 38(11): 3941. [CrossRef] [Google Scholar]
 Scherbina A, Kuznetsov S. Clustering of Web sessions using Levenstein metric[J]// Industrial Conference on Data Mining . Berlin: SpringerVerlag, 2004: 127133. [Google Scholar]
 Saxena N, Mishra K K. Improved multiobjective particle swarm optimization algorithm for optimizing watermark strength in color image watermarking[J]. Appl Intell, 2017, 47(2): 362381. [CrossRef] [Google Scholar]
 Faria D, Pesquita C, Balasubramani B S, et al. OAEI 2016 results for AML[C]// Proceedings of the 11th International Workshop on Ontology Matching. Berlin: SpringerVerlag, 2016: 154160. [Google Scholar]
 Ruiz E J, Grau B C, Cross V V. LogMap family participation in the OAEI 2016[C]// Proceedings of the 11th International Workshop on Ontology Matching. Berlin: SpringerVerlag, 2016: 201203. [Google Scholar]
 Djeddi W E, Khadir M T, Yahia S B. XMap: results for OAEI 2016[C]// Proceedings of the 11th International Workshop on Ontology Matching, Berlin: SpringerVerlag, 2016: 222226. [Google Scholar]
All Tables
Comparison with other multiobjective metaheuristic algorithms on 23 matching tasks
Comparison results with other singleobjective metaheuristic algorithms on 23 matching tasks
All Figures
Fig. 1 The method of calculating for the crowding distance of the zth solution Note: The points linked by dashed lines indicate that they are at the same nondominant level 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.