Open Access
Issue
Wuhan Univ. J. Nat. Sci.
Volume 27, Number 3, June 2022
Page(s) 240 - 254
DOI https://doi.org/10.1051/wujns/2022273240
Published online 24 August 2022

© Wuhan University 2022

Licence Creative CommonsThis is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

0 Introduction

Ontology is a clear formal specification for sharing concepts and is widely used to support knowledge integration and interoperability such as intelligent transportation systems [1], industrial megaprojects analytics [2], product lifecycle management [3], and cognitive and robotic systems [4]. In recent years, more and more ontologies have been developed by using the Semantic Web standard technologies in the same domain of interest. However, different ontology design experts may use different terms to describe the concept of the same thing, which emerges the heterogeneity problem between ontologies. Therefore, ontology-based data management is becoming more and more complex. In order to apply these knowledge widely, ontology alignment is a key process. The goal of ontology alignment is to find a set of semantic correspondences between different ontologies for further handled knowledge-based systems. In the past few years, many ontology alignment methods or techniques have been introduced in the literature. Especially the method of weighted aggregation for combining multiple basic matchers through swarm and evolutionary algorithms has demonstrated to be an effective method. These methods utilize weights to combine lexical-based, linguistic-based, and structure-based matchers for finding more correspondences between entities to make full use of the information contained within context of an ontology. Weights based methods are divided into two categories: fixed weights and dynamic weights. However, the fixed weights manually configured according to experimental results have certain limitations, which may reduce the quality of alignment. Therefore, adaptive weights allocation are an effective method for combining matchers adequately.

In recent years, the successful application of meta-heuristic algorithm is very noteworthy. Swarm intelligence algorithm has successfully solved the ontology alignment problem and obtained the better ontology alignment quality in the form of a single objective. However, these methods only apply the inherent advantages of meta-heuristics and rarely consider the execution efficiency, especially the multi-objective ontology alignment model. Further, the performance of such multi-objective optimization models mostly depends on the well-distributed and the fast-converged set of solutions in specific applications. In order to improve the efficiency and effect of multi-objective ontology alignment methods, in this work, the grasshopper optimization algorithms are improved to perform multi-objective ontology alignment tasks more efficiently. Grasshopper Optimization Algorithm (GOA) is a population based meta-heuristic algorithm inspired by the behavioral characteristics of grasshopper swarms to simulate exploration and exploitation processes of the swarm intelligence algorithm [5]. According to the claim of Bock et al [6], ontology alignment as an optimization problem needs to maintain the number of correspondences and a set of correct correspondences at the same time. Therefore, the ontology alignment can be regarded as a multi-objective optimization problem with two objectives: the number of correspondences and the set of correct correspondences. Further, GOA balances the exploration and exploitation in the search process through an adaptive parameter. The high repulsive force between grasshoppers gives GOA the advantages of high exploration and local optimal avoidance. The attraction between grasshoppers makes GOA algorithm have strong exploitation and convergence performance. These characteristics make the GOA algorithm more adaptable to complex multi-objective search problems and surpass other algorithms such as particle swarm optimization (PSO), bat optimization algorithm (BA), Bonobo Optimizer (BO), and firefly algorithm (FA). These advantages motivate us to develop an ontology alignment system based on multi-objective GOA. In this paper, two different multi-objective GOA are proposed to perform ontology alignment tasks in a more efficient way. One is ε-dominance concept based GOA (EMO-GOA) and the other is fast non-dominated sorting based GOA (NS-MOGOA). The proposed EMO-GOA approach uses the grid technology based on ε-dominance concept to keep the diversity of the non-dominated solution in the archive, and reduce the time to converge to the Pareto optimal solution. The NS-MOGOA uses a fast non-dominated sorting method by assigning ranking and crowding distance to the population, which speeds up the completion of the ontology alignment task.

The remainder of this paper is organized as follows. Section 1 describes the related work of this paper. Section 2 describes the related concepts underlying our research work. Section 3 gives the mathematical model of the multi-objective ontology alignment problem. The modified EMO-GOA is proposed in Section 4. Section 5 proposes the NS-MOGOA. Section 6 initializes the conditions required for the experiments and the results are discussed. Section 7 concludes with an overview of our future work.

1 Related Works

In the past few years, evolutionary computing technology based ontology alignment methods have been proposed to address the problem of combining multiple basic matchers. These methods are divided into two categories: direct optimization alignment and the ontology meta-matching technique. The method of directly optimizing an alignment is to search for a correspondence set in the candidate correspondences. This set is also called an alignment. Furthermore, each confidence value for a correspondence is calculated by combining multiple basic matchers. The most famous method of this type is MapPSO, which directly optimizes an alignment evaluated by a fitness function with two objectives [6]. The method minimizes the aggregate values of lexical-based, linguistic-based, and structure-based matchers and evaluates candidate alignments by using fitness functions. The optimization process is performed by using discrete particle swarm optimization. In detail, a candidate solution is randomly selected from the Class, Data property and Object property space. The update process deletes some correspondences when the fitness function decreases, and some new correspondences are added. When the termination condition is reached, a suboptimal alignment is found. The limitation of this method is that the weight value of the combined matchers is fixed. This similar work is GAOM, which uses an evaluation function to evaluate candidate alignments in the optimization process of genetic algorithms [7].In recent years, Mohammadi et al [8] proposed a method to directly optimize ontology alignment based on simulated annealing algorithm from the perspective of reducing the time of calculation and memory. The authors model the ontology alignment problem as a function of optimizing a state, and an optimal state is obtained by performing a simulated annealing algorithm. This method uses several matchers to calculate a similarity matrix. The confidence of each correspondence is taken from the maximum value calculated by these matchers which utilize all information of a pair of entities such as name, label and comment.

Another category is the ontology meta-matching technology, which can be considered to automatically select or configure appropriate weights and filtering thresholds during optimization process [9]. The earliest meta-matching method is GOAL [10]. Instead of directly optimizing the candidate alignment between two ontologies, this method uses a genetic algorithm to set the weights, and to find a suboptimal alignment. This algorithm searches for different weight configurations, i.e., a candidate solution vector, and finds the best set of weights in infinite search space. The evaluation metric of a solution is to use the precision and recall of the reconciliation. Similarly, Naya et al [11] used the genetic algorithm to find out appropriate weights for aggregating different similarity metrics into a final measure. Each individual represents a specific combination of metrics, and the method finds the combination and provides the best alignment. The genetic algorithm was also used in Ref.[12] for optimizing a suboptimal alignment. The difference is that the weight and the threshold are coded in the chromosome at the same time. Xue et al [13] utilized multi-objective genetic algorithm (NSGA-II) to align ontology. In addition, also in Ref.[14], the multi-objective evolutionary algorithm based on decomposition (MOEA/D) was utilized for solving ontology alignment problem. For these two multi-objective approaches, precision and recall are used as two objectives. Although these meta-matching methods have obtained good alignment quality, they rarely consider the execution efficiency of meta-heuristics algorithms, especially the multi-objective ontology alignment model.

In recent years, researchers have proposed different methods based on newly developed meta-heuristic algorithms to solve the ontology meta-matching problem, and these methods do not require a priori knowledge in the alignment process. For example, Acampora et al [15] proposed an ontology alignment process based on a hybrid evolutionary algorithm, which can effectively aggregate a variety of basic matchers without a priori knowledge. In the end, a suboptimal alignment was obtained between the given two ontologies. In addition, also in Ref.[16], the NSGA-II was utilized to solve the ontology meta-matching problem and the problem of prior knowledge. Ryma et al [17] proposed a hybridization method by combining the Hill climbing local search technique into the genetic algorithm for large ontology alignment without a priori knowledge. Forsati et al [18] proposed an effective method based on Harmony Search (HSOMap). This method uses a weighted harmonic-mean method to aggregate various matchers into a single confidence among all possible correspondences between two ontologies. Lv et al [19] proposed a novel approach by using grasshopper optimization algorithm to solve the ontology meta-matching problem. This method makes use of various matchers and does not require a priori knowledge in the alignment process. As each basic matcher can capture the information of an entity from a different perspective, the aggregation technique of multiple matchers can enhance the matching effect of the algorithm. For deep reviews of other meta-heuristics based methods, consider the following articles: utilizing Memetic algorithm through partial reference alignment to evaluate candidate solution [20], using Memetic algorithm through MatchFmeasure and Unanimous improvement ratio to evaluate candidate solution [21], using the Tabu search improved compact evolutionary algorithm to match sensor ontologies [22], and matching ontology based on instance by using Memetic algorithm [23]. These methods also only focus on solving the problem of relying on prior knowledge and improving the quality of ontology alignment without considering the execution efficiency of meta-heuristic algorithms.

In this paper, the well-distributed and the fast-converged set of solutions for multi-objective meta-heuristic algorithms are considered. Two different GOA are proposed for multi-objective ontology alignment.

2 Preliminaries

In this section, the ontology, ontology alignment, and two basic concepts related to MOOP are defined as follows.

2.1 Ontology and Ontology Alignment

Definition 1   (Ontology) An ontology is formalized as O =<C, P, I, V, R> , where:

1) C represents a collection of classes;

2) P = {P 1, P 2, ⋯, Pn } is a collection of properties;

3) I = {I 1, I 2, ⋯, In } represents instance information of concepts;

4) V indicates the annotations information of concepts;

5) R = {R 1, R 2,, Rn } represents axioms of concepts.

Definition 2   (Ontology alignment)

Ontology alignment is a task of finding a set of pair of entities between different ontologies. In this work, a pair of entities is also called correspondence, matching or mapping. The algorithm functioned to achieve this task can be defined as follows. This algorithm takes three parameters O1,O2,and P as input and returns an alignment M, which is a set of finding correspondences. The O1 and O2 are two ontologies, and P is a set of parameters [24].

M=F(O1,O2,P)(1)

Definition 3   (Pareto dominance)

Assuming x and y are any two solution vectors in the evolutionary population, then if x is called to dominate y (denoted by xy) , it must meet two conditions: For all sub-objectives, x is not worse than y and at least one sub-objective makes that x is better than y . This mathematical model is defined as follows.

(k{1,2,,r}: fk(x)fk(y))(l{1,2,,r}: fl(x)<fl(y))(2)

Definition 4   (Pareto optimal solution) A solution x is said a Pareto optimal solution because it is not dominated by any vector in the decision space Ω. The Pareto optimal solution is mathematically defined as follows:

y Ω :yx(3)

Definition 5   (Pareto optimal solution set) A Pareto optimal solution set (POS) is composed of multiple Pareto optimal solutions [25].

POS={x| y Ω :y x}(4)

2.2 Multi-Objective Grasshopper Optimization Algorithm

In 2018, Mirjalili et al [26] proposed a multi-objective grasshopper optimization algorithm inspired from the behavior of grasshoppers swarms. In order to solve the optimization problem, a mathematical model was proposed by simulating the foraging behavior of grasshopper swarms. In this model, a social interaction function realizes the attraction and repulsion between grasshopper swarms. Furthermore, this repulsive force realizes the exploration process of GOA in a search space, and this attraction force realizes the exploitation and convergence process. In order to balance these two search processes, an adaptive comfort zone coefficient leads to an appropriate balance between exploration and exploitation. In the multi-objective version, the Pareto dominance, an archive and target selection techniques are introduced into the algorithm to optimize a Pareto optimal solution set. Further, the Pareto optimal dominance is integrated into GOA to compare solutions with multiple objectives. In order to obtain a non-dominated solution set, i.e., Pareto optimal solution set, an external archiving technique is used to store the non-dominated solutions and is updated after the population evolves. The mathematical model is defined as follows:

Xi=cw(j=1jiNcnp-q2S(dij)dij̑)+Td̑(5)

Xi is the updated position vector of the i-th grasshopper; cw is similar to the inertia weight of particle swarm optimization, which balances the exploration and exploitation; p and q are the upper and lower bounds of the search range; Td̑ is the currently searched optimal target value, and dij ̑ is a unit vector from the i-th grasshopper to j-th grasshopper; S is the attraction function which is defined as follows.

S(dij)=ue-dijτ-e-dij (6)

where d ij is the distance between the i-th and the j-th grasshopper; u is the strength of attraction; τ indicates the attractive length scale. The unit vector dij ̑ is defined as follows:

dij̑=xj-xidij(7)

where xj is the current position vector of the j-th grasshopper; xi is the current position vector of the i-th grasshopper. In equation (5), cn is an internal parameter, which is the decreasing coefficient to shrink the comfort zone, repulsion zone and attraction zone; The cw and cn is defined as follows:

cw=cn=cmax-tcmax-cminL(8)

where cmax and cmin indicate the maximum value and minimum value of the parameter c, respectively; t is the current iteration number; L is the maximum number of iterations.

3 Mathematical Model of the Multi-Objective Ontology Alignment Method

According to the above analysis, the approach using meta-heuristics needs to be modeled as multi-objective optimization problem to solve the ontology alignment problem in a more suitable way. A mathematical model for the multi-objective ontology matching problem is defined. Subsequently, the data structure of the grasshopper is represented, and two objective functions are introduced. The mathematical model of the multi-objective optimization problem is defined as follows:

Given a vector X=(x1,x2,, xn), XΩ and it satisfies the following conditions:

gi(X)0, i=1,2,,k

hl(X)=0, i=1,2,,l

Considering a maximization problem for ontology alignment, there are k conflicting objectives to be optimized. This optimization function is defined as follows:

Maximize f(X)=(f1(X),f2(X),,fk(X))(9)

For a candidate solution in the population, each grasshopper position also represents a set of weights used in the basic matcher aggregation task and is associated with a candidate alignment in the meta-matching method. The value of each weight indicates the contribution for a basic matcher. In order to adequately capture the information of the ontology, the matchers with different capabilities are used in the proposed method. Therefore, a candidate solution vector X is represented as follows.

{X=(r1,r2,, rn)ri=i=1nwi×mi(ce),s.t.i=1nwi=1(10)

where ce is a correspondence, wi is the weight of the i-th matcher, mi (i∈[1,n]) is the i-th matcher, n is the number of matchers.

Inspired by Bock et al [6], the transformation of ontology alignment task into an optimization problem requires two objectives: 1) the number of correspondences; 2) a set of correct correspondences. Therefore, two evaluation functions were proposed. Furthermore, we consider the matching probability of correspondences for a candidate alignment found during perform. It is formulated to the first objective, which is formally defined as follows:

f1(X)=|X||O1×O2|(11)

In order to protect the elite individuals, it is necessary to maximize the average confidence of correspondences in an alignment, which is formally defined as follows:

f2(X)=i=1|X|ri|X|(12)

where |X| is the number of correspondences for an alignment found so far, ri is the confidence weighted for a correspondence, and |O1×O2| represents the total number of candidate correspondences between two ontologies.

In the next sections, two versions of the multi-objective grasshopper optimization algorithm are proposed to investigate the matching quality and calculation time for solving ontology alignment problem.

4 Proposed EMO-GOA Algorithm

Considering that the two essential components of a multi-objective optimization algorithm are the selecting of non-dominated solutions and the maintaining of the diversity of non-dominated solutions in the optimization process, we propose an archive update mechanism based on the ε-domination concept to improve the distribution and convergence speed. In the proposed EMO-GOA, a grid or hyper-box mechanism based on the ε-dominance concept is used to improve the original MOGOA which is only based on the Pareto dominance. The main reason for this modification is that in the steady-state MOEA experiment by ε-dominance, it has been proved to be a good compromise method in terms of approaching the Pareto optimal front, the diversity of the solution, and the well-converged [27]. Furthermore, this mechanism maintains the diversity of solutions in the archive by allowing only one solution to be displayed in each pre-allocated hyper-box. In EMO-GOA optimization process, the new solutions generated are compared with each member in the archive by using ε-dominance. First of all, an identification array {B=(B1,B2,,Bk)T} is assigned to each member in the archive, and it is defined as follows:

Bk(f)={fk-fkminεk, for minimizing  fkfk-fkminεk, for maximizing  fk (13)

where k is the number of problem objectives, and fkmin is the minimum value of the k-th objectie and εk is the allowable tolerance in the k-th objective.

Therefore, this mechanism divides the whole objective space into grid clusters. The identification array Bx of the new solution x and the identification array Ba of each archive member a are calculated, respectively. This new solution x is not accepted when the identification array Ba of any member a in the archive dominates Ba of the new solution x . That is, the new solution x is dominated by the archive member ε-dominance. Instead, this archive member a is deleted, and the new solution x is accepted when the new solution Bx dominates the Ba of any archive member a. If neither of the above two situations occurs, it means that the new solution x and the members in the archive are ε-non-dominance. Two cases need to be analyzed. If the new solution shares a hyper-box with an archive member, the general non-dominated relationships are used to detect them first. If this new solution x dominates the archive member or it is non-dominated by the archive member and is closer to the B vector than the archive member, then the new solution x is retained and the archive member is removed from the archive. If the new solution x does not share the hyper-box with any archive member, then the new solution x is accepted [27].

The above analysis shows that each hyper-box is occupied by only one hyper-box at the Pareto optimal front. The ε-dominance concept have three advantages: 1) the maintenance of diversity; 2) consistency of the size of the final archive and the total number of Pareto optimal solutions; 3) the rapid convergence of Pareto optimal solutions. The EMO-GOA is summarized in Algorithm 1.

Algorithm 1: Summary of EMO-GOA
1: Load two ontologies O 1 and O 2
2: Construct solution space by similarity measure
3: Initialize maximum number of iterations(maxIter), c max , c min , the population Xi(i=1,2,,n);
4: Calculate the fitness of each grasshopper by using Eqs. (11) and (12).
5: Select the target T of population by using Eq.(2).
6: while (t < maxIter)
7:  Update c using Eq. (8)
8:  for each search individual
9:   Normalize the distances between individuals in [1,4]
10:   Update the position of the current individual by using Eq. (5)
11: End for
12: for each individual
13:   Evaluate individual by using Eqs. (11) and (12)
14:   Update the archive by using the ε- dominance technique.
15: End for
16:  Select the target from the archive by using the roulette wheel strategy.
17:  Update target position and fitness.
18:  t=t+1;
19:  End while
20: End

5 Proposed NS-MOGOA Algorithm

In the proposed NS-MOGOA, a fast non-dominated sorting (FNS) algorithm is used to replace the calculation of non-dominated solutions in MOGOA, and to improve the performance of MOGOA. The simulation results of this FNS algorithm in NSGAII show that it can preserve the diversity of solutions and better convergence. This FNS algorithm mainly includes two operations: 1) Non-dominated ranking for the whole population; 2) Calculating the crowding distance of each individual [28]. The NS-MOGOA is summarized in Algorithm 2.

First, for the non-dominated ranking, there are two sub-operations that need to be performed. The first one is to count the number of dominating solutions p. It is also called the dominating number N p; the second operation saves the solutions dominated by this solution p. It is worth noting that the solution with a dominance number 0 is naturally called the first non-dominated level. Next, the algorithm scans all solutions with N p=0 and decrements their dominance numbers. When the dominance number of any solution q decreases by 0, the solution q is saved in a list Γ. The members from list Γ are considered the second non-dominant level. This operation continues until all levels are recognized.

For the calculation of crowding distance, this operation estimates the density around a particular solution in the population. Specifically, this operation sorts the population in ascending order according to each objective value. For each objective, the solution with the largest and smallest function value is assigned an infinite distance value. The intermediate solution is assigned a difference v, which is equal to the difference between the two neighboring solutions of this solution. This calculation is continued until all target values are reached. As a result, the crowding distance of each individual is the sum of each object difference v of each solution. Further, the crowding distance for the z-th solution is half the circumference of the cuboid. The method is depicted in Fig. 1.

thumbnail Fig. 1 The method of calculating for the crowding distance of the z-th solution

Note:   The points linked by dashed lines indicate that they are at the same non-dominant level

In the selection process, the FNS algorithm preferentially selects the solution with a low non-dominated rank when two solutions with different non-dominated ranks. When these two solutions have the same ranking, the one in a spacious area (that is, a less crowded area) is selected first.

Algorithm 2: Summary of NS-MOGOA
1: Load two ontologies O 1 and O 2
2: Construct solution space by similarity measure
3: Initialize maximum number of iterations(maxIter), c max, c min,the population Xi(i=1,2,,n);
4: Calculate the fitness of each grasshopper by using Eqs. (11) and (12).
5: Select the target T of population by using Eq.(2).
6: while (t < maxIter)
7:  Update c using Eq. (8)
8:  for each search individual
9:   Normalize the distances between individuals in [1,4]
10:   Update the position of the current individual by using Eq. (5)
11:   Evaluate individual by using Eqs. (11) and (12)
12:  End for
13:  Update the population by using the fast non-dominated sorting technique.
14:  Select the target from the individuals with lower rankings by using the roulette wheel strategy.
15:  Update target position and fitness.// Update current target
16:   t=t+1;
17:  End while
18: End

6 Experiments

In order to evaluate the performance of the improved algorithms in performing ontology alignment tasks, three experiments are performed. In the first experiment, the selection strategy for this target is studied. The second experiment compares traditional MOGOA with other commonly used meta-heuristic algorithms. In the third experiment, two improved methods were compared with the Ontology Alignment Evaluation Initiative (OAEI) methods.

6.1 Evaluation Methods and Dataset

The result of ontology alignment is a set of correspondences, which are usually evaluated using three indicators: precision, recall and F-measure from the Information Retrieval field [29]. Precision represents the percentage of correctly matched mappings in all mappings found. Recall represents the percentage of correctly matched mappings in all correct mappings from the reference alignment. It is worth noting that if a system only retrieves the correct correspondence, the number of correct correspondence retrieved is reduced, but the total number of retrieved correspondences will be significantly reduced, which leads to a high precision. When the number of correct correspondences decreases, the recall becomes smaller. Therefore, precision and recall are two conflicting indicators. In the actual evaluation, it is not complete to evaluate the performance of a system by only applying the precision and the recall, and they must be well combined[9]. For this reason, the F-measure is used, which represents the weighted harmonic mean of precision and recall. In this experiment, precision is expressed as Prec, recall is expressed as Rec , F-measure is represented as F1.

Precision=|RrelAret||Aret|(14)

Recall=|RrelAret||Rret|(15)

F-measure=2×Precision×RecallPrecision+Recall(16)

where Rrel is the relevant mappings given by OAEI, and Aret is the mappings retrieved by the system.

In order to study the performance of the proposed EMO-GOA and NS-MOGOA methods, twenty-three ontology alignment tasks are performed, which come from the OAEI. Since the maximum number of executions for the OAEI-2016 systems is five, we performed each task independently 5 times and calculated the average value for the fairness. These test tasks are described in Table 1.

Table 1

Brief of benchmarks

6.2 Parameter Settings

In order to maintain an ontology alignment system with high accuracy and applicability to input different ontologies, in this work, basic matchers with different capabilities are used to calculate the similarity for a pair of entities.The matcher means Jero measure[30], Vector Space Model[31], WordNet[32],Levhenstein[33], Hierarchy distance[15] and Numberedhierarchy distance[15], respectively. The parameter settings of each method are shown in Table 2.

Table 2

Parameters configuration of each method

6.3 Experiment and Comparision

6.3.1 Target selection strategy

The next position of the grasshopper is moved by learning the current position, the target position and the positions of all other grasshoppers. Therefore, the choice of the target is an important component, which improves the distribution of non-dominated solutions in the archive and guides the grasshopper population toward promising regions. However, since the choice of this target needs to choose one from the Pareto optimal solution set, the choice of this target is a challenge in multi-objective MOGOA. In this work, we experimented with two selection techniques: roulette wheel selection (RWS) technique and square root distance (SRD) technique [34]. Therefore, in our work, the roulette selection strategy and the square root distance strategy are verified to select targets from the archive. For roulette selection, the author counts the number of neighbors of each solution in the archive based on a fixed distance as crowding distance. Subsequently, the members in the archive are ranked according to the crowding distance. The archive ranked is employed by the roulette selection strategy to select a target. For the square root distance, we calculated the distance between each member in the archive and the current target by using Eq.(17). The member with the smallest SRD is selected as the new target.

SRD(x1,x2)=k=12|fk(x1)-fk(x2)|(17)

where k is the number of the problem objective.

The results of this experiment are shown in Table 3. As Table 3 shown, the RWS strategy is better than the SRD strategy in terms of matching accuracy and execution time. Therefore, RWS strategy is used to select the target that is suitable for ontology matching.

Table 3

Comparison of target selection strategy

6.3.2 Comparison between proposed methods and the other meta-heuristic algorithms

In order to investigate the performance of the proposed EMO-GOA and NS-MOGOA methods, we summarized the average F-measure and time, and compared them with the classic single- and multi-objective meta-heuristic algorithms. Table 4 shows the comparison results with the multi-objective meta-heuristic algorithms. For the average F1, this modified EMO-GOA and NS-MOGOA achieve better alignment quality compared with other multi-objective algorithms. The main reason is that EMO-GOA maintains the diversity in the archive by allowing only one solution in each pre-allocated hyper-box of the Pareto optimal front. Another reason is that it emphasizes non-dominated solutions. Compared with the original MOGOA, the NS-MOGOA can maintain a better solution diversity and better converge in the obtained non-dominated front. The SPEA2[27] obtains a poor alignment quality. Considering the precision for Eq.(14), if we select as many correct correspondences as possible, that is, the precision is high, then only the most likely correct correspondences will be selected. In this case, the total number of correspondences selected by the algorithm will be reduced, and the correct correspondences selected by the algorithm will be correspondingly reduced, but the denominator is reduced faster so that the precision becomes higher. The precision of EMO-GOA and NS-MOGOA are 0.985 and 0.989, respectively, which shows that, in most cases, the performance of the proposed methods to find the correct correspondence is enhanced.

Comparison results with the single-objective algorithm in Table 5 shows that the proposed EMO-GOA and NS-MOGOA outperform the GA-based and PSO-based methods. The main reason is that multi-objective optimization algorithms can maintain a better solution diversity and better converge in the obtained non-dominated front. Furthermore, the multi-objective optimization algorithm has better performance than single-objective for ontology alignment problem.

Table 6 shows the average running time of each algorithm. It is clear that the time of the proposed EMO-GOA is much less than NS-MOGOA, MOGOA, SPEA2, and GA. The main reason is that EMO-GOA employs the technology of ε-dominance to update the archive. This mechanism can reduce the time to converge to the Pareto optimal solution set and find a high-quality alignment. Although the running time of NS-MOGOA is higher than that of EMO-GOA, it is lower than MOGOA. Furthermore, this NS-MOGOA modified by employing a fast non-dominated solution sorting can speed up the convergence of non-dominated solutions and better maintain the diversity of solutions.

Table 4

Comparison with other multi-objective metaheuristic algorithms on 23 matching tasks

Table 5

Comparison results with other single-objective metaheuristic algorithms on 23 matching tasks

Table 6

Comparison of the time with MOGOA, SPEA2, GA and PSO (unit:ms)

6.3.3 Comparison between proposed methods and the other OAEI methods

In order to further study the performance of the proposed methods, the top system AML[35], LogMap[36] and XMap[37] in the OAEI-2016 competition are selected. It is worth noting that there are no results of tasks 103, 104, 203, 204, 205 and 231 on the OAEI official website. Therefore, these results are removed from our experimental results. Table 7 is the comparision of the proposed method with the top system of OAEI competition,and the results show that the proposed EMO-GOA and NS-MOGOA methods significantly improve the quality of alignment.

Table 7

Comparison of the proposed method with the top system of OAEI competition

7 Conclusion

The improvement of multi-objective evolutionary algorithms usually considers the two goals of maintaining the diversity of non-dominated solution sets and converging to the true Pareto optimal front. However, it is often ignored to achieve these two goals with rapid calculation. In this paper, the fast calculation manner for multi-objective meta-heuristic algorithms to align ontology are considered. Two different grasshopper optimization algorithms (GOA) are proposed for multi-objective ontology alignment. The performance of the proposed two methods is evaluated on the benchmark dataset. The results show that the proposed two methods outperform other meta-heuristic-based methods in terms of alignment quality. In particular, the EMO-GOA method achieves the best performance. For the running time, compared with SPEA2, GA, PSO, despite the long execution time of NS-MOGOA, the alignment quality has been improved. This EMO-GOA is only 0.019 49 seconds higher than PSO. It can be concluded that the two proposed multi-objective ontology alignment methods are successful for improving execution efficiency. In future work, we plan to further improve the performance of the proposed two methods in large ontology alignment.

References

  1. Fernández S, Ito T. Semantic integration of sensor data with SSN ontology in a multi-agent architecture for intelligent transportation systems[J]. IEICE Trans Inf Syst, 2017,12: 2915-2922. [CrossRef] [Google Scholar]
  2. Zangeneh P, McCabe B. Ontology-based knowledge representation for industrial megaprojects analytics using linked data and the semantic web[J]. Adv Eng Informatics, 2020, 46:101164. [CrossRef] [Google Scholar]
  3. Talhi A, Fortineau V, Huet J C, et al. Ontology for cloud manufacturing based product lifecycle management[J]. J Intell Manuf, 2019, 30(5) : 2171-2192. [CrossRef] [Google Scholar]
  4. Azevedo H, Belo J P R, Romero R A F. Using ontology as a strategy for modeling the interface between the cognitive and robotic systems[J]. J Intell Robotic Syst, 2020, 99(3):431-449. [CrossRef] [Google Scholar]
  5. Saremi S, Mirjalili S, Lewis A. Grasshopper optimisation algorithm: Theory and application[J]. Advances in Engineering Software, 2017, 105: 30-47. [CrossRef] [Google Scholar]
  6. Bock J, Hettenhausen J. Discrete particle swarm optimisation for ontology alignment[J]. Inf Sci, 2012, 192: 152-173. [CrossRef] [Google Scholar]
  7. Wang J L, Ding Z J, Jing C J. GAOM: Genetic algorithm based ontology alignment[C]// Asia-Pacific Services Computing Conference (APSCC) . Washington D C: IEEE, 2006: 617-620. [Google Scholar]
  8. Mohammadi M, Hofman W, Tan Y H. Simulated annealing-based ontology alignment[J]. ACM Trans Manag Inf Syst, 2019, 10(1): 3:1-3:24. [Google Scholar]
  9. Gil J M, Montes J F A. Evaluation of two heuristic approaches to solve the ontology meta-matching problem[J]. Knowl Inf Syst, 2011, 26(2): 225-247. [CrossRef] [Google Scholar]
  10. Gil J M, Alba E, Montes J F A. Optimizing ontology alignments by using genetic algorithms [C]// 7th International Semantic Web Conference (ISWC). Berlin Heidelberg: Springer-Verlag, 2008: 1-15. [Google Scholar]
  11. Naya J M V, Romero M M, Loureiro J P, et al. Improving ontology alignment through genetic algorithms[C]// Soft Computing Methods for Practical Environment Solutions: Techniques and Studies. Hershey: IGI Global, 2010: 240-259. [Google Scholar]
  12. Ginsca A L, Iftene A. Using a genetic algorithm for optimizing the similarity aggregation step in the process of ontology alignment[C]// 9th Roedunet International Conference (RoEduNet). Washington D C: IEEE, 2010: 118-122. [Google Scholar]
  13. Xue X S, Wang Y P. Improving the efficiency of NSGA-II based ontology aligning technology[J]. Data Knowl Eng, 2017, 108: 1-14. [CrossRef] [Google Scholar]
  14. Xue X S, Wang Y P, Hao W C. Using MOEA/D for optimizing ontology alignments[J]. Soft Comput, 2014,18(8):1589-1601. [CrossRef] [Google Scholar]
  15. Acampora G, Loia V, Vitiello A. Enhancing ontology alignment through a memetic aggregation of similarity measures[J]. Inf Sci, 2013, 250: 1-20. [CrossRef] [Google Scholar]
  16. Acampora G, Kaymak U, Loia V, et al. Applying NSGA-II for solving the ontology alignment problem[C]// International Conference on Systems, Man, and Cybernetics(SMC). Washington D C: IEEE , 2013: 1098-1103. [Google Scholar]
  17. Ryma G, Kholladi M K. Genetic algorithm with hill climbing for correspondences discovery in ontology mapping[J]. J Inf Technol Res, 2019, 12(4):153-170. [CrossRef] [Google Scholar]
  18. Forsati R, Shamsfard M. Symbiosis of evolutionary and combinatorial ontology mapping approaches[J]. Inf Sci, 2016, 342: 53-80. [CrossRef] [Google Scholar]
  19. Lv Z M, Peng R. A novel meta-matching approach for ontology alignment using grasshopper optimization[J]. Knowl Based Syst, 2020: 201-202. Article ID: 106050. [Google Scholar]
  20. Xue X S, Wang Y P, Ren A H. Optimizing ontology alignment through memetic algorithm based on partial reference alignment[J]. Expert Syst Appl, 2014, 41(7): 3213-3222. [CrossRef] [Google Scholar]
  21. Xue X S, Wang Y P. Optimizing ontology alignments through a memetic algorithm using both MatchFmeasure and unanimous improvement ratio[J]. Artif Intell, 2015, 223:65-81. [CrossRef] [Google Scholar]
  22. Xue X S, Wang Y P. Using compact evolutionary Tabu Search algorithm for matching sensor ontologies[J]. Swarm Evol Comput, 2019, 48: 25-30. [CrossRef] [Google Scholar]
  23. Xue X S, Wang Y P. Using memetic algorithm for instance coreference resolution[J]. IEEE Trans Knowl Data Eng, 2016, 28(2): 580-591. [CrossRef] [Google Scholar]
  24. Euzenat J, Shvaiko P. Ontology Matching[M]. Berlin Heidelberg: Springer-Verlag, 2013. [CrossRef] [Google Scholar]
  25. Das A K, Nikum A K, Krishnan S V, et al. Multi-objective Bonobo Optimizer (MOBO): An intelligent heuristic for multi-criteria optimization[J]. Knowl Inf Syst, 2020, 62(11):4407-4444. [CrossRef] [Google Scholar]
  26. Mirjalili S Z, Mirjalili S, Saremi S, et al. Grasshopper optimization algorithm for multi-objective optimization problems[J]. Appl Intell, 2018, 48 (4): 805-820. [CrossRef] [Google Scholar]
  27. Deb K, Mohan M, Mishra S. Evaluating the epsilon-domination based multi-objective evolutionary algorithm for a quick computation of Pareto-optimal solutions[J]. Evol Comput, 2005,13(4): 501-525. [CrossRef] [PubMed] [Google Scholar]
  28. Deb K, Agrawal S, Pratap A. et al. A fast and elitist multiobjective genetic algorithm: NSGA-II[J]. IEEE Trans Evol Comput, 2002, 6(2): 182-197. [CrossRef] [Google Scholar]
  29. Yates R A B, Neto B A R. Modern Information Retrieval[M]. New York: ACM Press / Addison-Wesley ,1999. [Google Scholar]
  30. Cohen W W, Ravikumar P, Fienberg S E. A comparison of string distance metrics for name-matching tasks [C]// Proc of the KDD Workshop on Data Cleaning and Object Consolidation. Washington D C: IEEE, 2003, 3: 73-78. [Google Scholar]
  31. Salton G, Wong A, Yang C S. A vector space model for automatic indexing[J]. Commun ACM, 1975, 18(11): 613-620. [CrossRef] [Google Scholar]
  32. Miller G A. WordNet: A lexical database for English[J].Communications of the ACM, 1995, 38(11): 39-41. [CrossRef] [Google Scholar]
  33. Scherbina A, Kuznetsov S. Clustering of Web sessions using Levenstein metric[J]// Industrial Conference on Data Mining . Berlin: Springer-Verlag, 2004: 127-133. [Google Scholar]
  34. Saxena N, Mishra K K. Improved multi-objective particle swarm optimization algorithm for optimizing watermark strength in color image watermarking[J]. Appl Intell, 2017, 47(2): 362-381. [CrossRef] [Google Scholar]
  35. Faria D, Pesquita C, Balasubramani B S, et al. OAEI 2016 results for AML[C]// Proceedings of the 11th International Workshop on Ontology Matching. Berlin: Springer-Verlag, 2016: 154-160. [Google Scholar]
  36. Ruiz E J, Grau B C, Cross V V. LogMap family participation in the OAEI 2016[C]// Proceedings of the 11th International Workshop on Ontology Matching. Berlin: Springer-Verlag, 2016: 201-203. [Google Scholar]
  37. Djeddi W E, Khadir M T, Yahia S B. XMap: results for OAEI 2016[C]// Proceedings of the 11th International Workshop on Ontology Matching, Berlin: Springer-Verlag, 2016: 222-226. [Google Scholar]

All Tables

Table 1

Brief of benchmarks

Table 2

Parameters configuration of each method

Table 3

Comparison of target selection strategy

Table 4

Comparison with other multi-objective metaheuristic algorithms on 23 matching tasks

Table 5

Comparison results with other single-objective metaheuristic algorithms on 23 matching tasks

Table 6

Comparison of the time with MOGOA, SPEA2, GA and PSO (unit:ms)

Table 7

Comparison of the proposed method with the top system of OAEI competition

All Figures

thumbnail Fig. 1 The method of calculating for the crowding distance of the z-th solution

Note:   The points linked by dashed lines indicate that they are at the same non-dominant level

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.