1. Introduction
With the rapid development of artificial intelligence, hypernetwork representation learning has gradually become a research hotspot in the field of machine learning. Hypernetwork representation learning maps the nodes in the hypernetwork to a low-dimensional vector representation space. The learned node representation vectors can be applied to node classification [
1], link prediction [
2], community detection [
3], and so on.
According to the types of the hypernetwork, the hypernetwork representation learning can be divided into homogeneous hypernetwork representation learning and heterogeneous hypernetwork representation learning. Homogeneous hypernetwork representation learning aims to map the nodes of a single type in the homogeneous hypernetwork to a low-dimensional vector representation space. For example, Zhou [
4] proposed the hypergraph embedding approach on the basis of the spectral hypergraph clustering [
5], but the high computational complexity of this approach limits the wide application. HGNN [
6] extended the convolution operation to hypergraph embedding, but the datasets used in this approach are not really hypernetwork datasets. In a word, although the above homogeneous hypernetwork representation learning approaches have good representation learning abilities, they do not consider the heterogeneity of the hypernetwork. Therefore, the researchers propose heterogeneous hypernetwork representation learning approaches, which aim to learn distinguishing representation vectors for different types of nodes in the heterogeneous hypernetwork. For example, HHNE [
7] is designed as a fully connected graph convolution layer to project different types of nodes into a common low-dimensional vector representation space, but the computational complexity of this approach is too high to be suitable for large-scale heterogeneous networks. DHNE [
8] realizes the local and global proximity of nonlinear tuple similar functions in the embedding space, but because the multi-layer perceptron is used in the DHNE approach, the approach is limited to heterogeneous hyperedges with fixed size, and the relationships among multi-type instances with unfixed size cannot be considered.
To sum up, although the above hypernetwork representation learning methods can obtain nice node representation vectors, there are various issues, especially high computational complexity and the limitation of the hyperedges with a fixed size. Therefore, to solve the above issues, a hypernetwork representation learning approach based on hyperedge modeling to effectively capture complex tuple relationships (i.e., hyperedges) among the nodes is proposed, which is suitable for the hypernetwork with the hyperedges with unfixed size and improves the computational efficiency.
The following two aspects are the features of this paper:
A hypernetwork representation learning approach based on hyperedge modeling is proposed to map the nodes in the hypernetwork to a low-dimensional vector representation space, where the main components of the learned node representation vectors are the hypernetwork structure and the hyperedges;
The advantage of HRHM approach is that the hyperedges with unfixed size are encoded in the learned node representation vectors. The disadvantage of HRHM approach is that the partial information of the hypernetwork structure is lost because the hypernetwork abstracted as the hypergraph is transformed into the ordinary network abstracted as two-section graph.
2. Related Works
Nowadays, researchers have proposed some hypernetwork representation learning approaches to obtain node representation vectors that are rich in hypernetwork structure information. The existing hypernetwork representation learning approaches can be divided into homogeneous hypernetwork representation learning approaches and heterogeneous hypernetwork representation learning approaches. With regard to homogeneous hypernetwork representation learning approaches, Zhou [
4] proposed the hypergraph embedding approach on the basis of the spectral hypergraph clustering. HyperGCN [
9] approximates each hyperedge of a hypergraph by a set of pairwise edges connecting the nodes in the hyperedge and treats the hypergraph learning as graph learning. HGNN [
6] extends the convolution operation to hypergraph embedding, which is convolved by the hypergraph Laplacian function in the spectral domain and further approximated by truncated Chebyshev polynomials. LHCN [
10] maps the hypergraph to a weighted attributed line graph and learns graph convolutions on this line graph. With regard to heterogeneous hypernetwork representation learning approaches, HHNE [
7] is designed as a fully connected graph convolutional layer to project different types of nodes into a common low-dimensional space and uses a tuple similarity function to protect the network structure, and a rank-based loss function is used to improve the similarity scores of hyperedges in the embedding space. DHNE [
8] is a new deep model to realize the local and global proximity of the nonlinear tuple similarity function in the embedding space. HPHG [
11] is a deep model called Hyper-gram to capture pairwise and tuple relationships in the node sequences. Hyper-SAGNN [
12] utilizes a self-attention mechanism [
13] to aggregate hypergraph information.
3. Problem Definition of HRHM Approach
The hypernetwork abstracted as the hypergraph consists of the node set and the hyperedge set . The HRHM approach aims to learn a low-dimensional representation vector for each node in the hypernetwork, where is much smaller than .
In order to understand the process of hypernetwork representation learning well, take the drug hypernetwork as an example, where the triplet < user, drug, adverse reaction > is the hyperedge. Since each drug has several specific adverse reactions, there is a semantic relevance between the drug and the adverse reaction. How to assess the above semantic relevance is the real-life problem, which is solved by hypernetwork representation learning, which obtains a node representation vector to calculate the similarity between the drug and the adverse reaction to assess the semantic relevance.
4. Preliminaries
4.1. Two-Section Graph Transformed from Hypergraph
According to the literature [
14], the two-section graph structure is closer to the hypergraph structure than the line graph and the incidence graph. Therefore, the two-section graph transformed from the hypergraph is used in this paper to carry out the research of hypernetwork representation learning. The hypergraph and its corresponding two-section graph are shown in
Figure 1.
The two-section graph transformed from the hypergraph is an ordinary graph to meet the following conditions:
The node set V′ of two-section graph S is identical with the node set V of the hypergraph H;
Any two different nodes are associated with one edge if and only if these two nodes belong to at least one hyperedge simultaneously.
4.2. TransE Model
TransE [
15] is a knowledge representation model with the translation mechanism, which thinks that if the head entity
and the tail entity
are with the relationship
, the triplet
holds. Moreover, the head entity vector
plus the relationship vector
is almost identical with the tail entity vector
, namely
, when the triplet
holds; otherwise,
. The TransE framework is shown in
Figure 2.
5. HRHM Approach
This section introduces hypernetwork representation learning based on hyperedge modeling in more detail. Firstly, the cognitive topology model is introduced in
Section 5.1. Secondly, the cognitive hyperedge model is introduced in
Section 5.2. Finally, the joint optimization of the above two models is introduced in more detail in
Section 5.3.
5.1. Cognitive Topology Model
In order to improve the computational efficiency, a cognitive topology model with the negative sampling [
16] is introduced to capture the hypernetwork structure. Under the condition of the node sequences
, we try to maximize the following target function of the cognitive topology model to obtain the representation vectors rich in the hypernetwork structure.
where
is the subset of negative samples of the center node
, regarded as the positive sample.
is the contextual nodes of the center node
.
is defined as follows.
where
is a sigmoid function.
is the sum vector of all nodes representation vectors corresponding to
.
is the parameter vector. For
, the node label
is defined as follows.
By means of Equation (3), Equation (2) can also be written as an integral expression.
By substituting Equation (4) into Equation (1), Equation (1) can be rewritten as follows:
Formally, maximizing the target function makes the learned node representation vectors rich in hypernetwork structure.
5.2. Cognitive Hyperedge Model
Because the qualities of the node representation vectors from the above cognitive topology model do not consider that the hyperedges are not high, a novel cognitive hyperedge model with the negative sampling is proposed to consider the hyperedges to learn node representation vectors of high quality, where the hyperedges are deemed as the interaction relationships among the nodes and regarded as the translation operations in the representation vector space in the TransE model.
Under the condition of the hyperedge constraint, we try to maximize the following target function of the cognitive hyperedge model to obtain the representation vectors rich in the hyperedges.
where
is the set of the hyperedges, namely relationships associated with the center node
;
is the set of the nodes with the relation
with the center node
;
is the subset of negative samples of the center node
;
is the parameter vector; and the parameter vector
is the sum vector of the parameter vectors
and
, namely
.
For
, the node label
in the Equation (6) is defined as follows:
Formally, maximizing the target function makes the learned node representation vectors rich in hyperedges.
5.3. Joint Optimization
In this subsection, the hypernetwork representation learning approach based on hyperedge modeling, abbreviated as HRHM, is proposed. The HRHM approach can jointly optimize the cognitive topology model and cognitive hyperedge model.
Figure 3 shows the HRHM framework.
From
Figure 3, the network topology representation and hyperedge representation from the cognitive topology model and cognitive hyperedge model, respectively, share the same representation.
For ease of calculation, take the logarithm of
and
to maximize the following joint optimization target function to make the hyperedges fully incorporated into the node representation vectors.
where the harmonic factor
is used to equilibrate the contribution rate between the cognitive topology model and cognitive hyperedge model.
For ease of derivation,
is denoted as follows:
The stochastic gradient ascent approach is used to optimize the target function . The acquisition of four kinds of gradients of is the key to this paper.
Firstly, the gradient on the parameter vector
is calculated as follows:
Therefore, the parameter vector
is updated as follows:
where
is the learning rate.
Secondly, the gradient on the sum vector
is calculated as follows via the symmetry property between
and
:
Therefore, the representation vector
is updated as follows, where
:
Thirdly, the gradient on the parameter vector
is calculated as follows:
Therefore, the parameter vector
is updated as follows:
Finally, the gradient on the parameter vector
is calculated as follows via the symmetry property between
and
:
Specially,
, and the gradient
is used to update the parameter vectors
and
, respectively, as follows:
The stochastic gradient ascent approach is used for optimization. More details are shown in Algorithm 1.
Algorithm 1: HRHM
|
1 Input: |
2 hypernetwork |
3 vector dimension size |
4 Output: |
5 node representation matrix |
6 for node in do |
7 initializing the representation vector |
8 initializing the parameter vector |
9 for hyperedge in do |
10 for node in do |
11 initializing the parameter vector |
12 end for |
13 end for |
14 end for |
15 node sequences |
16 for in do |
17 updating the parameter vector according to the Equation (11) |
18 updating the representation vector according to the Equation (13) |
19 updating the parameter vector according to the Equation (15) |
20 for hyperedge in do |
21 for node in do |
22 updating the parameter vector according to the Equation (17) |
23 updating the parameter vector according to the Equation (18) |
24 end for |
25 end for |
26 end for |
27 for do |
28 |
29 end for |
30 return |
6. Experiments and Result Analysis
6.1. Datasets
Four hypernetwork datasets are used to evaluate HRHM approach. Detailed dataset statistics are shown in
Table 1.
Four datasets are shown as follows:
GPS [
17] describes a situation where a user takes part in an activity in a location. The triplet < user, location, activity> is utilized to construct the hypernetwork;
MovieLens [
18] describes personal tag activities from MovieLens. The triplet < user, movie, tag> is utilized to construct the hypernetwork, where each movie has at least one genre;
Drug (
http://www.fda.gov/Drugs/, accessed on 27 January 2020) describes a situation where the user who takes some drugs has certain reactions that lead to adverse events. The triplet < user, drug, reaction> is utilized to construct the hypernetwork;
Wordnet [
15] is made up of a set of triplets <head, relation, tail> extracted from Wordnet3.0. The triplet < head, relation, tail> is utilized to construct the hypernetwork.
6.2. Baseline Approaches
DeepWalk: DeepWalk [
19] is a popular approach for learning node representation vectors to encode social relationships.
Node2vec: Node2vec [
20] maps the nodes in the network to a low-dimensional representation space to preserve network neighborhoods of the nodes.
LINE: LINE [
21] embeds huge networks into low-dimensional vector spaces to preserve both local and global network structures.
GraRep: GraRep [
22] integrates global graph structural information into the process of representation learning.
HOPE: HOPE [
23] is a learning approach that preserves higher-order proximities of large scale graphs and captures the asymmetric transitivity.
SDNE: SDNE [
24] is a semi-supervised learning approach with multiple layers of non-linear functions to capture the highly non-linear network structure.
HPGH: HPHG [
11] proposes a deep model called Hyper-gram to capture pairwise and tuple relationships in the node sequences.
HRHM: HRHM regards the interaction relationships among the nodes as the translation operation in the representation space and incorporates the relationships among the nodes into node representation vectors.
6.3. Node Classification
Because the labels are only on the MovieLens and wordnet, our approach is assessed via node classification [
1] on the two datasets. The node classification accuracies are calculated by means of SVM [
25].
With regard to the two datasets, the mean node classification accuracy of HRHM approach surpasses that of other baseline approaches, and in terms of the node classification accuracies with different training ratios, HRHM approach surpasses other baseline approaches. Furthermore, the node classification accuracies of HRHM approach is directly proportional to the training ratios, which shows that a large amount of training data is helpful for node classification. It is worth noting that the node classification accuracies of HRHM approach are not high. The reason is that the categorical attributes on the two datasets are less prominent;
The mean node classification accuracy of DeepWalk ranks only second to that of HRHM approach because DeepWalk captures the hypernetwork structure to a certain extent in the node sequences generated by random walks.
In short, it is found that the node representation vectors from the HRHM approach are better than other baseline approaches, which shows that HRHM approach is effective.
6.4. Link Prediction
The HRHM approach performs worse than HPHG approach. To be specific, the mean AUC value of HPHG approach goes beyond that of HRHM approach by about 29%, 9%, 24%, and 4%, respectively, for the GPS, MovieLens, drug, and wordnet. The reason is that HRHM approach transforms the hypergraph into a two-section graph, which leads to partial loss of the hypernetwork structure information, but the HPHG approach does not decompose the hyperedges, which leads to almost complete preservation of hypernetwork structure information;
Except for HPHG approach, the mean AUC value of HRHM approach surpasses that of other baseline approaches on the GPS, drug, and wordnet. The mean AUC value of HRHM approach is very close to that of the other best baseline approach DeepWalk on the MovieLens. To sum up, HRHM approach surpasses most baseline approaches, which shows that the HRHM approach is effective;
The HRHM approach performs consistently at different training ratios compared to other baseline approaches, which shows its feasibility and robustness;
The HRHM approach almost surpasses other baseline approaches that do not consider the hyperedges, which verifies the assumption that the hyperedges are useful for link prediction.
In short, the above observations show that the HRHM approach can learn node representation vectors of high quality.
6.5. Parameter Sensitivity
The contribution rate between the cognitive topology model and the cognitive hyperedge model is equilibrated by the harmonic factor
. We set the raining ratio to 50% and calculate node classification accuracies with different
within the ranges from 0.1 to 0.9 on MovieLens and wordnet.
Figure 4 shows the comparisons of node classification accuracies.
From
Figure 4, it is found that the variation ranges of node classification accuracies on the two datasets are both within 2%, which indicates that the HRHM approach is not sensitive to the parameter
and shows the robustness of the HRHM approach.
In short, the best node classification results on both MovieLens and wordnet datasets are achieved at .
7. Conclusions
This hypernetwork representation learning approach based on hyperedge modeling consists of the cognitive topology model and the cognitive hyperedge model, which incorporate the hypernetwork topology structure and the hyperedges into node representation vectors, respectively, where the learning process of node representation vectors is regarded as a joint optimization problem, which is resolved via the stochastic gradient ascend approach. The advantage of the HRHM approach is that the hyperedges with unfixed size are encoded in the learned node representation vectors. The experimental results show that the performance of the HRHM approach is almost all better than that of other baseline approaches, more or less, except for the HPHG approach. In future research, we will not transform the hypergraph into the ordinary graph but regard the hyperedges as a whole to carry out the research of the hypernetwork representation learning.
Author Contributions
Conceptualization, Y.Z. and H.Z.; methodology, Y.Z. and H.Z.; software, Y.Z.; validation, Y.Z.; formal analysis, Y.Z.; investigation, Y.Z. and H.Z.; resources, Y.Z.; data curation, Y.Z.; writing—original draft preparation, Y.Z.; writing—review and editing, Y.Z. and H.Z.; visualization, Y.Z.; supervision, Y.Z. and H.Z.; project administration, Y.Z. and H.Z.; funding acquisition, Y.Z., X.W. and J.H. All authors have read and agreed to the published version of the manuscript.
Funding
This research is funded by National Natural Science Foundation of China, grant number 62166032, grant number 62162053 and grant number 62062059, and by Natural Science Foundation of Qinghai Province, grant number 2022-ZJ-961Q.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Data are contained in the article.
Conflicts of Interest
The authors declare no conflict of interest.
Correction Statement
This article has been republished with a minor correction to the Funding statement. This change does not affect the scientific content of the article.
References
- Pedroche, F.; Tortosa, L.; Vicent, J.F. An eigenvector centrality for multiplex networks with data. Symmetry 2019, 11, 763. [Google Scholar] [CrossRef]
- Papageorgiou, I.; Bittner, D.; Psychogios, M.N.; Hadjidemetriou, S. Brain immunoinformatics: A symmetrical link between informatics, wet lab and the clinic. Symmetry 2021, 13, 2168. [Google Scholar] [CrossRef]
- Guerrero, M.; Banos, R.; Gil, C.; Montoya, F.G.; Alcayde, A. Evolutionary algorithms for community detection in continental-scale high-voltage transmission grids. Symmetry 2019, 11, 1472. [Google Scholar] [CrossRef]
- Zhou, D.Y.; Huang, J.Y.; Schölkopf, B. Learning with hypergraphs: Clustering, classification and embedding. In Proceedings of the 19th International Conference on Neural Information Processing Systems, Vancouver, Canada, 4–7 December 2006; pp. 1601–1608. [Google Scholar]
- Sharma, K.K.; Seal, A.; Herrera-Viedma, E.; Krejcar, O. An enhanced spectral clustering algorithm with s-distance. Symmetry 2021, 13, 596. [Google Scholar] [CrossRef]
- Feng, Y.F.; You, H.X.; Zhang, Z.Z.; Ji, R.R.; Gao, Y. Hypergraph neural networks. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; pp. 3558–3565. [Google Scholar]
- Baytas, I.M.; Xiao, C.; Wang, F.; Jain, A.K.; Zhou, J.Y. Heterogeneous hyper-network embedding. In Proceedings of the 18th IEEE International Conference on Data Mining, Singapore, 17–20 November 2018; pp. 875–880. [Google Scholar]
- Tu, K.; Cui, P.; Wang, X.; Wang, F.; Zhu, W.W. Structural deep embedding for hyper-networks. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; pp. 426–433. [Google Scholar]
- Yadati, N.; Nimishakavi, M.; Yadav, P.; Nitin, V.; Louis, A.; Talukdar, P. HyperGCN: A new approach of training graph convolutional networks on hypergraphs. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019. [Google Scholar]
- Bandyopadhyay, S.; Das, K.; Murty, M.N. Line hypergraph convolution network: Applying graph convolution for hypergraphs. arXiv 2020, arXiv:2002.03392. [Google Scholar]
- Huang, J.; Liu, X.; Song, Y.Q. Hyper-path-based representation learning for hyper-networks. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 449–458. [Google Scholar]
- Zhang, R.C.; Zou, Y.S.; Ma, J. Hyper-SAGNN: A self-attention based graph neural network for hypergraphs. arXiv 2019, arXiv:1911.02613. [Google Scholar]
- Song, G.; Li, J.W.; Wang, Z. Occluded offline handwritten chinese character inpainting via generative adversarial network and self-attention mechanism. Neurocomputing 2020, 415, 146–156. [Google Scholar] [CrossRef]
- Zhu, Y.; Zhao, H.X. Hypernetwork representation learning with the set constraint. Appl. Sci. 2022, 12, 2650. [Google Scholar] [CrossRef]
- Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating embeddings for modeling multi-relational data. In Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 5–10 December 2013; pp. 2787–2795. [Google Scholar]
- Zhu, Y.; Ye, Z.L.; Zhao, H.X.; Zhang, K. Text-enhanced network representation learning. Front. Comput. Sci. 2020, 14, 146322. [Google Scholar] [CrossRef]
- Zheng, V.W.; Cao, B.; Zheng, Y.; Xie, X.; Yang, Q. Collaborative filtering meets mobile recommendation: A user-centered approach. In Proceedings of the 24th AAAI Conference on Artificial Intelligence, Atlanta, GA, USA, 11–15 July 2010; pp. 236–241. [Google Scholar]
- Harper, F.M.; Konstan, J.A. The movielens datasets: History and context. ACM Trans. Interact. Intell. Syst. 2015, 5, 19. [Google Scholar] [CrossRef]
- Perozzi, B.; Al-Rfou, R.; Skiena, S. DeepWalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD Internatonal Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 701–710. [Google Scholar]
- Grover, A.; Leskovec, J. Node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 855–864. [Google Scholar]
- Tang, J.; Qu, M.; Wang, M.Z.; Zhang, M.; Yan, J.; Mei, Q.Z. Line: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, 18–22 May 2015; pp. 1067–1077. [Google Scholar]
- Cao, S.S.; Lu, W.; Xu, Q.K. Grarep: Learning graph representations with global structural information. In Proceedings of the 24th ACM International Conference on Information and Knowledge Management, Melbourne, Australia, 19–23 October 2015; pp. 891–900. [Google Scholar]
- Ou, M.D.; Cui, P.; Pei, J.; Zhang, Z.W.; Zhu, W.W. Asymmetric transitivity preserving graph embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1105–1114. [Google Scholar]
- Wang, D.X.; Cui, P.; Zhu, W.W. Structural deep network embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1225–1234. [Google Scholar]
- Gholamnia, K.; Nachappa, T.G.; Ghorbanzadeh, O.; Blaschke, T. Comparisons of diverse machine learning approaches for wildfire susceptibility mapping. Symmetry 2020, 12, 604. [Google Scholar] [CrossRef]
- Almardeny, Y.; Boujnah, N.; Cleary, F. A novel outlier detection method for multivariate data. IEEE Trans. Knowl. Data Eng. 2020, 34, 4052–4062. [Google Scholar] [CrossRef]
| Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).