ORIGINAL PAPERS

Convergence of Hyperbolic Neural Networks Under Riemannian Stochastic Gradient Descent

Expand
  • 1. Department of Mathematics, University of California, Irvine, CA, USA;
    2. Department of Mathematics, Scientific Computing and Imaging Institute, University of Utah, Salt Lake City, UT, USA

Received date: 2022-10-31

  Revised date: 2023-07-28

  Accepted date: 2023-07-31

  Online published: 2023-10-05

Supported by

The work was partially supported by NSF Grants DMS-1854434, DMS-1952644, and DMS-2151235 at UC Irvine, and Bao Wang is supported by NSF Grants DMS-1924935, DMS-1952339, DMS-2110145, DMS-2152762, and DMS-2208361, and DOE Grants DE-SC0021142 and DE-SC0002722.

Abstract

We prove, under mild conditions, the convergence of a Riemannian gradient descent method for a hyperbolic neural network regression model, both in batch gradient descent and stochastic gradient descent. We also discuss a Riemannian version of the Adam algorithm. We show numerical simulations of these algorithms on various benchmarks.

Cite this article

Wes Whiting, Bao Wang, Jack Xin . Convergence of Hyperbolic Neural Networks Under Riemannian Stochastic Gradient Descent[J]. Communications on Applied Mathematics and Computation, 2024 , 6(2) : 1175 -1188 . DOI: 10.1007/s42967-023-00302-9

References

[1] Bécigneul, G., Ganea, O.-E.: Riemannian adaptive optimization methods. arXiv:1810.00760 (2019)
[2] Bonnabel, S.: Stochastic gradient descent on Riemannian manifolds. IEEE Trans. Autom. Control 58(9), 2217-2229 (2013)
[3] De Sa, C., Gu, A., Ré, C., Sala, F.: Representation tradeoffs for hyperbolic embeddings. CoRR, arXiv:1804.03329 (2018)
[4] Ganea, O.-E., Bécigneul, G., Hofmann, T.: Hyperbolic neural networks. CoRR, arXiv:1805.09112 (2018)
[5] Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv:1412.6980v9 (2014)
[6] Kratsios, A., Bilokopytov, I.: Non-Euclidean universal approximation. CoRR, arXiv:2006.02341 (2020)
[7] Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.J.: Introduction to WordNet: an on-line lexical database. Int. J. Lexicogr. 3(4), 235-244 (1990)
[8] Nagano, Y., Yamaguchi, S., Fujita, Y., Koyama, M.: A wrapped normal distribution on hyperbolic space for gradient-based learning. arXiv.1902.02992 (2019)
[9] Neto, J.C., De Lima, L., Oliveira, P.: Geodesic algorithms in Riemannian geometry. Balkan J. Geom. Appl. (BJGA) 3, 01 (1998)
[10] Nickel, M., Kiela, D.: Poincaré embeddings for learning hierarchical representations. arXiv:1705.08039v2 (2017)
[11] Peng, W., Varanka, T., Mostafa, A., Shi, H., Zhao, G.: Hyperbolic deep neural networks: a survey. arXiv:2101.04562 (2021)
[12] Ungar, A.A.: A Gyrovector Space Approach to Hyperbolic Geometry. Morgan & Claypool Publishers, San Rafael (2009)
Options
Outlines

/