We prove, under mild conditions, the convergence of a Riemannian gradient descent method for a hyperbolic neural network regression model, both in batch gradient descent and stochastic gradient descent. We also discuss a Riemannian version of the Adam algorithm. We show numerical simulations of these algorithms on various benchmarks.
Wes Whiting, Bao Wang, Jack Xin
. Convergence of Hyperbolic Neural Networks Under Riemannian Stochastic Gradient Descent[J]. Communications on Applied Mathematics and Computation, 2024
, 6(2)
: 1175
-1188
.
DOI: 10.1007/s42967-023-00302-9
[1] Bécigneul, G., Ganea, O.-E.: Riemannian adaptive optimization methods. arXiv:1810.00760 (2019)
[2] Bonnabel, S.: Stochastic gradient descent on Riemannian manifolds. IEEE Trans. Autom. Control 58(9), 2217-2229 (2013)
[3] De Sa, C., Gu, A., Ré, C., Sala, F.: Representation tradeoffs for hyperbolic embeddings. CoRR, arXiv:1804.03329 (2018)
[4] Ganea, O.-E., Bécigneul, G., Hofmann, T.: Hyperbolic neural networks. CoRR, arXiv:1805.09112 (2018)
[5] Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv:1412.6980v9 (2014)
[6] Kratsios, A., Bilokopytov, I.: Non-Euclidean universal approximation. CoRR, arXiv:2006.02341 (2020)
[7] Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.J.: Introduction to WordNet: an on-line lexical database. Int. J. Lexicogr. 3(4), 235-244 (1990)
[8] Nagano, Y., Yamaguchi, S., Fujita, Y., Koyama, M.: A wrapped normal distribution on hyperbolic space for gradient-based learning. arXiv.1902.02992 (2019)
[9] Neto, J.C., De Lima, L., Oliveira, P.: Geodesic algorithms in Riemannian geometry. Balkan J. Geom. Appl. (BJGA) 3, 01 (1998)
[10] Nickel, M., Kiela, D.: Poincaré embeddings for learning hierarchical representations. arXiv:1705.08039v2 (2017)
[11] Peng, W., Varanka, T., Mostafa, A., Shi, H., Zhao, G.: Hyperbolic deep neural networks: a survey. arXiv:2101.04562 (2021)
[12] Ungar, A.A.: A Gyrovector Space Approach to Hyperbolic Geometry. Morgan & Claypool Publishers, San Rafael (2009)