Universally Consistent Hyperbolic Deep Neural Networks

No Thumbnail Available

Date

2025-06

Journal Title

Journal ISSN

Volume Title

Publisher

Indian Statistical Institute, Kolkata

Abstract

The ubiquitous pertinence of Deep Neural Networks has made it pivotal in modern Computer Science Applications, ranging from Computer Vision to Pattern Recognition and Machine Translation. Although these deep architectures are primarily based on Euclidean Spaces, Hyperbolic Neural Networks (HNN) gained traction in recent times to tackle more complex non-Euclidean data having inherent hierarchical structures. These HNN architectures have shown commendable improvements in test results on tree or graph-like data by exploiting the inherent exponential metric distances of hyperbolic spaces, making them more suitable to embed non-Euclidean data. Although HNNs surpass their conventional Euclidean counterparts by commendable margins, little to no theory is known behind their surprising results. We try to bridge this gap in this thesis by analyzing the universal consistency results on two types of HNNs: one is Convolutional and the other one is Transformer. We start by defining the corresponding architectures on hyperbolic spaces, suitable for our theoretical analyses. We will also define several statistical terminologies in this context, required for our theoretical justification. The primary contribution of this work is the introduction of universal consistency of HNNs. Universal consistency refers to the ability of a model to approximate any target function within a given class, under the assumption that the model has sufficient complexity (capacity). The thesis rigorously proves that hyperbolic neural networks, when trained under sufficient regularity constraints, can achieve universal consistency, ensuring they are capable of learning complex relationships across a broad range of tasks. Additionally, we present empirical results showcasing the superior performance of HNNs on benchmark datasets, including those with hierarchical or non-Euclidean structures. These results highlight the potential of hyperbolic neural networks to outperform traditional Euclideanbased models in tasks such as graph classification and representation learning. In conclusion, this thesis establishes the universal consistency of two hyperbolic neural networks, providing a powerful framework for tackling a wide variety of machine learning problems, particularly those involving data with complex, hierarchical, or non-Euclidean relationships, paving the way for future research and applications in this area.

Description

Dissertation under the supervision of Dr. Swagatam Das

Keywords

Neural Networks, Hyperbolic Neural Networks (HNN)

Citation

74p.

Endorsement

Review

Supplemented By

Referenced By