Universally Consistent Hyperbolic Deep Neural Networks
No Thumbnail Available
Date
2025-06
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Indian Statistical Institute, Kolkata
Abstract
The ubiquitous pertinence of Deep Neural Networks has made it pivotal in modern Computer
Science Applications, ranging from Computer Vision to Pattern Recognition and Machine Translation.
Although these deep architectures are primarily based on Euclidean Spaces, Hyperbolic
Neural Networks (HNN) gained traction in recent times to tackle more complex non-Euclidean
data having inherent hierarchical structures. These HNN architectures have shown commendable
improvements in test results on tree or graph-like data by exploiting the inherent exponential
metric distances of hyperbolic spaces, making them more suitable to embed non-Euclidean
data. Although HNNs surpass their conventional Euclidean counterparts by commendable margins,
little to no theory is known behind their surprising results. We try to bridge this gap in this
thesis by analyzing the universal consistency results on two types of HNNs: one is Convolutional
and the other one is Transformer.
We start by defining the corresponding architectures on hyperbolic spaces, suitable for our
theoretical analyses. We will also define several statistical terminologies in this context, required
for our theoretical justification. The primary contribution of this work is the introduction of
universal consistency of HNNs. Universal consistency refers to the ability of a model to approximate
any target function within a given class, under the assumption that the model has
sufficient complexity (capacity). The thesis rigorously proves that hyperbolic neural networks,
when trained under sufficient regularity constraints, can achieve universal consistency, ensuring
they are capable of learning complex relationships across a broad range of tasks.
Additionally, we present empirical results showcasing the superior performance of HNNs
on benchmark datasets, including those with hierarchical or non-Euclidean structures. These results
highlight the potential of hyperbolic neural networks to outperform traditional Euclideanbased
models in tasks such as graph classification and representation learning.
In conclusion, this thesis establishes the universal consistency of two hyperbolic neural networks,
providing a powerful framework for tackling a wide variety of machine learning problems,
particularly those involving data with complex, hierarchical, or non-Euclidean relationships,
paving the way for future research and applications in this area.
Description
Dissertation under the supervision of Dr. Swagatam Das
Keywords
Neural Networks, Hyperbolic Neural Networks (HNN)
Citation
74p.
