Make LLMs Private
No Thumbnail Available
Date
2024-07
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Indian Statistical Institute, Kolkata
Abstract
Large Language Models are Machine Learning models that are trained on large
text data, and have the capability to understand and generate human languages.
Cryptographic techniques can be able to protect both the input and output privacy
of the users. Secure Multi-Party Computation (MPC) is a cryptographic technique
that allows multiple parties to compute a function on their secret inputs without revealing
any information about their secret inputs. One of the applications of MPC lies
in the domain of privacy-preserving machine learning (PPML). In this project “Make
LLMs Private”, our goal is to choose a large language model and use MPC to make
it private. Here we choose Bert model, which is a large language model and using
different MPC protocols we construct an efficient combined MPC framework to evaluate
Bert. Here our focus is only on the inference of Bert model. To achieve this, we
require MPC protocols for multiplication, comparison, truncation, softmax, division,
normalization operations. There are different MPC protocols for multiplication, division,
comparison operations in ABY3 [10], Falcon [17] between three parties. Function
secret sharing based protocols have smaller communication costs but larger computation
costs. In Sigma [6], they introduce different MPC protocols for multiplication,
comparison, exponential, truncation, maximum using function secret sharing in two
parties. In a 3-party setup, MPC becomes increasingly faster, enabling practical deployment
even for larger Machine Learning models. In MPC computing exponential
is very expensive. In SecureML [12], the exponential is replaced by relu in an efficient
way, and papers Puma [5], Bolt [13], approximate the exponential by polynomials. In
Bicoptor [19], they introduce 2 round MPC protocols for multiplication, comparison,
maximum with no pre-processing. Here we use those techniques to protect the Bert
model. We integrate various basic MPC protocols to construct MPC protocols for
complex functions, such as the softmax function, and optimize for both the number
of rounds and communication costs. We propose combined MPC frameworks tailored
for different scenarios in the Bert model, involving three parties.
Description
Dissertation under the guidance of Prof. Dr. Bimal Kumar Roy and Prof. Dr. Bart Preneel
Keywords
Secure Multi-Party Computation (MPC), Privacy-preserving machine learning (PPML), LLMs, Cryptographic techniques
Citation
53p.
