Make LLMs Private

dc.contributor.authorPradhan, Sayan
dc.date.accessioned2025-03-18T10:21:52Z
dc.date.available2025-03-18T10:21:52Z
dc.date.issued2024-07
dc.descriptionDissertation under the guidance of Prof. Dr. Bimal Kumar Roy and Prof. Dr. Bart Preneelen_US
dc.description.abstractLarge Language Models are Machine Learning models that are trained on large text data, and have the capability to understand and generate human languages. Cryptographic techniques can be able to protect both the input and output privacy of the users. Secure Multi-Party Computation (MPC) is a cryptographic technique that allows multiple parties to compute a function on their secret inputs without revealing any information about their secret inputs. One of the applications of MPC lies in the domain of privacy-preserving machine learning (PPML). In this project “Make LLMs Private”, our goal is to choose a large language model and use MPC to make it private. Here we choose Bert model, which is a large language model and using different MPC protocols we construct an efficient combined MPC framework to evaluate Bert. Here our focus is only on the inference of Bert model. To achieve this, we require MPC protocols for multiplication, comparison, truncation, softmax, division, normalization operations. There are different MPC protocols for multiplication, division, comparison operations in ABY3 [10], Falcon [17] between three parties. Function secret sharing based protocols have smaller communication costs but larger computation costs. In Sigma [6], they introduce different MPC protocols for multiplication, comparison, exponential, truncation, maximum using function secret sharing in two parties. In a 3-party setup, MPC becomes increasingly faster, enabling practical deployment even for larger Machine Learning models. In MPC computing exponential is very expensive. In SecureML [12], the exponential is replaced by relu in an efficient way, and papers Puma [5], Bolt [13], approximate the exponential by polynomials. In Bicoptor [19], they introduce 2 round MPC protocols for multiplication, comparison, maximum with no pre-processing. Here we use those techniques to protect the Bert model. We integrate various basic MPC protocols to construct MPC protocols for complex functions, such as the softmax function, and optimize for both the number of rounds and communication costs. We propose combined MPC frameworks tailored for different scenarios in the Bert model, involving three parties.en_US
dc.identifier.citation53p.en_US
dc.identifier.urihttp://hdl.handle.net/10263/7540
dc.language.isoenen_US
dc.publisherIndian Statistical Institute, Kolkataen_US
dc.relation.ispartofseriesDissertation;;CrS;22-12
dc.subjectSecure Multi-Party Computation (MPC)en_US
dc.subjectPrivacy-preserving machine learning (PPML)en_US
dc.subjectLLMsen_US
dc.subjectCryptographic techniquesen_US
dc.titleMake LLMs Privateen_US
dc.typeOtheren_US

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Sayan Pradhan-Crs2212-2024.pdf
Size:
936.2 KB
Format:
Adobe Portable Document Format
Description:
Dissertations - M Tech (CRS)

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: