Make LLMs Private

Pradhan, Sayan

Make LLMs Private

Files

Sayan Pradhan-Crs2212-2024.pdf (936.2 KB)

Date

2024-07

Authors

Pradhan, Sayan

Publisher

Indian Statistical Institute, Kolkata

Abstract

Large Language Models are Machine Learning models that are trained on large text data, and have the capability to understand and generate human languages. Cryptographic techniques can be able to protect both the input and output privacy of the users. Secure Multi-Party Computation (MPC) is a cryptographic technique that allows multiple parties to compute a function on their secret inputs without revealing any information about their secret inputs. One of the applications of MPC lies in the domain of privacy-preserving machine learning (PPML). In this project “Make LLMs Private”, our goal is to choose a large language model and use MPC to make it private. Here we choose Bert model, which is a large language model and using different MPC protocols we construct an efficient combined MPC framework to evaluate Bert. Here our focus is only on the inference of Bert model. To achieve this, we require MPC protocols for multiplication, comparison, truncation, softmax, division, normalization operations. There are different MPC protocols for multiplication, division, comparison operations in ABY3 [10], Falcon [17] between three parties. Function secret sharing based protocols have smaller communication costs but larger computation costs. In Sigma [6], they introduce different MPC protocols for multiplication, comparison, exponential, truncation, maximum using function secret sharing in two parties. In a 3-party setup, MPC becomes increasingly faster, enabling practical deployment even for larger Machine Learning models. In MPC computing exponential is very expensive. In SecureML [12], the exponential is replaced by relu in an efficient way, and papers Puma [5], Bolt [13], approximate the exponential by polynomials. In Bicoptor [19], they introduce 2 round MPC protocols for multiplication, comparison, maximum with no pre-processing. Here we use those techniques to protect the Bert model. We integrate various basic MPC protocols to construct MPC protocols for complex functions, such as the softmax function, and optimize for both the number of rounds and communication costs. We propose combined MPC frameworks tailored for different scenarios in the Bert model, involving three parties.

Description

Dissertation under the guidance of Prof. Dr. Bimal Kumar Roy and Prof. Dr. Bart Preneel

Keywords

Secure Multi-Party Computation (MPC), Privacy-preserving machine learning (PPML), LLMs, Cryptographic techniques

Citation

53p.

URI

http://hdl.handle.net/10263/7540

Collections

Dissertations - M Tech (CRS)

Full item page

Make LLMs Private

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By