Enhancing Text to SQL Generation with Dynamic Vector Search

Mondal, Soumyadweep

Please use this identifier to cite or link to this item: http://hdl.handle.net/10263/7530

Full metadata record

DC Field	Value	Language
dc.contributor.author	Mondal, Soumyadweep	-
dc.date.accessioned	2025-02-21T11:16:04Z	-
dc.date.available	2025-02-21T11:16:04Z	-
dc.date.issued	2024-07	-
dc.identifier.citation	30p.	en_US
dc.identifier.uri	http://hdl.handle.net/10263/7530	-
dc.description	Dissertation under the guidance of Jayanta Kumar Mukherjee and Prof. Dipti Prasad Mukherjee	en_US
dc.description.abstract	Generating accurate SQL from natural language questions (text-to-SQL) is a longstanding challenge due to the complexities involved in understanding user queries, comprehending database schemas, and generating SQL statements. Traditional text-to-SQL systems have utilized human-engineered solutions and deep neural networks. More recently, pre-trained language models (PLMs) have been employed for text-to-SQL tasks, showing promising results. However, as modern databases and user queries become increasingly complex, the limited comprehension capabilities of PLMs can lead to incorrect SQL generation. This necessitates sophisticated and tailored optimization methods, which restrict the applicability of PLM-based systems. In contrast, large language models (LLMs) have demonstrated significant advancements in natural language understanding as their scale increases. This thesis explores the integration of LLMs into text-to-SQL systems, highlighting unique opportunities, challenges, and solutions. We propose a novel approach that leverages examples similar to user queries, allowing the model to better understand and generate accurate SQL. This work provides a comprehensive review of LLM-based text-to-SQL systems, outlining current challenges and the evolutionary process of the field. We introduce datasets and metrics designed for evaluating text-to-SQL systems. Finally, we discuss remaining challenges and propose future directions for research in this domain.	en_US
dc.language.iso	en	en_US
dc.publisher	Indian Statistical Institute, Kolkata	en_US
dc.relation.ispartofseries	Dissertation;;CrS;22-17	-
dc.subject	pre-trained language models (PLMs)	en_US
dc.subject	large language models (LLMs)	en_US
dc.subject	Zero-Shot Experiments	en_US
dc.title	Enhancing Text to SQL Generation with Dynamic Vector Search	en_US
dc.type	Other	en_US
Appears in Collections:	Dissertations - M Tech (CRS)

Files in This Item:

File	Description	Size	Format
Soumyadweep_CrS2217_2024.pdf	Dissertations - M Tech (CRS)	601.32 kB	Adobe PDF	View/Open

Show simple item record