Exploring Character-level Attacks on Neural Ranking Models

Halder, Surjyanee

Exploring Character-level Attacks on Neural Ranking Models

dc.contributor.author	Halder, Surjyanee
dc.date.accessioned	2025-07-15T10:26:21Z
dc.date.available	2025-07-15T10:26:21Z
dc.date.issued	2025-06
dc.description	Dissertation under the supervision of Dr. Debapriyo Majumdar	en_US
dc.description.abstract	Neural ranking models (NRMs) have achieved state-of-the-art performance in information retrieval, yet they remain highly susceptible to subtle adversarial inputs such as character-level typos. This project explores the robustness of such systems by introducing a reinforcement learning (RL)-based query perturbation framework. RL agents—PPO, DQN, and A2C—were trained to minimally modify user queries (e.g., through character deletions or swaps) with the goal of significantly altering the resulting document rankings, as measured by Kendall’s Tau. Experiments were conducted on the TREC DL 2019 and 2020 benchmarks using two different neural rankers: MiniLM and a fine-tuned CharacterBERT model. The perturbation attacks were shown to succeed in over 85% of cases for MiniLM and approximately 40% for CharacterBERT, indicating varying degrees of vulnerability. To mitigate these effects, a set of pretrained query recovery models—such as T5-large-spell, spelling-correction-base, and grammar correction modules—were applied to restore the original query form. When used in combination, these recovery mechanisms reduced the MiniLM attack success rate to around 52%, demonstrating partial robustness. This study underscores both the fragility of neural rankers to character-level noise and the value of lightweight correction pipelines in improving retrieval resilience.	en_US
dc.identifier.citation	45p.	en_US
dc.identifier.uri	http://hdl.handle.net/10263/7569
dc.language.iso	en	en_US
dc.publisher	Indian Statistical Institute, Kolkata	en_US
dc.relation.ispartofseries	MTech(CS) Dissertation;23-31
dc.subject	Neural Ranking Models	en_US
dc.subject	Reinforcement learning (RL)	en_US
dc.subject	TREC DL 2019	en_US
dc.subject	MiniLM	en_US
dc.subject	CharacterBERT model	en_US
dc.title	Exploring Character-level Attacks on Neural Ranking Models	en_US
dc.type	Other	en_US

Files

Original bundle

Now showing 1 - 2 of 2

Name:: Surjyaneeh_dissertation (2) (1).pdf
Size:: 1.11 MB
Format:: Adobe Portable Document Format
Description:: Dissertations - M Tech (CS)

Download

Name:: CS2331.pdf
Size:: 959.39 KB
Format:: Adobe Portable Document Format
Description:: Plagiarism_report

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Dissertations - M Tech (CS)