Please use this identifier to cite or link to this item: http://hdl.handle.net/10263/7518
Title: Scene Text Detection
Authors: Ghorai, Sugata
Keywords: You Only Look Once (YOLO)
Feature Pyramid Network(FPN)
PANet
Issue Date: Jun-2024
Publisher: Indian Statistical Institute, Kolkata
Citation: 40p.
Series/Report no.: MTech(CS) Dissertation;22-32
Abstract: Scene text detection is crucial for numerous applications, including autonomous driving and assistive technology for visually impaired individuals. This project leverages various versions of the You Only Look Once (YOLO) model to achieve efficient and accurate text detection in natural scenes. Given YOLO’s balance between speed and accuracy, it is an ideal candidate for real-time text detection tasks. Throughout this project, we compare different versions of YOLO, evaluating their performance on various multilingual datasets. These datasets comprise diverse scene text images with varying backgrounds, lighting conditions, and font styles. Each model is assessed based on metrics such as precision, recall, and mean Average Precision (mAP) score.As the YOLO versions are updated, their capability to detect text improves. Additionally, transfer learning is applied to datasets with common root languages, such as Hindi and Bengali or Telugu and Kannada. Our approach involves training these models in different ways and analyzing their performance on datasets with shared linguistic roots.Experimental results demonstrate that later YOLO versions significantly enhance text detection capabilities. This comparative analysis provides valuable insights into selecting the most suitable YOLO version for specific real-time text detection applications and highlights the benefits of transfer learning in multilingual contexts.
Description: Dissertation under the supervision of Dr. Ujjwal Bhattacharya
URI: http://hdl.handle.net/10263/7518
Appears in Collections:Dissertations - M Tech (CS)

Files in This Item:
File Description SizeFormat 
Sugata_Ghorai-cs2232.pdfDissertations - M Tech (CS)842.43 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.