Please use this identifier to cite or link to this item: http://hdl.handle.net/10263/7518
Full metadata record
DC FieldValueLanguage
dc.contributor.authorGhorai, Sugata-
dc.date.accessioned2025-02-07T12:07:07Z-
dc.date.available2025-02-07T12:07:07Z-
dc.date.issued2024-06-
dc.identifier.citation40p.en_US
dc.identifier.urihttp://hdl.handle.net/10263/7518-
dc.descriptionDissertation under the supervision of Dr. Ujjwal Bhattacharyaen_US
dc.description.abstractScene text detection is crucial for numerous applications, including autonomous driving and assistive technology for visually impaired individuals. This project leverages various versions of the You Only Look Once (YOLO) model to achieve efficient and accurate text detection in natural scenes. Given YOLO’s balance between speed and accuracy, it is an ideal candidate for real-time text detection tasks. Throughout this project, we compare different versions of YOLO, evaluating their performance on various multilingual datasets. These datasets comprise diverse scene text images with varying backgrounds, lighting conditions, and font styles. Each model is assessed based on metrics such as precision, recall, and mean Average Precision (mAP) score.As the YOLO versions are updated, their capability to detect text improves. Additionally, transfer learning is applied to datasets with common root languages, such as Hindi and Bengali or Telugu and Kannada. Our approach involves training these models in different ways and analyzing their performance on datasets with shared linguistic roots.Experimental results demonstrate that later YOLO versions significantly enhance text detection capabilities. This comparative analysis provides valuable insights into selecting the most suitable YOLO version for specific real-time text detection applications and highlights the benefits of transfer learning in multilingual contexts.en_US
dc.language.isoenen_US
dc.publisherIndian Statistical Institute, Kolkataen_US
dc.relation.ispartofseriesMTech(CS) Dissertation;22-32-
dc.subjectYou Only Look Once (YOLO)en_US
dc.subjectFeature Pyramid Network(FPN)en_US
dc.subjectPANeten_US
dc.titleScene Text Detectionen_US
dc.typeOtheren_US
Appears in Collections:Dissertations - M Tech (CS)

Files in This Item:
File Description SizeFormat 
Sugata_Ghorai-cs2232.pdfDissertations - M Tech (CS)842.43 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.