Please use this identifier to cite or link to this item:
http://hdl.handle.net/10263/7518
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Ghorai, Sugata | - |
dc.date.accessioned | 2025-02-07T12:07:07Z | - |
dc.date.available | 2025-02-07T12:07:07Z | - |
dc.date.issued | 2024-06 | - |
dc.identifier.citation | 40p. | en_US |
dc.identifier.uri | http://hdl.handle.net/10263/7518 | - |
dc.description | Dissertation under the supervision of Dr. Ujjwal Bhattacharya | en_US |
dc.description.abstract | Scene text detection is crucial for numerous applications, including autonomous driving and assistive technology for visually impaired individuals. This project leverages various versions of the You Only Look Once (YOLO) model to achieve efficient and accurate text detection in natural scenes. Given YOLO’s balance between speed and accuracy, it is an ideal candidate for real-time text detection tasks. Throughout this project, we compare different versions of YOLO, evaluating their performance on various multilingual datasets. These datasets comprise diverse scene text images with varying backgrounds, lighting conditions, and font styles. Each model is assessed based on metrics such as precision, recall, and mean Average Precision (mAP) score.As the YOLO versions are updated, their capability to detect text improves. Additionally, transfer learning is applied to datasets with common root languages, such as Hindi and Bengali or Telugu and Kannada. Our approach involves training these models in different ways and analyzing their performance on datasets with shared linguistic roots.Experimental results demonstrate that later YOLO versions significantly enhance text detection capabilities. This comparative analysis provides valuable insights into selecting the most suitable YOLO version for specific real-time text detection applications and highlights the benefits of transfer learning in multilingual contexts. | en_US |
dc.language.iso | en | en_US |
dc.publisher | Indian Statistical Institute, Kolkata | en_US |
dc.relation.ispartofseries | MTech(CS) Dissertation;22-32 | - |
dc.subject | You Only Look Once (YOLO) | en_US |
dc.subject | Feature Pyramid Network(FPN) | en_US |
dc.subject | PANet | en_US |
dc.title | Scene Text Detection | en_US |
dc.type | Other | en_US |
Appears in Collections: | Dissertations - M Tech (CS) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Sugata_Ghorai-cs2232.pdf | Dissertations - M Tech (CS) | 842.43 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.